天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

網(wǎng)絡(luò)內(nèi)容過(guò)濾系統(tǒng)設(shè)計(jì)與實(shí)現(xiàn)

發(fā)布時(shí)間:2018-11-03 17:23
【摘要】:校園網(wǎng)給師生提供便利的同時(shí)也帶來(lái)了危害,大量不健康和無(wú)用的信息充斥著網(wǎng)絡(luò)世界,給高校校園網(wǎng)的管理和維護(hù)帶來(lái)了很大的挑戰(zhàn)。網(wǎng)絡(luò)內(nèi)容過(guò)濾是一種有效的應(yīng)對(duì)方法,能夠自動(dòng)地將網(wǎng)絡(luò)中特定的信息過(guò)濾掉。本文首先回顧了國(guó)內(nèi)外網(wǎng)絡(luò)過(guò)濾領(lǐng)域的發(fā)展現(xiàn)狀、存在的問(wèn)題以及常見(jiàn)的過(guò)濾方法。本系統(tǒng)實(shí)現(xiàn)了兩個(gè)關(guān)鍵的系統(tǒng)功能模塊:網(wǎng)絡(luò)數(shù)據(jù)包的捕獲和重組模塊、網(wǎng)絡(luò)文本數(shù)據(jù)處理模塊。文中完成了網(wǎng)絡(luò)內(nèi)容過(guò)濾系統(tǒng)兩大關(guān)鍵功能:實(shí)現(xiàn)對(duì)特定URL的過(guò)濾以及對(duì)網(wǎng)頁(yè)正文內(nèi)容的過(guò)濾,其中網(wǎng)頁(yè)正文是文本內(nèi)容,不包括圖像視頻等多媒體信息。網(wǎng)絡(luò)數(shù)據(jù)捕獲模塊主要研究分析了網(wǎng)絡(luò)協(xié)議的解析,在具體的分析過(guò)程中涉及到以太網(wǎng)數(shù)據(jù)幀、IP數(shù)據(jù)包、TCP數(shù)據(jù)段和HTTP報(bào)文,同時(shí)在基于網(wǎng)絡(luò)協(xié)議分析的基礎(chǔ)上完成了在Windows系統(tǒng)下利用網(wǎng)絡(luò)數(shù)據(jù)包捕獲庫(kù)Winpcap對(duì)網(wǎng)絡(luò)數(shù)據(jù)包的捕獲和分析,最終這個(gè)模塊實(shí)現(xiàn)了URL過(guò)濾功能和HTML的頁(yè)面重組,為文本數(shù)據(jù)處理模塊提供了文本數(shù)據(jù)。根據(jù)校園網(wǎng)的特點(diǎn),URL過(guò)濾功能中的URL過(guò)濾庫(kù)可以由自行定義的多個(gè)不同規(guī)則庫(kù)組成,并且根據(jù)不同時(shí)間段運(yùn)行不同的過(guò)濾規(guī)則庫(kù)。網(wǎng)絡(luò)文本數(shù)據(jù)處理模塊研究了網(wǎng)頁(yè)文本分類(lèi)技術(shù)。因?yàn)榫W(wǎng)頁(yè)文本是一種半結(jié)構(gòu)化的文本數(shù)據(jù),首先研究和實(shí)現(xiàn)了從網(wǎng)頁(yè)文本中提取文本數(shù)據(jù)。然后重點(diǎn)研究了文本分類(lèi)技術(shù),主要包括文本預(yù)處理和文本分類(lèi)器的訓(xùn)練兩大技術(shù)難點(diǎn)。文本預(yù)處理技術(shù)中還涉及到中文分詞、特征選擇和權(quán)重計(jì)算等技術(shù)。對(duì)現(xiàn)在主流的各種文本分類(lèi)器進(jìn)行了理論上的分析和比較,最終根據(jù)校園網(wǎng)的特點(diǎn)選擇了類(lèi)中心向量分類(lèi)器作為文本分類(lèi)器。根據(jù)訓(xùn)練集文本完成文本分類(lèi)器的學(xué)習(xí),并對(duì)分類(lèi)器的效果進(jìn)行了交叉驗(yàn)證測(cè)試,取得了較滿意的分類(lèi)結(jié)果。最后對(duì)網(wǎng)絡(luò)內(nèi)容過(guò)濾系統(tǒng)進(jìn)行了總結(jié)和展望。希望下一步工作可以實(shí)現(xiàn)更加全面的網(wǎng)絡(luò)內(nèi)容過(guò)濾系統(tǒng),不僅僅是文本內(nèi)容,還可以包括圖片、聲音和視頻等多媒體信息的過(guò)濾。
[Abstract]:Campus network not only provides convenience to teachers and students but also brings harm. A large number of unhealthy and useless information flooded the network world and brought great challenges to the management and maintenance of campus network in colleges and universities. Web content filtering is an effective response method, which can automatically filter out the specific information in the network. Firstly, this paper reviews the status quo, existing problems and common filtering methods in the field of network filtering at home and abroad. This system realizes two key function modules: network data packet capture and recombination module, network text data processing module. In this paper, two key functions of the network content filtering system are accomplished: filtering the specific URL and filtering the content of the text of the web page. The text of the web page is the text content, not the multimedia information such as image and video. The network data capture module mainly studies and analyzes the analysis of network protocol, which involves Ethernet data frame, IP data packet, TCP data segment and HTTP message. At the same time, on the basis of network protocol analysis, the capture and analysis of network data packets using network packet capture library (Winpcap) under Windows system is completed. Finally, this module realizes the function of URL filtering and the page recombination of HTML. Provides text data for text data processing module. According to the characteristics of campus network, the URL filter library in the URL filtering function can be composed of several different rule libraries defined by itself, and run different filtering rule libraries according to different time periods. Web text data processing module studies the technology of web page text classification. Because web text is a kind of semi-structured text data, firstly, we study and realize extracting text data from web text. Then it focuses on the text classification technology, including the text preprocessing and text classifier training two major technical difficulties. Chinese word segmentation, feature selection and weight calculation are also involved in text preprocessing. This paper analyzes and compares all kinds of mainstream text classifiers in theory, and finally selects class center vector classifier as text classifier according to the characteristics of campus network. According to the text of the training set, the text classifier is learned, and the effect of the classifier is tested by cross-validation, and satisfactory results are obtained. Finally, the network content filtering system is summarized and prospected. It is hoped that the next step will be to implement a more comprehensive network content filtering system, not only for text content, but also for the filtering of multimedia information, such as pictures, sounds and videos.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類(lèi)號(hào)】:TP393.08

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 張莉,曾致遠(yuǎn);Windows下網(wǎng)頁(yè)信息實(shí)時(shí)監(jiān)聽(tīng)程序的設(shè)計(jì)與實(shí)現(xiàn)[J];微計(jì)算機(jī)信息;2005年03期

相關(guān)碩士學(xué)位論文 前1條

1 曲建華;Web上的信息過(guò)濾問(wèn)題研究[D];山東師范大學(xué);2003年



本文編號(hào):2308442

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2308442.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶(hù)59e81***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
98精品永久免费视频| 丰满少妇被粗大猛烈进出视频| 日本 一区二区 在线| 又黄又硬又爽又色的视频 | 一二区不卡不卡在线观看| 日本男人女人干逼视频| 日韩在线中文字幕不卡| 99香蕉精品视频国产版| 日本不卡在线视频你懂的 | 欧美精品亚洲精品一区| 黄色av尤物白丝在线播放网址| 日本不卡在线视频你懂的| 91福利视频日本免费看看| 亚洲中文在线观看小视频| 欧美日韩一区二区午夜| 亚洲中文字幕免费人妻| 日本午夜一本久久久综合| 午夜精品一区免费视频| 日韩一区中文免费视频| 国产综合香蕉五月婷在线| 午夜直播免费福利平台| 又黄又色又爽又免费的视频| 99久久国产亚洲综合精品| 欧美乱码精品一区二区三| 成人午夜视频在线播放| 日本 一区二区 在线| 成年午夜在线免费视频| 日韩精品免费一区三区| 儿媳妇的诱惑中文字幕| 久久本道综合色狠狠五月| 日韩精品人妻少妇一区二区| 熟女白浆精品一区二区| 亚洲精品偷拍视频免费观看| 高清在线精品一区二区| 亚洲国产另类久久精品| 男女激情视频在线免费观看| 国产成人精品一区二三区在线观看| 日本一区二区三区久久娇喘| 污污黄黄的成年亚洲毛片| 国产午夜免费在线视频| 国产偷拍盗摄一区二区|