互聯(lián)網(wǎng)流量特征智能提取關(guān)鍵技術(shù)研究
本文關(guān)鍵詞: 互聯(lián)網(wǎng)流量識別 特征提取 網(wǎng)絡(luò)監(jiān)測 網(wǎng)絡(luò)安全 數(shù)據(jù)清理 出處:《北京郵電大學(xué)》2014年博士論文 論文類型:學(xué)位論文
【摘要】:網(wǎng)絡(luò)信息技術(shù)的高速發(fā)展,使互聯(lián)網(wǎng)及其應(yīng)用走進(jìn)了千家萬戶,改變了當(dāng)今人們的生活方式。通過互聯(lián)網(wǎng)信息的傳遞,人們可以獲取當(dāng)今世界最新的信息咨訊,可以結(jié)交全球各地的朋友,可以使用多媒體工具娛樂生活,還可以通過網(wǎng)絡(luò)貿(mào)易,買到世界各地的物產(chǎn)。在提升人們生活質(zhì)量的同時,互聯(lián)網(wǎng)的高速普及也造成了網(wǎng)絡(luò)流量大幅增長、流量突發(fā)性增大、網(wǎng)絡(luò)應(yīng)用動態(tài)多樣、網(wǎng)絡(luò)安全事件頻發(fā)等問題。通過互聯(lián)網(wǎng)流量分類技術(shù)可以對網(wǎng)絡(luò)管道中流量所使用的協(xié)議、產(chǎn)生流量的應(yīng)用種類進(jìn)行感知。該技術(shù)是實現(xiàn)現(xiàn)今網(wǎng)絡(luò)可管可控、實現(xiàn)細(xì)粒度QoS (Quality of Service)保障、實現(xiàn)安全監(jiān)測和實現(xiàn)高效網(wǎng)規(guī)網(wǎng)優(yōu)的基礎(chǔ)和前提。然而,動態(tài)端口技術(shù)、端口偽裝技術(shù)和數(shù)據(jù)流量加密技術(shù)等各類反偵察技術(shù)的采用,使得如何能準(zhǔn)確、高效、實時的對網(wǎng)絡(luò)流量進(jìn)行識別又重新成為網(wǎng)絡(luò)流量檢測領(lǐng)域極富挑戰(zhàn)性的研究熱點。 網(wǎng)絡(luò)流量的識別特征是直接影響分類器準(zhǔn)確性、時效性和智能性的關(guān)鍵性因素。本文對流量識別領(lǐng)域常用的傳輸層端口、應(yīng)用層字符特征、流量統(tǒng)計特征及用戶流量行為特征的提取過程、使用場景和使用效率進(jìn)行了分析研究。并在此基礎(chǔ)上針對原始數(shù)據(jù)中噪聲處理問題、特征提取過程的高復(fù)雜性問題以及加密網(wǎng)絡(luò)流量的識別問題進(jìn)行了較為深入的研究和探索,并取得了一定的研究成果。 論文的研究工作和創(chuàng)新點主要包括以下幾個方面: 1)本文引入了主成分分析方式對目標(biāo)應(yīng)用流量進(jìn)行自動提純處理。如果用于目標(biāo)應(yīng)用特征提取的數(shù)據(jù)中包含噪聲等各類臟數(shù)據(jù),將會對所提取的特征的可信度產(chǎn)生不利影響。為此,本文采用主成分分析法將臟數(shù)據(jù)的流量統(tǒng)計特征作為次要信息濾除。該方法能有效提高所提取的目標(biāo)應(yīng)用網(wǎng)絡(luò)流量特征的針對性,進(jìn)而可以提高該方式的識別準(zhǔn)確率。 2)本文研究了如何更為高效的提取網(wǎng)絡(luò)流量特征。傳統(tǒng)提取流量字符特征過程的時間復(fù)雜度、空間復(fù)雜度都較高。針對該情況,本文提出了提取固定比特偏置特征算法。該算法能有效避免構(gòu)建矩陣和回溯求解的過程,通過實驗表明該算法對比傳統(tǒng)的LCS (Longest Common Subsequence)等算法有一個數(shù)量級以上的運行時間優(yōu)勢。同時,本文還提出了基于PCA (Principal Component Analysis)的特征提取算法。該算法將目標(biāo)應(yīng)用流量作為整體考慮,從而提取出其整體信息特征。該方法是流量特征提取領(lǐng)域較為新穎的嘗試,為之后的研究開拓了思路。 3)本文對加密流量識別進(jìn)行了研究。在借鑒現(xiàn)有基于網(wǎng)絡(luò)流量特征對加密流量識別的基礎(chǔ)上,本文使用神經(jīng)網(wǎng)絡(luò)對加密流量進(jìn)行有效識別。同時,為能提高神經(jīng)網(wǎng)絡(luò)建模速度,本文還對常用的流量統(tǒng)計信息在神經(jīng)網(wǎng)絡(luò)中的識別性能進(jìn)行了實驗統(tǒng)計分析,以期能使用較少的特征達(dá)到相似的識別性能。
[Abstract]:With the rapid development of network information technology, the Internet and its applications have entered thousands of households, changing the way of life of today's people. Through the transmission of information on the Internet, people can obtain the latest information in the world. Can make friends around the world, can use multimedia tools entertainment life, but also through the Internet trade, to buy products around the world, while improving the quality of life of people at the same time. The rapid popularity of the Internet has also resulted in a large increase in network traffic, traffic sudden increase, network application dynamic diversity. Through the technology of Internet traffic classification, the protocol used in network pipeline and the types of applications that generate traffic can be sensed. This technology can realize the network can be managed and controlled nowadays. Implementation of fine-grained QoS quality of Service guarantee, security monitoring and the realization of efficient network planning network optimization foundation and premise. However, dynamic port technology. Port camouflage technology and data flow encryption technology and other anti-reconnaissance technology, so that how to be accurate and efficient. Real-time recognition of network traffic has become a challenging research hotspot in the field of network traffic detection. The recognition feature of network traffic is the key factor that directly affects the accuracy, timeliness and intelligence of classifier. In this paper, the commonly used transport layer port, application layer character features in traffic identification field. The extraction process of traffic statistical features and user traffic behavior features, the use of scenarios and use efficiency are analyzed and studied. Based on this, the noise processing problem in the original data is addressed. The high complexity of feature extraction and the recognition of encrypted network traffic have been deeply studied and explored, and some research results have been obtained. The research work and innovation of the thesis mainly include the following aspects: 1) in this paper, principal component analysis (PCA) is introduced to automatically purify the target application flow. If the data used for feature extraction of the target application contains noise and other dirty data. Will have a negative impact on the credibility of the extracted features. In this paper, the principal component analysis (PCA) is used to filter the traffic statistics of dirty data as secondary information. This method can effectively improve the pertinence of the extracted target application network traffic characteristics. Furthermore, the recognition accuracy of this method can be improved. 2) this paper studies how to extract network traffic features more efficiently. The time complexity and space complexity of the traditional feature extraction process are both high. In this paper, a fixed bit offset feature extraction algorithm is proposed, which can effectively avoid the process of constructing matrix and backtracking solution. Experiments show that the proposed algorithm has an order of magnitude advantage over the traditional LCS longest Common sequence algorithm. At the same time. This paper also proposes a feature extraction algorithm based on PCA Principal Component Analysis, which considers the target application traffic as a whole. The method is a novel attempt in the field of traffic feature extraction, which opens up a new idea for the later research. 3) this paper studies the identification of encrypted traffic. Based on the existing network traffic characteristics, this paper uses neural network to identify encrypted traffic effectively. At the same time. In order to improve the modeling speed of neural network, the recognition performance of the commonly used traffic statistics information in neural network is analyzed experimentally in order to achieve similar recognition performance with fewer features.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2014
【分類號】:TP393.06
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 李俊杰;;支持40 Gbit/s路由器的傳輸技術(shù)研究[J];電信科學(xué);2007年01期
2 趙慧玲;徐向輝;陳運清;王峰;;智能管道構(gòu)建思路探討[J];電信科學(xué);2011年03期
3 田輝;徐鵬;;業(yè)務(wù)識別與控制技術(shù)及標(biāo)準(zhǔn)化進(jìn)展[J];電信網(wǎng)技術(shù);2007年03期
4 周水庚,周傲英,曹晶;基于數(shù)據(jù)分區(qū)的DBSCAN算法[J];計算機研究與發(fā)展;2000年10期
5 李振宇;謝高崗;;基于DHT的P2P系統(tǒng)的負(fù)載均衡算法[J];計算機研究與發(fā)展;2006年09期
6 楊黎剛;蘇宏業(yè);張英;褚健;;基于SOM聚類的數(shù)據(jù)挖掘方法及其應(yīng)用研究[J];計算機工程與科學(xué);2007年08期
7 徐泉清,朱玉文,李亮,劉萬春;一種結(jié)合粗糙集和Cobweb的聚類器[J];計算機應(yīng)用;2005年06期
8 曾夢岐;谷大武;侯方勇;宋寧楠;;自安全磁盤研究綜述[J];計算機應(yīng)用研究;2009年09期
9 段明秀;唐超琳;;一種基于密度的聚類算法實現(xiàn)[J];吉首大學(xué)學(xué)報(自然科學(xué)版);2013年01期
10 胡軍;周劍揚;師佳;;P2P網(wǎng)絡(luò)中UPnP穿越NAT的研究與實現(xiàn)[J];現(xiàn)代計算機(專業(yè)版);2009年08期
,本文編號:1486379
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1486379.html