基于異常權(quán)值和子空間聚類(lèi)的無(wú)監(jiān)督網(wǎng)絡(luò)異常流量檢測(cè)研究
發(fā)布時(shí)間:2018-09-07 22:08
【摘要】:隨著信息技術(shù)和網(wǎng)絡(luò)技術(shù)飛速發(fā)展,我們從網(wǎng)絡(luò)上獲取信息資源變得更為豐富,便捷的交流方式極大地縮小了人與人之間的距離,但與此同時(shí),這也給我們計(jì)算機(jī)安全方面帶來(lái)了極大的威脅,信息網(wǎng)絡(luò)安全問(wèn)題的重要性也逐漸凸顯出來(lái)。及時(shí)有效的發(fā)現(xiàn)網(wǎng)絡(luò)中的攻擊或異常行為已經(jīng)成為了網(wǎng)絡(luò)安全領(lǐng)域中的一個(gè)非常重要的課題。傳統(tǒng)的網(wǎng)絡(luò)異常入侵檢測(cè)算法一般需要用已打標(biāo)的數(shù)據(jù)庫(kù)來(lái)訓(xùn)練模型,而這些標(biāo)記數(shù)據(jù)庫(kù)在實(shí)際網(wǎng)絡(luò)環(huán)境中獲取成本較高,且對(duì)于未訓(xùn)練過(guò)的新出現(xiàn)的異常數(shù)據(jù)流量束手無(wú)策。數(shù)據(jù)挖掘是一種十分常用的數(shù)據(jù)處理技術(shù),可以從大量的數(shù)據(jù)中挖掘出潛在的符合事實(shí)的規(guī)則或知識(shí)。數(shù)據(jù)挖掘中的聚類(lèi)是一種較好的無(wú)監(jiān)督的學(xué)習(xí)方法,直接在無(wú)標(biāo)簽的數(shù)據(jù)集上建立檢測(cè)模型,用以發(fā)現(xiàn)已知或未知的異常數(shù)據(jù),因此無(wú)監(jiān)督聚類(lèi)經(jīng)常與網(wǎng)絡(luò)異常流量檢測(cè)技術(shù)相結(jié)合。基于以上相關(guān)研究背景,本文在分析實(shí)際網(wǎng)絡(luò)環(huán)境流量的基礎(chǔ)上,采用了基于熵知識(shí)的數(shù)據(jù)特征提取方法,有效地降低了實(shí)時(shí)網(wǎng)絡(luò)原數(shù)據(jù)的復(fù)雜度。在密度峰值聚類(lèi)算法的基礎(chǔ)上,創(chuàng)新地提出了基于密度的異常權(quán)值度量方法,進(jìn)而構(gòu)建出一種新的基于密度異常權(quán)值和子空間聚類(lèi)的無(wú)監(jiān)督異常流量檢測(cè)模型,計(jì)算在每個(gè)子空間上流量的異常權(quán)值并排序后得出最終異常流量,避免了聚類(lèi)完成后才能檢測(cè)的方式,從而極大地降低了計(jì)算復(fù)雜度;同時(shí)也提出了另一種基于距離的異常權(quán)值度量方法,并在此基礎(chǔ)上與K-means聚類(lèi)算法結(jié)合構(gòu)建出新的無(wú)監(jiān)督異常流量檢測(cè)模型。這兩種方法都克服了傳統(tǒng)網(wǎng)絡(luò)異常流量檢測(cè)模型的對(duì)于標(biāo)記數(shù)據(jù)集的依賴,較大地提高了實(shí)時(shí)異常流量的準(zhǔn)確率和查全率,同時(shí)也顯著地降低了檢測(cè)時(shí)間。最后在真實(shí)環(huán)境中的某信息安全公司內(nèi)網(wǎng)數(shù)據(jù)集上和模擬數(shù)據(jù)集KDD Cup99上對(duì)檢測(cè)模型進(jìn)行實(shí)驗(yàn)分析驗(yàn)證,結(jié)果表明提出的檢測(cè)模型對(duì)于提高檢測(cè)準(zhǔn)確率和降低誤檢率均有顯著的效果。
[Abstract]:With the rapid development of information technology and network technology, we get more information resources from the network, and the convenient way of communication has greatly reduced the distance between people, but at the same time, This also brings great threat to our computer security, and the importance of information network security becomes more and more important. It has become a very important topic in the field of network security to detect attacks or abnormal behaviors in network in time and effectively. The traditional network anomaly intrusion detection algorithms generally need to use marked databases to train the model, but these tagged databases are expensive to obtain in the actual network environment, and there is no way to deal with the untrained new abnormal data flow. Data mining is a very common data processing technology, which can extract the rules or knowledge from a large amount of data. Clustering in data mining is a better unsupervised learning method, which directly builds detection model on untagged data sets to find known or unknown abnormal data. Therefore, unsupervised clustering is often combined with network anomaly detection technology. Based on the above research background, based on the analysis of the actual network traffic, this paper adopts the method of feature extraction based on entropy knowledge, which effectively reduces the complexity of the original data of real-time network. Based on the density peak clustering algorithm, a new density based outlier weight measurement method is proposed, and a new unsupervised anomaly flow detection model based on density anomaly weight and subspace clustering is constructed. The outlier weight of traffic on each subspace is calculated and sorted to get the final abnormal flow, which avoids the detection method after clustering is completed, thus greatly reducing the computational complexity. At the same time, another method of outlier weight measurement based on distance is proposed, and a new unsupervised anomaly flow detection model is constructed by combining with K-means clustering algorithm. These two methods can overcome the dependence of the traditional network anomaly traffic detection model on the marked data set and greatly improve the accuracy and recall of real-time abnormal traffic. At the same time the detection time is significantly reduced. Finally, the detection model is tested and verified on the data set of a certain information security company and the simulated data set KDD Cup99 in the real environment. The results show that the proposed detection model can improve the detection accuracy and reduce the false detection rate.
【學(xué)位授予單位】:重慶郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類(lèi)號(hào)】:TP393.08
本文編號(hào):2229551
[Abstract]:With the rapid development of information technology and network technology, we get more information resources from the network, and the convenient way of communication has greatly reduced the distance between people, but at the same time, This also brings great threat to our computer security, and the importance of information network security becomes more and more important. It has become a very important topic in the field of network security to detect attacks or abnormal behaviors in network in time and effectively. The traditional network anomaly intrusion detection algorithms generally need to use marked databases to train the model, but these tagged databases are expensive to obtain in the actual network environment, and there is no way to deal with the untrained new abnormal data flow. Data mining is a very common data processing technology, which can extract the rules or knowledge from a large amount of data. Clustering in data mining is a better unsupervised learning method, which directly builds detection model on untagged data sets to find known or unknown abnormal data. Therefore, unsupervised clustering is often combined with network anomaly detection technology. Based on the above research background, based on the analysis of the actual network traffic, this paper adopts the method of feature extraction based on entropy knowledge, which effectively reduces the complexity of the original data of real-time network. Based on the density peak clustering algorithm, a new density based outlier weight measurement method is proposed, and a new unsupervised anomaly flow detection model based on density anomaly weight and subspace clustering is constructed. The outlier weight of traffic on each subspace is calculated and sorted to get the final abnormal flow, which avoids the detection method after clustering is completed, thus greatly reducing the computational complexity. At the same time, another method of outlier weight measurement based on distance is proposed, and a new unsupervised anomaly flow detection model is constructed by combining with K-means clustering algorithm. These two methods can overcome the dependence of the traditional network anomaly traffic detection model on the marked data set and greatly improve the accuracy and recall of real-time abnormal traffic. At the same time the detection time is significantly reduced. Finally, the detection model is tested and verified on the data set of a certain information security company and the simulated data set KDD Cup99 in the real environment. The results show that the proposed detection model can improve the detection accuracy and reduce the false detection rate.
【學(xué)位授予單位】:重慶郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類(lèi)號(hào)】:TP393.08
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 林果園;曹天杰;;入侵檢測(cè)系統(tǒng)研究綜述[J];計(jì)算機(jī)應(yīng)用與軟件;2009年03期
2 胡_g;李智玲;李春偉;;一種基于區(qū)分矩陣的屬性約簡(jiǎn)算法[J];計(jì)算機(jī)工程與應(yīng)用;2007年09期
3 羅敏,王麗娜,張煥國(guó);基于無(wú)監(jiān)督聚類(lèi)的入侵檢測(cè)方法[J];電子學(xué)報(bào);2003年11期
4 李輝,管曉宏,昝鑫,韓崇昭;基于支持向量機(jī)的網(wǎng)絡(luò)入侵檢測(cè)[J];計(jì)算機(jī)研究與發(fā)展;2003年06期
,本文編號(hào):2229551
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2229551.html
最近更新
教材專著