基于分布式流數(shù)據(jù)庫(kù)系統(tǒng)的網(wǎng)絡(luò)入侵檢測(cè)
發(fā)布時(shí)間:2018-05-04 04:46
本文選題:特征選擇 + SVM。 參考:《電子科技大學(xué)》2015年碩士論文
【摘要】:隨著互聯(lián)網(wǎng)的高速發(fā)展,安全性是越來(lái)越重要的一個(gè)話(huà)題。傳統(tǒng)的網(wǎng)絡(luò)安全是針對(duì)個(gè)人用戶(hù)和企業(yè)用戶(hù),其使用的主要技術(shù)包括系統(tǒng)入侵檢測(cè)、防病毒軟件和防火墻。但這些安全措施通常并不能減少大規(guī)模通信網(wǎng)絡(luò)(即骨干網(wǎng)絡(luò))中的非正常流量。為了從根本上降低網(wǎng)絡(luò)中的異常流量,減少或消除用戶(hù)所遭受的各類(lèi)攻擊,大規(guī)模通信網(wǎng)絡(luò)與路由交換設(shè)備必須具備異常流量的檢測(cè)與識(shí)別能力。流量的異常操作通常有如下兩種判斷方式:a)判斷是否存在異常流量,這稱(chēng)之為流量監(jiān)測(cè);b)判斷流量異常的類(lèi)型,這稱(chēng)之為流量識(shí)別。目前流量監(jiān)測(cè)按照檢測(cè)的粒度主要分為三種類(lèi)型,分別是:基于package、基于flow、基于traffic。本文提出了一種更細(xì)粒度的、基于動(dòng)態(tài)session window來(lái)聚合IP Flow篩選特征的算法,并結(jié)合SVM算法來(lái)檢測(cè)DoS攻擊。同時(shí),為了支持篩選特征的計(jì)算操作,本文擴(kuò)展了Spark Stream,使其支持Stream上的SQL查詢(xún)操作。在論文的研究過(guò)程中,對(duì)現(xiàn)有特征的選擇算法、Spark Stream、Hive工作原理以及SVM核函數(shù)的選取進(jìn)行了充分調(diào)研,并深入了解了目前的流量檢測(cè)。首先,傳統(tǒng)的基于熵的特征選取算法,是把不同的IP源聚合在一起來(lái)計(jì)算熵信息。這樣的實(shí)現(xiàn)方式存在一定的缺陷,當(dāng)異常發(fā)生的時(shí)候,還要再進(jìn)一步分析才能知道攻擊源,被攻擊目標(biāo)。論文根據(jù)sessionkey(srcIP,desIP,srcPort,desPort)來(lái)聚合不同的flow數(shù)據(jù)記錄,進(jìn)而獲取網(wǎng)絡(luò)流數(shù)據(jù)的信息熵作為訓(xùn)練特征來(lái)解決該問(wèn)題。另外目前的研究表明,異常流量占總流量的比率和檢測(cè)效果存在正相關(guān),即當(dāng)異常流量占比很低的時(shí)候,檢測(cè)效果一般也很差。本文提出的session window的方式很好的解決了這個(gè)問(wèn)題。最后,面對(duì)瞬間產(chǎn)生的大量的數(shù)據(jù)集,目前缺少主要的底層計(jì)算模型的支持,而且在異常檢測(cè)算法方面也不夠高效,因此本文在Spark Stream的基礎(chǔ)上進(jìn)行了擴(kuò)展,支持Stream上面的SQL操作,并且支持連續(xù)查詢(xún)和窗口操作。最后,本文對(duì)提出的特征選取算法進(jìn)行測(cè)試,與傳統(tǒng)的ID3和C4.5算法進(jìn)行性能對(duì)比。對(duì)于特征選擇結(jié)果好壞的判斷,最直接有效的評(píng)估標(biāo)準(zhǔn)是比較算法所選擇的特征子集與最優(yōu)特征子集的相似度。但在實(shí)際應(yīng)用中,最優(yōu)特征子集沒(méi)有評(píng)估標(biāo)準(zhǔn)。因此,為了驗(yàn)證特征選擇算法的有效性,本文使用一種間接的驗(yàn)證方法,即通過(guò)所選擇的特征子集在One-class SVM分類(lèi)算法中的AUC指標(biāo)來(lái)衡量特征選擇的好壞。另外本文模擬了異常流量所占窗口總流量的不同比例,來(lái)說(shuō)明基于session window的特征選擇算法在不同的異常流量比例下都很穩(wěn)定,同時(shí)實(shí)驗(yàn)的結(jié)果也表明本文基于Spark Stream的SQL擴(kuò)展,工作良好,能很好的完成計(jì)算需求。
[Abstract]:With the rapid development of the Internet, security is an increasingly important topic. Traditional network security is aimed at personal and enterprise users. The main technologies used include system intrusion detection, antivirus software and firewall. However, these security measures usually do not reduce abnormal traffic in large scale communication networks (i.e. backbone networks). In order to fundamentally reduce the abnormal traffic in the network and reduce or eliminate all kinds of attacks suffered by users, large-scale communication networks and routing switching devices must have the ability to detect and identify abnormal traffic. There are usually two ways: a) to judge whether there is an abnormal flow, which is called flow monitor to judge the type of abnormal flow, which is called traffic identification. At present, traffic monitoring is divided into three types according to the granularity of detection, which are package-based, flow-based, traffic-based. In this paper, we propose a finer grained algorithm to aggregate IP Flow filtering features based on dynamic session window, and combine SVM algorithm to detect DoS attacks. At the same time, in order to support the computing operation of filtering features, the Spark Stream is extended to support the SQL query operation on Stream. In the research process of this paper, the principle of the existing feature selection algorithm, Spark Stream Hive, and the selection of the SVM kernel function are fully investigated, and the current traffic detection is deeply understood. Firstly, the traditional feature selection algorithm based on entropy aggregates different IP sources to calculate entropy information. In this paper, different flow data records are aggregated according to session key / srcIP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP / IP networks. In addition, the current research shows that the ratio of abnormal flow to total flow is positively correlated with the detection effect, that is, when the proportion of abnormal flow is very low, the detection effect is generally poor. The session window method proposed in this paper solves this problem very well. Finally, in the face of a large number of data sets generated in an instant, the main underlying computing model is lacking at present, and the algorithm of anomaly detection is not efficient enough. Therefore, this paper extends on the basis of Spark Stream. Supports SQL operations above Stream, and supports continuous queries and window operations. Finally, the proposed feature selection algorithm is tested and compared with the traditional ID3 and C4.5 algorithms. The most direct and effective criterion for judging the result of feature selection is to compare the similarity between the feature subset selected by the algorithm and the optimal feature subset. However, in practical application, the optimal feature subset has no evaluation criteria. Therefore, in order to verify the effectiveness of the feature selection algorithm, this paper uses an indirect verification method, that is, the feature selection is evaluated by the AUC index of the selected feature subset in the One-class SVM classification algorithm. In addition, this paper simulates the different proportion of the abnormal traffic in the window total traffic, to show that the feature selection algorithm based on session window is very stable under the different abnormal traffic ratio. At the same time, the experimental results also show that the SQL extension based on Spark Stream in this paper. Work well, can complete the calculation requirements well.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2015
【分類(lèi)號(hào)】:TP311.13;TP393.08
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 劉奇有,程思遠(yuǎn);淺談網(wǎng)絡(luò)入侵檢測(cè)技術(shù)[J];電信工程技術(shù)與標(biāo)準(zhǔn)化;2003年08期
2 袁暉;;網(wǎng)絡(luò)入侵檢測(cè)的技術(shù)難點(diǎn)研究[J];網(wǎng)絡(luò)安全技術(shù)與應(yīng)用;2006年06期
3 王宏偉;;關(guān)聯(lián)規(guī)則挖掘技術(shù)在網(wǎng)絡(luò)入侵檢測(cè)中的應(yīng)用[J];黃石理工學(xué)院學(xué)報(bào);2006年03期
4 王丁;李向宏;運(yùn)海紅;;對(duì)網(wǎng)絡(luò)入侵檢測(cè)的評(píng)估模型[J];應(yīng)用能源技術(shù);2006年05期
5 周荃;王崇駿;王s,
本文編號(hào):1841674
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1841674.html
最近更新
教材專(zhuān)著