基于DPI系統(tǒng)的改進(jìn)正則表達(dá)式算法
本文關(guān)鍵詞: DPI 匹配算法 模式匹配 正則表達(dá)式 自動機(jī) 猜測-分組-檢驗(yàn)算法 出處:《江西理工大學(xué)》2014年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著科技的發(fā)展和網(wǎng)絡(luò)的普及,使得互聯(lián)網(wǎng)無論在人們的工作還是生活中都扮演著非常重要的角色,例如淘寶購物、公司文案的處理以及個(gè)人資料的保存都離不開互聯(lián)網(wǎng)。然而就如同一把雙刃劍,互聯(lián)網(wǎng)的安全也成為一個(gè)不能夠忽視的問題,如何防止信息和機(jī)密文件的泄露與篡改儼然是迫在眉睫的研究內(nèi)容。因此,采用DPI(深度報(bào)文檢測)檢測方法解決互聯(lián)網(wǎng)安全問題,已經(jīng)成為了現(xiàn)今一種有效方法并被廣泛采納使用。然而,在對DPI檢測技術(shù)的研究和分析的基礎(chǔ)上,發(fā)現(xiàn)現(xiàn)有的DPI檢測技術(shù)中匹配算法上的不足包括:(1)若DPI匹配算法采用的是模式匹配算法,當(dāng)有網(wǎng)絡(luò)流量形式復(fù)雜與多變,模式匹配算法會呈現(xiàn)出一種其匹配速度慢、匹配方式單一的衰老狀態(tài),這將無法滿足現(xiàn)今復(fù)雜多變的網(wǎng)絡(luò)流量;(2)若DPI匹配算法采用的是正則表達(dá)式算法,現(xiàn)今正則表達(dá)式算法的不足之處是在轉(zhuǎn)變?yōu)樽詣訖C(jī)的過程中消耗過多的內(nèi)存,占用極大的系統(tǒng)資源。針對上述所描述DPI匹配算法所存在的問題,本文提出基于DPI系統(tǒng)的改進(jìn)正則式表達(dá)算法。具體的內(nèi)容如下: 首先,對DPI檢測方法的工作原理進(jìn)行了深入學(xué)習(xí)與研究,并通過搭建DPI系統(tǒng)模型對網(wǎng)絡(luò)上多種應(yīng)用協(xié)議的識別和阻斷,證實(shí)在實(shí)際的應(yīng)用中DPI檢測系統(tǒng)可以極大的提高了防范網(wǎng)絡(luò)信息泄露的能力,而且夠有效地對多種網(wǎng)絡(luò)應(yīng)用進(jìn)行識別和監(jiān)控,并且在網(wǎng)絡(luò)安全上具備著廣泛的應(yīng)用包括有:反病毒、入侵防御、URL過濾、內(nèi)容過濾、文件過濾、應(yīng)用行為控制和郵件過濾等功能。 其次,分析了DPI檢測方法中最為核心的網(wǎng)絡(luò)流匹配引擎所采用的識別算法,通過對模式匹配算法和正則表達(dá)式算法研究和對比,總結(jié)出了以往算法的不足。提出一種基于DPI系統(tǒng)的改進(jìn)正則表達(dá)式算法:猜測-分組-檢驗(yàn)算法。算法首先對出現(xiàn)概率高的部分特征子塊進(jìn)行搜索并把特征子塊進(jìn)行分組后DFA轉(zhuǎn)換,然后對輸入的網(wǎng)絡(luò)流量進(jìn)行猜測匹配,若流量完成DFA匹配則使用NFA進(jìn)行完整驗(yàn)證。 最后,通過實(shí)驗(yàn)驗(yàn)證了本文所提的猜測-分組-檢驗(yàn)算法的正確性和有效性,并對比Hybrid-FA算法和猜測-檢驗(yàn)算法,,證明本文算法能有效地減少DFA狀態(tài)機(jī)轉(zhuǎn)化,減少內(nèi)存使用和資源占用率,對網(wǎng)絡(luò)流協(xié)議識別方面具有優(yōu)越性。
[Abstract]:With the development of science and technology and the popularity of the Internet, the Internet plays a very important role in people's work and life, such as Taobao shopping, The handling of corporate documents and the preservation of personal data are inseparable from the Internet. However, just like a double-edged sword, the security of the Internet has become a problem that cannot be ignored. How to prevent the disclosure and tampering of information and confidential documents is an urgent research content. Therefore, DPI (Deep message Detection) detection method is used to solve the Internet security problem. Has become an effective method and has been widely used. However, based on the research and analysis of DPI detection technology, It is found that the shortcomings of the matching algorithms in the existing DPI detection techniques include: 1) if the DPI matching algorithm uses a pattern matching algorithm, when the network traffic forms are complex and changeable, the pattern matching algorithm will present a slow matching speed. If the DPI matching algorithm uses a regular expression algorithm, it can not meet the needs of the complex and changeable network traffic. The shortcoming of the current regular expression algorithm is that it consumes too much memory and takes up a lot of system resources in the process of converting to automaton. In view of the problems of the DPI matching algorithm described above, An improved canonical representation algorithm based on DPI system is proposed in this paper. First of all, the working principle of DPI detection method is deeply studied and studied, and the identification and blocking of various application protocols on the network by building a DPI system model are carried out. It is proved that the DPI detection system can greatly improve the ability of preventing network information leakage in practical applications, and it is also effective enough to identify and monitor various network applications. And it has a wide range of applications in network security, including anti-virus, intrusion prevention URL filtering, content filtering, file filtering, application behavior control and email filtering and other functions. Secondly, the recognition algorithm used by the network flow matching engine, which is the core of the DPI detection method, is analyzed, and the pattern matching algorithm and the regular expression algorithm are studied and compared. This paper summarizes the shortcomings of the previous algorithms, and proposes an improved regular expression algorithm based on DPI system: conjecture-packet-test algorithm. The algorithm first searches some feature subblocks with high occurrence probability and makes feature subblocks. After grouping the DFA transformation, Then the inputted network traffic is estimated and matched, and if the traffic completes the DFA matching, NFA is used to complete the verification. Finally, the correctness and validity of the conjecture-packet-test algorithm proposed in this paper are verified by experiments. Compared with the Hybrid-FA algorithm and the conjecture-test algorithm, it is proved that the proposed algorithm can effectively reduce the transformation of the DFA state machine. It has advantages in network flow protocol identification by reducing memory usage and resource occupancy.
【學(xué)位授予單位】:江西理工大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP393.08
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 金婷;王攀;張順頤;陸青蓮;陳東;;基于DPI和會話關(guān)聯(lián)技術(shù)的QQ語音業(yè)務(wù)識別模型和算法[J];重慶郵電學(xué)院學(xué)報(bào)(自然科學(xué)版);2006年06期
2 沈大偉;余敦一;;P2P協(xié)議識別方法研究[J];電腦知識與技術(shù);2011年18期
3 胡振宇,劉在強(qiáng),蘇璞睿,馮登國;基于協(xié)議分析的IM阻斷策略及算法分析[J];電子學(xué)報(bào);2005年10期
4 吳君欽;王凱;;面向網(wǎng)絡(luò)流的正則表達(dá)式匹配改進(jìn)算法[J];電子技術(shù)應(yīng)用;2013年08期
5 周時(shí)陽;祝建華;;DFA最小化算法研究[J];計(jì)算機(jī)工程與科學(xué);2007年03期
6 張樹壯;羅浩;方濱興;云曉春;;一種面向網(wǎng)絡(luò)安全檢測的高性能正則表達(dá)式匹配算法[J];計(jì)算機(jī)學(xué)報(bào);2010年10期
7 陳建銳;;基于協(xié)議分析的IPV6入侵檢測系統(tǒng)研究[J];計(jì)算機(jī)與數(shù)字工程;2011年09期
8 李遠(yuǎn)杰,劉渭鋒,張玉清,梁力;主流即時(shí)通軟件通信協(xié)議分析[J];計(jì)算機(jī)應(yīng)用研究;2005年07期
9 翟麗杰;段海生;;基于正則表達(dá)式的DFA拆分算法研究[J];計(jì)算機(jī)與數(shù)字工程;2012年08期
10 魏強(qiáng);李云照;褚衍杰;;基于圖劃分的正則表達(dá)式分組算法[J];計(jì)算機(jī)工程;2012年18期
本文編號:1540923
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1540923.html