DFA壓縮和Snort規(guī)則解析的算法研究
發(fā)布時(shí)間:2018-09-11 11:47
【摘要】:隨著網(wǎng)絡(luò)技術(shù)的發(fā)展,針對(duì)網(wǎng)絡(luò)傳輸?shù)闹匾畔⒌墓糇兊酶与[秘和復(fù)雜。從攻擊數(shù)據(jù)包的頭部到惡意代碼、入侵指令等攻擊可能隱藏在數(shù)據(jù)包的內(nèi)容中。深包檢測(cè)技術(shù)(DPI)不僅對(duì)數(shù)據(jù)包的頭部進(jìn)行檢測(cè),還對(duì)數(shù)據(jù)包的內(nèi)容進(jìn)行分析。DPI通過(guò)使用一組給定的安全規(guī)則與網(wǎng)絡(luò)中捕獲的數(shù)據(jù)包內(nèi)容進(jìn)行比較,從而發(fā)現(xiàn)數(shù)據(jù)包內(nèi)容中攜帶的攻擊特征。 早期DPI定義的安全規(guī)則中,用精確字符串描述入侵行為的特征,這些字符串組成規(guī)則過(guò)濾集。然而,為躲避檢測(cè),各種惡意代碼和入侵指令不斷刻意隱藏自己的特征,,導(dǎo)致精確字符串已經(jīng)不能充分描述這些特征。正則表達(dá)式因表達(dá)能力靈活成為描述入侵行為特征的新一代工具。將正則表達(dá)式編譯為確定性有窮自動(dòng)機(jī)(DFA)可以實(shí)現(xiàn)正則表達(dá)式與數(shù)據(jù)包的匹配。DFA自身具有高效的匹配速率,但占用存儲(chǔ)空間較大。如果用軟件實(shí)現(xiàn)匹配算法,會(huì)面臨匹配耗時(shí)過(guò)長(zhǎng)的問(wèn)題;如果用硬件實(shí)現(xiàn)匹配算法,則會(huì)面臨存儲(chǔ)空間較小的問(wèn)題,通常硬件可用空間大小為數(shù)百KB到2MB左右。因此,構(gòu)建時(shí)空高效的正則表達(dá)式匹配過(guò)濾算法是目前DPI研究的熱點(diǎn)之一。 本文重點(diǎn)研究了DFA的存儲(chǔ)壓縮問(wèn)題,實(shí)驗(yàn)中所需的正則表達(dá)式來(lái)源于Snort2.8規(guī)則集中的一些關(guān)鍵選項(xiàng)。本文也簡(jiǎn)要介紹了實(shí)驗(yàn)中對(duì)Snort2.8規(guī)則集中關(guān)鍵選項(xiàng)的解析處理方法。主要工作為以下四個(gè)方面: (1)對(duì)于同一正則語(yǔ)言,可以存在多個(gè)識(shí)別此語(yǔ)言的DFA,但任何正則語(yǔ)言都有一個(gè)唯一的(不計(jì)同構(gòu))狀態(tài)數(shù)目最少的DFA。將任意DFA轉(zhuǎn)化為等價(jià)的狀態(tài)數(shù)最少的DFA的算法稱為DFA最小化算法。本文分析了DFA最小化算法實(shí)現(xiàn)的理論依據(jù),描述并實(shí)現(xiàn)了一個(gè)DFA最小化的經(jīng)典算法。 (2)正則表達(dá)式在編譯為DFA跳轉(zhuǎn)表數(shù)據(jù)后,存在數(shù)據(jù)量大和數(shù)據(jù)冗余的問(wèn)題。通過(guò)對(duì)DFA跳轉(zhuǎn)函數(shù)表進(jìn)行分析,發(fā)現(xiàn)了數(shù)據(jù)量大與其字符集和狀態(tài)數(shù)之間的關(guān)系。分別提出了字符集壓縮算法和基于層次聚類的狀態(tài)壓縮算法。 (3)介紹了Snort2.8規(guī)則集中規(guī)則的構(gòu)成,說(shuō)明了從文本規(guī)則到存入內(nèi)存數(shù)據(jù)結(jié)構(gòu)的Snort2.8規(guī)則集解析處理過(guò)程,給出了一種抽取Snort規(guī)則中關(guān)鍵選項(xiàng)的算法。 (4)通過(guò)Snort2.8規(guī)則集進(jìn)行實(shí)驗(yàn),評(píng)估了兩種壓縮算法結(jié)合使用后的DFA壓縮效率。與原始DFA算法和D2FA相比,存儲(chǔ)空間開(kāi)銷分別減少了98.7%和63.2%。
[Abstract]:With the development of network technology, the attacks against important information transmitted by network become more secret and complex. Attacks ranging from the header of attack packets to malicious code, intrusion instructions and so on may be hidden in the contents of packets. Deep-packet detection technology (DPI) not only detects the header of data packets, but also analyzes the contents of packets. DPI compares the contents of packets captured in the network by using a set of given security rules. Thus, the attack characteristics carried in the packet content are found. In earlier security rules defined by DPI, exact strings were used to describe the characteristics of intrusion behavior, and these strings formed a rule filter set. However, in order to avoid detection, all kinds of malicious code and intrusion instructions have been deliberately hiding their own characteristics, resulting in accurate string can no longer fully describe these characteristics. Regular expression is a new generation of tools for describing intrusion behavior because of its flexible expression ability. Compiling regular expressions into deterministic finite automata (DFA) can realize matching between regular expressions and packets. If the matching algorithm is implemented by software, it will be faced with the problem of too long matching time; if the matching algorithm is implemented with hardware, it will face the problem of small storage space, usually the size of available hardware space is about hundreds of KB to 2MB. Therefore, the construction of spatiotemporal efficient regular expression matching filtering algorithm is one of the hotspots of DPI research. This paper focuses on the storage compression problem of DFA. The required regular expressions in the experiment come from some key options in the Snort2.8 rule set. This paper also briefly introduces the analytical method of key options in Snort2.8 rule set. The main work is as follows: (1) for the same regular language, there can be more than one DFA, that recognizes the language, but any regular language has a unique (excluding isomorphism) state with the least number of DFA. The algorithm of transforming arbitrary DFA into equivalent DFA with the least number of states is called DFA minimization algorithm. In this paper, the theoretical basis of DFA minimization algorithm is analyzed, and a classical DFA minimization algorithm is described and implemented. (2) after compiling to DFA jump table data, the regular expression has the problems of large amount of data and redundant data. By analyzing the DFA jump function table, the relationship between the large amount of data and its character set and state number is found. The character set compression algorithm and the state compression algorithm based on hierarchical clustering are put forward respectively. (3) the composition of Snort2.8 rule set rules is introduced, and the parsing process of Snort2.8 rule set from text rules to memory data structure is explained. An algorithm for extracting key options from Snort rules is presented. (4) the DFA compression efficiency of two compression algorithms is evaluated by Snort2.8 rule set experiments. Compared with the original DFA algorithm and D2FA, the storage space overhead is reduced by 98.7% and 63.2% respectively.
【學(xué)位授予單位】:西安電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.08
本文編號(hào):2236603
[Abstract]:With the development of network technology, the attacks against important information transmitted by network become more secret and complex. Attacks ranging from the header of attack packets to malicious code, intrusion instructions and so on may be hidden in the contents of packets. Deep-packet detection technology (DPI) not only detects the header of data packets, but also analyzes the contents of packets. DPI compares the contents of packets captured in the network by using a set of given security rules. Thus, the attack characteristics carried in the packet content are found. In earlier security rules defined by DPI, exact strings were used to describe the characteristics of intrusion behavior, and these strings formed a rule filter set. However, in order to avoid detection, all kinds of malicious code and intrusion instructions have been deliberately hiding their own characteristics, resulting in accurate string can no longer fully describe these characteristics. Regular expression is a new generation of tools for describing intrusion behavior because of its flexible expression ability. Compiling regular expressions into deterministic finite automata (DFA) can realize matching between regular expressions and packets. If the matching algorithm is implemented by software, it will be faced with the problem of too long matching time; if the matching algorithm is implemented with hardware, it will face the problem of small storage space, usually the size of available hardware space is about hundreds of KB to 2MB. Therefore, the construction of spatiotemporal efficient regular expression matching filtering algorithm is one of the hotspots of DPI research. This paper focuses on the storage compression problem of DFA. The required regular expressions in the experiment come from some key options in the Snort2.8 rule set. This paper also briefly introduces the analytical method of key options in Snort2.8 rule set. The main work is as follows: (1) for the same regular language, there can be more than one DFA, that recognizes the language, but any regular language has a unique (excluding isomorphism) state with the least number of DFA. The algorithm of transforming arbitrary DFA into equivalent DFA with the least number of states is called DFA minimization algorithm. In this paper, the theoretical basis of DFA minimization algorithm is analyzed, and a classical DFA minimization algorithm is described and implemented. (2) after compiling to DFA jump table data, the regular expression has the problems of large amount of data and redundant data. By analyzing the DFA jump function table, the relationship between the large amount of data and its character set and state number is found. The character set compression algorithm and the state compression algorithm based on hierarchical clustering are put forward respectively. (3) the composition of Snort2.8 rule set rules is introduced, and the parsing process of Snort2.8 rule set from text rules to memory data structure is explained. An algorithm for extracting key options from Snort rules is presented. (4) the DFA compression efficiency of two compression algorithms is evaluated by Snort2.8 rule set experiments. Compared with the original DFA algorithm and D2FA, the storage space overhead is reduced by 98.7% and 63.2% respectively.
【學(xué)位授予單位】:西安電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.08
【參考文獻(xiàn)】
相關(guān)期刊論文 前8條
1 嚴(yán)書(shū)亭;劉佳新;王新生;;Snort規(guī)則鏈表結(jié)構(gòu)的分析與改進(jìn)[J];燕山大學(xué)學(xué)報(bào);2006年03期
2 陳曙暉;蘇金樹(shù);范慧萍;侯婕;;一種基于深度報(bào)文檢測(cè)的FSM狀態(tài)表壓縮技術(shù)[J];計(jì)算機(jī)研究與發(fā)展;2008年08期
3 熊德蘭;田勝利;;DFA最小化算法的探討與改進(jìn)[J];計(jì)算機(jī)教育;2008年09期
4 徐乾;鄂躍鵬;葛敬國(guó);錢華林;;深度包檢測(cè)中一種高效的正則表達(dá)式壓縮算法[J];軟件學(xué)報(bào);2009年08期
5 于強(qiáng);霍紅衛(wèi);;一組提高存儲(chǔ)效率的深度包檢測(cè)算法[J];軟件學(xué)報(bào);2011年01期
6 唐謙,張大方;基于Snort的入侵檢測(cè)引擎比較分析[J];計(jì)算機(jī)工程與設(shè)計(jì);2005年11期
7 柳廳文;孫永;卜東波;郭莉;方濱興;;正則表達(dá)式分組的1/(1-1/k)-近似算法[J];軟件學(xué)報(bào);2012年09期
8 翁佩純;劉波;;Snort規(guī)則檢測(cè)及其優(yōu)化分析[J];網(wǎng)絡(luò)安全技術(shù)與應(yīng)用;2008年01期
本文編號(hào):2236603
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2236603.html
最近更新
教材專著