面向網(wǎng)絡(luò)安全的多維正則表達(dá)式匹配算法研究
發(fā)布時(shí)間:2018-07-25 15:51
【摘要】:近年來,網(wǎng)絡(luò)安全已發(fā)展成為民生和國(guó)本的重大威脅。隨著網(wǎng)絡(luò)帶寬的線速增長(zhǎng)和入侵模式的多樣化,正則表達(dá)式匹配作為應(yīng)用最為廣泛的DPI算法,面臨著高速低存儲(chǔ)的性能挑戰(zhàn)。DFA匹配效率高,然而多條正則表達(dá)式聯(lián)合編譯時(shí),由于“.*”相互作用會(huì)產(chǎn)生狀態(tài)爆炸問題。該問題一直以來都是正則表達(dá)式匹配算法的研究難點(diǎn)和熱點(diǎn)。研究者提出了諸多改進(jìn)算法,但仍存在以下問題:1)算法普遍缺乏數(shù)學(xué)模型的理論創(chuàng)新,對(duì)狀態(tài)和存儲(chǔ)空間的壓縮不徹底;2)算法在壓縮空間的同時(shí),往往不能保證常數(shù)級(jí)別的匹配時(shí)間復(fù)雜度,降低了系統(tǒng)匹配效率,難以應(yīng)用于IDS中。本文依托國(guó)家973計(jì)劃項(xiàng)目“可重構(gòu)信息通信基礎(chǔ)網(wǎng)絡(luò)體系研究”課題,對(duì)DFA狀態(tài)爆炸問題開展理論研究,按照模型創(chuàng)新、算法設(shè)計(jì)和算法應(yīng)用的思路,提出時(shí)空高效的正則表達(dá)式匹配模型和算法。主要研究?jī)?nèi)容如下:1)提出了基于多維立方體的狀態(tài)轉(zhuǎn)移模型M-D-Cube-STD。首先,由于冗余狀態(tài)湮沒在雜亂無章的二維STD中,因此將STD在多維空間展開,根據(jù)信息論和多維空間理論給出冗余本質(zhì)的解釋,并將冗余狀態(tài)劃分為0維冗余和1維冗余;然后,將前者按照維度壓縮,后者動(dòng)態(tài)構(gòu)建并即時(shí)更新。理論分析和實(shí)驗(yàn)結(jié)果表明,針對(duì)含“.*”的一類理想規(guī)則,M-D-Cube-STD對(duì)STD指數(shù)級(jí)別的狀態(tài)空間進(jìn)行了對(duì)數(shù)級(jí)別的壓縮,使之降低到有限自動(dòng)機(jī)的理論下界O(m)。2)提出了基于M-D-Cube-STD的確定性有限自動(dòng)機(jī)M-D-Cube-DFA。首先,算法考慮到低存儲(chǔ)要求,并不靜態(tài)存儲(chǔ)所有狀態(tài)信息,而是采用基本結(jié)構(gòu)加輔助變量的方法來獲取激活狀態(tài)信息;然后,多維空間中不同邊上1維狀態(tài)之間的跳轉(zhuǎn)不再是一跳,而是以0維狀態(tài)為“中轉(zhuǎn)站”,進(jìn)行不同維度上的二次跳轉(zhuǎn),從而實(shí)現(xiàn)與DFA等價(jià)的狀態(tài)轉(zhuǎn)移。理論分析和實(shí)驗(yàn)結(jié)果表明,針對(duì)含“.*”的一類理想規(guī)則,M-D-Cube-DFA在保證線速匹配的同時(shí),將DFA存儲(chǔ)空間降至理論下界O(m),僅付出比DFA多3~4倍的匹配時(shí)間代價(jià)。3)提出了基于多維有限自動(dòng)機(jī)MFA的正則表達(dá)式匹配算法。首先,提出有限自動(dòng)機(jī)的輸入驅(qū)動(dòng)特性理論,進(jìn)一步指出由于存在規(guī)則驅(qū)動(dòng)特性,DFA改進(jìn)算法難以對(duì)所有類型的規(guī)則進(jìn)行徹底的冗余縮減;然后,在多維立方體算法和模型的基礎(chǔ)上,以規(guī)則模板為設(shè)計(jì)導(dǎo)向,針對(duì)含“.*”的一類安全規(guī)則,修正算法和模型,不斷擴(kuò)展規(guī)則覆蓋率,最后可適用于68.9%的含“.*”的Snort規(guī)則。理論分析和實(shí)驗(yàn)結(jié)果表明,對(duì)于含“.*”的一類安全規(guī)則,與其他DFA改進(jìn)算法相比,在保證線性匹配的同時(shí),MFA在存儲(chǔ)空間、匹配時(shí)間和構(gòu)造時(shí)間方面有數(shù)量級(jí)級(jí)別的提高。多維正則表達(dá)式匹配算法具有時(shí)空高效性,在軟件設(shè)計(jì)中可作為匹配引擎的協(xié)處理器,在硬件設(shè)計(jì)中可封裝成IP核,專門處理含“.*”的一類安全規(guī)則的聯(lián)合編譯,用于解決“.*”相互作用產(chǎn)生的DFA狀態(tài)爆炸問題。
[Abstract]:In recent years, network security has become a major threat to the people's livelihood and the state. With the growth of network bandwidth and the diversification of intrusion patterns, regular expression matching, as the most widely used DPI algorithm, faces the performance challenge of high speed and low storage. Due to the interaction of ". *", the problem of state explosion will occur. This problem has always been the research difficulty and hot spot of regular expression matching algorithm. Many improved algorithms have been proposed, but the following problems still exist: 1) there is a general lack of theoretical innovation in the mathematical model. The compression of state and storage space is not complete, and the algorithm compresses the space at the same time. The complexity of matching time at constant level is often not guaranteed, which reduces the system matching efficiency and is difficult to be applied in IDS. In this paper, based on the national 973 project "Research on Reconfigurable Information and Communication basic Network system", the state explosion problem of DFA is studied theoretically, according to the ideas of model innovation, algorithm design and algorithm application. A spatiotemporal efficient regular expression matching model and algorithm are proposed. The main research contents are as follows: (1) A multi-dimensional cube based state transition model (M-D-Cube-STD) is proposed. Firstly, because the redundant state is annihilated in the chaotic two-dimensional STD, the STD is expanded in multidimensional space, and the essence of redundancy is explained according to information theory and multidimensional space theory, and the redundant state is divided into 0-dimensional redundancy and 1-dimensional redundancy. The former is then compressed according to dimension, while the latter is dynamically constructed and updated instantly. Theoretical analysis and experimental results show that M-D-Cube-STD compresses the state space of STD exponent level for a class of ideal rules with ". *" A deterministic finite automaton (M-D-Cube-DFA-based) based on M-D-Cube-STD is proposed, which is reduced to the theoretical lower bound of finite automata (O (m). 2). First, considering the low storage requirements, the algorithm does not statically store all state information, but uses the basic structure plus auxiliary variables to obtain the activation state information. The jump between 1-dimensional states on different sides in multidimensional space is no longer one-hop, but takes the 0-dimensional state as the "transfer station" and performs the secondary jump on different dimensions, so as to realize the state transition equivalent to DFA. The theoretical analysis and experimental results show that the M-D-Cube-DFA can guarantee the matching of line velocities for a class of ideal rules with ". *" The DFA storage space is reduced to the theoretical lower bound. The matching time cost of O (m), is only 3 times higher than that of DFA. (3) A regular expression matching algorithm based on multidimensional finite automata (MFA) is proposed. Firstly, the input-drive characteristic theory of finite automata is proposed, and it is further pointed out that the improved DFA algorithm is difficult to reduce all types of rules completely because of the existence of rule-driven characteristics. On the basis of multi-dimension cube algorithm and model, the rule template is taken as the design direction. For a class of security rules with ". *", the algorithm and model are modified, and the rule coverage is continuously expanded. Finally, it can be applied to 68.9% of the ".*" Snort rules. The theoretical analysis and experimental results show that for a class of security rules with ". *", compared with other improved DFA algorithms, there is an order of magnitude increase in storage space, matching time and construction time while keeping linear matching. Multi-dimensional regular expression matching algorithm has high space-time efficiency. It can be used as coprocessor of matching engine in software design, and can be encapsulated into IP core in hardware design. Used to solve the DFA state explosion problem caused by the. * interaction.
【學(xué)位授予單位】:解放軍信息工程大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.08
本文編號(hào):2144304
[Abstract]:In recent years, network security has become a major threat to the people's livelihood and the state. With the growth of network bandwidth and the diversification of intrusion patterns, regular expression matching, as the most widely used DPI algorithm, faces the performance challenge of high speed and low storage. Due to the interaction of ". *", the problem of state explosion will occur. This problem has always been the research difficulty and hot spot of regular expression matching algorithm. Many improved algorithms have been proposed, but the following problems still exist: 1) there is a general lack of theoretical innovation in the mathematical model. The compression of state and storage space is not complete, and the algorithm compresses the space at the same time. The complexity of matching time at constant level is often not guaranteed, which reduces the system matching efficiency and is difficult to be applied in IDS. In this paper, based on the national 973 project "Research on Reconfigurable Information and Communication basic Network system", the state explosion problem of DFA is studied theoretically, according to the ideas of model innovation, algorithm design and algorithm application. A spatiotemporal efficient regular expression matching model and algorithm are proposed. The main research contents are as follows: (1) A multi-dimensional cube based state transition model (M-D-Cube-STD) is proposed. Firstly, because the redundant state is annihilated in the chaotic two-dimensional STD, the STD is expanded in multidimensional space, and the essence of redundancy is explained according to information theory and multidimensional space theory, and the redundant state is divided into 0-dimensional redundancy and 1-dimensional redundancy. The former is then compressed according to dimension, while the latter is dynamically constructed and updated instantly. Theoretical analysis and experimental results show that M-D-Cube-STD compresses the state space of STD exponent level for a class of ideal rules with ". *" A deterministic finite automaton (M-D-Cube-DFA-based) based on M-D-Cube-STD is proposed, which is reduced to the theoretical lower bound of finite automata (O (m). 2). First, considering the low storage requirements, the algorithm does not statically store all state information, but uses the basic structure plus auxiliary variables to obtain the activation state information. The jump between 1-dimensional states on different sides in multidimensional space is no longer one-hop, but takes the 0-dimensional state as the "transfer station" and performs the secondary jump on different dimensions, so as to realize the state transition equivalent to DFA. The theoretical analysis and experimental results show that the M-D-Cube-DFA can guarantee the matching of line velocities for a class of ideal rules with ". *" The DFA storage space is reduced to the theoretical lower bound. The matching time cost of O (m), is only 3 times higher than that of DFA. (3) A regular expression matching algorithm based on multidimensional finite automata (MFA) is proposed. Firstly, the input-drive characteristic theory of finite automata is proposed, and it is further pointed out that the improved DFA algorithm is difficult to reduce all types of rules completely because of the existence of rule-driven characteristics. On the basis of multi-dimension cube algorithm and model, the rule template is taken as the design direction. For a class of security rules with ". *", the algorithm and model are modified, and the rule coverage is continuously expanded. Finally, it can be applied to 68.9% of the ".*" Snort rules. The theoretical analysis and experimental results show that for a class of security rules with ". *", compared with other improved DFA algorithms, there is an order of magnitude increase in storage space, matching time and construction time while keeping linear matching. Multi-dimensional regular expression matching algorithm has high space-time efficiency. It can be used as coprocessor of matching engine in software design, and can be encapsulated into IP core in hardware design. Used to solve the DFA state explosion problem caused by the. * interaction.
【學(xué)位授予單位】:解放軍信息工程大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.08
【參考文獻(xiàn)】
相關(guān)期刊論文 前6條
1 賀煒;郭云飛;扈紅超;;基于狀態(tài)約束的大規(guī)模正則表達(dá)式匹配算法[J];通信學(xué)報(bào);2013年10期
2 張大方;張潔坤;黃昆;;一種基于智能有限自動(dòng)機(jī)的正則表達(dá)式匹配算法[J];電子學(xué)報(bào);2012年08期
3 張樹壯;羅浩;方濱興;云曉春;;一種面向網(wǎng)絡(luò)安全檢測(cè)的高性能正則表達(dá)式匹配算法[J];計(jì)算機(jī)學(xué)報(bào);2010年10期
4 楊毅夫;劉燕兵;劉萍;郭牧怡;郭莉;;正則表達(dá)式的DFA壓縮算法[J];通信學(xué)報(bào);2009年S1期
5 徐乾;鄂躍鵬;葛敬國(guó);錢華林;;深度包檢測(cè)中一種高效的正則表達(dá)式壓縮算法[J];軟件學(xué)報(bào);2009年08期
6 張榮遠(yuǎn);n維立方體[J];數(shù)學(xué)通報(bào);1999年04期
相關(guān)博士學(xué)位論文 前1條
1 張樹壯;面向網(wǎng)絡(luò)安全的高性能特征匹配技術(shù)研究[D];哈爾濱工業(yè)大學(xué);2011年
相關(guān)碩士學(xué)位論文 前1條
1 葉荻秋;高速網(wǎng)絡(luò)的跨包正則表達(dá)式匹配技術(shù)研究[D];解放軍信息工程大學(xué);2013年
,本文編號(hào):2144304
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2144304.html
最近更新
教材專著