海量多模式串匹配算法關(guān)鍵技術(shù)研究
發(fā)布時(shí)間:2018-04-16 07:50
本文選題:海量多模式匹配 + 布爾表達(dá)式匹配; 參考:《哈爾濱工程大學(xué)》2013年碩士論文
【摘要】:經(jīng)過(guò)40余年的發(fā)展,字符串匹配已經(jīng)從單一的單模式串匹配發(fā)展成為包含多模式匹配、正則匹配、近似匹配等多個(gè)新方向的研究領(lǐng)域。而隨著計(jì)算機(jī)技術(shù)、網(wǎng)絡(luò)技術(shù)的發(fā)展,社會(huì)的進(jìn)步,字符串匹配也有了越來(lái)越廣泛的應(yīng)用場(chǎng)所。在信息檢索、入侵檢測(cè)、網(wǎng)絡(luò)數(shù)據(jù)分析等多個(gè)領(lǐng)域都能夠看到字符串匹配的身影。布爾表達(dá)式匹配也有很多的應(yīng)用,如搜索引擎中的布爾查詢(xún),病毒檢測(cè)系統(tǒng)的特征組合匹配等等。字符串匹配算法以往的研究和實(shí)驗(yàn)分析都是在模式集規(guī)模在幾千到幾萬(wàn)的情況下進(jìn)行的,對(duì)于大規(guī)模模式集下的算法沒(méi)有進(jìn)行過(guò)深入的分析。本文主要研究?jī)?nèi)容分為如下兩個(gè)部分: 研究在大規(guī)模模式集下的多模式匹配問(wèn)題的特點(diǎn)和難點(diǎn),提出了基于Wu-manber算法的三種改進(jìn)算法:基于最短關(guān)鍵字長(zhǎng)度哈希的改進(jìn)算法、多哈希值預(yù)檢改進(jìn)算法、成組比較改進(jìn)算法。 研究布爾表達(dá)式匹配問(wèn)題,提出更泛化的復(fù)雜與或布爾表達(dá)式匹配問(wèn)題,,并且針對(duì)該問(wèn)題提出擴(kuò)展位標(biāo)記算法。此外本文還提出了布爾表達(dá)式化簡(jiǎn)方法,來(lái)進(jìn)一步減少表達(dá)式數(shù)目,提高匹配效率。 綜上所述,本文在研究和總結(jié)現(xiàn)有的模式匹配算法和布爾表達(dá)式匹配算法的基礎(chǔ)上,重點(diǎn)對(duì)大規(guī)模模式集下匹配算法、復(fù)雜與或布爾表達(dá)式匹配和表達(dá)式化簡(jiǎn)進(jìn)行研究,提出優(yōu)化方案,并且通過(guò)實(shí)驗(yàn)來(lái)驗(yàn)證算法的可行性以及空間和時(shí)間效率。最后本文還展望了該領(lǐng)域的未來(lái)發(fā)展趨勢(shì)。
[Abstract]:After more than 40 years' development, string matching has developed from single single pattern string matching to many new research fields, such as multi-pattern matching, regular matching, approximate matching and so on.With the development of computer technology, network technology and society, string matching has been used more and more widely.String matching can be seen in many fields, such as information retrieval, intrusion detection, network data analysis and so on.Boolean expression matching also has many applications, such as Boolean query in search engine, feature combination matching in virus detection system and so on.The previous research and experimental analysis of string matching algorithm are carried out in the case of pattern set size ranging from thousands to tens of thousands, but no in-depth analysis has been done for the algorithm under large-scale pattern set.The main contents of this paper are as follows:This paper studies the characteristics and difficulties of multi-pattern matching problem in large-scale pattern sets, and proposes three improved algorithms based on Wu-manber algorithm: an improved algorithm based on the shortest keyword length hash, an improved multi-hash pre-checking algorithm.The improved algorithm is compared in groups.In this paper, the problem of Boolean expression matching is studied, and a more generalized problem of complex or Boolean expression matching is proposed, and an extended bit marking algorithm is proposed to solve the problem.In addition, a Boolean expression simplification method is proposed to further reduce the number of expressions and improve the matching efficiency.To sum up, on the basis of studying and summarizing the existing pattern matching algorithms and Boolean expression matching algorithms, this paper focuses on the large-scale pattern set matching algorithm, complex or Boolean expression matching and expression simplification.The optimization scheme is proposed, and the feasibility of the algorithm and space and time efficiency are verified by experiments.Finally, the future development trend of this field is prospected.
【學(xué)位授予單位】:哈爾濱工程大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:TP301.6
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 楊軍;鄧芳林;;基于Snort入侵檢測(cè)系統(tǒng)模式匹配改進(jìn)算法研究[J];計(jì)算機(jī)安全;2011年06期
2 殷麗華,張冬艷,方濱興;面向入侵檢測(cè)的單模式匹配算法性能分析[J];計(jì)算機(jī)工程與應(yīng)用;2004年24期
3 曹京;譚建龍;劉萍;郭莉;;布爾表達(dá)式匹配問(wèn)題研究[J];計(jì)算機(jī)應(yīng)用研究;2007年09期
4 宋云;龍際珍;;規(guī)則數(shù)量無(wú)關(guān)的多布爾表達(dá)式匹配算法[J];軟件導(dǎo)刊;2012年03期
5 李曉明,鳳旺森;兩種對(duì)URL的散列效果很好的函數(shù)[J];軟件學(xué)報(bào);2004年02期
6 李安懷;荊繼武;;網(wǎng)絡(luò)安全系統(tǒng)中的快速規(guī)則匹配[J];計(jì)算機(jī)工程與設(shè)計(jì);2007年06期
7 曹京;劉燕兵;劉萍;譚建龍;郭莉;;定序窗口布爾表達(dá)式匹配技術(shù)研究[J];通信學(xué)報(bào);2007年12期
本文編號(hào):1757934
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1757934.html
最近更新
教材專(zhuān)著