天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 軟件論文 >

含有計數(shù)量詞的圖模式匹配研究與實現(xiàn)

發(fā)布時間:2019-03-09 18:32
【摘要】:計數(shù)量詞作為一種增強表達的方式,加入到圖模式匹配中可以更加準確地描述客觀世界。通過簡單地在查詢圖的邊上附加計數(shù)量詞可以很自然的表達出全部和存在量化表達,比例和數(shù)量聚集表達,以及否定表達。這種增加了表達能力的圖模式匹配在社會媒體營銷,知識發(fā)現(xiàn),網(wǎng)絡(luò)安全等領(lǐng)域都有廣泛且直接的應用。關(guān)于量詞化的圖模式匹配問題的研究目前仍然比較新,而且該問題本身泛化了傳統(tǒng)圖模式匹配問題條件,傳統(tǒng)圖模式匹配問題僅屬于本文問題的一個特例。因而本文所研究的量詞化的圖模式匹配問題復雜度遠難于傳統(tǒng)的圖模式匹配。用邊上計數(shù)量詞是否包含否定計數(shù)量詞可以將該問題劃分成兩個復雜度相異的兩個子問題,第一類子問題模式圖邊上只包含肯定的計數(shù)量詞,它的復雜度可以證明是NP完全的,并且在特定條件下復雜度的上下界都為NP難,而含有否定計數(shù)量詞邊的匹配問題復雜度則是DP完全的,特定條件下復雜度上下界為DP難,如此高的復雜度使得蠻力法解決該問題的開銷無法令人接受。本文實現(xiàn)了(1)正QGP問題的匹配算法QMatch,該算法通過調(diào)用子圖同構(gòu)子算法而改進實現(xiàn),本文通過實驗驗證發(fā)現(xiàn)算法在模式圖規(guī)模在4以內(nèi)時的匹配時間是比較好的,規(guī)模變大后的匹配時間呈指數(shù)式增長。(2)負QGP問題的匹配算法Inc QMatch,該算法避免了直接驗證負計數(shù)量詞的高復雜度,在正問題解決的基礎(chǔ)上,利用集差思想,構(gòu)建兩個新的查詢圖結(jié)構(gòu)得到最后的結(jié)果集,通過記錄終結(jié)結(jié)果集的方式可以避免第二次查詢從頭開始有效地減少了運行時間,并且通過多線程的方式對該問題的匹配實現(xiàn)了并行化,實驗表明當邊上的負計數(shù)量詞個數(shù)只有一時,Inc QMatch的時間開銷在查詢圖規(guī)模超過4以后同樣呈顯指數(shù)式的增長。(3)利用哈希編碼技術(shù)給節(jié)點周圍邊標簽和節(jié)點標簽進行編碼,通過與的手段壓縮周圍結(jié)構(gòu)信息,并將周圍的標簽種類信息壓縮到一串二進制字串當中,通過或運算匹配節(jié)點,得到小規(guī)模候選集合,在本文實驗中,使用8+8位編碼,可以過濾掉95%的不匹配,優(yōu)化效率非常明顯,且在查詢圖規(guī)模是6的時候都表現(xiàn)出了小于10秒的運行時間,優(yōu)化效果非常顯著。
[Abstract]:Counting quantifiers, as an enhanced expression, can describe the objective world more accurately when added to graph pattern matching. By simply appending quantifiers to the edges of the query graph, it is natural to express all and existing quantitative expressions, proportion and number aggregation, and negative expressions. This kind of graph pattern matching, which increases the expression ability, is widely and directly applied in the fields of social media marketing, knowledge discovery, network security and so on. The research on quantized graph pattern matching problem is still relatively new, and the problem itself generalizes the condition of traditional graph pattern matching problem, and the traditional graph pattern matching problem is only a special case of this problem. Therefore, the complexity of quantized graph pattern matching problem studied in this paper is far more difficult than the traditional graph pattern matching problem. Whether the edge quantifier contains negative quantifier can divide the problem into two sub-problems with different complexity. The first subproblem contains only positive quantifiers on the edge of the pattern diagram, and its complexity can be proved to be NP complete. And the upper and lower bounds of complexity are difficult for NP under certain conditions, while the complexity of matching problems with negative number of words is DP complete, and the upper and lower bounds of complexity are difficult for DP under specific conditions. Such high complexity makes the cost of brute force solving the problem unacceptable. In this paper, (1) the matching algorithm QMatch, for positive QGP problem is implemented. This algorithm is improved by calling the subgraph isomorphism sub-algorithm. The experiment shows that the matching time of the algorithm is better when the scale of the pattern graph is less than 4, and the matching time of the algorithm is better when the scale of the pattern graph is less than 4. The matching time increases exponentially after the scale increases. (2) the matching algorithm Inc QMatch, of the negative QGP problem avoids the high complexity of verifying the negative quantifier directly, on the basis of solving the positive problem, the idea of set difference is used. Building two new query graph structures to get the final result set, by recording the end result set, can prevent the second query from starting from scratch effectively reducing the run time. And the problem is parallelized by multi-thread matching. The experiment shows that when the number of negative quantifiers on the edge is only one time, the number of negative quantifiers on the edge is only one time. The time cost of Inc QMatch increases exponentially after the size of the query graph exceeds 4. (3) Hash coding technology is used to encode the edge tags and node labels around the nodes and compress the surrounding structural information by means of the same method. And the surrounding label type information is compressed into a string of binary strings, and small-scale candidate sets can be obtained by or operation of matching nodes. In this experiment, 95% mismatch can be filtered out by using 8.8-bit encoding. The optimization efficiency is very obvious, and the run time is less than 10 seconds when the query graph scale is 6, and the optimization effect is very remarkable.
【學位授予單位】:哈爾濱工業(yè)大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.41

【參考文獻】

相關(guān)期刊論文 前6條

1 張麗霞;王偉平;高建良;王建新;;利用局部評估的分布式圖模式匹配算法[J];國防科技大學學報;2016年02期

2 張麗霞;王偉平;高建良;王建新;;面向模式圖變化的增量圖模式匹配[J];軟件學報;2015年11期

3 于靜;劉燕兵;張宇;劉夢雅;譚建龍;郭莉;;大規(guī)模圖數(shù)據(jù)匹配技術(shù)綜述[J];計算機研究與發(fā)展;2015年02期

4 呂雪棟;馮志勇;王鑫;饒國政;付宇新;;StepMatch:一種基于BSP計算模型的SPARQL基本圖模式匹配算法[J];計算機研究與發(fā)展;2013年S2期

5 張航;王宏志;李建中;高宏;;基于2-hop優(yōu)化的子圖模式匹配算法[J];黑龍江大學自然科學學報;2010年01期

6 吳祉群;吉方;黃文;;基于遺傳算法圖像模式匹配的收斂性研究[J];現(xiàn)代電子技術(shù);2007年01期



本文編號:2437744

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2437744.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶336de***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com