基于DFSTree鄰域索引的大圖匹配算法研究
發(fā)布時間:2018-11-26 12:21
【摘要】:圖結構模型常被用于互聯(lián)網(wǎng),生物,計算機視覺和化學等領域中描述復雜的數(shù)據(jù)對象及對象之間的關聯(lián)關系。圖搜索在圖計算研究領域扮演著十分重要的角色,圖搜索可根據(jù)其應用場景的不同分為兩類:子圖查詢與子圖匹配,其中子圖查詢是在圖數(shù)據(jù)庫中找出所有與查詢圖匹配的超圖集合并返回;子圖匹配則是從圖數(shù)據(jù)庫中找到所有與查詢圖同構的子圖集并返回。無論是圖搜索中的圖查詢亦或是圖匹配,均需要借助于子圖同構算法來對查詢圖與返回結果中的圖進行同構判定。眾所周知子圖同構問題是NP完全問題,在業(yè)界中圖同構問題是通過搜索剪枝的快速啟發(fā)算法來實現(xiàn);谒阉鞯淖訄D同構算法在時間開銷過大,所以若對數(shù)據(jù)庫中的所有圖逐一掃描并判斷同構關系是不現(xiàn)實的。針對這一問題科研人員提出并運用的圖索引技術,通過為圖構建索引來將一部分不滿足查詢匹配條件的圖過濾掉。通過這種方式將需要執(zhí)行同構檢測算法的圖集規(guī)模盡可能的減少,以此來從全局提升圖匹配查詢算法的執(zhí)行效率?蒲腥藛T也基于圖索引算法提出了現(xiàn)今最為常用的,基于過濾-驗證的圖匹配查詢的算法框架。傳統(tǒng)基于挖掘與非挖掘圖索引算法受限于圖規(guī)模大小,當圖規(guī)模變的十分龐大時便失去研究價值。在本文中,我們首先提出了一種基于樹型結構的鄰域索引,并通過該索引用來提升圖搜索算法的執(zhí)行效率。同時將鄰域索引作為通用的模型,對基于鄰域索引的圖匹配框架進行抽象并總結出其算法框架的代價估算函數(shù)。不僅如此,我們還參照于使用十分廣泛的VF2子圖同構檢測算法框架設計出了一種以樹型索引項為單位的圖匹配算法,并通過該算法來獲取被查詢圖中所有與查詢圖滿足子圖同構的映射關系集合。在文章的最后,通過實驗對比數(shù)據(jù)來驗證我們所提出的基于鄰域模式的DFSTree索引和基于該索引的圖重構匹配算法要優(yōu)于當今主流的基于鄰域模式的SPath索引算法。在實驗階段我們分別在索引構建時間與空間,索引過濾候選集大小,以及基于索引的圖匹配查詢算法在返回匹配結果的平均響應時間與算法返回的匹配結果來進行對比分析。通過實驗結果數(shù)據(jù)來驗證得到,基于樹型結構的DFSTree鄰域索引雖然在構建時間與空間上稍遜色于當今主流基于最短路徑的鄰域索引SPath算法,但是在查詢響應時間與過濾后對候選集中呈負相關的候選元素的過濾強度而言都優(yōu)于SPath算法。
[Abstract]:Graph structure models are often used to describe complex data objects and their relationships in the fields of Internet, biology, computer vision and chemistry. Graph search plays a very important role in the field of graph computing. Graph search can be divided into two categories according to its application scenarios: subgraph query and subgraph matching. The subgraph query is to find out all the hypergraph sets matching the query graph in the graph database and return them. Subgraph matching is to find all the subsets isomorphic to the query graph from the graph database and return. Whether it is graph query or graph matching in graph search, it is necessary to use subgraph isomorphism algorithm to determine the isomorphism between the query graph and the graph in the returned result. It is well known that the subgraph isomorphism problem is a NP complete problem. In the industry, the graph isomorphism problem is realized by the fast heuristic algorithm of searching pruning. The time cost of the search based subgraph isomorphism algorithm is too large, so it is not realistic to scan all the graphs in the database one by one and judge the isomorphism relationship. Aiming at this problem, the graph index technology proposed and applied by researchers, by constructing indexes for graphs, filters out some graphs that do not meet the query matching conditions. In this way, the scale of graph set that needs to perform isomorphism detection algorithm will be reduced as much as possible, so as to improve the execution efficiency of graph matching query algorithm globally. Based on graph indexing algorithm, researchers also proposed the most commonly used, filter-based graph matching query algorithm framework. The traditional algorithms based on mining and non-mining graph index are limited by the size of the graph and lose the research value when the scale of the graph becomes very large. In this paper, we first propose a tree structure based neighborhood index, which is used to improve the execution efficiency of graph search algorithm. At the same time, the neighborhood index is used as a general model to abstract the graph matching framework based on neighborhood index and sum up the cost estimation function of its algorithm framework. Not only that, we also design a graph matching algorithm based on tree index terms according to the widely used framework of VF2 subgraph isomorphism detection algorithm. The algorithm is used to obtain all the mapping relation sets of the queried graph which are isomorphic to the subgraph of the query graph. At the end of the paper, we compare the experimental data to verify that our proposed DFSTree index based on neighborhood schema and graph reconstruction matching algorithm based on this index are better than the current mainstream SPath index algorithm based on neighborhood schema. In the experiment stage, we compare and analyze the time and space of index construction, the size of index filter candidate set, the average response time of graph matching query algorithm based on index and the matching result returned by the algorithm. The experimental results show that the DFSTree neighborhood index based on the tree structure is slightly inferior to the current SPath algorithm based on the shortest path in the construction time and space. However, the filtering intensity of candidate elements with negative correlation between query response time and filtering time is better than that of SPath algorithm.
【學位授予單位】:遼寧大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:O157.5
本文編號:2358544
[Abstract]:Graph structure models are often used to describe complex data objects and their relationships in the fields of Internet, biology, computer vision and chemistry. Graph search plays a very important role in the field of graph computing. Graph search can be divided into two categories according to its application scenarios: subgraph query and subgraph matching. The subgraph query is to find out all the hypergraph sets matching the query graph in the graph database and return them. Subgraph matching is to find all the subsets isomorphic to the query graph from the graph database and return. Whether it is graph query or graph matching in graph search, it is necessary to use subgraph isomorphism algorithm to determine the isomorphism between the query graph and the graph in the returned result. It is well known that the subgraph isomorphism problem is a NP complete problem. In the industry, the graph isomorphism problem is realized by the fast heuristic algorithm of searching pruning. The time cost of the search based subgraph isomorphism algorithm is too large, so it is not realistic to scan all the graphs in the database one by one and judge the isomorphism relationship. Aiming at this problem, the graph index technology proposed and applied by researchers, by constructing indexes for graphs, filters out some graphs that do not meet the query matching conditions. In this way, the scale of graph set that needs to perform isomorphism detection algorithm will be reduced as much as possible, so as to improve the execution efficiency of graph matching query algorithm globally. Based on graph indexing algorithm, researchers also proposed the most commonly used, filter-based graph matching query algorithm framework. The traditional algorithms based on mining and non-mining graph index are limited by the size of the graph and lose the research value when the scale of the graph becomes very large. In this paper, we first propose a tree structure based neighborhood index, which is used to improve the execution efficiency of graph search algorithm. At the same time, the neighborhood index is used as a general model to abstract the graph matching framework based on neighborhood index and sum up the cost estimation function of its algorithm framework. Not only that, we also design a graph matching algorithm based on tree index terms according to the widely used framework of VF2 subgraph isomorphism detection algorithm. The algorithm is used to obtain all the mapping relation sets of the queried graph which are isomorphic to the subgraph of the query graph. At the end of the paper, we compare the experimental data to verify that our proposed DFSTree index based on neighborhood schema and graph reconstruction matching algorithm based on this index are better than the current mainstream SPath index algorithm based on neighborhood schema. In the experiment stage, we compare and analyze the time and space of index construction, the size of index filter candidate set, the average response time of graph matching query algorithm based on index and the matching result returned by the algorithm. The experimental results show that the DFSTree neighborhood index based on the tree structure is slightly inferior to the current SPath algorithm based on the shortest path in the construction time and space. However, the filtering intensity of candidate elements with negative correlation between query response time and filtering time is better than that of SPath algorithm.
【學位授予單位】:遼寧大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:O157.5
【參考文獻】
相關期刊論文 前2條
1 王楠;王斌;李曉華;楊曉春;;支持動態(tài)圖數(shù)據(jù)的子圖查詢方法[J];計算機科學與探索;2014年02期
2 宋美娜;金遠平;;一個基于DFS編碼的圖形匹配算法[J];計算機與數(shù)字工程;2009年09期
相關博士學位論文 前3條
1 鄒曉紅;用于圖分類的頻繁子結構挖掘算法研究[D];燕山大學;2011年
2 張碩;圖數(shù)據(jù)庫查詢處理技術的研究[D];哈爾濱工業(yè)大學;2010年
3 鄒磊;圖數(shù)據(jù)庫中的子圖查詢算法研究[D];華中科技大學;2009年
相關碩士學位論文 前3條
1 王會會;精確子圖數(shù)據(jù)庫查詢技術研究[D];哈爾濱工業(yè)大學;2014年
2 李辰辰;基于非挖掘索引的圖查詢研究[D];遼寧大學;2014年
3 鄭超;大規(guī)模圖集的頻繁子圖挖掘算法研究[D];燕山大學;2010年
,本文編號:2358544
本文鏈接:http://sikaile.net/kejilunwen/yysx/2358544.html
最近更新
教材專著