異構(gòu)圖中的Top-K興趣子圖匹配算法研究
本文選題:異構(gòu)圖 + 靜態(tài)Top-K子圖匹配 ; 參考:《遼寧大學(xué)》2017年碩士論文
【摘要】:圖是一種用于描述實(shí)體間復(fù)雜關(guān)聯(lián)關(guān)系的通用數(shù)據(jù)結(jié)構(gòu),而異構(gòu)圖是一種帶標(biāo)簽的特殊圖結(jié)構(gòu),在多種不同實(shí)體組成的復(fù)雜數(shù)據(jù)建模中被廣泛應(yīng)用,例如信息網(wǎng)絡(luò)、生物網(wǎng)絡(luò)等實(shí)際應(yīng)用領(lǐng)域。在異構(gòu)數(shù)據(jù)圖中如何進(jìn)行有效的子圖匹配成為近些年的研究熱點(diǎn)。子圖匹配是用戶給定查詢圖,從大數(shù)據(jù)圖中找到與查詢圖的結(jié)構(gòu)相同且節(jié)點(diǎn)標(biāo)簽相同的所有子圖。但在實(shí)際應(yīng)中,人們往往關(guān)注興趣度比較高的匹配結(jié)果,因此引出更具針對性的Top-K興趣子圖匹配問題。Top-K興趣子圖匹配主要分為兩個(gè)部分,一是根據(jù)查詢圖在數(shù)據(jù)圖中找到所有的匹配子圖;二是將所有的匹配子圖根據(jù)興趣度排名獲得興趣度最大的K個(gè)興趣子圖。本文針對異構(gòu)圖中的Top-K興趣子圖匹配問題進(jìn)行研究發(fā)現(xiàn),目前已有方法存在若干問題,如僅涉及靜態(tài)圖上中針對無權(quán)查詢圖的匹配問題,沒有考慮用戶的個(gè)性化需求,即對有權(quán)查詢圖的匹配處理;在實(shí)際應(yīng)用中,隨著時(shí)間推移、實(shí)際應(yīng)用語義的改變,圖將發(fā)生拓?fù)浣Y(jié)構(gòu)的變化,即圖的動態(tài)演進(jìn),對圖的動態(tài)演進(jìn)過程中的Top-K興趣子圖匹配研究關(guān)注很少。因此,為解決上述問題,本文考慮用戶的個(gè)性化需求,提出一種高效的針對有權(quán)查詢圖的靜態(tài)Top-K興趣子圖匹配方法;同時(shí),針對圖的動態(tài)演進(jìn)過程中Top-K興趣子圖匹配問題展開研究,提出一種有效的基于增量的動態(tài)Top-K興趣子圖匹配方法。主要內(nèi)容如下:(1)提出一種高效的靜態(tài)Top-K興趣子圖匹配方法(ERWM)。首先,根據(jù)異構(gòu)圖的特性,提出一種類鄰接表的儲存結(jié)構(gòu)存儲異構(gòu)數(shù)據(jù)圖;其次,建立兩種新異的索引:節(jié)點(diǎn)的拓?fù)浣Y(jié)構(gòu)特性索引(NTFI)、邊特性索引(EFI),充分利用它們提供的節(jié)點(diǎn)和邊的信息,在查詢檢索階段過濾掉不合格(不匹配)的節(jié)點(diǎn)和邊,獲得相對較少的候選集,避免了匹配階段進(jìn)行不必要的節(jié)點(diǎn)、邊連通性檢測,從而減少了算法的冗余匹配步驟;最后,提出查詢圖邊的標(biāo)簽設(shè)定方法,利用其確定查詢邊進(jìn)行子圖匹配驗(yàn)證的順序,同時(shí)引入RWM算法的邊匹配邊排序的思想,從而提高圖的子圖匹配效率,減少冗余的子圖匹配步驟。(2)提出一種有效的動態(tài)Top-K興趣子圖匹配方法(DRWM)。該方法在ERWM算法基礎(chǔ)上進(jìn)行動態(tài)擴(kuò)展。首先,利用滑動窗口模型處理動態(tài)異構(gòu)圖的局部動態(tài)變化數(shù)據(jù);其次,利用滑動窗口的思想,實(shí)現(xiàn)基于圖的增量變化動態(tài)更新的Top-K興趣子圖匹配。(3)在模擬數(shù)據(jù)集和真實(shí)數(shù)據(jù)上,將本文提出的方法與現(xiàn)有的RAM和RWM方法進(jìn)行對比,分別對算法索引構(gòu)建、Top-K子圖匹配的實(shí)驗(yàn)結(jié)果進(jìn)行比較和分析,驗(yàn)證了本文提出的Top-K興趣子圖匹配方法在模擬和現(xiàn)實(shí)網(wǎng)絡(luò)上,都具有較好的查詢匹配能力,Top-K興趣子圖匹配的效率得到了極大的提高。
[Abstract]:A graph is a general data structure used to describe complex relationships between entities, while a different composition is a special graph structure with labels, which is widely used in complex data modeling of many different entities, such as information networks. Biological networks and other practical applications. How to perform effective subgraph matching in heterogeneous data graph has become a hot topic in recent years. Subgraph matching is a user given query graph, from big data graph to find the same structure as the query graph and the same node label of all subgraphs. However, in practice, people tend to pay attention to the matching results with high interest degree, so the more targeted subgraph matching problem of Top-K interest. Top-K interest subgraph matching is divided into two parts. One is to find all the matching subgraphs in the data graph according to the query graph, and the other is to get K interest subgraphs with the highest interest degree according to the interest rank of all the matching subgraphs. In this paper, we study the problem of Top-K subgraph matching in heterogeneous graphs, and find that there are some problems in the existing methods, such as the matching problem of unauthorized query graphs in static graphs, and not considering the personalized needs of users. In practical application, the topology of graph will change with the change of semantic of actual application, that is, the dynamic evolution of graph. Little attention has been paid to subgraph matching of Top-K interest in the dynamic evolution of graphs. Therefore, in order to solve the above problems, this paper proposes an efficient matching method of static Top-K interest subgraph for the authorized query graph, considering the personalized requirements of the user. At the same time, Based on the research of subgraph matching of Top-K interest in the process of dynamic evolution of graphs, an effective method of dynamic subgraph matching of Top-K interest based on increment is proposed. The main contents are as follows: 1) an efficient static Top-K interest subgraph matching method is proposed. Firstly, according to the characteristics of heterogeneous graphs, a storage structure of a class of adjacent tables is proposed to store heterogeneous data graphs. Two kinds of new and different indexes are established: the topological structure characteristic index of nodes and the edge characteristic index of EFI, which make full use of the information of nodes and edges provided by them, and filter out the unqualified nodes and edges in the query and retrieval stage. Obtain relatively few candidate sets, avoid unnecessary nodes in the matching stage, edge connectivity detection, thereby reducing the redundant matching steps of the algorithm. Finally, a label setting method of query graph edge is proposed. Using it to determine the order of subgraph matching verification by query edge, and introducing the idea of edge matching edge sorting of RWM algorithm to improve the matching efficiency of subgraph. This paper presents an effective subgraph matching method for dynamic Top-K. The method is dynamically extended on the basis of ERWM algorithm. Firstly, the sliding window model is used to deal with the local dynamic change data of the dynamic isomerous graph. Secondly, using the idea of sliding window, the Top-K interest subgraph matching. 3) based on the incremental change of graph, is realized in the simulation data set and the real data. The proposed method is compared with the existing RAM and RWM methods, and the experimental results of constructing a Top-K subgraph matching algorithm are compared and analyzed respectively. It is verified that the proposed method of Top-K interest subgraph matching is in the simulated and realistic network. Both have better query matching ability and the efficiency of Top-K interest subgraph matching has been greatly improved.
【學(xué)位授予單位】:遼寧大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:O157.5
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 楊艷;紀(jì)安娜;金虎;;大規(guī)模數(shù)據(jù)圖上的個(gè)性化子圖匹配算法[J];計(jì)算機(jī)研究與發(fā)展;2015年S1期
2 羅清華;焉曉貞;彭宇;彭喜元;;基于滑動窗口模式匹配的動態(tài)距離估計(jì)方法[J];儀器儀表學(xué)報(bào);2015年03期
3 王楠;王斌;李曉華;楊曉春;;支持動態(tài)圖數(shù)據(jù)的子圖查詢方法[J];計(jì)算機(jī)科學(xué)與探索;2014年02期
4 王爽;王國仁;;基于滑動窗口的Top-K概率頻繁項(xiàng)查詢算法研究[J];計(jì)算機(jī)研究與發(fā)展;2012年10期
5 湯克明;戴彩艷;陳];;一種基于滑動窗口的不確定數(shù)據(jù)流Top-K查詢算法[J];南京大學(xué)學(xué)報(bào)(自然科學(xué)版);2012年03期
6 張忠平;王浩;薛偉;夏炎;;動態(tài)滑動窗口的數(shù)據(jù)流聚類方法[J];計(jì)算機(jī)工程與應(yīng)用;2011年07期
7 張碩;高宏;李建中;鄒兆年;;不確定圖數(shù)據(jù)庫中高效查詢處理[J];計(jì)算機(jī)學(xué)報(bào);2009年10期
8 周傲英;金澈清;王國仁;李建中;;不確定性數(shù)據(jù)管理技術(shù)研究綜述[J];計(jì)算機(jī)學(xué)報(bào);2009年01期
9 宋寶燕;張衡;于洋;奚麗娜;王大玲;;基于滑動窗口的支持泛在應(yīng)用的流聚類挖掘算法[J];小型微型計(jì)算機(jī)系統(tǒng);2008年12期
10 常建龍;曹鋒;周傲英+;;基于滑動窗口的進(jìn)化數(shù)據(jù)流聚類[J];軟件學(xué)報(bào);2007年04期
相關(guān)博士學(xué)位論文 前1條
1 苗又山;大規(guī)模動態(tài)演化圖的存儲與分析系統(tǒng)研究[D];中國科學(xué)技術(shù)大學(xué);2015年
相關(guān)碩士學(xué)位論文 前1條
1 單曉歡;大規(guī)模演化圖的分割技術(shù)研究[D];遼寧大學(xué);2014年
,本文編號:1783993
本文鏈接:http://sikaile.net/kejilunwen/yysx/1783993.html