重疊社區(qū)發(fā)現(xiàn)中的邊聚類算法研究
發(fā)布時間:2018-06-20 16:57
本文選題:重疊社區(qū)發(fā)現(xiàn) + 社區(qū)發(fā)現(xiàn); 參考:《吉林大學(xué)》2016年博士論文
【摘要】:重疊社區(qū)發(fā)現(xiàn)問題是復(fù)雜網(wǎng)絡(luò)分析中很熱門的一個問題。重疊社區(qū)發(fā)現(xiàn)旨在揭示復(fù)雜網(wǎng)絡(luò)中的重疊的社區(qū)結(jié)構(gòu)。重疊的社區(qū)結(jié)構(gòu)接近真實網(wǎng)絡(luò)中存在的社區(qū)結(jié)構(gòu),因此,對于重疊社區(qū)發(fā)現(xiàn)問題的研究是有著實際意義的。傳統(tǒng)的重疊社區(qū)發(fā)現(xiàn)研究都是以節(jié)點作為主要研究目標(biāo)。近年來,一些工作開始以邊作為研究目標(biāo),旨在通過對網(wǎng)絡(luò)中存在的邊社區(qū)進(jìn)行發(fā)現(xiàn),進(jìn)而達(dá)到解決重疊社區(qū)發(fā)現(xiàn)問題的目的。我們籠統(tǒng)地稱這類算法為邊社區(qū)發(fā)現(xiàn)算法。與傳統(tǒng)的重疊社區(qū)發(fā)現(xiàn)算法不同,邊社區(qū)發(fā)現(xiàn)算法將社區(qū)看作是由邊構(gòu)成的。由于將邊作為研究對象,邊社區(qū)發(fā)現(xiàn)算法在解決重疊社區(qū)發(fā)現(xiàn)問題時,會具有獨特的優(yōu)勢。例如,通常,邊在真實網(wǎng)絡(luò)中是屬于單一社區(qū)的。邊社區(qū)發(fā)現(xiàn)算法在對邊社區(qū)進(jìn)行發(fā)現(xiàn)工作時,會自然地形成節(jié)點的重疊社區(qū)結(jié)構(gòu);近來的一些邊社區(qū)發(fā)現(xiàn)算法會使用層次聚類算法對邊進(jìn)行聚類,這樣做能夠同時考慮到相應(yīng)的節(jié)點的層次關(guān)系和重疊關(guān)系。邊社區(qū)發(fā)現(xiàn)算法雖然有著獨特優(yōu)勢,但仍存在著發(fā)現(xiàn)的邊社區(qū)質(zhì)量低、邊相似關(guān)系考慮不全和節(jié)點的重疊度過高的問題。為進(jìn)一步利用相似性關(guān)系拓展傳統(tǒng)重疊社區(qū)發(fā)現(xiàn)策略,本文從邊相似關(guān)系角度開展了針對邊相似關(guān)系模型的階段性改進(jìn)工作,從考慮擴展邊之間的關(guān)系的基于線圖相似度,到考慮具有共同鄰居邊之間的關(guān)系的拓展余弦邊距離,再到考慮具有公共鄰居邊之間的極值情況的極值非相鄰邊的邊相似度,進(jìn)行了遞推式改進(jìn)。(1)利用線圖模型來對邊聚類問題建模?紤]到邊與邊之間的相似關(guān)系不僅存在于具有共同節(jié)點的邊之間,還存在于具有共同鄰居的邊之間以及不相鄰的邊之間。提出了一種基于線圖的邊相似度計算方法;結(jié)合馬爾科夫聚類算法和基于線圖的邊相似度計算方法,提出了基于線圖的邊聚類LCLG算法;在真實網(wǎng)絡(luò)和生物網(wǎng)絡(luò)上的實驗結(jié)果驗證了LCLG算法的有效性。(2)考慮到具有公共鄰居的邊之間的相似關(guān)系,結(jié)合余弦相似度,提出了一個拓展的余弦邊距離計算方法。將快速密度峰值搜索聚類算法引入到邊社區(qū)發(fā)現(xiàn)中,并提出了基于盒圖的社區(qū)中心邊的自動選取策略;結(jié)合拓展的余弦邊距離計算方法、快速密度峰值搜索聚類算法及基于盒圖的社區(qū)中心邊自動選取策略,提出了邊密度聚類LDC算法;在真實網(wǎng)絡(luò)上的實驗結(jié)果表明,LDC算法能夠發(fā)現(xiàn)具有良好模塊度和高覆蓋率的重疊社區(qū)。(3)考慮到具有共同鄰居的邊之間的兩種極值情況,提出了極值非相鄰邊的邊相似度計算方法;使用拓展的模塊度評價指標(biāo)來改善層次聚類算法的劃分結(jié)果,并提出了極值非相鄰邊的邊聚類MLC算法;在真實網(wǎng)絡(luò)上的實驗結(jié)果表明,MLC算法能夠發(fā)現(xiàn)具有良好模塊度的重疊社區(qū)結(jié)構(gòu)。本文提出的LCLG算法、LDC算法和MLC算法,能夠?qū)?fù)雜網(wǎng)絡(luò)中的邊社區(qū)進(jìn)行發(fā)現(xiàn),進(jìn)而達(dá)到解決重疊社區(qū)發(fā)現(xiàn)問題的目的,在理論和實際應(yīng)用方面都具有一定意義。
[Abstract]:Overlapping community discovery is a hot issue in complex network analysis. Overlapping community discovery aims to reveal overlapping community structures in complex networks. Overlapping community structures are close to the community structure in the real network. Therefore, the study of overlapping community discovery is of practical significance. In recent years, some work has begun to use the edge as a research goal. In recent years, the aim of the research is to find the side community in the network to solve the problem of overlapping community discovery. In general, we call this kind of algorithm a side community discovery algorithm. The edge community discovery algorithm takes the community as a side. As the edge is used as the research object, the edge community discovery algorithm has a unique advantage when solving the problem in the overlapping community. For example, the edge is a single community in the real network. The edge community discovery algorithm is found in the border community. It will naturally form the overlapping community structure of nodes; recently, some edge community discovery algorithms use hierarchical clustering algorithms to cluster edges, which can take into account the hierarchy and overlapping relations of the corresponding nodes. Although the edge community finds that the algorithm has a unique advantage, there is still a low quality of the border community found. In order to further utilize the similarity relationship to extend the traditional overlapping community discovery strategy, this paper develops a phase improvement work on the edge similarity relation model from the angle of edge similarity. The extended cosine edge distance between the neighbors and the neighbourhood is extended, and then the edge similarity of the extremal non adjacent edges with the extremal conditions between the common neighbors is considered, and a recursive improvement is carried out. (1) the linear graph model is used to model the edge clustering problem. In addition, a line graph based edge similarity calculation method is proposed. A line graph based edge clustering LCLG algorithm is proposed with a line graph and a line graph based edge similarity calculation method. Experimental results on real network and biological network are tested. The effectiveness of the LCLG algorithm is proved. (2) considering the similarity relation between the edges of the public neighbours and the cosine similarity, a extended cosine edge distance calculation method is proposed. The fast density peak search clustering algorithm is introduced into the border community discovery, and the automatic selection strategy based on the community center edge of the box graph is proposed. The extended cosine edge distance calculation method, the fast density peak search clustering algorithm and the box graph based community center edge automatic selection strategy are proposed, and the edge density clustering LDC algorithm is proposed. The experimental results on real network show that the LDC algorithm can find overlapping communities with good modularity and high coverage. (3) considering the common neighbour. Two extreme value cases between the sides of the house are presented, and the edge similarity calculation method of extreme value non adjacent edges is proposed. Using the extended module degree evaluation index to improve the division results of the hierarchical clustering algorithm and the edge clustering MLC algorithm of extreme value non adjacent edges is proposed. The actual results on the real network show that the MLC algorithm can find a good model. Overlapping community structure blocks. The proposed LCLG algorithm, LDC algorithm and MLC algorithm to complex edges in the network community, and achieve the purpose of solving the problem that the overlapping community, have certain significance in theory and practical application.
【學(xué)位授予單位】:吉林大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2016
【分類號】:TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 潘磊;金杰;王崇駿;謝俊元;;社會網(wǎng)絡(luò)中基于局部信息的邊社區(qū)挖掘[J];電子學(xué)報;2012年11期
,本文編號:2044979
本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/2044979.html
最近更新
教材專著