基于聚類分析的社會網(wǎng)絡(luò)社團劃分方法研究
發(fā)布時間:2018-05-07 20:17
本文選題:社會網(wǎng)絡(luò) + 社團劃分 ; 參考:《南京郵電大學》2015年碩士論文
【摘要】:近年來,隨著Web2.0的發(fā)展,社會網(wǎng)絡(luò)越來越受到更多學者們的關(guān)注和研究。在社會網(wǎng)絡(luò)的眾多性質(zhì)中,社團結(jié)構(gòu)是其最重要同時也是最具有研究意義的性質(zhì)之一。通過社團的劃分,我們不僅能夠了解和分析網(wǎng)絡(luò)的結(jié)構(gòu)特點,也能夠挖掘出表面數(shù)據(jù)關(guān)系之中的隱性信息。在目前的研究中已經(jīng)有了很多經(jīng)典的社團劃分算法,比如Girvan-Newman算法,Keinighan-Lin算法等,但是準確度和時間復(fù)雜度之間的平衡仍然是社團劃分算法的主要問題之一。本文對社會網(wǎng)絡(luò)的社團劃分方法進行了研究,首先系統(tǒng)地分析比較了現(xiàn)有劃分算法的優(yōu)缺點,然后基于聚類分析的思想,提出了兩種改進的社團劃分方法:基于節(jié)點相似度系數(shù)聚類的社團劃分方法(Community Division Method Based on Node Similarity Clustering,NSCCDM)和基于遺傳算法聚類的社團劃分方法(Genetic Algorithm Clustering Based Community Division Method,GACCDM)。NSCCDM方法首先通過歐幾里得公式和最短路徑構(gòu)造了一種新的相似度系數(shù)度量方法,并生成相似度系數(shù)矩陣,然后通過比較節(jié)點之間的相似度系數(shù),進行節(jié)點的聚類,最后再使用模塊度函數(shù)Q對被重復(fù)劃分的節(jié)點進行最優(yōu)社團選擇。GACCDM方法主要采用了遺傳算法的思想。該方法使用了模塊度函數(shù)Q作為適應(yīng)度函數(shù)來控制遺傳算法的進化過程;采用指向性的變異策略代替隨機變異策略加快算法的收斂速度;并結(jié)合網(wǎng)絡(luò)的拓撲結(jié)構(gòu)和節(jié)點之間的相似度系數(shù)關(guān)系進行遺傳聚類,從而完成對社團的劃分。本文使用java語言對以上的兩種方法進行編碼實現(xiàn),并將它們分別應(yīng)用于真實社會網(wǎng)絡(luò):Zachary空手道俱樂部網(wǎng)絡(luò)、海豚網(wǎng)絡(luò)以及美國足球網(wǎng)絡(luò)。實驗得出的劃分結(jié)果與實際情況相符,從而表明了本文算法的準確性和可用性。
[Abstract]:In recent years, with the development of Web2.0, more and more scholars pay attention to social network. Among the many properties of social network, community structure is one of the most important and significant properties. Through the division of communities, we can not only understand and analyze the structural characteristics of the network, but also mine the hidden information in the surface data relationship. There have been many classical community partitioning algorithms, such as Girvan-Newman algorithm, Keinighan-Lin algorithm, and so on. However, the balance between accuracy and time complexity is still one of the main problems of community partitioning algorithm. In this paper, the social network community division method is studied. Firstly, the advantages and disadvantages of the existing partitioning algorithms are systematically analyzed and compared, and then based on the idea of clustering analysis, Two improved community classification methods are proposed: community Division Method Based on Node Similarity clustering method based on node similarity coefficient clustering and genetic Algorithm Clustering Based Community Division method based on genetic Algorithm Clustering Based Community Division clustering. Euclidean formula and shortest path construct a new similarity coefficient measurement method. The similarity coefficient matrix is generated, and then the clustering of nodes is carried out by comparing the similarity coefficients between nodes. In the end, the modular degree function Q is used to select the optimal community of the repeated partitioned nodes. The GACCDM method mainly adopts the idea of genetic algorithm (GA). In this method, the modular degree function Q is used as the fitness function to control the evolutionary process of genetic algorithm, and the directivity mutation strategy is used instead of the random mutation strategy to accelerate the convergence of the algorithm. Combining the topological structure of the network and the similarity coefficient relationship between nodes, genetic clustering is carried out to complete the division of communities. In this paper, we use java language to encode the above two methods, and apply them to the real social network:: Zachary karate club network, dolphin network and American football network respectively. The experimental results are consistent with the actual situation, which shows the accuracy and availability of the proposed algorithm.
【學位授予單位】:南京郵電大學
【學位級別】:碩士
【學位授予年份】:2015
【分類號】:TP393.09;TP311.13
【參考文獻】
相關(guān)期刊論文 前1條
1 張慧哲;王堅;;基于初始聚類中心選取的改進FCM聚類算法[J];計算機科學;2009年06期
,本文編號:1858326
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1858326.html
最近更新
教材專著