基于MapReduce聚類算法的復(fù)雜網(wǎng)絡(luò)分簇研究
發(fā)布時(shí)間:2018-10-31 18:12
【摘要】:現(xiàn)實(shí)世界中諸如交通運(yùn)輸網(wǎng)絡(luò)、社交網(wǎng)絡(luò)等復(fù)雜系統(tǒng)都可建模成是由若干個(gè)“社團(tuán)”或“簇”構(gòu)成復(fù)雜網(wǎng)絡(luò),揭示其社團(tuán)結(jié)構(gòu)以便深入了解網(wǎng)絡(luò)結(jié)構(gòu)與分析網(wǎng)絡(luò)行為特性受到研究者廣泛關(guān)注。社團(tuán)分簇就是要找到復(fù)雜網(wǎng)絡(luò)中存在的社團(tuán)結(jié)構(gòu),進(jìn)而提取各社團(tuán)結(jié)構(gòu)中所蘊(yùn)含的重要信息,但近年來隨著移動(dòng)互聯(lián)網(wǎng)、社交網(wǎng)絡(luò)、物聯(lián)網(wǎng)的等復(fù)雜網(wǎng)絡(luò)節(jié)點(diǎn)不斷增加,網(wǎng)絡(luò)規(guī)模日益增大,傳統(tǒng)單機(jī)模式下的社團(tuán)分簇方法已經(jīng)不能滿足大規(guī)模復(fù)雜網(wǎng)絡(luò)分析的需求,如何應(yīng)對(duì)大規(guī)模復(fù)雜網(wǎng)絡(luò)社團(tuán)分簇日益成為當(dāng)前的研究熱點(diǎn)。本文針對(duì)復(fù)雜網(wǎng)絡(luò)分簇問題,提出了基于MapReduce聚類算法的復(fù)雜網(wǎng)絡(luò)分簇研究,具體工作如下:首先,針對(duì)復(fù)雜網(wǎng)絡(luò)社團(tuán)分簇問題,提出了一種基于鄰域搜索聚類算法的復(fù)雜網(wǎng)絡(luò)分簇方法,該算法通過鄰域搜索策略控制選取聚類中心,克服了傳統(tǒng)聚類算法選取聚類中心的隨機(jī)性和局限性,從而實(shí)現(xiàn)較優(yōu)的社團(tuán)分簇結(jié)果,實(shí)驗(yàn)結(jié)果表明在相對(duì)規(guī)模較小的復(fù)雜網(wǎng)絡(luò)中本文算法具有較高的檢測(cè)準(zhǔn)確率。其次,針對(duì)復(fù)雜網(wǎng)絡(luò)規(guī)模不斷增大,傳統(tǒng)單機(jī)模式已經(jīng)不能滿足大規(guī)模網(wǎng)絡(luò)社團(tuán)分簇需要問題,提出將基于鄰域搜索聚類算法的復(fù)雜網(wǎng)絡(luò)分簇方法進(jìn)行MapReduce化,該方法依次進(jìn)行數(shù)據(jù)預(yù)處理、計(jì)算節(jié)點(diǎn)的最短路徑、計(jì)算鄰域密度、選取聚類中心節(jié)點(diǎn)并分簇,以實(shí)現(xiàn)MapReduce化后大規(guī)模復(fù)雜網(wǎng)絡(luò)社團(tuán)分簇處理。最后,設(shè)計(jì)并搭建基于Hadoop集群的實(shí)驗(yàn)平臺(tái),實(shí)驗(yàn)結(jié)果表明,隨著復(fù)雜網(wǎng)絡(luò)規(guī)模的增大,本文算法MapReduce并行化在執(zhí)行速度上對(duì)比于單機(jī)具有明顯的優(yōu)勢(shì),體現(xiàn)出較高的準(zhǔn)確率和可靠性。
[Abstract]:Complex systems in the real world, such as transportation networks and social networks, can be modeled as complex networks consisting of several "communities" or "clusters". In order to understand the network structure and analyze the characteristics of network behavior, researchers pay more and more attention to revealing its community structure. Community clustering is to find out the community structure in the complex network, and then extract the important information contained in the community structure. But in recent years, with the mobile Internet, social network, Internet of things and other complex network nodes increasing. With the increasing of network scale, the traditional single-machine community clustering method can not meet the needs of large-scale complex network analysis. How to deal with the large-scale complex network community clustering has become a hot research topic. In order to solve the complex network clustering problem, this paper proposes a complex network clustering research based on MapReduce clustering algorithm. The specific work is as follows: firstly, aiming at the complex network community clustering problem, In this paper, a new clustering method based on neighborhood search clustering algorithm is proposed. The clustering center selection is controlled by neighborhood search strategy, which overcomes the randomness and limitation of traditional clustering algorithm. The experimental results show that the proposed algorithm has a high detection accuracy in a relatively small complex network. Secondly, in view of the increasing scale of complex network and the fact that the traditional single-machine mode can no longer meet the needs of large-scale network community clustering, a new clustering method based on neighborhood search clustering algorithm is proposed for MapReduce. The method performs data preprocessing in turn, calculates the shortest path of nodes, calculates neighborhood density, selects cluster center nodes and clusters, so as to achieve large-scale complex network community clustering after MapReduce. Finally, the experimental platform based on Hadoop cluster is designed and built. The experimental results show that with the increase of the scale of complex network, the parallel algorithm MapReduce has obvious advantages over single computer in execution speed. It shows high accuracy and reliability.
【學(xué)位授予單位】:南京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP311.13;O157.5
本文編號(hào):2303127
[Abstract]:Complex systems in the real world, such as transportation networks and social networks, can be modeled as complex networks consisting of several "communities" or "clusters". In order to understand the network structure and analyze the characteristics of network behavior, researchers pay more and more attention to revealing its community structure. Community clustering is to find out the community structure in the complex network, and then extract the important information contained in the community structure. But in recent years, with the mobile Internet, social network, Internet of things and other complex network nodes increasing. With the increasing of network scale, the traditional single-machine community clustering method can not meet the needs of large-scale complex network analysis. How to deal with the large-scale complex network community clustering has become a hot research topic. In order to solve the complex network clustering problem, this paper proposes a complex network clustering research based on MapReduce clustering algorithm. The specific work is as follows: firstly, aiming at the complex network community clustering problem, In this paper, a new clustering method based on neighborhood search clustering algorithm is proposed. The clustering center selection is controlled by neighborhood search strategy, which overcomes the randomness and limitation of traditional clustering algorithm. The experimental results show that the proposed algorithm has a high detection accuracy in a relatively small complex network. Secondly, in view of the increasing scale of complex network and the fact that the traditional single-machine mode can no longer meet the needs of large-scale network community clustering, a new clustering method based on neighborhood search clustering algorithm is proposed for MapReduce. The method performs data preprocessing in turn, calculates the shortest path of nodes, calculates neighborhood density, selects cluster center nodes and clusters, so as to achieve large-scale complex network community clustering after MapReduce. Finally, the experimental platform based on Hadoop cluster is designed and built. The experimental results show that with the increase of the scale of complex network, the parallel algorithm MapReduce has obvious advantages over single computer in execution speed. It shows high accuracy and reliability.
【學(xué)位授予單位】:南京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP311.13;O157.5
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 李建中;王宏志;高宏;;大數(shù)據(jù)可用性的研究進(jìn)展[J];軟件學(xué)報(bào);2016年07期
2 郭平;王可;羅阿理;薛明志;;大數(shù)據(jù)分析中的計(jì)算智能研究現(xiàn)狀與展望[J];軟件學(xué)報(bào);2015年11期
3 李劉強(qiáng);桂小林;安健;孫雨;;采用模糊層次聚類的社會(huì)網(wǎng)絡(luò)重疊社區(qū)檢測(cè)算法[J];西安交通大學(xué)學(xué)報(bào);2015年02期
4 楊博;劉大有;金弟;馬海賓;;復(fù)雜網(wǎng)絡(luò)聚類方法[J];軟件學(xué)報(bào);2009年01期
,本文編號(hào):2303127
本文鏈接:http://sikaile.net/kejilunwen/yysx/2303127.html
最近更新
教材專著