基于Spark的并行增量動態(tài)社團(tuán)發(fā)現(xiàn)算法
發(fā)布時間:2018-07-17 05:42
【摘要】:動態(tài)社團(tuán)發(fā)現(xiàn)是研究網(wǎng)絡(luò)演化的關(guān)鍵步驟。在數(shù)據(jù)量迅猛增長的情況下,社團(tuán)發(fā)現(xiàn)的單機(jī)算法效率較低。該文提出了一種基于Spark的并行增量動態(tài)社團(tuán)發(fā)現(xiàn)算法(parallel incremental dynamic community detection algorithm based on Spark,PIDCDS),為了在GraphX并行圖計算平臺上通過最大化持久力發(fā)現(xiàn)社團(tuán),該算法對節(jié)點的持久力計算公式進(jìn)行了有效修正。PIDCDS計算每個時間片中增量節(jié)點的持久力指標(biāo),更新其社團(tuán)歸屬,在保證一定的社團(tuán)劃分準(zhǔn)確性的基礎(chǔ)上減少計算量。通過與FacetNet動態(tài)社團(tuán)發(fā)現(xiàn)算法做比較,該算法能夠獲得更好的穩(wěn)定性,同時能發(fā)現(xiàn)更真實的社團(tuán)劃分。對比不同規(guī)模網(wǎng)絡(luò)在PIDCDS上的運行時間,發(fā)現(xiàn)該時間隨著網(wǎng)絡(luò)節(jié)點和邊數(shù)的增加緩慢增長,性能較高,并且增加執(zhí)行器核數(shù)將在一定程度上加速算法的執(zhí)行。
[Abstract]:Dynamic community discovery is a key step in the study of network evolution. In the case of rapid growth of data volume, the efficiency of single machine algorithm discovered by community is low. In this paper, a parallel incremental dynamic community discovery algorithm (parallel incremental dynamic community detection algorithm based on Sparkan PIDCDS based on Spark is proposed. The algorithm effectively modifies the calculation formula of the persistence of nodes. PIDCDS calculates the persistence index of the incremental nodes in each time slice, updates their community ownership, and reduces the amount of computation on the basis of ensuring the accuracy of community division. Compared with FacetNet dynamic community discovery algorithm, this algorithm can obtain better stability and find more real community partition. By comparing the running time of different scale networks on PIDCDS, it is found that the time increases slowly with the increase of network nodes and edges, and the performance of PIDCDS is higher, and the execution of the algorithm will be accelerated to a certain extent by increasing the number of executor kernels.
【作者單位】: 北京郵電大學(xué)智能通信軟件與多媒體北京市重點實驗室;北京郵電大學(xué)計算機(jī)學(xué)院;
【基金】:國家“九七三”重點基礎(chǔ)研究發(fā)展計劃(2013CB329606) 北京市共建項目專項資助
【分類號】:TP301.6
[Abstract]:Dynamic community discovery is a key step in the study of network evolution. In the case of rapid growth of data volume, the efficiency of single machine algorithm discovered by community is low. In this paper, a parallel incremental dynamic community discovery algorithm (parallel incremental dynamic community detection algorithm based on Sparkan PIDCDS based on Spark is proposed. The algorithm effectively modifies the calculation formula of the persistence of nodes. PIDCDS calculates the persistence index of the incremental nodes in each time slice, updates their community ownership, and reduces the amount of computation on the basis of ensuring the accuracy of community division. Compared with FacetNet dynamic community discovery algorithm, this algorithm can obtain better stability and find more real community partition. By comparing the running time of different scale networks on PIDCDS, it is found that the time increases slowly with the increase of network nodes and edges, and the performance of PIDCDS is higher, and the execution of the algorithm will be accelerated to a certain extent by increasing the number of executor kernels.
【作者單位】: 北京郵電大學(xué)智能通信軟件與多媒體北京市重點實驗室;北京郵電大學(xué)計算機(jī)學(xué)院;
【基金】:國家“九七三”重點基礎(chǔ)研究發(fā)展計劃(2013CB329606) 北京市共建項目專項資助
【分類號】:TP301.6
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 任慶生,葉中行,曾進(jìn);進(jìn)化算法的收斂速度[J];上海交通大學(xué)學(xué)報;1999年06期
2 唐浩;;蟻群算法的研究與展望[J];牡丹江教育學(xué)院學(xué)報;2009年06期
3 鄧小波;曹聰聰;龍倫海;康耀紅;;蟻群算法搜索熵研究[J];海南大學(xué)學(xué)報(自然科學(xué)版);2007年04期
4 張康;顧幸生;;全局組搜索優(yōu)化算法及其應(yīng)用研究[J];青島科技大學(xué)學(xué)報(自然科學(xué)版);2012年05期
5 李東曉;蔣珉;柴干;;蟻群算法優(yōu)化及其在高速公路緊急救援中的應(yīng)用[J];計算機(jī)技術(shù)與發(fā)展;2010年11期
6 李德勝;張才仙;陳淑銘;;選擇策略對進(jìn)化算法性能的影響[J];科技資訊;2007年11期
7 韓明紅;鄧家y,
本文編號:2129263
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2129263.html
最近更新
教材專著