基于圖特征的介度中心近似算法研究
發(fā)布時間:2018-04-20 10:00
本文選題:圖特征 + 介度中心�。� 參考:《曲阜師范大學(xué)》2015年碩士論文
【摘要】:目前復(fù)雜網(wǎng)絡(luò)的研究領(lǐng)域已經(jīng)涉及到了計(jì)算機(jī)科學(xué)、生物工程、物理以及城市交通學(xué)等各個學(xué)科,與人類生活等各個方面的聯(lián)系越來越緊密。所以復(fù)雜網(wǎng)絡(luò)分析是目前研究的一個熱點(diǎn)。在復(fù)雜網(wǎng)絡(luò)分析中一個重要研究方向就是分析網(wǎng)絡(luò)中節(jié)點(diǎn)的重要程度,尋找網(wǎng)絡(luò)中的關(guān)鍵節(jié)點(diǎn),例如通過控制恐怖組織的領(lǐng)導(dǎo)者可以控制整個恐怖組織,進(jìn)而避免一些恐怖襲擊案件的發(fā)生;對網(wǎng)絡(luò)中的關(guān)鍵服務(wù)器加以保護(hù),可以防止其受到病毒或者黑客的攻擊,從而達(dá)到整個網(wǎng)絡(luò)正常運(yùn)行的目的;通過隔離傳染源,有效預(yù)防傳染病毒的傳播與擴(kuò)散;根據(jù)人體致命性的蛋白質(zhì)組成研制出新的藥物等等。上述應(yīng)用中都需要使用的算法是介度中心算法,因此介度中心算法在復(fù)雜網(wǎng)絡(luò)分析中占有重要位置。雖然介度中心算法在實(shí)際應(yīng)用中的使用比較廣泛,但是還存在兩個問題。首先,介度中心算法的應(yīng)用場景不確定,目前的衡量網(wǎng)絡(luò)節(jié)點(diǎn)重要程度的算法比較多,還有一些其他常用關(guān)鍵點(diǎn)發(fā)現(xiàn)算法,如PageRank算法等,對于何時選擇介度中心算法進(jìn)行關(guān)鍵點(diǎn)發(fā)現(xiàn)目前還沒有研究,因此確定介度中心算法的應(yīng)用場景是目前亟需解決的一個問題;其次,介度中心算法的計(jì)算量過大,運(yùn)行時間長,在大數(shù)據(jù)時代不適用,現(xiàn)在最快的介度中心的算法時間復(fù)雜度為O(V*E),大規(guī)模的圖完成該算法的時間一般比較長,對于一個擁有百萬節(jié)點(diǎn)的圖數(shù)據(jù),完成計(jì)算就需要十幾個月的時間,而且雖然目前的相關(guān)工作降低了介度中心算法的運(yùn)行時間,但是完成一個節(jié)點(diǎn)規(guī)模超過百萬的圖的介度中心值計(jì)算仍然需要數(shù)月的時間,因此介度中心算法在大數(shù)據(jù)時代不適用。為了確定介度中心算法的應(yīng)用場景,本文通過比較介度中心算法和PageRank算法得到的關(guān)鍵點(diǎn)集合所處在網(wǎng)絡(luò)上位置的差異,根據(jù)7種網(wǎng)絡(luò)類型下的15個真實(shí)數(shù)據(jù)集的實(shí)驗(yàn)結(jié)果分析出了這兩種關(guān)鍵點(diǎn)發(fā)現(xiàn)算法的應(yīng)用場景。介度中心算法用于查找整個網(wǎng)絡(luò)中的重要節(jié)點(diǎn),PageRank算法用于查找某個領(lǐng)域內(nèi)熟知度比較高的點(diǎn)。為了解決介度中心算法計(jì)算量過大的問題,本文通過近似算法的角度來降低介度中心算法的計(jì)算量。經(jīng)研究發(fā)現(xiàn),目前的復(fù)雜網(wǎng)絡(luò)都呈現(xiàn)小世界網(wǎng)絡(luò)和冪律分布的特點(diǎn),可以考慮將圖本身的特征或者再加上其他的直觀圖特征與近似算法結(jié)合來降低介度中心算法的計(jì)算時間,使之能達(dá)到實(shí)用的程度。本文根據(jù)圖數(shù)據(jù)本身的特征提出了一種基于頂點(diǎn)加權(quán)的介度中心近似算法,具體做法是選取高影響力的源點(diǎn)與頂點(diǎn)加權(quán)的方式來降低介度中心算法的計(jì)算量,使用該算法的加速比平均為25,而近似結(jié)果的誤差率小于0.01%,符合在實(shí)際應(yīng)用中只關(guān)心部分重要節(jié)點(diǎn)排名的要求。所以,基于頂點(diǎn)加權(quán)的介度中心近似算法在保證近似結(jié)果精確度的前提下,大大降低了介度中心算法的計(jì)算量。
[Abstract]:At present, the research field of complex network has been involved in computer science, biological engineering, physics, urban transportation and other disciplines, and has become more and more closely related to all aspects of human life. Therefore, complex network analysis is a hot research topic at present. An important research direction in complex network analysis is to analyze the importance of the nodes in the network, to find the key nodes in the network, for example, to control the entire terrorist organization by controlling the leader of the terrorist organization. Thus avoiding some terrorist attack cases, protecting the key servers in the network, preventing them from being attacked by viruses or hackers, so as to achieve the purpose of normal operation of the entire network; by isolating the source of infection, To prevent the spread and spread of infectious virus effectively; to develop new drugs according to human lethal protein composition, and so on. The algorithm that needs to be used in the above applications is the meshing center algorithm, so it plays an important role in the analysis of complex networks. Although the dielectric center algorithm is widely used in practical applications, there are still two problems. First of all, the application scene of the mesoscale center algorithm is uncertain, there are many algorithms to measure the importance of network nodes, and there are some other commonly used key point discovery algorithms, such as PageRank algorithm, etc. At present, there is no research on when to select the key point of the algorithm, so it is urgent to solve the problem of determining the application scene of the algorithm. Secondly, the computation of the algorithm is too large and the running time is long. It was not applicable in big data's time. The time complexity of the fastest mesoscale center algorithm is OVS, and it takes a long time to complete the algorithm on a large scale graph, so for a graph with millions of nodes, It takes more than a dozen months to complete the calculation, and although the current work reduces the running time of the mesoscale center algorithm, it still takes months to complete the calculation of the mesoscale center value of a graph with a node size of more than a million. Therefore, the mesoscale center algorithm is not applicable in big data era. In order to determine the application scene of the meshing center algorithm, this paper compares the difference of the location of the key points in the network between the meshing center algorithm and the PageRank algorithm. According to the experimental results of 15 real data sets under 7 kinds of network types, the application scenarios of these two key point discovery algorithms are analyzed. The mesoscale center algorithm is used to find the most important node in the whole network, the PageRank algorithm, which is used to find the point with high degree of familiarity in a certain domain. In order to solve the problem that the computation of the dielectric center algorithm is too large, this paper uses the approximate algorithm to reduce the computational complexity of the dielectric center algorithm. It is found that the current complex networks all present the characteristics of small-world networks and power-law distribution. We can consider combining the characteristics of graph itself or other intuitionistic graph features with approximate algorithms to reduce the computing time of the mesoscale center algorithm. Make it practical. According to the characteristics of graph data, this paper proposes a vertex-weighted permittivity center approximation algorithm. The specific method is to select the high-influence source and vertex weighting method to reduce the computational complexity of the mesoscale center algorithm. The average speedup ratio of the proposed algorithm is 25 and the error rate of the approximate result is less than 0.01 which is in line with the requirement of only paying attention to the ranking of some important nodes in practical application. Therefore, the vertex weighted permittivity center approximation algorithm can greatly reduce the computational complexity of the mesoscale center algorithm on the premise of guaranteeing the accuracy of the approximate results.
【學(xué)位授予單位】:曲阜師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2015
【分類號】:O157.5
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 涂登彪;譚光明;孫凝暉;;無鎖同步的細(xì)粒度并行介度中心算法[J];軟件學(xué)報(bào);2011年05期
,本文編號:1777308
本文鏈接:http://sikaile.net/kejilunwen/yysx/1777308.html
最近更新
教材專著