基于虛擬坐標(biāo)的社交網(wǎng)絡(luò)距離估計(jì)改進(jìn)算法研究
發(fā)布時(shí)間:2019-01-28 18:35
【摘要】:隨著互聯(lián)網(wǎng)技術(shù)的不斷發(fā)展,社交網(wǎng)絡(luò)的發(fā)展也如火如荼,衍生出一系列社交網(wǎng)絡(luò)應(yīng)用。這些應(yīng)用大多以網(wǎng)絡(luò)節(jié)點(diǎn)之間的距離信息為基礎(chǔ),社交網(wǎng)絡(luò)中的距離是對用戶之間邏輯關(guān)系的一種描述,與物理距離無關(guān),一般使用節(jié)點(diǎn)間最短路徑的長度表示。隨著現(xiàn)代社交網(wǎng)絡(luò)的規(guī)模日漸龐大,使用傳統(tǒng)方法對網(wǎng)絡(luò)中節(jié)點(diǎn)間距離進(jìn)行精確計(jì)算所花費(fèi)代價(jià)已難以承受,F(xiàn)有的大規(guī)模社交網(wǎng)絡(luò)中節(jié)點(diǎn)間距離估計(jì)算法大多采用將社交網(wǎng)絡(luò)嵌入低維空間的技術(shù),也就是利用參考節(jié)點(diǎn)進(jìn)行坐標(biāo)分配的方法實(shí)現(xiàn)距離估計(jì)。但現(xiàn)有的方法中存在一個(gè)共性問題,在為普通節(jié)點(diǎn)分配坐標(biāo)時(shí),僅僅使用了參考節(jié)點(diǎn)中的一少部分用于坐標(biāo)分配,影響了距離估計(jì)算法的精度。本文針對現(xiàn)有方法的不足進(jìn)行研究,提出了基于坐標(biāo)嵌入技術(shù)的距離估計(jì)算法SDE(ShortestPath Distance Estimation)。該算法選擇將社交網(wǎng)絡(luò)嵌入不受對稱性約束的內(nèi)積空間,使其能同時(shí)適用于無向圖與有向圖進(jìn)行距離估計(jì)。為提高距離估計(jì)的精確度,該算法使用全部的參考節(jié)點(diǎn)信息為節(jié)點(diǎn)分配坐標(biāo),但這就存在會(huì)過多增加計(jì)算壓力的問題。為了解決這一問題,本文提出了基于組矩陣分解的坐標(biāo)分配算法RGMD(Robust Group Matrix Decomposition)。RGMD借鑒了魯棒性主成分分析提取低秩矩陣、去除異常值的思想,將坐標(biāo)分配過程轉(zhuǎn)化為抽取低秩矩陣的優(yōu)化問題。RGMD算法采用把所有參考節(jié)點(diǎn)分成若干組的方式,通過組與組之間并行計(jì)算完成對其它節(jié)點(diǎn)的坐標(biāo)分配。這種方法將坐標(biāo)分配算法的優(yōu)化過程由對單一變量的優(yōu)化轉(zhuǎn)化為對一組變量的同時(shí)優(yōu)化,不僅將坐標(biāo)分配算法的計(jì)算壓力從指數(shù)增加降低到常數(shù)倍增加,而且將坐標(biāo)分配與坐標(biāo)校正兩個(gè)過程合二為一,進(jìn)一步提高了距離估計(jì)算法的精確度。本文在6個(gè)開源的真實(shí)社交網(wǎng)絡(luò)數(shù)據(jù)集上對算法的性能進(jìn)行了驗(yàn)證,主要從算法的精確度、收斂速度以及對參數(shù)的敏感性等方面展開。除此之外,還實(shí)現(xiàn)了4個(gè)社交網(wǎng)絡(luò)中常用的基于距離計(jì)算的應(yīng)用,從應(yīng)用的角度對算法的性能進(jìn)行驗(yàn)證。實(shí)驗(yàn)結(jié)果表明,相較于已有的算法,本文提出的算法雖然時(shí)間開銷略有增加,但在精確度方面有大幅度提升。
[Abstract]:With the continuous development of Internet technology, the development of social networks is in full swing, derived a series of social network applications. Most of these applications are based on the distance information between network nodes. The distance in social network is a description of the logical relationship between users, which is independent of physical distance, and is usually expressed by the shortest path length between nodes. With the increasing scale of modern social networks, the cost of accurately calculating the distance between nodes in the network by traditional methods has become unbearable. Most of the existing algorithms for estimating distance between nodes in large-scale social networks are based on the technique of embedding social networks into low-dimensional space, that is, using reference nodes to allocate coordinates to realize distance estimation. However, there is a common problem in the existing methods. Only a few of the reference nodes are used for coordinate assignment, which affects the accuracy of the distance estimation algorithm. In this paper, a distance estimation algorithm, SDE (ShortestPath Distance Estimation)., based on coordinate embedding technique, is proposed to study the shortcomings of the existing methods. The algorithm chooses to embed the social network into the inner product space which is not constrained by symmetry so that it can be used to estimate the distance between the undirected graph and the directed graph at the same time. In order to improve the accuracy of distance estimation, the algorithm allocates coordinates to the nodes using all the reference node information, but there is a problem that the calculation pressure will be increased too much. In order to solve this problem, a coordinate allocation algorithm based on group matrix decomposition (RGMD (Robust Group Matrix Decomposition). RGMD) is proposed in this paper, which uses robust principal component analysis (PCA) to extract low rank matrix and remove outliers. The coordinate assignment process is transformed into an optimization problem of extracting low rank matrix. RGMD algorithm divides all reference nodes into several groups and accomplishes coordinate assignment to other nodes by parallel computation between groups. This method transforms the optimization process of coordinate allocation algorithm from the optimization of a single variable to the optimization of a set of variables at the same time. It not only reduces the computational pressure of the coordinate allocation algorithm from exponential to constant times, In addition, coordinate assignment and coordinate correction are combined to improve the accuracy of the distance estimation algorithm. In this paper, the performance of the algorithm is verified on six open source real social network datasets, mainly from the accuracy, convergence speed and sensitivity to parameters of the algorithm. In addition, four commonly used distance computing applications in social networks are implemented, and the performance of the algorithm is verified from the application point of view. The experimental results show that, compared with the existing algorithms, the proposed algorithm increases the time cost slightly, but greatly improves the accuracy.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP393.09
本文編號:2417197
[Abstract]:With the continuous development of Internet technology, the development of social networks is in full swing, derived a series of social network applications. Most of these applications are based on the distance information between network nodes. The distance in social network is a description of the logical relationship between users, which is independent of physical distance, and is usually expressed by the shortest path length between nodes. With the increasing scale of modern social networks, the cost of accurately calculating the distance between nodes in the network by traditional methods has become unbearable. Most of the existing algorithms for estimating distance between nodes in large-scale social networks are based on the technique of embedding social networks into low-dimensional space, that is, using reference nodes to allocate coordinates to realize distance estimation. However, there is a common problem in the existing methods. Only a few of the reference nodes are used for coordinate assignment, which affects the accuracy of the distance estimation algorithm. In this paper, a distance estimation algorithm, SDE (ShortestPath Distance Estimation)., based on coordinate embedding technique, is proposed to study the shortcomings of the existing methods. The algorithm chooses to embed the social network into the inner product space which is not constrained by symmetry so that it can be used to estimate the distance between the undirected graph and the directed graph at the same time. In order to improve the accuracy of distance estimation, the algorithm allocates coordinates to the nodes using all the reference node information, but there is a problem that the calculation pressure will be increased too much. In order to solve this problem, a coordinate allocation algorithm based on group matrix decomposition (RGMD (Robust Group Matrix Decomposition). RGMD) is proposed in this paper, which uses robust principal component analysis (PCA) to extract low rank matrix and remove outliers. The coordinate assignment process is transformed into an optimization problem of extracting low rank matrix. RGMD algorithm divides all reference nodes into several groups and accomplishes coordinate assignment to other nodes by parallel computation between groups. This method transforms the optimization process of coordinate allocation algorithm from the optimization of a single variable to the optimization of a set of variables at the same time. It not only reduces the computational pressure of the coordinate allocation algorithm from exponential to constant times, In addition, coordinate assignment and coordinate correction are combined to improve the accuracy of the distance estimation algorithm. In this paper, the performance of the algorithm is verified on six open source real social network datasets, mainly from the accuracy, convergence speed and sensitivity to parameters of the algorithm. In addition, four commonly used distance computing applications in social networks are implemented, and the performance of the algorithm is verified from the application point of view. The experimental results show that, compared with the existing algorithms, the proposed algorithm increases the time cost slightly, but greatly improves the accuracy.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP393.09
【參考文獻(xiàn)】
相關(guān)期刊論文 前5條
1 劉井蓮;王大玲;趙衛(wèi)績;馮時(shí);張一飛;;一種面向度中心性及重疊網(wǎng)絡(luò)社區(qū)的發(fā)現(xiàn)算法[J];計(jì)算機(jī)科學(xué);2016年03期
2 張方風(fēng);劉軍;;復(fù)雜網(wǎng)絡(luò)拓?fù)浣Y(jié)構(gòu)與演化模型研究綜述(二)[J];系統(tǒng)科學(xué)學(xué)報(bào);2015年01期
3 張富國;;基于社交網(wǎng)絡(luò)的個(gè)性化推薦技術(shù)[J];小型微型計(jì)算機(jī)系統(tǒng);2014年07期
4 張一飛;張宏莉;;網(wǎng)絡(luò)距離預(yù)測方法的研究[J];電信科學(xué);2010年12期
5 王意潔;李小勇;;網(wǎng)絡(luò)距離預(yù)測技術(shù)研究[J];軟件學(xué)報(bào);2009年06期
,本文編號:2417197
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2417197.html
最近更新
教材專著