基于用戶親密度與密度峰值的社區(qū)發(fā)現(xiàn)算法研究
本文關(guān)鍵詞: 社會網(wǎng)絡(luò) 社區(qū)發(fā)現(xiàn) 用戶親密度 密度峰值 模塊度 出處:《吉林大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著信息技術(shù)的快速發(fā)展和智能硬件設(shè)備的普及,人們已經(jīng)進入到了社會信息化的時代,在線社會網(wǎng)絡(luò)的出現(xiàn)改變了人們的日常生活和娛樂方式,各種各樣的社會網(wǎng)絡(luò)工具層出不窮,如微博、微信、知乎等,使人與人之間進行溝通交流更加方便、快捷,拉近了人與人之間的距離,促進了在線社會網(wǎng)絡(luò)的快速發(fā)展。在線社會網(wǎng)絡(luò)中記錄了大量用戶的信息,用戶與用戶間的關(guān)系有的緊密有的疏遠,社會網(wǎng)絡(luò)的社區(qū)化趨勢越發(fā)明顯,為了更好的理解社會網(wǎng)絡(luò)中社區(qū)結(jié)構(gòu)的特征以及社區(qū)演化的規(guī)律,大量學(xué)者投入到社會網(wǎng)絡(luò)的研究中來,社會網(wǎng)絡(luò)中的社區(qū)發(fā)現(xiàn)研究可以將整個網(wǎng)絡(luò)劃分為粒度小的社區(qū),讓我們更加清晰的了解網(wǎng)絡(luò)結(jié)構(gòu),針對社會網(wǎng)絡(luò)中的社區(qū)發(fā)現(xiàn)問題,本文的主要工作如下:首先,給出了一種改進的衡量用戶相似度的方法。社區(qū)發(fā)現(xiàn)的大部分算法可以進行有效的社區(qū)識別,但是缺點是僅僅考慮了節(jié)點之間直接的、無向的關(guān)系,然而這在真實的在線社會網(wǎng)絡(luò)中是不合理的,只依靠節(jié)點之間直接的、無向的關(guān)系并不能準(zhǔn)確度量節(jié)點之間的相似程度,本文充分考慮節(jié)點之間直接與間接的關(guān)系,并且考慮了關(guān)系的有向性帶給度量節(jié)點之間相似性的影響,給出一種新的基于用戶關(guān)系的親密度計算方法。首先給出了關(guān)注和粉絲矩陣的生成算法、直接親密度與間接親密度的定義。綜合考慮有向的關(guān)注關(guān)系和粉絲關(guān)系給出了直接親密度的計算公式,然后充分考慮節(jié)點間接關(guān)系給出了間接親密度計算方法。最后給出了能夠綜合衡量節(jié)點之間結(jié)構(gòu)特性的用戶親密度計算方法,并且給出了計算過程。然后,對基于密度峰值和快速搜索的聚類算法進行了改進,其作為一種高效的、新穎的聚類方法,可以自動識別社區(qū)的規(guī)模,并且可以得到任意形狀的簇結(jié)構(gòu)。但在識別社區(qū)中心時,可能導(dǎo)致將同一簇結(jié)構(gòu)拆分為兩個簇結(jié)構(gòu),影響了算法的結(jié)果。本文將其聚類思想應(yīng)用到社會網(wǎng)絡(luò)中社區(qū)發(fā)現(xiàn)的研究中,并結(jié)合社會網(wǎng)絡(luò)的特性,給出了改進后的識別社區(qū)中心的方法,使其可以更加準(zhǔn)確的識別社區(qū)中心,給出了基于密度峰值的社區(qū)發(fā)現(xiàn)算法。然后將上述兩種改進方法相結(jié)合,基于用戶關(guān)系的親密度計算方法得到用戶親密度矩陣,使用基于密度峰值的社區(qū)發(fā)現(xiàn)算法來計算用戶的重要度與距離,使其屬性計算更加合理,最后給出了完整的基于用戶親密度與密度峰值的社區(qū)發(fā)現(xiàn)算法。最后,在微博數(shù)據(jù)集和公共數(shù)據(jù)集上驗證算法的結(jié)果,實驗表明了算法的可行性以及有效性,算法的參數(shù)調(diào)節(jié)策略使其具有較好的靈活性,算法同樣適用于無向的用戶關(guān)系網(wǎng)絡(luò),證明了算法具有較好的泛化性。
[Abstract]:With the rapid development of information technology and the popularization of intelligent hardware, people have entered the era of social information, and the appearance of online social network has changed people's daily life and entertainment. A variety of social network tools emerge in endlessly, such as Weibo, WeChat, Zhihu and so on, making communication between people more convenient, faster and closer to the distance between people. Promote the rapid development of online social networks. Online social networks record a large number of user information, the relationship between users some close some alienated, the social network community trend is becoming more and more obvious. In order to better understand the characteristics of community structure and the law of community evolution in social networks, a large number of scholars put into the research of social networks. The community discovery research in social network can divide the whole network into small grained communities, let us understand the network structure more clearly, and find out the problem for the community in the social network. The main work of this paper is as follows: firstly, an improved method to measure user similarity is presented. Most of the algorithms of community discovery can be used for effective community identification. However, the disadvantage is that the direct and undirected relationship between nodes is considered only. However, this is unreasonable in the real online social network and only depends on the direct relationship between nodes. Undirected relationship can not accurately measure the degree of similarity between nodes. In this paper, the direct and indirect relationships between nodes are fully considered, and the influence of the directionality of relationships on the similarity between measurement nodes is considered. A new method of user relationship based affinity calculation is presented. Firstly, the algorithm of generating attention and fan matrix is given. The definition of direct affinity and indirect affinity. Considering the relationship of directed concern and fan relationship, the formula of direct affinity is given. Then, considering the indirect relationship of nodes, an indirect affinity calculation method is given. Finally, the user affinity calculation method which can comprehensively measure the structural characteristics between nodes is given, and the calculation process is given. The clustering algorithm based on peak density and fast searching is improved. As an efficient and novel clustering method, it can automatically identify the community size. The cluster structure with arbitrary shape can be obtained, but when the community center is identified, the same cluster structure may be split into two clusters. The result of the algorithm is affected. In this paper, the clustering idea is applied to the research of community discovery in social network, and the improved method of identifying community center is given according to the characteristics of social network. So that it can identify the community center more accurately, give the community discovery algorithm based on the peak density, and then combine the above two improved methods. The user affinity matrix is obtained by the user relationship based affinity calculation method. The community discovery algorithm based on the peak density is used to calculate the importance and distance of the user, which makes the attribute calculation more reasonable. Finally, a complete community discovery algorithm based on user affinity and peak density is presented. Finally, the results of the algorithm are verified on Weibo dataset and common data set. Experiments show that the algorithm is feasible and effective. The parameter adjustment strategy of the algorithm makes it more flexible, and the algorithm is also applicable to the undirected user relationship network, which proves that the algorithm has better generalization.
【學(xué)位授予單位】:吉林大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP311.13
【相似文獻】
相關(guān)期刊論文 前8條
1 劉峰;;在線社交網(wǎng)絡(luò)中好友親密度判定方法研究[J];軟件導(dǎo)刊;2013年08期
2 Sean;;《太閣立志傳Ⅴ》天下之道[J];軟件;2004年05期
3 趙天宸;朱曉民;王純;;彩印SNS的用戶親密度算法[J];電信工程技術(shù)與標(biāo)準(zhǔn)化;2012年10期
4 馮濤;劉廣鐘;;基于節(jié)點間親密度的DTN路由策略[J];上海海事大學(xué)學(xué)報;2012年01期
5 戰(zhàn)神;《三國志10》心得攻略[J];電腦校園;2004年12期
6 鐘偉;;未來電腦狂想曲[J];現(xiàn)代計算機(普及版);2009年06期
7 張丹;何躍;;基于聚類分析的SNS網(wǎng)絡(luò)研究[J];情報雜志;2012年05期
8 ;《三國志8》結(jié)婚對象的條件[J];計算機與網(wǎng)絡(luò);2001年24期
相關(guān)會議論文 前1條
1 梁曉燕;黃燕;魏嵐;;宿舍親密度和適應(yīng)性與大學(xué)生孤獨感的關(guān)系研究[A];全國教育與心理統(tǒng)計測量學(xué)術(shù)年會論文摘要集[C];2006年
相關(guān)重要報紙文章 前1條
1 本報記者 張元章;父母如何教育“不聽話”的孩子?[N];珠海特區(qū)報;2011年
相關(guān)博士學(xué)位論文 前1條
1 王永剛;以數(shù)據(jù)為中心的在線社會網(wǎng)絡(luò)若干安全問題研究[D];北京大學(xué);2013年
相關(guān)碩士學(xué)位論文 前10條
1 譚景麟;基于社交關(guān)系圖譜的移動社交設(shè)計研究[D];華南理工大學(xué);2015年
2 李彤;親社會花費對幸福感的影響:人際親密度的邊界作用[D];曲阜師范大學(xué);2015年
3 張玉婷;基于等級反應(yīng)模型和評定量表模型的青少年親子關(guān)系親密度量表編制[D];貴州師范大學(xué);2016年
4 王俊力;戀人間實際親密度對自我面孔優(yōu)勢效應(yīng)的影響[D];西南大學(xué);2016年
5 單勝華;新生兒缺氧缺血性腦病恢復(fù)期患兒家長焦慮抑郁狀況及影響因素分析[D];新鄉(xiāng)醫(yī)學(xué)院;2016年
6 隋鵬;基于用戶親密度與密度峰值的社區(qū)發(fā)現(xiàn)算法研究[D];吉林大學(xué);2017年
7 華燁;基于相對關(guān)系親密度的局部社團發(fā)現(xiàn)算法研究[D];中國科學(xué)技術(shù)大學(xué);2014年
8 霍麗婕;以增加親密度為目的的會話分析[D];復(fù)旦大學(xué);2013年
9 曹坤宇;基于親密度及影響力的微博社交興趣圈挖掘算法研究[D];天津大學(xué);2013年
10 王偉偉;人際親密度對自我道德行為的影響[D];寧波大學(xué);2014年
,本文編號:1463598
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1463598.html