微博僵尸用戶檢測(cè)研究
發(fā)布時(shí)間:2019-02-17 18:49
【摘要】:隨著在線社交網(wǎng)絡(luò)的盛行,微博作為一種方便快捷的信息傳播載體,已經(jīng)成為人們交流互動(dòng)的重要方式。微博服務(wù)拉近了網(wǎng)民之間的距離,使用戶可以快速的發(fā)布、接收及傳播信息。微博在國內(nèi)外迅速流行的同時(shí),粉絲數(shù)逐漸成為了衡量用戶知名度及用戶排名所參考的一項(xiàng)重要指標(biāo)。隨之衍生出的僵尸粉(即僵尸用戶)擾亂微博正常秩序,引發(fā)微博信任危機(jī)。僵尸用戶經(jīng)過長期地演變,其行為表現(xiàn)變得越發(fā)類似真實(shí)用戶,因此,如何快速準(zhǔn)確地甄別僵尸用戶已成為維護(hù)微博公信力所亟待解決的一項(xiàng)問題。 本文選取國內(nèi)最具影響力、發(fā)展最迅速的微博平臺(tái)之一——新浪微博作為數(shù)據(jù)分析對(duì)象,并使用新浪API接口獲取用戶數(shù)據(jù)信息,用于研究分析及模型有效性驗(yàn)證。通過數(shù)據(jù)分析,本文找出了僵尸用戶和真實(shí)用戶的粉絲關(guān)系網(wǎng)絡(luò)是否存在聚類現(xiàn)象上所呈現(xiàn)的明顯差異。此外,結(jié)合僵尸用戶和真實(shí)用戶在粉絲數(shù)、關(guān)注數(shù)及發(fā)微博頻率等行為上的差異,提出用戶可信度計(jì)算算法及用戶活躍度計(jì)算方式,并構(gòu)建得出基于用戶粉絲聚類現(xiàn)象的僵尸用戶檢測(cè)模型。經(jīng)實(shí)驗(yàn)驗(yàn)證,,此模型在檢測(cè)準(zhǔn)確性及穩(wěn)定性上表現(xiàn)良好,但是檢測(cè)效率偏低。 同時(shí),考慮到微博用戶信息量巨大,數(shù)據(jù)處理較為耗時(shí),本研究在原有檢測(cè)模型的基礎(chǔ)上結(jié)合云計(jì)算技術(shù),將僵尸用戶檢測(cè)模型中較為耗時(shí)的四個(gè)模塊利用MapReduce技術(shù)做出改進(jìn),提高模型的可用性。經(jīng)搭建Hadoop集群將改進(jìn)前后的模型建立對(duì)比實(shí)驗(yàn),實(shí)驗(yàn)結(jié)果表明改進(jìn)后的模型在保持原有檢測(cè)準(zhǔn)確率及穩(wěn)定性的基礎(chǔ)上,檢測(cè)效率有了明顯的提高。并且,隨著Hadoop集群節(jié)點(diǎn)的增多,檢測(cè)效率增長趨勢(shì)呈現(xiàn)出接近線性的加速比。
[Abstract]:With the popularity of online social networks, Weibo, as a convenient and fast carrier of information dissemination, has become an important way for people to communicate and interact. Weibo service draws the distance between Internet users, so that users can quickly publish, receive and disseminate information. With the rapid popularity of Weibo at home and abroad, the number of fans has gradually become an important index to measure the popularity and ranking of users. The resulting zombie powder (that is, zombie users) disrupts Weibo's normal order, triggering a crisis of confidence in Weibo. After a long period of evolution, the behavior of zombie users becomes more and more similar to real users. Therefore, how to identify zombie users quickly and accurately has become an urgent problem to maintain Weibo's credibility. In this paper, one of the most influential and rapidly developing Weibo platforms in China is selected as the object of data analysis, and the Sina API interface is used to obtain user data information for research and analysis and validation of model validity. Through data analysis, this paper finds out whether there are obvious differences in clustering between zombie users and real users. In addition, considering the differences between zombie users and real users in the number of followers, the number of attention and the frequency of Weibo, the calculation algorithm of user credibility and the calculation method of user activity are put forward. And construct a zombie user detection model based on the phenomenon of user fan clustering. Experimental results show that the model performs well in accuracy and stability, but the detection efficiency is low. At the same time, considering Weibo's huge amount of user information and time-consuming data processing, this study combines cloud computing technology with the original detection model, and improves the four modules of zombie user detection model using MapReduce technology. Improve model availability. The experimental results show that the improved model can improve the detection efficiency on the basis of maintaining the original detection accuracy and stability. Moreover, with the increase of Hadoop cluster nodes, the increasing trend of detection efficiency is close to linear speedup.
【學(xué)位授予單位】:鄭州大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.092
本文編號(hào):2425478
[Abstract]:With the popularity of online social networks, Weibo, as a convenient and fast carrier of information dissemination, has become an important way for people to communicate and interact. Weibo service draws the distance between Internet users, so that users can quickly publish, receive and disseminate information. With the rapid popularity of Weibo at home and abroad, the number of fans has gradually become an important index to measure the popularity and ranking of users. The resulting zombie powder (that is, zombie users) disrupts Weibo's normal order, triggering a crisis of confidence in Weibo. After a long period of evolution, the behavior of zombie users becomes more and more similar to real users. Therefore, how to identify zombie users quickly and accurately has become an urgent problem to maintain Weibo's credibility. In this paper, one of the most influential and rapidly developing Weibo platforms in China is selected as the object of data analysis, and the Sina API interface is used to obtain user data information for research and analysis and validation of model validity. Through data analysis, this paper finds out whether there are obvious differences in clustering between zombie users and real users. In addition, considering the differences between zombie users and real users in the number of followers, the number of attention and the frequency of Weibo, the calculation algorithm of user credibility and the calculation method of user activity are put forward. And construct a zombie user detection model based on the phenomenon of user fan clustering. Experimental results show that the model performs well in accuracy and stability, but the detection efficiency is low. At the same time, considering Weibo's huge amount of user information and time-consuming data processing, this study combines cloud computing technology with the original detection model, and improves the four modules of zombie user detection model using MapReduce technology. Improve model availability. The experimental results show that the improved model can improve the detection efficiency on the basis of maintaining the original detection accuracy and stability. Moreover, with the increase of Hadoop cluster nodes, the increasing trend of detection efficiency is close to linear speedup.
【學(xué)位授予單位】:鄭州大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.092
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 胡瑤迪;;微博傳播對(duì)傳統(tǒng)媒體的影響[J];新聞世界;2010年06期
2 李鴻彬;林滸;楊雪華;林榮;;一種基于社會(huì)網(wǎng)絡(luò)的SIP垃圾即時(shí)消息的檢測(cè)方法[J];小型微型計(jì)算機(jī)系統(tǒng);2012年08期
3 姚永明;呂建平;;基于Android平臺(tái)的用戶管理軟件的設(shè)計(jì)與實(shí)現(xiàn)[J];西安文理學(xué)院學(xué)報(bào)(自然科學(xué)版);2013年01期
相關(guān)博士學(xué)位論文 前1條
1 韓毅;社會(huì)網(wǎng)絡(luò)分析與挖掘的若干關(guān)鍵問題研究[D];國防科學(xué)技術(shù)大學(xué);2011年
本文編號(hào):2425478
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2425478.html
最近更新
教材專著