基于協(xié)同訓(xùn)練的社交網(wǎng)絡(luò)垃圾用戶檢測的研究

發(fā)布時(shí)間：2019-06-17 19:06

【摘要】：近年來,隨著web 2.0技術(shù)的不斷發(fā)展與成熟,社交網(wǎng)絡(luò)作為人類社會(huì)的一種交流工具,給人們之間的溝通和交流帶來了極大的便利。然而,社交網(wǎng)絡(luò)中的大量垃圾信息和垃圾用戶嚴(yán)重影響了人們之間的交流。這些垃圾信息和垃圾用戶不但消耗大量的網(wǎng)絡(luò)資源,而且有可能使合法用戶的權(quán)益受到損害�，F(xiàn)有的社交網(wǎng)絡(luò)垃圾信息和垃圾用戶檢測技術(shù)通常以大量的標(biāo)記數(shù)據(jù)為基礎(chǔ),采用監(jiān)督學(xué)習(xí)的策略。然而,人工標(biāo)記數(shù)據(jù)是一件復(fù)雜易出錯(cuò)的工作,且需要消耗大量的人力和物力。因此,有必要研究如何使用較少的標(biāo)記數(shù)據(jù)來檢測垃圾信息和垃圾用戶。為了解決上述問題,本文提出一種半監(jiān)督分類框架來檢測社交網(wǎng)絡(luò)中的垃圾用戶。此框架將協(xié)同訓(xùn)練與聚類算法相結(jié)合,首先通過K中心點(diǎn)聚類算法來識(shí)別和標(biāo)記一些信息量大并且有代表性的樣本作為半監(jiān)督學(xué)習(xí)的初始種子集,然后利用用戶的內(nèi)容特征和行為特征進(jìn)行協(xié)同訓(xùn)練。協(xié)同訓(xùn)練分類框架不斷預(yù)測用戶的標(biāo)記,將置信度較高且滿足一定閾值的用戶作為新的訓(xùn)練集,重新訓(xùn)練學(xué)習(xí)模型。通過不斷地迭代最終得到一個(gè)優(yōu)化的分類模型。本文首先介紹了社交網(wǎng)絡(luò)垃圾的危害及檢測社交網(wǎng)絡(luò)垃圾用戶的必要性,接著對社交網(wǎng)絡(luò)中垃圾作弊檢測技術(shù)與相關(guān)理論進(jìn)行了概述,然后詳細(xì)闡述了本文所提出的基于協(xié)同訓(xùn)練的半監(jiān)督分類檢測框架的算法與實(shí)現(xiàn),最后在真實(shí)的Twitter數(shù)據(jù)集上進(jìn)行了實(shí)驗(yàn)和分析,結(jié)果驗(yàn)證了本文所提出框架的有效性與正確性。實(shí)驗(yàn)結(jié)果表明本文提出的檢測框架在標(biāo)記樣本較少的情況下,依然能訓(xùn)練出正確的模型并且實(shí)驗(yàn)效果顯著。
[Abstract]:In recent years, with the continuous development and maturity of web 2.0 technology, social network, as a communication tool of human society, has brought great convenience to the communication and communication between people. However, a large number of junk information and junk users in social networks seriously affect the communication between people. These junk information and garbage users not only consume a lot of network resources, but also may damage the rights and interests of legitimate users. The existing social network spam and junk user detection technology is usually based on a large number of marked data and adopts the strategy of supervised learning. However, manual marking of data is a complex and error-prone work, and needs to consume a lot of manpower and material resources. Therefore, it is necessary to study how to use less tagged data to detect spam and junk users. In order to solve the above problems, this paper proposes a semi-supervised classification framework to detect junk users in social networks. This framework combines collaborative training with clustering algorithm. Firstly, some samples with large amount of information and representative samples are identified and marked by K center point clustering algorithm as the initial subset of semi-supervised learning, and then collaborative training is carried out by using the content and behavior characteristics of users. The collaborative training classification framework constantly forecasts the user's mark, takes the user with high confidence and meets a certain threshold as the new training set, and retrains the learning model. Finally, an optimized classification model is obtained by continuous iteration. This paper first introduces the harm of social network garbage and the necessity of detecting social network garbage users, then summarizes the detection technology and related theories of garbage cheating in social network, then expounds in detail the algorithm and implementation of the semi-supervised classification detection framework based on collaborative training, and finally carries on the experiment and analysis on the real Twitter data set. The results verify the effectiveness and correctness of the proposed framework. The experimental results show that the detection framework proposed in this paper can still train the correct model under the condition of small number of marking samples, and the experimental effect is remarkable.
【學(xué)位授予單位】：大連理工大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2016
【分類號(hào)】：TP393.09;TP311.13

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 ;基于位置的手機(jī)社交網(wǎng)絡(luò)“貝多”正式發(fā)布[J];中國新通信;2008年06期

2 曹增輝;;社交網(wǎng)絡(luò)更偏向于用戶工具[J];信息網(wǎng)絡(luò);2009年11期

3 ;美國:印刷企業(yè)青睞社交網(wǎng)絡(luò)營銷新方式[J];中國包裝工業(yè);2010年Z1期

4 李智惠;柳承燁;;韓國移動(dòng)社交網(wǎng)絡(luò)服務(wù)的類型分析與促進(jìn)方案[J];現(xiàn)代傳播(中國傳媒大學(xué)學(xué)報(bào));2010年08期

5 賈富;;改變一切的社交網(wǎng)絡(luò)[J];互聯(lián)網(wǎng)天地;2011年04期

6 譚拯;;社交網(wǎng)絡(luò):連接與發(fā)現(xiàn)[J];廣東通信技術(shù);2011年07期

7 陳一舟;;社交網(wǎng)絡(luò)的發(fā)展趨勢[J];傳媒;2011年12期

8 殷樂;;全球社交網(wǎng)絡(luò)新態(tài)勢及文化影響[J];新聞與寫作;2012年01期

9 許麗;;社交網(wǎng)絡(luò):孤獨(dú)年代的集體狂歡[J];上海信息化;2012年09期

10 李玲麗;吳新年;;科研社交網(wǎng)絡(luò)的發(fā)展現(xiàn)狀及趨勢分析[J];圖書館學(xué)研究;2013年01期

相關(guān)會(huì)議論文前10條

1 趙云龍;李艷兵;;社交網(wǎng)絡(luò)用戶的人格預(yù)測與關(guān)系強(qiáng)度研究[A];第七屆（2012）中國管理學(xué)年會(huì)商務(wù)智能分會(huì)場論文集（選編）[C];2012年

2 宮廣宇;李開軍;;對社交網(wǎng)絡(luò)中信息傳播的分析和思考——以人人網(wǎng)為例[A];首屆華中地區(qū)新聞與傳播學(xué)科研究生學(xué)術(shù)論壇獲獎(jiǎng)?wù)撐腫C];2010年

3 楊子鵬;喬麗娟;王夢思;楊雪迎;孟子冰;張禹;;社交網(wǎng)絡(luò)與大學(xué)生焦慮緩解[A];心理學(xué)與創(chuàng)新能力提升——第十六屆全國心理學(xué)學(xué)術(shù)會(huì)議論文集[C];2013年

4 畢雪梅;;體育虛擬社區(qū)中的體育社交網(wǎng)絡(luò)解析[A];第九屆全國體育科學(xué)大會(huì)論文摘要匯編(4)[C];2011年

5 杜p，

本文編號(hào)：2501193

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2501193.html

上一篇：基于SGX的虛擬網(wǎng)絡(luò)功能安全保護(hù)機(jī)制研究
下一篇：動(dòng)態(tài)視覺手勢識(shí)別下手工裝配時(shí)序控制的智能防錯(cuò)方法

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于協(xié)同訓(xùn)練的社交網(wǎng)絡(luò)垃圾用戶檢測的研究