微網(wǎng)絡(luò)環(huán)境中謠言識別機制研究
發(fā)布時間:2019-02-16 05:58
【摘要】:微博、微信等社交平臺的廣泛應(yīng)用縮短了信息傳播周期、擴大了信息傳播范圍,使得謠言造成的影響與危害變得更大,如何識別、進而阻斷謠言成為信息傳播領(lǐng)域的熱點問題。本文基于最大熵模型、改進的最大熵模型和謠言的爆炸性,構(gòu)建了微網(wǎng)絡(luò)環(huán)境中謠言信息的識別機制。本文主要進行了以下四項工作:第一,將最大熵模型用于謠言識別,并根據(jù)謠言的特點確定特征函數(shù),設(shè)計實驗的訓(xùn)練集,并在不同特征數(shù)量下進行了實驗,找到了最適合謠言識別的特征數(shù)量。通過與支持向量機模型、BP-神經(jīng)網(wǎng)絡(luò)模型、貝葉斯模型和K-means算法的謠言識別結(jié)果的比較證明,基于最大熵模型的謠言識別準確率與貝葉斯模型和K-means算法相當,仍有改進空間。第二,改進了最大熵模型,提高了謠言識別的準確率。提出了一種新的樣本構(gòu)建方法:中心距離裁剪法,用來解決非平衡數(shù)據(jù)分類問題中的邊界模糊和孤立樣本的問題。該方法用帶有權(quán)重的向量來表示每一條信息,并用向量之間的距離表示信息的相似度,利用樣本信息到每一類信息中心的距離來定義孤立點,裁剪邊界樣本。該方法解決了原始樣本孤立點多和邊界模糊的問題。提出了一種全新的特征選擇方法:差異計算法。該方法充分考慮到了特征出現(xiàn)次數(shù)對謠言識別的影響,也充分考慮了在謠言和非謠言兩類信息中出現(xiàn)都較多的特征的參考價值較低這一問題,在此基礎(chǔ)上計算每個特征的差異值fDC)(,并根據(jù)差異值對特征進行排序,選擇差異值最大的n個特征用于謠言識別。同時,對最大熵模型的特征函數(shù)進行改進,使最大熵模型更適合謠言識別。在構(gòu)建了基于改進的最大熵模型的謠言識別機制后,本文進行了謠言識別實驗,在實驗設(shè)計中,對訓(xùn)練集的選取進行了改進,并用中心距離裁剪法進行優(yōu)化,通過實驗找到了微網(wǎng)絡(luò)環(huán)境中進行謠言識別的最佳特征數(shù)量。將改進后與改進前的最大熵模型實驗結(jié)果進行了比較,并且與支持向量機模型、BP-神經(jīng)網(wǎng)絡(luò)模型、貝葉斯模型和K-means算法的謠言識別結(jié)果進行了對比。實驗結(jié)果表明,通過優(yōu)化的訓(xùn)練集和特征函數(shù)的謠言識別效果明顯優(yōu)于優(yōu)化之前,并且識別準確率優(yōu)于其他相關(guān)分類方法。第三,對于基于最大熵模型識別謠言結(jié)果中分類模糊的信息,基于謠言的爆炸性進行了進一步的識別。建立了謠言制造者和傳播者之間的博弈模型以及謠言的on-Trust)ET(Explosi模型,并通過實驗找到了傳播廣泛的謠言所具有的共同特點,即傳播廣泛的謠言爆炸性值在范圍]795.0,695.0[內(nèi),因此,謠言的爆炸性值成為謠言識別的重要依據(jù)。
[Abstract]:The extensive application of Weibo, WeChat and other social platforms shortens the period of information dissemination, expands the scope of information dissemination, makes the influence and harm caused by rumors become greater, and how to identify and block rumors becomes a hot issue in the field of information dissemination. Based on the maximum entropy model, the improved maximum entropy model and the explosion of rumors, this paper constructs a mechanism for the identification of rumor information in micro-network environment. The main work of this paper is as follows: first, the maximum entropy model is applied to the rumor recognition, the feature function is determined according to the characteristics of the rumor, the training set of the experiment is designed, and the experiment is carried out under the different number of features. The number of features most suitable for rumour recognition has been found. By comparing the results of rumor recognition with support vector machine model, BP- neural network model, Bayesian model and K-means algorithm, it is proved that the accuracy of rumor recognition based on maximum entropy model is equivalent to that of Bayesian model and K-means algorithm. There is still room for improvement. Secondly, the maximum entropy model is improved to improve the accuracy of rumour recognition. In this paper, a new method of constructing samples: centroid distance clipping is proposed, which is used to solve the problem of fuzzy boundary and isolated samples in the problem of non-equilibrium data classification. In this method, each piece of information is represented by a vector with weights, the similarity of information is expressed by the distance between vectors, the outlier is defined by the distance from sample information to each kind of information center, and the boundary samples are clipped. This method solves the problem of multiple outliers and fuzzy boundaries of the original samples. A new feature selection method, the difference calculation method, is proposed. The method takes into account the influence of feature occurrence times on rumor recognition and the low reference value of the features which appear more frequently in both rumor and non-rumor information. On this basis, the difference value fDC) (, of each feature is calculated and sorted according to the difference value, and n features with the largest difference value are selected for rumor recognition. At the same time, the feature function of the maximum entropy model is improved to make the maximum entropy model more suitable for rumour recognition. After constructing the rumour recognition mechanism based on the improved maximum entropy model, this paper carries out a rumor recognition experiment. In the experiment design, the selection of training set is improved, and the center distance clipping method is used to optimize it. The best number of features for rumor recognition in micro-network environment is found through experiments. The experimental results of the improved maximum entropy model are compared with those of the support vector machine (SVM) model, the BP- neural network model, the Bayesian model and the K-means algorithm. The experimental results show that the effect of the optimized training set and feature function is obviously better than that before the optimization, and the recognition accuracy is better than that of other related classification methods. Thirdly, the fuzzy information in the rumour recognition based on the maximum entropy model is further identified. The game model between rumour maker and communicator and the on-Trust) ET (Explosi model of rumor are established, and the common characteristics of spreading rumors are found through experiments. Therefore, the explosive value of rumor becomes an important basis for rumor recognition.
【學(xué)位授予單位】:山東師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:G206
[Abstract]:The extensive application of Weibo, WeChat and other social platforms shortens the period of information dissemination, expands the scope of information dissemination, makes the influence and harm caused by rumors become greater, and how to identify and block rumors becomes a hot issue in the field of information dissemination. Based on the maximum entropy model, the improved maximum entropy model and the explosion of rumors, this paper constructs a mechanism for the identification of rumor information in micro-network environment. The main work of this paper is as follows: first, the maximum entropy model is applied to the rumor recognition, the feature function is determined according to the characteristics of the rumor, the training set of the experiment is designed, and the experiment is carried out under the different number of features. The number of features most suitable for rumour recognition has been found. By comparing the results of rumor recognition with support vector machine model, BP- neural network model, Bayesian model and K-means algorithm, it is proved that the accuracy of rumor recognition based on maximum entropy model is equivalent to that of Bayesian model and K-means algorithm. There is still room for improvement. Secondly, the maximum entropy model is improved to improve the accuracy of rumour recognition. In this paper, a new method of constructing samples: centroid distance clipping is proposed, which is used to solve the problem of fuzzy boundary and isolated samples in the problem of non-equilibrium data classification. In this method, each piece of information is represented by a vector with weights, the similarity of information is expressed by the distance between vectors, the outlier is defined by the distance from sample information to each kind of information center, and the boundary samples are clipped. This method solves the problem of multiple outliers and fuzzy boundaries of the original samples. A new feature selection method, the difference calculation method, is proposed. The method takes into account the influence of feature occurrence times on rumor recognition and the low reference value of the features which appear more frequently in both rumor and non-rumor information. On this basis, the difference value fDC) (, of each feature is calculated and sorted according to the difference value, and n features with the largest difference value are selected for rumor recognition. At the same time, the feature function of the maximum entropy model is improved to make the maximum entropy model more suitable for rumour recognition. After constructing the rumour recognition mechanism based on the improved maximum entropy model, this paper carries out a rumor recognition experiment. In the experiment design, the selection of training set is improved, and the center distance clipping method is used to optimize it. The best number of features for rumor recognition in micro-network environment is found through experiments. The experimental results of the improved maximum entropy model are compared with those of the support vector machine (SVM) model, the BP- neural network model, the Bayesian model and the K-means algorithm. The experimental results show that the effect of the optimized training set and feature function is obviously better than that before the optimization, and the recognition accuracy is better than that of other related classification methods. Thirdly, the fuzzy information in the rumour recognition based on the maximum entropy model is further identified. The game model between rumour maker and communicator and the on-Trust) ET (Explosi model of rumor are established, and the common characteristics of spreading rumors are found through experiments. Therefore, the explosive value of rumor becomes an important basis for rumor recognition.
【學(xué)位授予單位】:山東師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:G206
【參考文獻】
相關(guān)期刊論文 前10條
1 黃文明;孫艷秋;;基于最大熵的中文短文本情感分析[J];計算機工程與設(shè)計;2017年01期
2 陶成;;“量身”服務(wù)助力老年讀者識別社交平臺謠言[J];當代圖書館;2016年04期
3 何湘東;朱亦寧;;網(wǎng)絡(luò)謠言識別方法及展望[J];網(wǎng)絡(luò)空間安全;2016年Z2期
4 朱興;謝瑞杰;;主題爬蟲在網(wǎng)絡(luò)地震謠言信息獲取識別中的應(yīng)用研究初探[J];黑龍江科技信息;2016年30期
5 馬長林;謝羅迪;司琪;王夢;;基于情感從屬和最大熵模型的細粒度觀點挖掘[J];計算機工程與科學(xué);2015年10期
6 路同強;石冰;閆中敏;周s,
本文編號:2424116
本文鏈接:http://sikaile.net/xinwenchuanbolunwen/2424116.html
最近更新
教材專著