融合異常檢測與隨機森林的微博轉(zhuǎn)發(fā)行為預(yù)測方法
發(fā)布時間:2019-05-17 22:49
【摘要】:針對目前微博轉(zhuǎn)發(fā)行為預(yù)測具有的特征選擇任意性、準確率不高的問題,提出了融合異常檢測與隨機森林的微博轉(zhuǎn)發(fā)行為預(yù)測方法。首先,提取用戶基本特征、博文基本特征、博文內(nèi)容主題特征,并基于相對熵計算用戶活躍度、博文影響力;其次,通過結(jié)合過濾式與封裝式特征選擇方法篩選出關(guān)鍵特征組;最后,融合異常檢測與隨機森林算法,依據(jù)篩選后的關(guān)鍵特征組進行微博轉(zhuǎn)發(fā)行為預(yù)測,并利用袋外數(shù)據(jù)誤差估計設(shè)置隨機森林中的決策樹和特征數(shù)。在真實新浪微博數(shù)據(jù)集上與基于邏輯回歸、決策樹、樸素貝葉斯、隨機森林等算法的微博轉(zhuǎn)發(fā)行為預(yù)測方法進行實驗對比,結(jié)果表明所提方法的預(yù)測準確率(90.5%)高于基準方法中最優(yōu)的隨機森林方法的預(yù)測準確率,同時驗證了特征篩選方法的有效性。
[Abstract]:In order to solve the problem of arbitrary feature selection and low accuracy in Weibo forwarding behavior prediction, a Weibo forwarding behavior prediction method combining anomaly detection and random forest is proposed. Firstly, the basic features of users and the subject features of blog content are extracted, and the user activity and blog influence are calculated based on relative entropy. Secondly, the key feature groups are selected by combining filtering and encapsulation feature selection methods. Finally, combining anomaly detection and random forest algorithm, Weibo forwarding behavior is predicted according to the selected key feature groups, and the decision tree and feature number in random forest are set up by using out-of-bag data error estimation. The Weibo forwarding behavior prediction method based on logical regression, decision tree, naive Bays, random forest and other algorithms is compared with the Weibo forwarding behavior prediction method on the real Sina Weibo dataset. The results show that the prediction accuracy of the proposed method (90.5%) is higher than that of the optimal stochastic forest method in the benchmark method, and the effectiveness of the feature screening method is verified.
【作者單位】: 桂林電子科技大學(xué)計算機與信息安全學(xué)院;桂林電子科技大學(xué)廣西可信軟件重點實驗室;
【基金】:廣西科技攻關(guān)項目(桂科攻1598019-6)資助
【分類號】:TP391.1;TP393.092
[Abstract]:In order to solve the problem of arbitrary feature selection and low accuracy in Weibo forwarding behavior prediction, a Weibo forwarding behavior prediction method combining anomaly detection and random forest is proposed. Firstly, the basic features of users and the subject features of blog content are extracted, and the user activity and blog influence are calculated based on relative entropy. Secondly, the key feature groups are selected by combining filtering and encapsulation feature selection methods. Finally, combining anomaly detection and random forest algorithm, Weibo forwarding behavior is predicted according to the selected key feature groups, and the decision tree and feature number in random forest are set up by using out-of-bag data error estimation. The Weibo forwarding behavior prediction method based on logical regression, decision tree, naive Bays, random forest and other algorithms is compared with the Weibo forwarding behavior prediction method on the real Sina Weibo dataset. The results show that the prediction accuracy of the proposed method (90.5%) is higher than that of the optimal stochastic forest method in the benchmark method, and the effectiveness of the feature screening method is verified.
【作者單位】: 桂林電子科技大學(xué)計算機與信息安全學(xué)院;桂林電子科技大學(xué)廣西可信軟件重點實驗室;
【基金】:廣西科技攻關(guān)項目(桂科攻1598019-6)資助
【分類號】:TP391.1;TP393.092
【參考文獻】
相關(guān)期刊論文 前6條
1 吳錦華;左開中;接標;丁新濤;;新穎的判別性特征選擇方法[J];計算機應(yīng)用;2015年10期
2 劉少鵬;印鑒;歐陽佳;黃云;楊曉穎;;基于MB-HDP模型的微博主題挖掘[J];計算機學(xué)報;2015年07期
3 趙煜;邵必林;邊根慶;宋丹;;面向不平衡微博數(shù)據(jù)集的轉(zhuǎn)發(fā)行為預(yù)測方法[J];計算機應(yīng)用;2015年07期
4 曹玖新;吳江林;石偉;劉波;鄭嘯;羅軍舟;;新浪微博網(wǎng)信息傳播分析與預(yù)測[J];計算機學(xué)報;2014年04期
5 李英樂;于洪濤;劉力雄;;基于SVM的微博轉(zhuǎn)發(fā)規(guī)模預(yù)測方法[J];計算機應(yīng)用研究;2013年09期
6 張e,
本文編號:2479465
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2479465.html
最近更新
教材專著