基于Co-training的用戶屬性預測研究
發(fā)布時間:2018-06-10 06:59
本文選題:用戶屬性 + Co-training; 參考:《工程科學與技術》2017年S2期
【摘要】:針對當前基于第三方應用數據進行用戶屬性預測算法研究,其較少考慮應用前臺實際使用時長問題,由此,本文在應用的使用頻率及使用時長的基礎上,構造了應用前臺均使用時長特征,該特征能進一步刻畫用戶對應用的興趣度;同時,為充分利用大量未標注數據,從多角度特征對用戶屬性進行預測,由此本文采用了Co-training框架,該框架包含兩個均由棧式自編碼器與神經網絡相結合的網絡結構。實驗過程中,對于棧式自編碼算法,先利用未標注的數據對網絡進行參數初始化,使得網絡參數處于一個較優(yōu)的位置,再利用有標注的數據,采用基于準確率的梯度下降算法,對網絡參數進行更新,最終達到收斂。實驗結果表明,本文算法在準確率、召回率、F1值上均有所提高。
[Abstract]:In view of the current research on user attribute prediction algorithm based on third-party application data, the problem of actual usage time of application foreground is less considered. Therefore, based on the frequency and duration of application, In order to make full use of a large amount of unannotated data and to predict user attributes from multiple angles, the Co-training framework is used in this paper. The framework consists of two networks which are composed of stack self-encoder and neural network. In the process of experiment, for the stack self-coding algorithm, the network parameters are initialized with unlabeled data at first, and the network parameters are placed in a better position. Then, using labeled data, the gradient descent algorithm based on accuracy is adopted. The network parameters are updated and finally converged. The experimental results show that the accuracy and recall rate of the algorithm are improved.
【作者單位】: 四川大學計算機學院;
【基金】:國家自然科學基金資助項目(61332066;81373239)
【分類號】:TP301.6
【相似文獻】
相關期刊論文 前2條
1 余坦;王益民;;一種基于用戶屬性的搜索算法[J];計算機系統(tǒng)應用;2010年07期
2 ;[J];;年期
,本文編號:2002369
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2002369.html