基于深度學習的協(xié)同過濾模型研究
發(fā)布時間:2018-04-22 06:33
本文選題:棧式降噪自編碼 + 協(xié)同過濾。 參考:《深圳大學》2017年碩士論文
【摘要】:隨著互聯(lián)網(wǎng)技術(shù)的日新月異,民眾的生活方式發(fā)生了重大的改變。在信息琳瑯滿目、競爭激勵的互聯(lián)網(wǎng)時代,如何幫助用戶快速準確的挑選出其感興趣的物品,對一個互聯(lián)網(wǎng)企業(yè)至關(guān)重要。基于上述問題,推薦系統(tǒng)技術(shù)應(yīng)運而生。協(xié)同過濾技術(shù)是推薦系統(tǒng)中使用最廣,最受歡迎的一項技術(shù)。傳統(tǒng)協(xié)同過濾技術(shù)僅使用用戶對物品的評分矩陣,但是通常情況下,評分矩陣非常稀疏,導致推薦系統(tǒng)的推薦準確率嚴重下降,并且傳統(tǒng)協(xié)同過濾技術(shù)對于新物品還存在冷啟動的問題。針對這些問題,本文的主要工作包括以下兩個部分:(1)受限波爾茲曼機RBM用于協(xié)同過濾時,其推薦性能與評分矩陣的稀疏性有很大的關(guān)聯(lián),當評分矩陣稀疏時其推薦性能不佳,且基于RBM的推薦僅使用評分矩陣,對于新物品存在冷啟動的問題。針對上述問題,本文提出一種結(jié)合物品內(nèi)容相似性的RBM協(xié)同過濾方法,命名為CS-RBM。該方法利用Word2vec對物品內(nèi)容進行向量表示,并計算物品之間的相似度,然后將所得到的物品間的相似度度量添加到RBM模型預(yù)測評分上,從而使最后預(yù)測出來的評分既考慮了評分矩陣中隱因子的影響,又考慮了物品內(nèi)容之間相似度的影響。經(jīng)在ml-100k、ml-1m、Netflix多個數(shù)據(jù)集上的實驗結(jié)果表明,結(jié)合物品內(nèi)容相似性的RBM協(xié)同過濾方法能比原始的RBM模型具有更好的推薦性能。(2)由于結(jié)合物品內(nèi)容的RBM協(xié)同過濾方法僅簡單利用了物品的內(nèi)容信息,不能從物品的內(nèi)容信息中捕獲更深層次的隱因子用于模型改進,且沒有考慮用戶特征對模型的影響。針對這些問題,本文在深度協(xié)同模型CDL的基礎(chǔ)上,提出了同時對用戶特征和物品特征進行雙向約束的深度協(xié)同模型DCDL。該模型同時利用深度棧式降噪自動編碼SDAE和概率矩陣分解PMF協(xié)同訓練,自動從物品內(nèi)容和評分矩陣中學習物品隱藏特征和用戶隱藏特征,使得模型既考慮了物品內(nèi)容對推薦的影響,又考慮了用戶特征對推薦的影響。經(jīng)在citeulike-a、citeulike-t、Netflix多個數(shù)據(jù)集上的實驗結(jié)果表明,DCDL相對于協(xié)同主題回歸模型CTR和深度協(xié)同模型CDL具有更好的推薦性能。
[Abstract]:With the rapid development of Internet technology, people's way of life has changed greatly. In the era of the Internet, which is full of information and competition, how to help users quickly and accurately pick out the objects of interest is very important to an Internet enterprise. Based on the above problems, recommendation system technology emerged as the times require. Collaborative filtering is one of the most popular and widely used technologies in recommendation systems. The traditional collaborative filtering technology only uses the scoring matrix of the user to the item, but usually, the score matrix is very sparse, resulting in the recommendation accuracy of the recommendation system seriously reduced. And the traditional collaborative filtering technology has the problem of cold start for new items. In order to solve these problems, the main work of this paper includes the following two parts: 1) when RBM is used for collaborative filtering, its recommendation performance is closely related to the sparsity of the score matrix. When the score matrix is sparse, its recommendation performance is not good. The recommendation based on RBM only uses the score matrix, which has the problem of cold start for new items. In order to solve the above problems, this paper proposes a collaborative RBM filtering method, named CS-RBM, which combines the similarity of the content of articles. In this method, Word2vec is used to vector the content of the items, and the similarity between the items is calculated. Then, the similarity measure of the items is added to the prediction score of the RBM model. Therefore, the predicted score not only takes into account the influence of implicit factors in the scoring matrix, but also takes into account the influence of similarity between items. The results of experiments on multiple data sets on ml-100k/ m-1 / Netflix show that, The RBM collaborative filtering method combined with the similarity of article content has better recommended performance than the original RBM model. (2) because the RBM collaborative filtering method combined with the content of articles can only make use of the content information of the article simply. Further hidden factors can not be captured from the content information of the items for model improvement and the influence of user characteristics on the model is not considered. In order to solve these problems, this paper proposes a depth collaboration model based on the depth collaboration model (CDL), which simultaneously binds both user and item features. At the same time, the model uses depth stack denoising automatic coding SDAE and probability matrix factorization PMF cooperative training to automatically learn item hiding feature and user hiding feature from item content and score matrix. The model not only considers the effect of item content on recommendation, but also considers the influence of user characteristics on recommendation. The experimental results on several data sets show that DCDL has better performance than CTR and CDL.
【學位授予單位】:深圳大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.3
【參考文獻】
相關(guān)期刊論文 前5條
1 鄧俊鋒;張曉龍;;基于自動編碼器組合的深度學習優(yōu)化方法[J];計算機應(yīng)用;2016年03期
2 冷亞軍;陸青;梁昌勇;;協(xié)同過濾推薦技術(shù)綜述[J];模式識別與人工智能;2014年08期
3 鄭煒;梁戰(zhàn)平;梁建;;基于個性化數(shù)據(jù)的搜索引擎技術(shù)研究[J];情報理論與實踐;2013年10期
4 王國霞;劉賀平;;個性化推薦系統(tǒng)綜述[J];計算機工程與應(yīng)用;2012年07期
5 劉建國;周濤;汪秉宏;;個性化推薦系統(tǒng)的研究進展[J];自然科學進展;2009年01期
,本文編號:1786043
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1786043.html
最近更新
教材專著