基于評分選取技術(shù)的推薦算法研究

發(fā)布時間：2018-11-01 20:40

【摘要】：推薦系統(tǒng)已經(jīng)成為大數(shù)據(jù)時代最重要的信息過濾工具之一,它可以幫助用戶從海量數(shù)據(jù)中迅速定位有價值的信息,并以用戶可能感興趣的物品列表的方式推薦給用戶�；ヂ�(lián)網(wǎng)爆炸式的信息量以及用戶和物品數(shù)量的快速增長使推薦系統(tǒng)面臨著諸多挑戰(zhàn),可擴(kuò)展性便是其中的主要挑戰(zhàn)之一。協(xié)同過濾是推薦系統(tǒng)領(lǐng)域應(yīng)用最成功、最廣泛的技術(shù)。目前,很多學(xué)者為了提升協(xié)同過濾算法的可擴(kuò)展性,提出了多種基于聚類和基于并行技術(shù)的方案。通常,他們會在推薦算法的建模階段使用全部的用戶評分?jǐn)?shù)據(jù),而沒有考慮這些數(shù)據(jù)的質(zhì)量因素,而且已有的論文大都是針對基于近鄰的協(xié)同過濾算法的可擴(kuò)展性。本文從輸入源數(shù)據(jù)集的角度出發(fā),提出觀點(diǎn):并不是所有的用戶行為數(shù)據(jù)都對最終的預(yù)測模型作出了同樣的貢獻(xiàn),尤其是對那些擁有大量行為的活躍用戶而言。本文認(rèn)為,對于活躍用戶,部分具備代表性的行為數(shù)據(jù)已經(jīng)可以包含足夠的信息來對用戶作出準(zhǔn)確的建模,在更短的時間內(nèi)得到一個好的推薦結(jié)果。基于上述觀點(diǎn),本文首先通過一系列的實(shí)驗(yàn)探索了推薦算法建模階段使用的用戶行為數(shù)量和推薦算法性能之間的關(guān)系,提出了基于評分選取的推薦算法。特別地,本文的所有實(shí)驗(yàn)均同時考慮了評分預(yù)測和TopN推薦任務(wù)。隨后,本文提出了一個綜合考慮用戶和電影兩方面因素的通用評分選取框架,并且提出了基于劃分的3種評分選取策略和基于統(tǒng)計學(xué)與信息論的5種評分選取策略,來為每一個用戶選取其最具代表性的評分。最后,本文在MovieLens和Netflix數(shù)據(jù)集上做了大量的實(shí)驗(yàn),實(shí)驗(yàn)結(jié)果表明僅使用活躍用戶的部分代表性行為可以在達(dá)到預(yù)期推薦精度的同時降低算法的運(yùn)行時間,由此提升了推薦系統(tǒng)的可擴(kuò)展性,而且本文提出的方案適用于所有的協(xié)同過濾算法。
[Abstract]:Recommendation system has become one of the most important information filtering tools in big data era. It can help users quickly locate valuable information from massive data and recommend it to users in the form of lists of items that may be of interest to users. The explosive amount of information on the Internet and the rapid growth of the number of users and items make the recommendation system face many challenges, of which scalability is one of the main challenges. Collaborative filtering is the most successful and widely used technology in the field of recommendation system. At present, in order to improve the scalability of collaborative filtering algorithms, many scholars have proposed many schemes based on clustering and parallel technology. Usually, they use all the user rating data in the modeling phase of the recommendation algorithm, without considering the quality factors of the data, and most of the existing papers focus on the scalability of the collaborative filtering algorithm based on the nearest neighbor. From the point of view of input source dataset, this paper puts forward the point of view: not all user behavior data make the same contribution to the final prediction model, especially for those active users who have a large number of behaviors. This paper holds that for active users, some representative behavioral data can already contain enough information to model users accurately and get a good recommended result in a shorter time. Based on the above viewpoint, this paper first explores the relationship between the number of user behaviors and the performance of the recommendation algorithm in the modeling phase of the recommendation algorithm through a series of experiments, and proposes a recommendation algorithm based on the selection of the score. In particular, all experiments in this paper considered both score prediction and TopN recommendation tasks. Then, this paper proposes a general scoring selection framework that considers both user and movie factors, and proposes three scoring selection strategies based on division and five scoring selection strategies based on statistics and information theory. To select the most representative score for each user. Finally, a large number of experiments have been done on MovieLens and Netflix datasets. The experimental results show that only using some representative behaviors of active users can reduce the running time of the algorithm while achieving the expected recommendation accuracy. This improves the scalability of the recommendation system, and the proposed scheme is suitable for all collaborative filtering algorithms.
【學(xué)位授予單位】：浙江大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2016
【分類號】：TP391.3

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 李穎基,彭宏,鄭啟倫,曾煒;自動分層推薦算法[J];計算機(jī)應(yīng)用;2002年11期

2 徐義峰;徐云青;劉曉平;;一種基于時間序列性的推薦算法[J];計算機(jī)系統(tǒng)應(yīng)用;2006年10期

3 余小鵬;;一種基于多層關(guān)聯(lián)規(guī)則的推薦算法研究[J];計算機(jī)應(yīng)用;2007年06期

4 張海玉;劉志都;楊彩;賈松浩;;基于頁面聚類的推薦算法的改進(jìn)[J];計算機(jī)應(yīng)用與軟件;2008年09期

5 張立燕;;一種基于用戶事務(wù)模式的推薦算法[J];福建電腦;2009年03期

6 王晗;夏自謙;;基于蟻群算法和瀏覽路徑的推薦算法研究[J];中國科技信息;2009年07期

7 周珊丹;周興社;王海鵬;倪紅波;張桂英;苗強(qiáng);;智能博物館環(huán)境下的個性化推薦算法[J];計算機(jī)工程與應(yīng)用;2010年19期

8 王文;;個性化推薦算法研究[J];電腦知識與技術(shù);2010年16期

9 張愷;秦亮曦;寧朝波;李文閣;;改進(jìn)評價估計的混合推薦算法研究[J];微計算機(jī)信息;2010年36期

10 夏秀峰;代沁;叢麗暉;;用戶顯意識下的多重態(tài)度個性化推薦算法[J];計算機(jī)工程與應(yīng)用;2011年16期

相關(guān)會議論文前10條

1 王韜丞;羅喜軍;杜小勇;;基于層次的推薦:一種新的個性化推薦算法[A];第二十四屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集（技術(shù)報告篇）[C];2007年

2 唐燦;;基于模糊用戶心理模式的個性化推薦算法[A];2008年計算機(jī)應(yīng)用技術(shù)交流會論文集[C];2008年

3 秦國;杜小勇;;基于用戶層次信息的協(xié)同推薦算法[A];第二十一屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集（技術(shù)報告篇）[C];2004年

4 周玉妮;鄭會頌;;基于瀏覽路徑選擇的蟻群推薦算法:用于移動商務(wù)個性化推薦系統(tǒng)[A];社會經(jīng)濟(jì)發(fā)展轉(zhuǎn)型與系統(tǒng)工程——中國系統(tǒng)工程學(xué)會第17屆學(xué)術(shù)年會論文集[C];2012年

5 蘇日啟;胡皓;汪秉宏;;基于網(wǎng)絡(luò)的含時推薦算法[A];第五屆全國復(fù)雜網(wǎng)絡(luò)學(xué)術(shù)會議論文（摘要）匯集[C];2009年

6 梁莘q，

本文編號：2305040

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2305040.html

上一篇：水利云下的數(shù)據(jù)清洗策略研究與實(shí)現(xiàn)
下一篇：基于彈幕情感分析的視頻片段推薦模型

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于評分選取技術(shù)的推薦算法研究