推薦系統(tǒng)中協(xié)同過(guò)濾算法關(guān)鍵問(wèn)題研究
本文選題:推薦系統(tǒng) + 協(xié)同過(guò)濾; 參考:《揚(yáng)州大學(xué)》2016年碩士論文
【摘要】:隨著Wleb技術(shù)在互聯(lián)網(wǎng)中發(fā)展,用戶不再是簡(jiǎn)單地從網(wǎng)絡(luò)中獲取信息,而是采取更加主動(dòng)的方式產(chǎn)生信息。由于用戶數(shù)量的急劇增長(zhǎng),以用戶為中心的信息產(chǎn)生模式,導(dǎo)致了互聯(lián)網(wǎng)信息量呈現(xiàn)飛速增長(zhǎng),這種現(xiàn)象被稱為“信息過(guò)載”。該現(xiàn)象是指在海量信息面前,人們無(wú)法迅速準(zhǔn)確地獲取對(duì)他們有用的信息。為了解決“信息過(guò)載”問(wèn)題,推薦系統(tǒng)由此而產(chǎn)生。推薦系統(tǒng)不要求用戶提供準(zhǔn)確的需求,而是根據(jù)對(duì)用戶的過(guò)去行為進(jìn)行分析,從而推測(cè)出用戶在將來(lái)可能需要的信息。當(dāng)前,在眾多推薦技術(shù)中,協(xié)同過(guò)濾推薦技術(shù)由于它獨(dú)特的優(yōu)點(diǎn),在電子商務(wù)中取得了廣泛應(yīng)用。雖然協(xié)同過(guò)濾推薦算法的研究工作已經(jīng)取得許多成果,但依然存在很多問(wèn)題亟需解決。比如“冷啟動(dòng)”、“可擴(kuò)展性”、“數(shù)據(jù)稀疏性”等問(wèn)題,這些問(wèn)題的存在,對(duì)算法的準(zhǔn)確性造成了影響。如何解決上述問(wèn)題,改進(jìn)協(xié)同過(guò)濾算法性能,一直是推薦系統(tǒng)中重點(diǎn)研究的課題。論文主要工作如下:第一,針對(duì)協(xié)同過(guò)濾技術(shù)中存在的“冷啟動(dòng)”、“可擴(kuò)展性”問(wèn)題,提出了結(jié)合用戶屬性聚類的協(xié)同過(guò)濾推薦算法ID-CF。該推薦系統(tǒng)通過(guò)加入權(quán)重的方法,將基于項(xiàng)目的協(xié)同過(guò)濾算法與K—means算法相結(jié)合,顯著提高其推薦準(zhǔn)確度。在算法中,由于項(xiàng)目之間的相似性和用戶聚類可以離線計(jì)算,這樣可以解決推薦系統(tǒng)的可擴(kuò)展性問(wèn)題。當(dāng)一個(gè)新用戶加入系統(tǒng)時(shí),通過(guò)使用聚類算法,可將新用戶添加到最相近的用戶集,這樣可以快速預(yù)測(cè)用戶對(duì)項(xiàng)目的評(píng)分,冷啟動(dòng)問(wèn)題也可較好地解決。第二,由于“數(shù)據(jù)稀疏性”問(wèn)題對(duì)協(xié)同過(guò)濾算法的準(zhǔn)確性有較大的影響,提出了一種結(jié)合圖模型的協(xié)同過(guò)濾推薦算法NG-CF,該算法提出一種新的相似性度量標(biāo)準(zhǔn),即用戶或者項(xiàng)目之間的相似性,可以通過(guò)圖中頂點(diǎn)之間的關(guān)系來(lái)獲得,然后使用K-近鄰算法產(chǎn)生預(yù)測(cè)。實(shí)驗(yàn)表明, 即使改變數(shù)據(jù)稀疏性,預(yù)測(cè)結(jié)果也具有較好的穩(wěn)定性!袄鋯(dòng)”、“可擴(kuò)展性”、“數(shù)據(jù)稀疏性”等問(wèn)題是協(xié)同過(guò)濾推薦算法研究的熱點(diǎn)問(wèn)題,論文是在前人的工作的基礎(chǔ)上,僅僅做出一些探索和分析,還有許多問(wèn)題需要改進(jìn)。
[Abstract]:With the development of Wleb technology in the Internet, users no longer simply get information from the network, but take a more active way to generate information. Due to the rapid growth of the number of users, the user-centered information generation model leads to the rapid growth of Internet information, which is called "information overload". This phenomenon means that in the face of mass information, people can not get useful information quickly and accurately. In order to solve the problem of information overload, recommendation system is produced. Recommendation system does not require the user to provide accurate requirements, but based on the past behavior of the user to analyze, so as to speculate the user may need information in the future. At present, collaborative filtering recommendation technology has been widely used in e-commerce due to its unique advantages among many recommendation technologies. Although many achievements have been made in collaborative filtering recommendation algorithms, there are still many problems to be solved. Such as "cold start", "extensibility", "data sparsity" and other problems, these problems have an impact on the accuracy of the algorithm. How to solve the above problems and improve the performance of collaborative filtering algorithm has been the focus of research in recommendation system. The main work of this paper is as follows: first, aiming at the problems of "cold start" and "expansibility" in collaborative filtering technology, a collaborative filtering recommendation algorithm ID-CFbased on user attribute clustering is proposed. The recommendation system combines the project-based collaborative filtering algorithm with the K-means algorithm by adding weights to improve the accuracy of recommendation. In the algorithm, due to the similarity between items and user clustering can be calculated offline, this can solve the scalability problem of recommendation system. When a new user joins the system, the new user can be added to the closest user set by using clustering algorithm, which can quickly predict the user's score on the item, and the cold start problem can be solved better. Secondly, because the problem of "data sparsity" has great influence on the accuracy of collaborative filtering algorithm, a collaborative filtering recommendation algorithm NG-CFS combining graph model is proposed, which proposes a new similarity measurement standard. In other words, the similarity between users or items can be obtained by the relationship between vertices in the graph, and then the K-nearest neighbor algorithm is used to generate prediction. The experimental results show that the prediction results are stable even if the data sparsity is changed. "Cold start", "expansibility" and "data sparsity" are hot issues in the research of collaborative filtering recommendation algorithm. Based on the previous work, this paper only makes some exploration and analysis, and many problems need to be improved.
【學(xué)位授予單位】:揚(yáng)州大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前9條
1 應(yīng)毅;劉亞軍;陳誠(chéng);;基于云計(jì)算技術(shù)的個(gè)性化推薦系統(tǒng)[J];計(jì)算機(jī)工程與應(yīng)用;2015年13期
2 徐勇;陳建國(guó);胡凌云;張林;周善英;;基于泛化語(yǔ)義相似的科技文獻(xiàn)混合推薦算法[J];情報(bào)理論與實(shí)踐;2013年02期
3 黃霞;韋素云;業(yè)寧;朱健;張碩;;基于用戶屬性和項(xiàng)目類別的協(xié)同過(guò)濾算法[J];計(jì)算機(jī)與數(shù)字工程;2012年10期
4 施鳳仙;陳恩紅;;結(jié)合項(xiàng)目區(qū)分用戶興趣度的協(xié)同過(guò)濾算法[J];小型微型計(jì)算機(jī)系統(tǒng);2012年07期
5 朱郁筱;呂琳媛;;推薦系統(tǒng)評(píng)價(jià)指標(biāo)綜述[J];電子科技大學(xué)學(xué)報(bào);2012年02期
6 張新猛;蔣盛益;;基于加權(quán)二部圖的個(gè)性化推薦算法[J];計(jì)算機(jī)應(yīng)用;2012年03期
7 曹毅;賀衛(wèi)紅;;基于內(nèi)容過(guò)濾的電子商務(wù)推薦系統(tǒng)研究[J];計(jì)算機(jī)技術(shù)與發(fā)展;2009年06期
8 游文;葉水生;;電子商務(wù)推薦系統(tǒng)中的協(xié)同過(guò)濾推薦[J];計(jì)算機(jī)技術(shù)與發(fā)展;2006年09期
9 鄧愛(ài)林,左子葉,朱揚(yáng)勇;基于項(xiàng)目聚類的協(xié)同過(guò)濾推薦算法[J];小型微型計(jì)算機(jī)系統(tǒng);2004年09期
,本文編號(hào):2064381
本文鏈接:http://sikaile.net/jingjilunwen/dianzishangwulunwen/2064381.html