天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于Spark的推薦系統(tǒng)的研究

發(fā)布時(shí)間:2018-04-20 15:10

  本文選題:推薦系統(tǒng) + 協(xié)同過(guò)濾算法 ; 參考:《浙江理工大學(xué)》2017年碩士論文


【摘要】:隨著互聯(lián)網(wǎng)和信息技術(shù)的高速發(fā)展,有海量的信息數(shù)據(jù)產(chǎn)生,怎么能夠從紛繁復(fù)雜的信息中,獲取有價(jià)值的數(shù)據(jù)是一個(gè)亟待解決的問(wèn)題。推薦系統(tǒng)是解決這一問(wèn)題的有效方法之一,推薦系統(tǒng)是一種從用戶的歷史行為以及喜好信息中給目標(biāo)用戶推薦產(chǎn)品的應(yīng)用,廣泛地應(yīng)用于電子商務(wù)、視頻音樂(lè)門(mén)戶網(wǎng)站等多個(gè)鄰域。然而依然存在數(shù)據(jù)稀疏性、冷啟動(dòng)、系統(tǒng)預(yù)測(cè)準(zhǔn)確率不理想的問(wèn)題。特別是隨著用戶數(shù)以及物品數(shù)不斷增加,基于單機(jī)的傳統(tǒng)推薦算法遇到不可擴(kuò)展性的瓶頸,很難滿足當(dāng)今的商業(yè)需求,而結(jié)合分布式計(jì)算平臺(tái)的并行化實(shí)現(xiàn)為解決這個(gè)問(wèn)題提供了新的思路。Spark是一種新型的基于內(nèi)存的通用并行化大數(shù)據(jù)計(jì)算引擎,由于其迭代并行化的計(jì)算優(yōu)勢(shì),在大數(shù)據(jù)處理方面得到廣泛的關(guān)注,本文主要研究了基于鄰域和基于模型的推薦算法,針對(duì)其稀疏性、冷啟動(dòng)及預(yù)測(cè)準(zhǔn)確率不理想的問(wèn)題,進(jìn)行算法改進(jìn),并將其在Spark集群上并行化設(shè)計(jì)與實(shí)現(xiàn)優(yōu)化算法。具體的研究的方面如下:(1)針對(duì)基于用戶的協(xié)同過(guò)濾算法存在的評(píng)分?jǐn)?shù)據(jù)稀疏情況下推薦預(yù)測(cè)準(zhǔn)確率不理想的問(wèn)題,引入了用戶屬性特征相似度。本文在計(jì)算用戶相似度時(shí),組合了用戶屬性特征相似度和用戶協(xié)同過(guò)濾相似度,以此來(lái)緩解評(píng)分?jǐn)?shù)據(jù)稀疏性對(duì)計(jì)算用戶相似度的影響。并在Spark平臺(tái)實(shí)現(xiàn)了優(yōu)化后的算法,通過(guò)實(shí)驗(yàn)結(jié)果分析,優(yōu)化的基于用戶的協(xié)同過(guò)濾算法,提高了推薦預(yù)測(cè)準(zhǔn)確率,也改善了算法的執(zhí)行效率。(2)針對(duì)基于物品的協(xié)同過(guò)濾算法存在冷啟動(dòng)情況下預(yù)測(cè)準(zhǔn)確率不理想的問(wèn)題,引入了物品屬性特征相似度。本文在計(jì)算物品相似度度時(shí),組合了物品屬性特征相似度和評(píng)分?jǐn)?shù)據(jù)相似度,以此來(lái)降低冷啟動(dòng)問(wèn)題對(duì)物品相似度計(jì)算的負(fù)面影響。并在Spark平臺(tái)并行化設(shè)計(jì)和實(shí)現(xiàn)了優(yōu)化的算法,通過(guò)實(shí)驗(yàn)結(jié)果分析,優(yōu)化的基于物品的協(xié)同過(guò)濾算法提高了系統(tǒng)預(yù)測(cè)準(zhǔn)確率。(3)針對(duì)基于ALS模型的推薦算法,本文設(shè)計(jì)了一種新的目標(biāo)函數(shù),融合了模型訓(xùn)練前的用戶及物品相似性信息。并在Spark平臺(tái)并行化設(shè)計(jì)和實(shí)現(xiàn)了基于ALS模型的推薦算法,同過(guò)實(shí)驗(yàn)結(jié)果分析,新的模型目標(biāo)函數(shù)下,有較好的預(yù)測(cè)準(zhǔn)確率,也提高了算法的執(zhí)行效率。
[Abstract]:With the rapid development of Internet and information technology, there is a huge amount of information data. How to obtain valuable data from the complicated information is an urgent problem to be solved. Recommendation system is one of the effective methods to solve this problem. Recommendation system is a kind of application of recommending products to target users from user's historical behavior and preference information, which is widely used in electronic commerce. Video music portal and other neighborhoods. However, there are still some problems, such as data sparsity, cold start, and system prediction accuracy. Especially, with the increasing number of users and items, the traditional recommendation algorithm based on single machine meets the bottleneck of inextensibility, so it is difficult to meet the needs of today's business. The parallelization of distributed computing platform provides a new way to solve this problem. Park .Sch is a new memory based general-purpose parallel big data computing engine, because of its advantage of iterative parallelization. In this paper, we mainly study the recommendation algorithm based on neighborhood and model, aiming at the problems of sparse, cold start and poor prediction accuracy, we improve the algorithm. The optimization algorithm is designed and implemented in parallel on Spark cluster. The specific aspects of the research are as follows: (1) aiming at the problem that the recommendation prediction accuracy is not ideal in the case of sparse scoring data in the user-based collaborative filtering algorithm, the similarity of user attribute features is introduced. In this paper, we combine user attribute feature similarity and user collaborative filtering similarity to mitigate the influence of score data sparsity on the calculation of user similarity. The optimized algorithm is implemented on the Spark platform. Through the analysis of experimental results, the optimized collaborative filtering algorithm based on users can improve the accuracy of recommendation prediction. It also improves the execution efficiency of the algorithm. (2) aiming at the problem that the prediction accuracy is not ideal in the cold start case, the article attribute feature similarity is introduced in the article based collaborative filtering algorithm. In order to reduce the negative effect of cold start problem on the calculation of item similarity, this paper combines the similarity of attribute features of items and the similarity of scoring data to calculate the similarity of items. The optimization algorithm is designed and implemented in parallel on Spark platform. Through the analysis of experimental results, the optimized object-based collaborative filtering algorithm improves the prediction accuracy of the system. In this paper, a new objective function is designed, which combines user and object similarity information before model training. The algorithm based on ALS model is designed and implemented in parallel on Spark platform. With the analysis of experimental results, the prediction accuracy is better and the efficiency of the algorithm is improved under the new model objective function.
【學(xué)位授予單位】:浙江理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類(lèi)號(hào)】:TP391.3

【參考文獻(xiàn)】

相關(guān)期刊論文 前6條

1 魯權(quán);王如龍;張錦;丁怡;;融合鄰域模型與隱語(yǔ)義模型的推薦算法[J];計(jì)算機(jī)工程與應(yīng)用;2013年19期

2 孫金剛;艾麗蓉;;基于項(xiàng)目屬性和云填充的協(xié)同過(guò)濾推薦算法[J];計(jì)算機(jī)應(yīng)用;2012年03期

3 汪玉凱;;“十二五”規(guī)劃與我國(guó)電子政務(wù)發(fā)展趨勢(shì)[J];信息化建設(shè);2011年01期

4 汪靜;印鑒;;一種優(yōu)化的Item-based協(xié)同過(guò)濾推薦算法[J];小型微型計(jì)算機(jī)系統(tǒng);2010年12期

5 黃創(chuàng)光;印鑒;汪靜;劉玉葆;王甲海;;不確定近鄰的協(xié)同過(guò)濾推薦算法[J];計(jì)算機(jī)學(xué)報(bào);2010年08期

6 邢春曉;高鳳榮;戰(zhàn)思南;周立柱;;適應(yīng)用戶興趣變化的協(xié)同過(guò)濾推薦算法[J];計(jì)算機(jī)研究與發(fā)展;2007年02期



本文編號(hào):1778289

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/xixikjs/1778289.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶56b68***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
日本高清不卡一二三区| 日韩精品亚洲精品国产精品| 免费福利午夜在线观看| 91在线爽的少妇嗷嗷叫| 欧美日韩亚洲精品内裤| 99日韩在线视频精品免费| 人妻一区二区三区在线| 99热在线播放免费观看| 欧美乱码精品一区二区三| 国产精欧美一区二区三区久久| 国产精品丝袜一二三区| 久热香蕉精品视频在线播放| 91香蕉视频精品在线看| 99久久国产精品亚洲| 免费性欧美重口味黄色| 亚洲专区中文字幕视频| 久久夜色精品国产高清不卡| 国产又爽又猛又粗又色对黄| 偷自拍亚洲欧美一区二页| 免费在线播放一区二区| 欧美做爰猛烈叫床大尺度| 成人免费视频免费观看| 中文字幕区自拍偷拍区| 亚洲欧美日本成人在线| 91国自产精品中文字幕亚洲| 精品人妻久久一品二品三品| 我想看亚洲一级黄色录像| 国产又粗又猛又黄又爽视频免费| 久久精品福利在线观看| 人妻偷人精品一区二区三区不卡| 国产色偷丝袜麻豆亚洲| 日韩一级一片内射视频4k| 欧美久久一区二区精品| 夜夜嗨激情五月天精品| 中文字幕精品一区二区年下载| 高清免费在线不卡视频| 国产中文字幕一二三区| 东京热加勒比一区二区| 国产精品香蕉在线的人| 欧美大粗爽一区二区三区| 五月婷婷亚洲综合一区|