天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 搜索引擎論文 >

基于hadoop大數(shù)據(jù)框架的個性化推薦系統(tǒng)研究與實現(xiàn)

發(fā)布時間:2018-11-22 19:42
【摘要】:信息過載問題在當(dāng)今世界越來越突出,目前有三種比較成熟的處理方法,即網(wǎng)站導(dǎo)航、搜索引擎以及推薦系統(tǒng)。網(wǎng)站導(dǎo)航通過收錄著名網(wǎng)站并分門別類的方式解決信息過載問題。而搜索引擎通過為海量網(wǎng)頁建立索引的方式解決信息過載問題。但是當(dāng)用戶不能明確表述自己的需求時,前兩者就略顯無力了,而推薦系統(tǒng)就可以解決此類問題。推薦系統(tǒng)通過分析用戶歷史行為記錄,主動為用戶推薦其潛在感興趣的內(nèi)容。但是隨著互聯(lián)網(wǎng)的高速發(fā)展,信息量也呈幾何倍數(shù)增加,傳統(tǒng)的推薦系統(tǒng)在海量數(shù)據(jù)下容易遭遇計算瓶頸。此外傳統(tǒng)推薦系統(tǒng)未充分考慮用戶興趣多變且呈現(xiàn)一定的離散性的問題。針對以上問題,本文參考以往推薦系統(tǒng)設(shè)計方案,以搜索引擎下圖書的個性化推薦系統(tǒng)為目標(biāo),研究并實現(xiàn)一種基于潛在語義分析和分片聚類的混合推薦系統(tǒng)方案。并使用hadoop大數(shù)據(jù)處理框架解決推薦系統(tǒng)海量數(shù)據(jù)處理問題。本文首先研究搜索引擎下用戶行為數(shù)據(jù)采集方法。分析搜索引擎下用戶行為類型及其特性,針對各數(shù)據(jù)類型及其特性使用不同的數(shù)據(jù)采集方式以及標(biāo)準(zhǔn)化方法,從而完成用戶行為數(shù)據(jù)采集工作。其次,針對搜索引擎下用戶行為獨特性和用戶興趣多變問題,提出潛在語義分析模型和分片聚類模型分別挖掘用戶行為大數(shù)據(jù)下的長久興趣和即時興趣。其中,潛在語義分析推薦模型以內(nèi)容進行推薦,可以緩解用戶和圖書冷啟動問題,并提升系統(tǒng)推薦的覆蓋率。而基于分片聚類的協(xié)同過濾推薦模型中的將用戶行為按屬性和內(nèi)容分片,可以抽取出用戶不同時期的興趣,從而進一步提升推薦性能,且推薦結(jié)果具有一定的新穎性。此外,針對分片聚類過程中搜索引擎下用戶相似度計算問題,提出一種基于用戶檢索詞的改進混合類型數(shù)據(jù)相似度計算方法。最后,基于Hadoop大數(shù)據(jù)處理框架研究用戶行為預(yù)處理以及推薦算法的并行化方法,完成搜索引擎下圖書的個性化推薦系統(tǒng)的設(shè)計與實現(xiàn)。通過引入Hadoop大數(shù)據(jù)處理平臺,設(shè)計并行化的推薦算法,系統(tǒng)處理海量數(shù)據(jù)的能力有很大提升。通過基于潛在語義分析的推薦模型和分片聚類的推薦模型協(xié)同作用,搜索引擎下圖書的個性化推薦精準(zhǔn)度和覆蓋率也有一定改善。最后,通過系統(tǒng)測試以及算法實驗證明其正確性。
[Abstract]:The problem of information overload is becoming more and more prominent in the world. There are three more mature methods, that is, website navigation, search engine and recommendation system. Website navigation through the collection of famous websites and classified ways to solve the problem of information overload. The search engine solves the problem of information overload by indexing massive web pages. However, when users can not express their needs clearly, the first two are slightly powerless, and recommendation system can solve such problems. The recommendation system actively recommends the content of potential interest to the user by analyzing the user's historical behavior record. However, with the rapid development of the Internet, the amount of information is increasing in geometric multiples. Traditional recommendation systems are prone to encounter computational bottlenecks under the massive data. In addition, the traditional recommendation system does not fully consider the problem that user interest is variable and present a certain degree of discreteness. In order to solve the above problems, this paper studies and implements a hybrid recommendation system based on latent semantic analysis and piecewise clustering, aiming at the personalized recommendation system of books under search engine. And use hadoop big data processing framework to solve the problem of mass data processing in recommendation system. This paper first studies the method of user behavior data acquisition under search engine. This paper analyzes the user behavior types and their characteristics under search engine, and uses different data collection methods and standardization methods according to different data types and their characteristics to complete user behavior data collection. Secondly, aiming at the problem of user behavior uniqueness and user interest variability under search engine, a latent semantic analysis model and a piecewise clustering model are proposed to mine the long-term interest and instant interest of user behavior big data respectively. Among them, the potential semantic analysis recommendation model recommends content, which can alleviate the cold start problem of users and books, and improve the coverage of system recommendation. In the collaborative filtering recommendation model based on piecewise clustering, user behavior can be segmented according to attributes and content, which can extract the interest of users in different periods, thus further improve the performance of recommendation, and the recommendation results have some novelty. In addition, an improved hybrid data similarity calculation method based on user search words is proposed to solve the problem of user similarity calculation under search engines in the process of segmented clustering. Finally, based on the Hadoop big data processing framework, the user behavior preprocessing and the parallelization of recommendation algorithm are studied, and the design and implementation of personalized recommendation system for books under search engine is completed. By introducing Hadoop big data processing platform and designing parallel recommendation algorithm, the system's ability to deal with massive data has been greatly improved. Through the collaborative effect of recommendation model based on latent semantic analysis and recommendation model based on piecewise clustering, the personalized recommendation accuracy and coverage of books under search engine are improved to some extent. Finally, it is proved to be correct by system test and algorithm experiment.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2016
【分類號】:TP391.3

【參考文獻】

相關(guān)博士學(xué)位論文 前1條

1 孔維梁;協(xié)同過濾推薦系統(tǒng)關(guān)鍵問題研究[D];華中師范大學(xué);2013年



本文編號:2350346

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2350346.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶97faf***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
亚洲av又爽又色又色| 无套内射美女视频免费在线观看| 国产亚洲精品久久99| 亚洲高清亚洲欧美一区二区| 内射精子视频欧美一区二区| 国产肥女老熟女激情视频一区| 欧美一区二区三区播放| 日韩国产亚洲一区二区三区| 欧美一区二区三区高潮菊竹| 亚洲国产av国产av| 深夜福利亚洲高清性感| 国产不卡的视频在线观看| 91久久国产福利自产拍| 丰满的人妻一区二区三区| 91亚洲精品国产一区| 日本在线视频播放91| 日韩一区中文免费视频| 99视频精品免费视频播放| 亚洲欧美日韩综合在线成成| 亚洲中文字幕日韩在线| 欧美日韩成人在线一区| 亚洲一区二区三区免费的视频 | 欧美成人久久久免费播放| 亚洲精品熟女国产多毛| 色婷婷国产精品视频一区二区保健| 亚洲av一区二区三区精品| 好吊妞视频只有这里有精品| 久久精品免费视看国产成人| 国产老熟女乱子人伦视频| 亚洲欧美天堂精品在线| 高潮日韩福利在线观看| 日本熟妇熟女久久综合| 日韩蜜桃一区二区三区| 久久福利视频视频一区二区| 日韩不卡一区二区三区色图| 在线视频三区日本精品| 女生更色还是男生更色| 久久国产青偷人人妻潘金莲| 欧美国产日产综合精品| 日本不卡在线一区二区三区| 日韩欧美综合在线播放|