基于物流數(shù)據(jù)的個性化推薦系統(tǒng)的研究與實現(xiàn)
本文選題:網(wǎng)絡(luò)爬蟲 + 領(lǐng)域特征值; 參考:《南京郵電大學》2017年碩士論文
【摘要】:隨著互聯(lián)網(wǎng)技術(shù)的日漸發(fā)展與成熟,網(wǎng)上購物已經(jīng)成為現(xiàn)代的主要購物方式之一,隨之帶來的是用戶量與商品量的激增,產(chǎn)生了海量的物流信息數(shù)據(jù)。在這種大數(shù)據(jù)背景下,由于物流資源與用戶需求信息不對等的原因,導(dǎo)致物流網(wǎng)絡(luò)運輸成本過高、物流資源調(diào)度不合理以及企業(yè)物流決策不及時等問題。傳統(tǒng)的物流數(shù)據(jù)處理模式無法準確預(yù)測用戶的需求,企業(yè)也就無法提前做好物流計劃。而個性化推薦可通過數(shù)據(jù)挖掘技術(shù)從物流數(shù)據(jù)中挖掘用戶的偏好信息,根據(jù)相似度計算找出用戶感興趣的物品,實現(xiàn)為用戶的精確推薦,為企業(yè)決策提供有效的數(shù)據(jù)支撐。目前,個性化推薦系統(tǒng)的主要考慮到以下幾個問題:一是如何在海量數(shù)據(jù)中挖掘用戶信息,全面反映用戶真實的偏好信息;二是如何利用得到的偏好數(shù)據(jù)集訓練得到有效的偏好模型;三是選擇合適的推薦算法。為了更好地研究和實現(xiàn)基于物流數(shù)據(jù)的個性化推薦系統(tǒng),本文以開發(fā)的網(wǎng)上書店的書籍銷售系統(tǒng)為基礎(chǔ)重點研究和實現(xiàn)其上的推薦系統(tǒng),并解決傳統(tǒng)書籍推薦系統(tǒng)存在的數(shù)據(jù)稀疏性、冷啟動以及可擴展性等問題。論文中提出了一種基于協(xié)同過濾的領(lǐng)域特征值感知推薦方法(Domain Features-Aware Recommendation Method,DFAR)。我們使用那些暗示用戶偏好的物品的特征值去間接的挖掘用戶的信息,利用現(xiàn)有工具自動提取物品領(lǐng)域特征值,并通過多屬性決策方法層次分析法(Analytic Hierarchy Process,AHP)去優(yōu)化構(gòu)建用戶偏好模型,最后將用戶偏好模型與協(xié)同過濾算法綁定產(chǎn)生推薦結(jié)果。通過仿真實驗,結(jié)果表明我們的方法能夠有效的提取物品領(lǐng)域特征值,緩解數(shù)據(jù)稀疏性與冷啟動問題,很大程度上提高了推薦的精確度。同時,結(jié)合Hadoop平臺,在Hadoop平臺研究實現(xiàn)個性化書籍推薦系統(tǒng)。面對海量數(shù)據(jù),利用MapReduce并行化計算框架,實現(xiàn)一個分布式并行化網(wǎng)絡(luò)爬蟲。由于傳統(tǒng)的推薦算法在數(shù)據(jù)處理與計算上時間成本過高,實現(xiàn)基于Hadoop的并行化DFAR推薦方法,這大大的提高了算法的效率,滿足用戶的需求。最后,結(jié)合實際的應(yīng)用場景與分析,對基于Hadoop的并行化DFAR推薦方法進行了性能上的分析,并通過Java Web開發(fā)技術(shù),設(shè)計實現(xiàn)了一個書籍推薦系統(tǒng)。
[Abstract]:With the development and maturity of Internet technology, online shopping has become one of the main shopping methods in modern times. Under the background of big data, due to the unequal information between logistics resources and users' needs, the transportation cost of logistics network is too high, the scheduling of logistics resources is unreasonable, and the enterprise logistics decision is not timely. Traditional logistics data processing model can not accurately predict the needs of users, enterprises can not do a good job of logistics planning. The personalized recommendation can mine the user's preference information from the logistics data through data mining technology, find out the objects of interest to the user according to the similarity calculation, realize the accurate recommendation for the user, and provide the effective data support for the enterprise decision-making. At present, the main considerations of personalized recommendation system are as follows: first, how to mine user information in mass data to reflect the real preferences of users; The second is how to use the obtained preference data set to train the effective preference model and the third is to select the appropriate recommendation algorithm. In order to better research and implement the personalized recommendation system based on logistics data, this paper focuses on the research and implementation of the recommendation system based on the book sales system of online bookstores. It also solves the problems of data sparsity, cold start and expansibility in the traditional book recommendation system. In this paper, a domain Features-Aware Recommendation method based on collaborative filtering is proposed. We use the eigenvalues of items that imply user preferences to indirectly mine user information, and use existing tools to automatically extract the feature values of the item domain. Finally, the user preference model is optimized by the Analytic Hierarchy process Analysis (AHP) method. Finally, the user preference model is bound to the collaborative filtering algorithm to produce the recommended results. The simulation results show that the proposed method can extract the feature values of the object domain effectively, alleviate the problem of data sparsity and cold start, and improve the accuracy of the recommendation to a great extent. At the same time, combined with Hadoop platform, a personalized book recommendation system is developed on Hadoop platform. In the face of massive data, a distributed parallel network crawler is implemented by using MapReduce parallel computing framework. Because the time cost of the traditional recommendation algorithm in data processing and computing is too high, the parallel DFAR recommendation method based on Hadoop is implemented, which greatly improves the efficiency of the algorithm and meets the needs of users. Finally, the performance of parallel DFAR recommendation method based on Hadoop is analyzed, and a book recommendation system is designed and implemented by Java Web development technology.
【學位授予單位】:南京郵電大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.3
【參考文獻】
相關(guān)期刊論文 前8條
1 趙影;;基于Web挖掘的物流信息平臺個性化推薦研究[J];中國市場;2015年20期
2 MA You;XIN Xin;WANG Shangguang;LI Jinglin;SUN Qibo;YANG Fangchun;;QoS Evaluation for Web Service Recommendation[J];中國通信;2015年04期
3 張雪潔;王志堅;張偉建;;基于混合協(xié)同過濾的個性化Web服務(wù)推薦[J];計算機科學與探索;2015年05期
4 陳彥萍;王賽;;基于用戶-項目的混合協(xié)同過濾算法[J];計算機技術(shù)與發(fā)展;2014年12期
5 Liang Hu;Guohang Song;Zhenzhen Xie;Kuo Zhao;;Personalized Recommendation Algorithm Based on Preference Features[J];Tsinghua Science and Technology;2014年03期
6 呂成戍;王維國;丁永健;;基于KNN-SVM的混合協(xié)同過濾推薦算法[J];計算機應(yīng)用研究;2012年05期
7 廖新考;;基于用戶特征和項目屬性的混合協(xié)同過濾推薦[J];福建電腦;2010年07期
8 周娜;廖文和;楊浩;趙家偉;;基于分類和關(guān)聯(lián)規(guī)則的個性化產(chǎn)品推薦系統(tǒng)[J];高技術(shù)通訊;2004年11期
,本文編號:1982446
本文鏈接:http://sikaile.net/guanlilunwen/wuliuguanlilunwen/1982446.html