基于Hadoop云平臺的智能推薦物流系統(tǒng)設(shè)計(jì)與實(shí)現(xiàn)

發(fā)布時(shí)間：2018-05-15 22:10

本文選題：推薦系統(tǒng) + Hadoop�。� 參考：《沈陽師范大學(xué)》2015年碩士論文

【摘要】：開源的云平臺框架Hadoop,隨著互聯(lián)網(wǎng)的高速發(fā)展,也在不斷完善自身推出具有更高性能、更穩(wěn)定的版本,在以數(shù)據(jù)為引導(dǎo)的今日,得到了更加廣泛的關(guān)注。它是Google一個(gè)重要的分布式并行化編程模型MapReduce的開源實(shí)現(xiàn),擁有豐富的服務(wù)接口,可以部署在數(shù)千個(gè)節(jié)點(diǎn)集群中,來應(yīng)對海量的數(shù)據(jù)計(jì)算業(yè)務(wù)。對于實(shí)現(xiàn)并行化的算法程序,使用其MapReduce編程模型,開發(fā)者只需要將注意力集中在自身要解決的計(jì)算任務(wù)上,將自定義好MapReduce類提交給平臺相應(yīng)的接口處理即可,為開發(fā)和研究云計(jì)算服務(wù)、大數(shù)據(jù)業(yè)務(wù)處理帶來極大的便利性。本文的主要研究工作就是基于Hadoop云平臺展開的。論文研究過程中,在VMware虛擬化的服務(wù)器上搭建了四個(gè)工作節(jié)點(diǎn),在這個(gè)小集群的基礎(chǔ)上進(jìn)行智能推薦算法的應(yīng)用研究工作。文中對于Hadoop平臺的部署配置,以及采用MapReduce編程模型為基礎(chǔ)實(shí)現(xiàn)分布式的并行化計(jì)算的編程方法做了仔細(xì)的學(xué)習(xí)研究。文中研究了物流業(yè)務(wù)平臺的原有客戶關(guān)系等信息,構(gòu)建了基于Hadoop平臺的推薦系統(tǒng)框架,采用離線實(shí)驗(yàn)的方式,從業(yè)務(wù)平臺的Oracle數(shù)據(jù)庫中獲取實(shí)驗(yàn)研究用的原始數(shù)據(jù),并通過簡單的數(shù)據(jù)ETL功能模塊進(jìn)行數(shù)據(jù)轉(zhuǎn)換,使數(shù)據(jù)比較適應(yīng)于MapReduce的算法應(yīng)用。文中的智能推薦的方法采用了基于項(xiàng)目的協(xié)同過濾算法,該算法核心是從用戶一項(xiàng)目的評分矩陣之中構(gòu)造出項(xiàng)目間的同現(xiàn)矩陣,進(jìn)而利用同現(xiàn)矩陣來快速的計(jì)算出用戶的興趣物品。該算法的基本實(shí)現(xiàn)相對簡單,且在處理一定規(guī)模的數(shù)據(jù)集上效率比較高。研究中,以MapReduce編程模型實(shí)現(xiàn)了該算法,將其與物流業(yè)務(wù)平臺相結(jié)合為物流行業(yè)的企業(yè)用戶提供推薦服務(wù),Hadoop平臺對于數(shù)據(jù)集的分片使得算法的實(shí)現(xiàn)出現(xiàn)推薦結(jié)果局部化問題,為了解決該問題,以及現(xiàn)有平臺的數(shù)據(jù)增長規(guī)模的分析和系統(tǒng)結(jié)構(gòu)的綜合分析,提出了利用Redis來構(gòu)建推薦系統(tǒng)的緩存數(shù)據(jù)層以此存儲算法用到的同現(xiàn)矩陣,同時(shí)調(diào)整原有算法實(shí)現(xiàn)的程序流程,來解決推薦結(jié)果局部化問題。文中對兩種方法在多個(gè)評價(jià)指標(biāo)上進(jìn)行了分析比隊(duì)。調(diào)整后的程序在利用Redis緩存同現(xiàn)矩陣的實(shí)驗(yàn)結(jié)果表明,該方法在性能和評價(jià)指標(biāo)上有了明顯的改善,運(yùn)行時(shí)間比較合適,能夠取得較好的推薦效果,同時(shí)在數(shù)據(jù)集規(guī)模增長過程中也具有較好的實(shí)時(shí)性和可擴(kuò)展性。
[Abstract]:The open source cloud platform framework Hadoop, with the rapid development of the Internet, is also constantly improving its own higher performance and more stable version. It has been paid more attention to the data - guided today. It is an open source implementation of an important distributed parallel programming model of Google, MapReduce, with a rich service connection. The mouth, which can be deployed in thousands of node clusters to deal with massive data computing services. For parallel algorithms, using its MapReduce programming model, developers only need to focus their attention on the computing tasks to be solved by themselves, and submit a good MapReduce class to the corresponding interface processing of the platform. The main research work of this paper is based on the Hadoop cloud platform. In this paper, four work nodes are built on the VMware virtualized server, and the application research of the intelligent recommendation algorithm is carried out on the basis of this small cluster. In this paper, the deployment configuration of Hadoop platform and the programming method of implementing distributed parallel computing based on MapReduce programming model are studied carefully. The original customer relationship and other information of the logistics service platform are studied in this paper. The framework of the recommendation system based on the Hadoop platform is constructed, and the off-line experiment is adopted. From the Oracle database of the business platform, the original data used in the experimental research are obtained, and the data is converted through the simple data ETL function module. The data comparison is adapted to the application of the MapReduce algorithm. The intelligent recommendation method used in this paper is based on the project based collaborative filtering algorithm. The core of the algorithm is the score from the user one project. The same occurrence matrix is constructed in the matrix, and then the user's interest items are quickly calculated by using the co-occurrence matrix. The basic realization of the algorithm is relatively simple, and the efficiency of the data set on a certain scale is relatively high. In the study, the algorithm is implemented with the MapReduce programming model, which combines it with the logistics service platform. The enterprise users in the logistics industry provide the recommendation service. The Hadoop platform makes the implementation of the algorithm appear localization problem. In order to solve this problem, as well as the analysis of the data growth scale of the existing platform and the comprehensive analysis of the system structure, the caching data layer of the recommendation system is constructed by using Redis. In order to store the co-occurrence matrix used in the algorithm, and adjust the procedure flow of the original algorithm to solve the localization problem of the recommendation results. In this paper, the two methods are analyzed on multiple evaluation indexes. The experimental results of the adjusted program using Redis caching co occurrence matrix show that the method is on performance and evaluation index. With the obvious improvement, the running time is suitable, and the better recommendation effect can be obtained. At the same time, it also has better real-time and extensibility in the process of the scale growth of the data set.

【學(xué)位授予單位】：沈陽師范大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2015
【分類號】：TP391.3

【參考文獻(xiàn)】

相關(guān)期刊論文前2條

1 吳夙慧;成穎;鄭彥寧;潘云濤;;K-means算法研究綜述[J];現(xiàn)代圖書情報(bào)技術(shù);2011年05期

2 李永森;楊善林;馬溪駿;胡笑旋;陳增明;;空間聚類算法中的K值優(yōu)化問題研究[J];系統(tǒng)仿真學(xué)報(bào);2006年03期

，

本文編號：1894150

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/guanlilunwen/kehuguanxiguanli/1894150.html

上一篇：數(shù)據(jù)挖掘算法在銀行理財(cái)產(chǎn)品營銷中的應(yīng)用研究
下一篇：基于云計(jì)算的倉庫管理系統(tǒng)——以RJW運(yùn)輸公司為例

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于Hadoop云平臺的智能推薦物流系統(tǒng)設(shè)計(jì)與實(shí)現(xiàn)