基于云計(jì)算的移動互聯(lián)網(wǎng)用戶行為分析
發(fā)布時(shí)間:2018-03-11 13:50
本文選題:云計(jì)算 切入點(diǎn):用戶行為分析 出處:《北京郵電大學(xué)》2014年碩士論文 論文類型:學(xué)位論文
【摘要】:互聯(lián)網(wǎng)已經(jīng)融入人們的日常生活之中,也使得人們擺脫了原本信息匱乏的時(shí)代,進(jìn)入了信息爆炸的時(shí)代。現(xiàn)如今,信息的總量激增,光是每天增加的新信息,也是沒有任何一個(gè)人有足夠的精力去一一了解的,這使得人們面對海量的數(shù)據(jù)感到無所適從。為了解決這個(gè)難題,就需要我們通過分析用戶的行為,得知用戶的興趣喜好,從而有針對性的為用戶推薦他感興趣的信息,幫助用戶有選擇性的獲取信息。這樣的方式,無論從用戶的角度和從信息提供者的角度都是非常有利的。因?yàn)?從用戶的角度,減少了用戶篩選信息的工作量,能夠直接獲取到想要的信息;從信息提供者的角度,有針對性的向相關(guān)用戶推送信息,而不是毫無選擇的海量推送,降低了信息推送的成本。 本文首先介紹了本課題的研究背景、研究意義和研究現(xiàn)狀。然后介紹了移動互聯(lián)網(wǎng)用戶行為的特點(diǎn)、移動互聯(lián)網(wǎng)用戶行為分析的內(nèi)容和方法以及數(shù)據(jù)挖掘在移動互聯(lián)網(wǎng)用戶行為分析中的運(yùn)用。接著,本文概述了基于云計(jì)算的海量用戶行為數(shù)據(jù)分析,其中涉及到海量數(shù)據(jù)的處理難點(diǎn),并簡介了Hadoop技術(shù)、MapReduce編程框架和Hadoop分布式文件系統(tǒng)。接下來,簡要介紹了本文所用到的數(shù)據(jù)的采集過程和預(yù)處理過程。然后,分別詳細(xì)介紹了用戶訪問服務(wù)器模式挖掘,用戶流量與時(shí)空相關(guān)性分析,協(xié)同過濾算法研究與用戶興趣推薦系統(tǒng)。其中,用戶訪問服務(wù)器模式挖掘分別從用戶訪問服務(wù)器IP地址數(shù)和服務(wù)器IP地址的用戶數(shù)兩個(gè)方面進(jìn)行了統(tǒng)計(jì)分析,并根據(jù)分析結(jié)果對用戶和服務(wù)器IP地址進(jìn)行了簡單的分組。用戶流量與時(shí)空相關(guān)性分析是從用戶的移動性與流量的關(guān)系和用戶的時(shí)間活躍度與流量的關(guān)系兩個(gè)方面進(jìn)行的分析。最后,在協(xié)同過濾算法研究與用戶興趣推薦系統(tǒng)的介紹中,分別介紹了推薦系統(tǒng)、協(xié)同過濾算法和Mahout,并使用Mahout對本文中的數(shù)據(jù)進(jìn)行了實(shí)驗(yàn),對實(shí)驗(yàn)結(jié)果進(jìn)行了分析。
[Abstract]:The Internet has been integrated into people's daily life, and it has also made people get rid of the era of lack of information and enter the era of information explosion. Nowadays, the total amount of information is soaring, and it is only the new information that is added every day. No one has enough energy to understand them all, which makes people feel at a loss in the face of the huge amount of data. In order to solve this problem, we need to analyze the behavior of the users and know the interests and preferences of the users. So that they can specifically recommend the information they are interested in, and help them selectively get the information. Such a way is very beneficial, both from the user's point of view and from the information provider's point of view, because, from the user's point of view, It reduces the workload of user filtering information and can directly obtain the desired information. From the point of view of information provider, it can push information to relevant users, rather than the mass push without any choice, thus reducing the cost of information push. This paper first introduces the research background, research significance and research status of this topic, and then introduces the characteristics of mobile Internet user behavior. The content and method of mobile Internet user behavior analysis and the application of data mining in mobile Internet user behavior analysis. This paper introduces the Hadoop technology and the Hadoop distributed file system. Then, the paper briefly introduces the process of data acquisition and preprocessing used in this paper. In this paper, user access server pattern mining, user traffic and space-time correlation analysis, collaborative filtering algorithm research and user interest recommendation system are introduced in detail. The mining of user access server patterns is analyzed from two aspects: the number of IP addresses of users accessing servers and the number of users of server IP addresses. According to the results of the analysis, the IP addresses of users and servers are simply grouped. The correlation analysis of user traffic and space-time is based on the relationship between user mobility and traffic and the relationship between user time activity and traffic. Finally, In the research of collaborative filtering algorithm and the introduction of user interest recommendation system, we introduce the recommendation system, collaborative filtering algorithm and Mahout, and use Mahout to experiment the data in this paper, and analyze the experimental results.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP393.01;TN929.5
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 趙嘉凌;數(shù)據(jù)挖掘在數(shù)字圖書館中的應(yīng)用研究[J];計(jì)算機(jī)與網(wǎng)絡(luò);2005年10期
2 周宇葵;杜方冬;;數(shù)據(jù)挖掘的哲學(xué)思考[J];圖書館學(xué)刊;2006年03期
,本文編號:1598455
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1598455.html
最近更新
教材專著