天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于路徑與頁面挖掘的用戶瀏覽行為研究

發(fā)布時(shí)間:2018-10-21 12:50
【摘要】:在用戶與互聯(lián)網(wǎng)產(chǎn)品進(jìn)行交互,特別是web瀏覽的過程中,網(wǎng)絡(luò)反饋了大量的行為數(shù)據(jù)。如何利用用戶瀏覽過程中所產(chǎn)生的這些數(shù)據(jù),進(jìn)行深度挖掘和分析,摸透用戶的行為、心理以及喜好,更好的改進(jìn)產(chǎn)品提升用戶體驗(yàn),成為了當(dāng)下許多互聯(lián)網(wǎng)公司感興趣的課題。 對(duì)互聯(lián)網(wǎng)用戶瀏覽行為進(jìn)行研究,一個(gè)行之有效的辦法,就是將用戶瀏覽過程中反饋的web日志收集起來,通過web日志挖掘,從而實(shí)現(xiàn)用戶瀏覽行為分析,這在許多學(xué)者的研究中己獲得了成功。本文希望在前人的基礎(chǔ)上,結(jié)合當(dāng)前流行的Hadoop平臺(tái)和數(shù)據(jù)倉(cāng)庫(kù)技術(shù),將基于web日志挖掘的用戶行為分析系統(tǒng)化、工程化,從而成為互聯(lián)網(wǎng)企業(yè)日常生產(chǎn)中可以應(yīng)用的項(xiàng)目,更好的實(shí)現(xiàn)對(duì)企業(yè)的產(chǎn)品開發(fā)、運(yùn)營(yíng)、管理的有效支撐。 本文基于路徑與頁面挖掘,對(duì)用戶的頁面瀏覽行為進(jìn)行了研究,這主要包括四方面的內(nèi)容: (1)對(duì)Hadoop數(shù)據(jù)處理平臺(tái)及hive數(shù)據(jù)倉(cāng)庫(kù)進(jìn)行了介紹,該平臺(tái)通過分布式存儲(chǔ)與計(jì)算,可以實(shí)現(xiàn)海量數(shù)據(jù)的高速且有效分析,并根據(jù)hive數(shù)據(jù)倉(cāng)庫(kù)的特點(diǎn),提出了基于數(shù)據(jù)倉(cāng)庫(kù)的用戶瀏覽行為研究框架。 (2)基于數(shù)據(jù)倉(cāng)庫(kù)構(gòu)建了基礎(chǔ)數(shù)據(jù)層及主題層,在主題層主要是用戶瀏覽行為主題。 (3)通過研究關(guān)聯(lián)規(guī)則算法與常用路徑挖掘算法,提出了基于數(shù)據(jù)倉(cāng)庫(kù)的頻繁訪問路徑挖掘Hive-CFAP算法。 (4)基于用戶瀏覽行為主題及Hive-CFAP 算法,對(duì)用戶頻繁訪問路徑、頁面瀏覽量與頁面距離的關(guān)系,相似瀏覽用戶的聚類進(jìn)行了分析及應(yīng)用。
[Abstract]:In the process of interaction between users and Internet products, especially web browsing, the network feedback a lot of behavior data. How to make use of the data generated in the process of browsing, to mine and analyze deeply, to understand the behavior, psychology and preferences of the user, and to improve the product to enhance the user experience. It has become a topic of interest to many Internet companies. To study the browsing behavior of Internet users, an effective method is to collect the web logs feedback during the browsing process, and to realize the user browsing behavior analysis through web log mining. This has been successfully studied by many scholars. This paper hopes to systematize and engineer the user behavior analysis based on web log mining based on the current popular Hadoop platform and data warehouse technology on the basis of predecessors, so as to become a project that can be applied in the daily production of Internet enterprises. Better implementation of the enterprise's product development, operation, management of effective support. Based on the path and page mining, this paper studies the user's page browsing behavior, which includes four aspects: (1) the Hadoop data processing platform and the hive data warehouse are introduced. Through distributed storage and computing, the platform can realize the high-speed and effective analysis of massive data, and according to the characteristics of hive data warehouse, The research framework of user browsing behavior based on data warehouse is proposed. (2) the basic data layer and topic layer are constructed based on data warehouse. In the topic layer, user browsing behavior is the main topic. (3) by studying association rules algorithm and common path mining algorithm, The Hive-CFAP algorithm of frequent access path mining based on data warehouse is proposed. (4) based on the topic of user browsing behavior and Hive-CFAP algorithm, the relationship among frequent access path, page views and page distance is discussed. The clustering of similar browsing users is analyzed and applied.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.092

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 單蓉;;一種基于用戶瀏覽行為更新的興趣模型[J];電子設(shè)計(jì)工程;2010年04期

2 肖國(guó)強(qiáng),肖軼;一種從Web日志中挖掘訪問模式的新算法[J];華中科技大學(xué)學(xué)報(bào)(自然科學(xué)版);2004年05期

3 何炎祥,孔維強(qiáng),向劍文,朱驍峰;WebLog訪問序列模式挖掘[J];計(jì)算機(jī)工程與應(yīng)用;2003年27期

4 褚紅丹;焦素云;馬威;;用戶訪問興趣路徑挖掘方法[J];計(jì)算機(jī)工程與應(yīng)用;2008年35期

5 田昌鵬;;基于Web日志分析的Web QoS研究[J];計(jì)算機(jī)科學(xué);2007年06期

6 任永功;付玉;張亮;;一種改進(jìn)的用戶瀏覽偏愛路徑挖掘方法[J];計(jì)算機(jī)工程;2009年08期

7 郭本俊;王鵬;陳高云;黃健;;基于MPI的云計(jì)算模型[J];計(jì)算機(jī)工程;2009年24期

8 程苗;陳華平;;基于Hadoop的Web日志挖掘[J];計(jì)算機(jī)工程;2011年11期

9 邢東山,沈鈞毅,宋擒豹;從Web日志中挖掘用戶瀏覽偏愛路徑[J];計(jì)算機(jī)學(xué)報(bào);2003年11期

10 盧喜利;周軍;周月鵬;;基于Cookie技術(shù)和啟發(fā)式規(guī)則的用戶識(shí)別算法[J];微計(jì)算機(jī)應(yīng)用;2009年11期

,

本文編號(hào):2285141

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2285141.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶969b6***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com