點(diǎn)擊流數(shù)據(jù)倉庫在電子商務(wù)中的研究與應(yīng)用
[Abstract]:With the development of database technology, the office efficiency of enterprises has been greatly improved. With the wide application of database, the business data stored by enterprises increase rapidly. The large amount of data stored in the enterprise can not be converted into effective information, which leads to the situation of "rich data, poor information", which makes the enterprise's investment in the database can not be converted into income. Data warehouse can store a lot of historical data, and it solves this problem well. Traditional data warehouse only loads data from all kinds of business databases. With the development of Internet, Web data is becoming an important data source that people pay more and more attention to. Among these data, Web logging is a very important behavior data, it can help decision makers understand user habits, and then make targeted deployment. In this paper, we construct the click-stream data warehouse, implement the user clustering algorithm based on implicit association pages, and describe the application of user clustering algorithm in e-commerce. The click-stream data warehouse constructed in this paper is based on electronic commerce environment and Web log as important data source. The design of data warehouse adopts the framework of data warehouse subordinate data Mart advocated by Inmon. The data warehouse is constructed by relational model and dimension data Mart is constructed by dimension model. As a data base for enterprise managers to make decisions, data Warehouse stores a large amount of low granularity business history data in the form of the third normal form. Dependent data marts are constructed based on user needs. Using data warehouse subordinate data Mart architecture can balance access efficiency and flexibility of structure adjustment. Based on the click-stream data warehouse, a vector-based click-stream user clustering algorithm is presented in this paper. The algorithm maps the user's click-stream data to vector data and judges the similarity between users according to the magnitude of the angle between vectors. In this paper, the association page group obtained by the implicit association page mining algorithm is regarded as the dimension of the vector. Implicit association pages can well reflect the user's visiting habits and better highlight the theme of interest. The algorithm is verified on the experimental data warehouse. Experiments show that the algorithm can effectively identify user target pages and find more than two implicit association pages. User clustering can also better adapt to the complex Internet environment.
【學(xué)位授予單位】:遼寧工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 郭曉淳;馬冬梅;;點(diǎn)擊流數(shù)據(jù)倉庫中基于事件驅(qū)動(dòng)的星型ER模型[J];信息技術(shù);2012年06期
2 褚紅丹;焦素云;馬威;;用戶訪問興趣路徑挖掘方法[J];計(jì)算機(jī)工程與應(yīng)用;2008年35期
3 林文龍;劉業(yè)政;余智學(xué);;用頁組拓?fù)淦骄嚯x改善頁面聚類算法[J];計(jì)算機(jī)科學(xué);2008年10期
4 劉嘉;祁奇;陳振宇;惠成峰;;ESSK:一種計(jì)算點(diǎn)擊流相似度的新方法[J];計(jì)算機(jī)科學(xué);2012年06期
5 馬超;沈微;;基于閉合有間隔頻繁子序列的點(diǎn)擊流聚類[J];計(jì)算機(jī)工程;2010年23期
6 周勇,鮑鈺;互聯(lián)網(wǎng)目標(biāo)頁面間隱式關(guān)聯(lián)規(guī)則的發(fā)現(xiàn)[J];計(jì)算機(jī)應(yīng)用;2004年08期
7 黎客來;湯震;;點(diǎn)擊流數(shù)據(jù)倉庫系統(tǒng)應(yīng)用研究[J];計(jì)算機(jī)與現(xiàn)代化;2008年02期
8 楊怡玲,管旭東,尤晉元;基于頁面內(nèi)容和站點(diǎn)結(jié)構(gòu)的頁面聚類挖掘算法[J];軟件學(xué)報(bào);2002年03期
9 李曉明;夏秀峰;張斌;;一種具有增量挖掘功能的Web點(diǎn)擊流聚類算法[J];沈陽大學(xué)學(xué)報(bào);2010年03期
10 曾陳萍;;點(diǎn)擊流數(shù)據(jù)倉庫的維度建模設(shè)計(jì)與實(shí)現(xiàn)[J];統(tǒng)計(jì)與決策;2008年08期
相關(guān)博士學(xué)位論文 前1條
1 鮑鈺;WEB日志挖掘及其應(yīng)用研究[D];華東師范大學(xué);2010年
本文編號:2340145
本文鏈接:http://sikaile.net/jingjilunwen/dianzishangwulunwen/2340145.html