基于Web日志挖掘和關(guān)聯(lián)規(guī)則的個性化推薦系統(tǒng)模型研究
[Abstract]:With the rapid development of science and technology, the rich information provided by the Internet not only promotes the upgrading of social industrial departments, but also brings some problems, such as the rapid growth of information is easy to produce a big bang effect, resulting in "information overload". At the same time, in order to provide more comprehensive information resources for Internet users, website operators and managers constantly add information to Web sites, which makes the topology of Web sites increasingly complex. Because the new resources added to the Web site may not meet the real needs of the user, it is easy to cause a "resource obsessive" when the user browses the Web site. Therefore, how to find the information that people are interested in from the massive data is the problem we face. Therefore, the application of data mining in Web site analysis, that is, Web mining, Web mining is a comprehensive technology, it involves Web technology, data mining, informatics, Web mining can play a role in many aspects, such as mining the structure of search engine, determining the authority page of Web document classification, Web usage mining, intelligent query, etc. Web usage mining, such as establishing Metaweb data warehouse, is to discover user behavior characteristics and navigation patterns from server logs. In this paper, the whole process of data mining and Web usage mining is systematically described, and three aspects of Web log preprocessing process, association rule mining model and sliding window recommendation model are studied. Firstly, the preprocessing process of Web log includes data cleaning, user identification, session identification, path supplement and transaction identification. After preprocessing, a large amount of irrelevant data can be removed from the user access information. At the same time, the user access information on Internet can be structured and stored in the relational database as a transaction or session. Then, this paper uses weighted association rules to mine the preprocessed data. Apriori, a classical association rule mining algorithm, can not only discover the relationship between Web pages, but also play an important role in discovering user preference navigation patterns. However, the application of Apriori algorithm to Web log mining also has its subjective limitations. The implicit assumption of the algorithm is that all pages are of the same importance, and it does not take into account the differences between pages. Some pages of interest to users may be omitted from the data mined using this rule. Aiming at the deficiency of Apriori algorithm in the application of Web log mining, this paper introduces the concept of "page weight", which reflects the users' real preference for pages. According to the definition of page weight, we consider two factors: browsing time and visiting frequency, and then we propose W-Apriori algorithm. The algorithm uses the extended Boolean matrix to describe the transaction database, which is helpful to the compression of the transaction database. At the same time, the introduction of weight also helps to distinguish the differences between pages, and effectively solves the problem of missing some important pages in the process of mining. Finally, this paper designs the Web log recommendation model based on association rule mining by combining the rule base mining and sliding window technology. The model not only can effectively solve the problems of information overload and resource misorientation. And users can be interested in the pages recommended to the relevant Web users, personalized recommendations.
【學(xué)位授予單位】:西南大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 田曉珍;尚冬娟;;Web的個性化服務(wù)[J];重慶工學(xué)院學(xué)報(自然科學(xué)版);2008年07期
2 張智軍,方穎,許云濤;基于Apriori算法的水平加權(quán)關(guān)聯(lián)規(guī)則挖掘[J];計算機(jī)工程與應(yīng)用;2003年14期
3 顧明;仲萃豪;;MIS軟件開發(fā)的過程模型[J];計算機(jī)科學(xué);1997年06期
4 郭巖;白碩;于滿泉;;Web使用信息挖掘綜述[J];計算機(jī)科學(xué);2005年01期
5 張文獻(xiàn),陸建江;加權(quán)布爾型關(guān)聯(lián)規(guī)則的研究[J];計算機(jī)工程;2003年09期
6 李成軍;楊天奇;;一種改進(jìn)的加權(quán)關(guān)聯(lián)規(guī)則挖掘方法[J];計算機(jī)工程;2010年07期
7 張玉芳;熊忠陽;耿曉斐;陳劍敏;;Eclat算法的分析及改進(jìn)[J];計算機(jī)工程;2010年23期
8 陳文;;基于Fp樹的加權(quán)頻繁模式挖掘算法[J];計算機(jī)工程;2012年06期
9 邢東山,沈鈞毅,宋擒豹;從Web日志中挖掘用戶瀏覽偏愛路徑[J];計算機(jī)學(xué)報;2003年11期
10 歐陽為民,鄭誠,蔡慶生;數(shù)據(jù)庫中加權(quán)關(guān)聯(lián)規(guī)則的發(fā)現(xiàn)[J];軟件學(xué)報;2001年04期
,本文編號:2187191
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2187191.html