基于社會(huì)化媒體的自適應(yīng)信息推薦機(jī)制研究
本文選題:推薦系統(tǒng) + 社會(huì)化媒體; 參考:《西南財(cái)經(jīng)大學(xué)》2011年碩士論文
【摘要】:由于互聯(lián)網(wǎng)的優(yōu)越特性,在其上發(fā)布信息極為便捷,這就使得互聯(lián)網(wǎng)上的信息數(shù)量以近乎爆炸的速度增長。如此多的信息即使瀏覽一遍都無法做到,用戶希望能找到感興趣的部分更是不可能的。傳統(tǒng)的搜索方法只能呈現(xiàn)給所有用戶一樣的排序結(jié)果,無法針對不同用戶的興趣偏好提供相應(yīng)的服務(wù)。信息的爆炸使得信息的利用率反而降低,這種現(xiàn)象被稱之為“信息過載”。推薦系統(tǒng)是為解決互聯(lián)網(wǎng)上的信息過載問題而提出的一種智能代理系統(tǒng),能從互聯(lián)網(wǎng)的大量信息、中向用戶自動(dòng)推薦出符合其興趣偏好或需求的資源。 在當(dāng)前Web 2.0的環(huán)境下,社會(huì)化媒體的出現(xiàn)使得用戶不僅是網(wǎng)絡(luò)內(nèi)容的瀏覽者,也是網(wǎng)絡(luò)內(nèi)容的制造者。它的發(fā)展進(jìn)一步加劇了網(wǎng)絡(luò)時(shí)代的信息爆炸。傳統(tǒng)的推薦系統(tǒng)通過讓用戶回答問題或者主動(dòng)定制的方式來獲取用戶的興趣,進(jìn)而實(shí)現(xiàn)推薦。然而,用戶的興趣不是一成不變的,它會(huì)隨著時(shí)間的推移而變化。針對該點(diǎn),本文提出了一種自適應(yīng)信息推薦機(jī)制,來及時(shí)跟蹤用戶興趣變化,推薦用戶感興趣的資源。社會(huì)化媒體形式多樣,如論壇、博客、內(nèi)容社區(qū)、社交網(wǎng)絡(luò)等。在這些形式下,用戶可以發(fā)布或者轉(zhuǎn)帖一篇文章,其他用戶可以對其閱讀或評論,這些評論本身又會(huì)被其他用戶閱讀或評論。從用戶評論中,可以觀察出用戶當(dāng)前感興趣的話題。傳統(tǒng)的基于內(nèi)容的推薦方法一般根據(jù)原文的內(nèi)容信息來推薦相關(guān)文章。然而,我們知道,隨著用戶討論的繼續(xù),討論的主題也會(huì)發(fā)生變化,即用戶興趣也會(huì)發(fā)生變化。這時(shí),如果僅僅依據(jù)原文本體進(jìn)行推薦,則返回的文章往往不是用戶當(dāng)前最感興趣的,從而會(huì)降低用戶的滿意度。因此,本文考慮了結(jié)合用戶評論和原文本體來構(gòu)建主題模型,利用該模型來選擇相關(guān)文章。根據(jù)觀察發(fā)現(xiàn),每條評論對推薦結(jié)果的影響應(yīng)該是不一樣的,如有些評論對原文內(nèi)容有深刻的見解,而有些評論完全是無意義的討論。所以,當(dāng)利用用戶評論信息來跟蹤主題演變時(shí),區(qū)分開每條評論的影響非常重要。這里,我們從用戶評論中抽取出評論間語義關(guān)系、結(jié)構(gòu)關(guān)系以及用戶權(quán)威來區(qū)別每條評論對推薦的影響。分析事件報(bào)道在網(wǎng)絡(luò)上的傳播,可以發(fā)現(xiàn)其存在如下四個(gè)特點(diǎn):轉(zhuǎn)載重合、報(bào)道重合、包含重合和追蹤重合。這些特點(diǎn)使得基于內(nèi)容的推薦系統(tǒng)存在一個(gè)嚴(yán)重問題—重復(fù)推薦,即推薦文章的內(nèi)容與原文含有相同的信息,這樣會(huì)增加用戶的閱讀負(fù)擔(dān)。于是,本文提出了一種方法來解釋推薦文章與原文本體之間的邏輯關(guān)系(包括一般化、特殊化和重復(fù)),以此降低重復(fù)內(nèi)容的推薦,推薦出符合用戶需求的文章。 本文第一部分介紹了課題的研究背景、研究目的和意義,對文中涉及到的一些基本概念作了簡單介紹。介紹了推薦系統(tǒng)的定義;四種主要方法,即基于內(nèi)容的推薦、協(xié)同過濾推薦、混合型推薦和基于數(shù)據(jù)挖掘技術(shù)的推薦;針對四種方法,分別以一個(gè)系統(tǒng)實(shí)例解釋其工作模式;對推薦系統(tǒng)的評測標(biāo)準(zhǔn)進(jìn)行了匯總。還介紹了社會(huì)化媒體的概念以及與傳統(tǒng)媒體相比,其具有的一些特點(diǎn)。最后,總結(jié)了本文的主要工作和貢獻(xiàn)如下: (1)本研究是在國內(nèi)外率先結(jié)合用戶評論來協(xié)助信息推薦服務(wù)的研究,為基于社會(huì)化媒體的信息推薦研究提供一條嶄新的研究思路,將信息推薦的研究從Web 1.0的傳統(tǒng)靜態(tài)媒體延伸到了Web 2.0的社會(huì)化媒體模式。 (2)為了充分利用社會(huì)化媒體的用戶交互體驗(yàn)特征,我們獨(dú)創(chuàng)性地設(shè)計(jì)了一套基于圖論的用戶評論信息挖掘機(jī)制,可以準(zhǔn)確地捕捉用戶對事件的關(guān)注焦點(diǎn),并將其與原文本體內(nèi)容相結(jié)合,使得推薦的結(jié)果既反映了作者的觀點(diǎn),也反映了讀者的觀點(diǎn)。 (3)為了減輕用戶的認(rèn)知負(fù)擔(dān),我們創(chuàng)新性地提出了一套基于信息熵理論來判斷文本邏輯關(guān)系的機(jī)制。通過該機(jī)制,我們可以獲得推薦文章與原文章的邏輯關(guān)系。此外,該研究成果可以廣泛地應(yīng)用到文本分析的內(nèi)容邏輯判斷中。例如,搜索引擎的結(jié)果呈現(xiàn),基于內(nèi)容的廣告設(shè)置等。 本文第二部分介紹了該課題的研究基礎(chǔ)與背景。首先,針對本文的實(shí)驗(yàn)對象,即新聞和博客,對已有的相關(guān)研究工作進(jìn)行了總結(jié)。新聞推薦從現(xiàn)有的商業(yè)新聞推薦系統(tǒng)和學(xué)術(shù)研究兩個(gè)方面進(jìn)行了介紹。接著,針對文中存在的主題漂移問題,對主題檢測與跟蹤技術(shù)的研究發(fā)展進(jìn)行了匯總。最后,對本文將涉及到的相關(guān)理論知識(shí)作了簡要介紹,如語言模型,PageRank算法、信息熵、T檢驗(yàn)等。 本文第三部分是核心部分,介紹了自適應(yīng)信息推薦機(jī)制的設(shè)計(jì)。首先,展示了總體系統(tǒng)框架圖,并對其運(yùn)作流程進(jìn)行簡單介紹。然后,針對框架中的各個(gè)模塊進(jìn)行詳細(xì)闡述。通過用戶間關(guān)系建模計(jì)算用戶權(quán)威,這里的關(guān)系包括了引用關(guān)系與回復(fù)關(guān)系。在整個(gè)社區(qū)中,根據(jù)一個(gè)用戶對另一個(gè)用戶的信息進(jìn)行引用或者回復(fù)來構(gòu)建圖模型,然后利用PageRank算法計(jì)算每個(gè)用戶的權(quán)威。接著,計(jì)算評論權(quán)重。這里,我們同樣利用了圖模型,不同的是,現(xiàn)在的模型是建立在用戶評論之間的關(guān)系上,這里的關(guān)系包括了語義、引用和回復(fù)關(guān)系。語義關(guān)系指的是兩條評論之間的內(nèi)容相似性,引用或回復(fù)關(guān)系指的是一條評論對另一條評論的信息引用或者回復(fù)。模型構(gòu)建好后,也利用PageRank算法得出評論的權(quán)重。一條評論質(zhì)量的好壞,由其作者的權(quán)威和評論本身共同決定,因此,我們將用戶權(quán)威和評論權(quán)重結(jié)合起來,計(jì)算出每條評論的最終權(quán)重。其次,將這些權(quán)重信息和原文本體、用戶評論一起輸入到合成器中,構(gòu)建主題模型。利用該主題模型從數(shù)據(jù)庫中檢索出相關(guān)文章。最后,根據(jù)信息熵理論來解釋相關(guān)文章與原文本體之間的邏輯關(guān)系,返回符合用戶興趣的文章。 本文第四部分是實(shí)驗(yàn)設(shè)計(jì)與分析。介紹了系統(tǒng)開發(fā)環(huán)境、實(shí)驗(yàn)數(shù)據(jù)的獲取以及詳細(xì)信息。實(shí)驗(yàn)數(shù)據(jù)包括兩部分:一個(gè)是新聞數(shù)據(jù)集,一個(gè)是博客數(shù)據(jù)集。由于我們獲取的是整個(gè)網(wǎng)頁數(shù)據(jù),所以需要對網(wǎng)頁進(jìn)行解析,抽取出所需部分。還介紹了評測標(biāo)準(zhǔn)的選取,為了評測目的,我們除了選用一些常用的指標(biāo),還引入了一個(gè)新的評測指標(biāo)—新穎度,來度量返回文章的主題多樣性。接著,設(shè)計(jì)了一系列實(shí)驗(yàn):1)將本文提出的方法與兩種常用方法進(jìn)行比較,結(jié)果表明,在新聞和博客數(shù)據(jù)集上,我們的方法都明顯優(yōu)于其它兩種;2)分析了用戶權(quán)威和評論對推薦效果的影響,實(shí)驗(yàn)結(jié)果表明結(jié)合用戶權(quán)威和評論信息有利于提高推薦效果;3)分析了評論間關(guān)系對推薦效果的影響,實(shí)驗(yàn)結(jié)果顯示,針對不同的文本形式,有不同的推薦效果。對于新聞數(shù)據(jù),結(jié)合用戶評論間的內(nèi)容關(guān)系會(huì)導(dǎo)致推薦效果的降低;然而,對于博客數(shù)據(jù),結(jié)合用戶評論間的內(nèi)容關(guān)系有助于推薦效果的提高;4)對推薦關(guān)系解釋進(jìn)行了評估。 本文的最后一部分是對本文研究工作的總結(jié)和未來研究工作的展望?偨Y(jié)了本文研究的基于社會(huì)化媒體的自適應(yīng)信息推薦系統(tǒng)的整體設(shè)計(jì);針對本文的研究工作,指出了其存在的一些不足之處,并給出了以后的發(fā)展方向。
[Abstract]:Because of the advantages of the Internet, it is very convenient to publish information on it, which makes the number of information on the internet almost explosive. So much information can not be done even if you browse through it. It is impossible for the user to find the part of interest. The traditional search method can only be presented to all users. The sorting results can not provide services to different users' interest preferences. Information explosion makes the utilization of information reduced, which is called "information overload". The recommendation system is a kind of intelligent agent system for solving the problem of information overload on the Internet, which can get a large amount of information from the Internet, Users automatically recommend resources that meet their interest preferences or needs.
In the current environment of Web 2, the emergence of social media makes users not only the browsers of network content, but also the maker of network content. Its development further exacerbates the information explosion in the network era. However, the interest of the user is not constant, and it will change with time. In this paper, an adaptive information recommendation mechanism is proposed to track users' interest changes in time and recommend the resources of interest to users. The social media forms are diverse, such as forums, blogs, content communities, social networks and so on. In some forms, a user can publish or post an article, other users can read or comment on it, and the comments themselves will be read or commented by other users. From the user reviews, the topic of the user's current interest can be observed. The traditional content based recommendation method is generally recommended according to the content information of the original text. We know, however, that as the user talks continue, the topic of the discussion will change, that is, the user's interest will change. Then, if the text is recommended only according to the original text ontology, the returned article is often not the user's current most interested, which will reduce the user's satisfaction. Therefore, this article considers the combination of use. According to observation, the impact of each comment on the recommended results should be different, for example, some comments have profound views on the original content, and some comments are totally meaningless. In the evolution of a problem, it is very important to distinguish the impact of each comment. Here, we extract the semantic relationship between the comments, the structure relationship and the user authority to distinguish the impact of each comment on the recommendation. These features make the content based recommendation system a serious problem - repeat recommendation, that is, the content of the recommended article has the same information as the original, which will increase the user's reading burden. Therefore, this article proposes a method to explain the logical relationship between the recommendation and the original text Ontology (package). It includes generalization, specialization and duplication, so as to reduce duplication of content and recommend articles that meet users' needs.
The first part of this paper introduces the background of the research, the purpose and significance of the research, introduces some basic concepts involved in the paper. It introduces the definition of the recommendation system; four main methods, namely, content based recommendation, collaborative filtering recommendation, mixed recommendation and data mining based recommendation; for the four methods, A system example is used to explain its work pattern, and the evaluation criteria of the recommended system are summarized. The concept of social media and some characteristics compared with the traditional media are also introduced. Finally, the main work and contributions of this paper are summarized as follows:
(1) this study is the first to assist in the research of information recommendation service at home and abroad. It provides a new research idea for the research of information recommendation based on social media, and extends the research of information recommendation from the traditional static media of Web 1 to the social media model of Web 2.
(2) in order to make full use of the user interactive experience characteristics of social media, we have designed a set of user commentary information mining mechanism based on graph theory, which can accurately capture the focus of attention to the event and combine it with the original content of the original, so that the recommended results reflect both the author's views and the reading. The point of view.
(3) in order to reduce the user's cognitive burden, we innovatively put forward a set of mechanism based on information entropy theory to judge the logical relationship of text. Through this mechanism, we can obtain the logical relationship between the recommended article and the original article. In addition, the research results can be widely used in the logical judgment of the content of text analysis. For example, search. Engine results are presented, content based advertising settings, etc.
The second part of this paper introduces the research foundation and background of the subject. Firstly, it summarizes the existing research work on the subjects of this paper, that is news and blogs. The news recommendation is introduced from two aspects of the existing commercial news recommendation system and academic research. Then, the subject drift problem exists in this paper. The research and development of topic detection and tracking technology are summarized. Finally, the relevant theoretical knowledge involved in this paper is briefly introduced, such as language model, PageRank algorithm, information entropy, T test and so on.
The third part of this paper is the core part, which introduces the design of adaptive information recommendation mechanism. First, the framework of the system is presented, and its operation process is briefly introduced. Then, each module in the framework is described in detail. The user authority is calculated by modeling the relationship between users. The relationship here includes the reference relationship and the relationship. In the whole community, in the whole community, a graph model is constructed based on the reference or reply of one user to another user. Then the authority of each user is calculated using the PageRank algorithm. Then, the weight of the comment is calculated. Here, we also use the graph model, and the different models are based on the user reviews. In relation, the relationship here includes semantic, reference and reply relations. Semantic relations refer to the content similarity between two commentaries. The reference or reply relation refers to the reference or reply of a comment to another comment. After the model is built, the weight of the comment is obtained by using the PageRank algorithm. A good quality of the comment is good. It is decided by the author's authority and the comment itself. Therefore, we combine the user authority and the weight of comments to calculate the final weight of each comment. Secondly, the weight information is entered into the synthesizer with the original text ontology and user reviews, and the main problem model is constructed. Finally, according to the theory of information entropy, it explains the logical relationship between the relevant articles and the original ontology, and returns articles that meet users' interests.
The fourth part of this paper is the design and analysis of the experiment. It introduces the system development environment, the acquisition of experimental data and the detailed information. The experimental data includes two parts: one is the news data set and the other is a blog data set. In order to evaluate the evaluation criteria, in order to evaluate the purpose, we have introduced a new evaluation index, novelty, to measure the theme diversity of the article. Then, a series of experiments are designed: 1) comparing the proposed method with the two common methods, and the results show that in news and blog. On the data set, our methods are obviously better than the other two; 2) analysis of the influence of user authority and comment on the recommendation effect. The experimental results show that the combination of user authority and comment information is beneficial to improving the recommendation effect; 3) analysis of the influence of the relationship between comments on the recommendation effect, the experimental results show that there are different text forms, Different recommendations. For news data, combining the content relationship between user reviews can lead to a reduction in the recommendation effect; however, for the blog data, the content relationship of the user reviews helps to improve the recommendation effect; 4) evaluation of the recommendation relationship interpretation.
The last part of this paper is the summary of the research work in this paper and the prospect of the future research work. It summarizes the overall design of the adaptive information recommendation system based on the social media, and points out some shortcomings in the research work, and gives the future development direction.
【學(xué)位授予單位】:西南財(cái)經(jīng)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2011
【分類號(hào)】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前9條
1 鄧秀勤,姜蓮花;電子商務(wù)推薦系統(tǒng)研究[J];遼東學(xué)院學(xué)報(bào);2005年04期
2 洪宇;張宇;范基禮;劉挺;李生;;基于子話題分治匹配的新事件檢測[J];計(jì)算機(jī)學(xué)報(bào);2008年04期
3 陳炯;張永奎;;一種基于文檔差異度的Web突發(fā)事件新聞個(gè)性化推薦算法[J];計(jì)算機(jī)應(yīng)用與軟件;2010年11期
4 吳克文;;基于內(nèi)容挖掘的博客推薦系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[J];科技情報(bào)開發(fā)與經(jīng)濟(jì);2008年25期
5 洪宇;張宇;劉挺;李生;;話題檢測與跟蹤的評測及研究綜述[J];中文信息學(xué)報(bào);2007年06期
6 田振清,周越;信息熵基本性質(zhì)的研究[J];內(nèi)蒙古師范大學(xué)學(xué)報(bào)(自然科學(xué)漢文版);2002年04期
7 王志勇;耿亦兵;;統(tǒng)計(jì)語言模型在文本信息檢索中的應(yīng)用[J];中國索引;2003年01期
8 高琳琦;;基于用戶行為分析的自適應(yīng)新聞推薦模型[J];圖書情報(bào)工作;2007年06期
9 孫楠楠;;對社會(huì)化媒體的傳播學(xué)思考[J];新聞愛好者;2009年17期
相關(guān)碩士學(xué)位論文 前3條
1 鄭軍威;敏捷環(huán)境下軟件項(xiàng)目風(fēng)險(xiǎn)管理系統(tǒng)的開發(fā)[D];上海交通大學(xué);2009年
2 章志龍;基于語義網(wǎng)的博客搜索系統(tǒng)研究[D];武漢理工大學(xué);2009年
3 唐朝;資源自適應(yīng)個(gè)性化新聞推薦系統(tǒng)的研究與實(shí)現(xiàn)[D];浙江大學(xué);2010年
,本文編號(hào):2051898
本文鏈接:http://sikaile.net/wenyilunwen/guanggaoshejilunwen/2051898.html