微博用戶的興趣及性格分析

發(fā)布時間：2019-01-13 08:54

【摘要】：隨著互聯(lián)網(wǎng)的快速普及,人們從一個信息匱乏的時代過渡到信息爆炸的時代。在這樣一個時代中,如何更為準確地獲取所需要的信息資訊將是一個難點。微博是一種新型的兼具社交網(wǎng)絡(luò)服務和互聯(lián)網(wǎng)媒體功能的平臺。它可以幫助用戶更加實時地獲取或者發(fā)布信息�；谶@種新型的媒體平臺提供更為個性化的服務,將會進一步幫助用戶更為準確和實時地獲取感興趣的內(nèi)容。而在此之前,系統(tǒng)必須先分析用戶興趣愛好等信息。本文就是針對該研究熱點,基于微博文本對用戶信息進行分析。本文主要對新浪微博平臺上的微博文本進行綜合分析。其研究特色在于僅使用微博文本為分析的數(shù)據(jù)對象,從多個方面進行分析用戶的信息。在本文中主要從微博文本中所蘊含用戶的興趣、情感和性格三個方面來進行分析,進而獲取用戶的信息。第一個方面是從微博文本中分析用戶的興趣,使用宏觀和微觀兩個層面來表示用戶的興趣。通過三元過濾法來消除直接使用主題模型的訓練結(jié)果中詞項與主題之間關(guān)系矩陣中所可能含有的雜質(zhì),然后再使用WTMR(Word-themes mutual reinforcement)模型獲取單條微博文本的主題概率分布。在獲取單條微博文本主題概率分布的基礎(chǔ)上,再一次使用WTMR模型獲取微博文本集合的主題概率分布,即該微博用戶興趣信息的主題概率分布。同時,使用偽長文本和貪心策略,來提取用戶感興趣的關(guān)鍵詞用來明確用戶感興趣的話題對象。整合用戶興趣的主題概率分布和關(guān)鍵詞,從而獲取用戶興趣。實驗證明該方法能較為準確和簡便地提取微博文本中所蘊含的用戶興趣。第二個方面主要是分析微博文本中的情感。改進主題模型中的經(jīng)典方法LDA(Latent Dirichlet Allocation),提出DLDA(Double Latent Dirichlet Allocation)模型。將文本所蘊含的語義和情感視為平等關(guān)系來改進LDA。與此同時,使用不同的詞權(quán)重來優(yōu)化Gibbs Sampling。從實驗結(jié)果可以看出,該方法可以更為準確地分析微博文本中所蘊含的情感極性。第三個方面是分析微博文本中所蘊含微博用戶的性格�；赟C-LIWC詞典,獲取微博文本中所蘊含的用戶性格。使用SC-LIWC詞典和Big Five Model理論獲取微博文本中每個詞對應的大五類性格因素值。然后,再利用WTMR模型,從微博文本中挖掘用戶的性格信息。從實驗的結(jié)果可知,本文的方法可以較為準確地計算微博文本中蘊含用戶性格。本文著重從多個方面分析微博文本中所蘊含的用戶的興趣、情感和性格內(nèi)容。為了更好地展示本文研究的成果,本文設(shè)計一個基于上述三個研究結(jié)果的個性化微博服務系統(tǒng)。其包括三種不同類型的個性化服務,分別是:個性化微博排序、文字云導讀和用戶性格分析助手。這三種服務,除了給用戶提供更高效地獲取信息的功能,還可以用來分析用戶的性格信息。
[Abstract]:With the rapid popularity of the Internet, people transition from an era of information scarcity to an era of information explosion. In such an era, how to obtain the information needed more accurately will be a difficult point. Weibo is a new type of social network services and Internet media features of the platform. It can help users to obtain or publish information in real time. Providing more personalized services based on this new media platform will further help users to obtain more accurate and real-time content of interest. Before this, the system must first analyze the user's interests and other information. This article is aimed at this research hot spot, based on Weibo text carries on the analysis to the user information. This article mainly carries on the comprehensive analysis to the Sina Weibo platform Weibo text. The characteristic of the research is that only Weibo text is used as the data object to analyze the user's information from many aspects. In this paper, we analyze the user's interest, emotion and character in Weibo's text, and then get the user's information. The first aspect is to analyze the user's interest from Weibo's text and express the user's interest at macro and micro level. In this paper, the possible impurities in the relation matrix between the words and the subject are eliminated in the training result of the direct use of the topic model by using the three element filter method, and then the topic probability distribution of the single Weibo text is obtained by using the WTMR (Word-themes mutual reinforcement) model). On the basis of obtaining the subject probability distribution of a single Weibo text, this paper uses WTMR model again to obtain the topic probability distribution of Weibo text set, that is, the topic probability distribution of the user's interest information. At the same time, pseudo-long text and greedy strategy are used to extract the keywords of interest to identify the subject of interest to the user. Integrating the topic probability distribution and key words of user interest to obtain user interest. Experiments show that this method can extract user interest in Weibo text accurately and easily. The second aspect is to analyze the emotion in Weibo's text. The DLDA (Double Latent Dirichlet Allocation) model is proposed by improving the classical method of topic model LDA (Latent Dirichlet Allocation),. Improve LDA. by treating the semantic and emotional implications of the text as equal relations At the same time, use different word weights to optimize Gibbs Sampling. The experimental results show that this method can more accurately analyze the emotional polarity contained in Weibo's text. The third aspect is to analyze the character of the user contained in Weibo's text. Based on the SC-LIWC dictionary, the user character contained in Weibo's text is obtained. The SC-LIWC dictionary and Big Five Model theory are used to obtain the five types of personality factors corresponding to each word in Weibo's text. Then, the WTMR model is used to mine the user's character information from Weibo's text. The experimental results show that the proposed method can accurately calculate the user character in Weibo's text. This paper analyzes the user's interest, emotion and character in Weibo's text. In order to better show the results of this study, this paper designs a personalized Weibo service system based on the above three research results. It includes three different types of personalized services: personalized Weibo sorting, text cloud reading and user character analysis assistant. In addition to providing users with more efficient access to information, these three services can also be used to analyze users' personality information.
【學位授予單位】：上海大學
【學位級別】：碩士
【學位授予年份】：2015
【分類號】：TP393.092;TP391.1

【相似文獻】

相關(guān)期刊論文前10條

1 王杰;使圖像的編輯更加容易[J];中文信息;1998年Z1期

2 王波,姚敏;基于信息抽取的匿名用戶興趣描述[J];華南理工大學學報(自然科學版);2004年S1期

3 董全德;;用戶興趣遷移模式與個性化服務[J];電腦知識與技術(shù)(學術(shù)交流);2007年17期

4 鄭運剛;馬建國;;基于分類的用戶興趣漂移模型[J];情報雜志;2008年01期

5 張濤;;基于瀏覽歷史的用戶興趣提取模型[J];軟件導刊;2009年06期

6 楊杰;陳恩紅;;面向個性化服務的用戶興趣偏移檢測及處理方法[J];電子技術(shù);2009年11期

7 陳圣兵;李龍澍;紀霞;;多層次用戶興趣模式的動態(tài)捕捉[J];計算機工程與應用;2009年36期

8 鄭曉健;龐淑英;何英;;一種面向主題的用戶興趣挖掘模型研究[J];昆明學院學報;2010年03期

9 花青松;劉海峰;胡錚;;基于基尼系數(shù)的用戶興趣分布模式度量方法[J];計算機工程;2012年22期

10 孫雨生;劉偉;仇蓉蓉;黃傳慧;;國內(nèi)用戶興趣建模研究進展[J];情報雜志;2013年05期

相關(guān)會議論文前7條

1 趙琦;駱志剛;田文穎;李聰;丁凡;;一種基于負反饋信息的用戶興趣模型修正方法[A];中國通信學會第六屆學術(shù)年會論文集（下）[C];2009年

2 孫靜;郭奇;張志強;馮建華;;一種基于面向領(lǐng)域檢索系統(tǒng)的用戶興趣獲取方法[A];第二十一屆中國數(shù)據(jù)庫學術(shù)會議論文集（技術(shù)報告篇）[C];2004年

3 孫鐵利;教巍巍;;基于馬爾科夫模型的用戶興趣導航模型系統(tǒng)(英文)[A];計算機技術(shù)與應用進展——全國第17屆計算機科學與技術(shù)應用（CACIS）學術(shù)會議論文集（上冊）[C];2006年

4 廖祝華;劉建勛;易愛平;;基于用戶興趣的Web服務發(fā)現(xiàn)[A];2006年全國開放式分布與并行計算機學術(shù)會議論文集（三）[C];2006年

5 李曉黎;史忠植;梁永全;劉福桃;;INTERNET網(wǎng)上一種識別用戶興趣的學習方法[A];第十六屆全國數(shù)據(jù)庫學術(shù)會議論文集[C];1999年

6 田萱;杜小勇;;基于SAM模型的用戶興趣表示研究[A];第二十三屆中國數(shù)據(jù)庫學術(shù)會議論文集（技術(shù)報告篇）[C];2006年

7 王勇;劉奕群;張敏;馬少平;茹立云;;基于用戶興趣分析的網(wǎng)頁生命周期建模(英文)[A];第三屆全國信息檢索與內(nèi)容安全學術(shù)會議論文集[C];2007年

相關(guān)重要報紙文章前1條

1 中國科學院計算技術(shù)研究所王斌;內(nèi)容為王[N];計算機世界;2004年

相關(guān)博士學位論文前8條

1 張召;在線論壇用戶興趣圖譜發(fā)現(xiàn)與個性化信息推薦[D];華東師范大學;2012年

2 劉淇;基于用戶興趣建模的推薦方法及應用研究[D];中國科學技術(shù)大學;2013年

3 郭巖;網(wǎng)絡(luò)日志中用戶興趣的挖掘及利用[D];中國科學院研究生院（計算技術(shù)研究所）;2004年

4 吳麗輝;個性化的Web信息采集技術(shù)研究[D];中國科學院研究生院（計算技術(shù)研究所）;2005年

5 謝興;社會網(wǎng)絡(luò)中興趣發(fā)現(xiàn)與信息組織的研究[D];復旦大學;2011年

6 李東勝;基于興趣與保護隱私的在線社區(qū)推薦技術(shù)研究[D];復旦大學;2012年

7 陳浩;Web搜索的用戶興趣與智能優(yōu)化研究[D];中南大學;2012年

8 姜邵巍;基于競爭關(guān)系的推薦技術(shù)研究[D];北京郵電大學;2014年

相關(guān)碩士學位論文前10條

1 梁潤庭(Runting Leung);面向微博用戶的興趣識別算法的研究與實現(xiàn)[D];西南交通大學;2015年

2 俞忻峰;新浪微博的數(shù)據(jù)采集和推薦方案研究[D];南京理工大學;2015年

3 楊梅;基于樹型網(wǎng)絡(luò)的多源用戶興趣數(shù)據(jù)融合方法研究[D];四川師范大學;2015年

4 石光蓮;基于形式概念分析的Folksonomy用戶興趣識別研究[D];西南大學;2015年

5 湯文清;微博用戶的興趣及性格分析[D];上海大學;2015年

6 蔣萍;基于用戶興趣挖掘的個性化模型研究與設(shè)計[D];蘇州大學;2005年

7 蘭楊;移動個性化信息服務中用戶興趣建模的研究[D];電子科技大學;2009年

8 孫威;微博用戶興趣挖掘與建模研究[D];大連理工大學;2012年

9 王廣新;基于微博的用戶興趣分析與個性化信息推薦[D];上海交通大學;2013年

10 李致;知識庫系統(tǒng)中的用戶興趣挖掘與推薦[D];北京交通大學;2013年

，

本文編號：2408290

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/guanlilunwen/ydhl/2408290.html

上一篇：基于WEB的零件生產(chǎn)流程質(zhì)量追蹤查詢系統(tǒng)的研究與開發(fā)
下一篇：基于啟發(fā)式的釣魚網(wǎng)站檢測技術(shù)的研究與實現(xiàn)

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

微博用戶的興趣及性格分析