微博短文本情感分析關(guān)鍵技術(shù)研究與實現(xiàn)
[Abstract]:With the rise of social networks and the advent of Weibo since the media era, hundreds of millions of blog posts can be generated on the Internet every day. The massive Weibo text data contains abundant information about individual, society, enterprise and government. It has important theoretical research value and application value to analyze the content of blog articles, monitor network public opinion, and complete the analysis of emotional tendency contained in blog posts. Based on simulated user login, this paper collects massive Weibo data, constructs vector space model by using natural language processing technology, such as participle, part of speech tagging, subject word extraction and so on, combining emotional lexicon and Weibo corpus. And dynamically adjust the weight of affective factors and other parameters, Weibo data for emotional analysis. The work of this paper is as follows: first, based on the simulation browser technology, combined with HttpWatch8.5 packet capture analysis technology, collect massive Weibo information. Secondly, based on the hidden Markov model and N-Gram language model, the main functions of Chinese word Segmentation (SkyLightAnalyzer,) include word segmentation, part of speech tagging, word sense disambiguation, unrecorded word recognition and so on. Thirdly, based on the algorithm of combining statistics and rules, based on the above Chinese word segmentation, the thesis implements the subject word extraction and emotion unit extraction for blog posts. Fourthly, an algorithm based on vector space model and dynamic adjustment of affective influence factors is proposed, and an emotional orientation analysis method based on personalization modeling and content analysis is designed and implemented. Experimental and practical results show the effectiveness of the proposed algorithm. The paper also describes the shortcomings and the next work plan.
【學(xué)位授予單位】:河北科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP391.1;TP393.092
【參考文獻】
相關(guān)期刊論文 前10條
1 張仰森;郭江;;四種統(tǒng)計詞義消歧模型的分析與比較[J];北京信息科技大學(xué)學(xué)報(自然科學(xué)版);2011年02期
2 朱聰慧;趙鐵軍;鄭德權(quán);;基于無向圖序列標注模型的中文分詞詞性標注一體化系統(tǒng)[J];電子與信息學(xué)報;2010年03期
3 李華波;吳禮發(fā);賴海光;鄭成輝;黃康宇;;有效的爬行Ajax頁面的網(wǎng)絡(luò)爬行算法[J];電子科技大學(xué)學(xué)報;2013年01期
4 王佰玲;曲蕓;張永錚;田志宏;;基于數(shù)據(jù)流的網(wǎng)頁內(nèi)容分析技術(shù)研究[J];電子學(xué)報;2013年04期
5 潘欣;呂靜波;張素莉;;基于網(wǎng)絡(luò)蜘蛛的新詞自動發(fā)現(xiàn)算法研究[J];長春工程學(xué)院學(xué)報(自然科學(xué)版);2011年03期
6 崔世起;劉群;孟遙;于浩;西野文人;;基于大規(guī)模語料庫的新詞檢測[J];計算機研究與發(fā)展;2006年05期
7 黃德根;焦世斗;周惠巍;;基于子詞的雙層CRFs中文分詞[J];計算機研究與發(fā)展;2010年05期
8 姚繼偉;趙東范;;基于短語匹配的中文分詞消歧方法[J];吉林大學(xué)學(xué)報(理學(xué)版);2010年03期
9 張海軍;史樹敏;朱朝勇;黃河燕;;中文新詞識別技術(shù)綜述[J];計算機科學(xué);2010年03期
10 張敏;王春紅;;基于統(tǒng)計方法的Web新詞分詞方法研究[J];計算機工程與科學(xué);2010年05期
相關(guān)博士學(xué)位論文 前1條
1 車超;知識自動獲取的詞義消歧方法[D];大連理工大學(xué);2010年
本文編號:2208046
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2208046.html