基于微博話題評(píng)論的情感分析研究與應(yīng)用
發(fā)布時(shí)間:2019-01-09 15:47
【摘要】:微博是當(dāng)今非常流行的社交和信息傳播平臺(tái)。2016年,隨著里約奧運(yùn)會(huì)與王寶強(qiáng)離婚事件的傳播,微博彰顯了作為一個(gè)信息傳播平臺(tái)的重要地位。在2016年9月,微博月活躍用戶達(dá)到2.97億,同比增長(zhǎng)34%,日平均活躍用戶數(shù)量達(dá)到1.32億,同比增長(zhǎng)32%。人們通過微博發(fā)布消息、轉(zhuǎn)發(fā)見聞、評(píng)論看法、點(diǎn)贊博文,表達(dá)自己對(duì)人物和事件的觀點(diǎn),并和其他人交流意見。通過分析微博用戶轉(zhuǎn)發(fā)和評(píng)論的博文,可以快速獲知當(dāng)前的輿論動(dòng)向和針對(duì)特定事務(wù)的輿情,為決策者提供巨大參考價(jià)值。在企業(yè)中,通過用戶發(fā)布、轉(zhuǎn)發(fā)、評(píng)論的微博內(nèi)容中可以分析出用戶對(duì)產(chǎn)品和服務(wù)的喜好程度,這正是本文研究的出發(fā)點(diǎn)。基于微博話題的情感分析系統(tǒng)可以快速準(zhǔn)確的統(tǒng)計(jì)出當(dāng)前公司或者產(chǎn)品的輿論環(huán)境,對(duì)于快速?zèng)Q策、危機(jī)公關(guān)、輿論引導(dǎo)有著重要的應(yīng)用價(jià)值。本文主要針對(duì)微博評(píng)論進(jìn)行分析,得到微博評(píng)論情感正負(fù)極性。本文的主要工作包括:第一,設(shè)計(jì)爬蟲,爬取公司微博以及對(duì)應(yīng)的評(píng)論。第二,對(duì)數(shù)據(jù)進(jìn)行去停用詞、分詞等處理;第三,基于word2vec得到評(píng)論內(nèi)容對(duì)應(yīng)的詞向量,訓(xùn)練了基于支持向量機(jī)、卷積神經(jīng)網(wǎng)絡(luò)、長(zhǎng)短時(shí)記憶神經(jīng)網(wǎng)絡(luò)的三個(gè)分類器,通過對(duì)準(zhǔn)確率、召回率、F1值以及計(jì)算時(shí)間等性能指標(biāo)進(jìn)行分析對(duì)比,選擇一個(gè)經(jīng)濟(jì)實(shí)用的算法;第四,設(shè)計(jì)UI交互界面。為了驗(yàn)證算法的有效性,本文基于公有數(shù)據(jù)集COAE2013進(jìn)行評(píng)測(cè)以保證各種算法的有效性,結(jié)果表明長(zhǎng)短時(shí)記憶神經(jīng)網(wǎng)絡(luò)取得了最好的性能;并使用優(yōu)化后的堆棧長(zhǎng)短時(shí)記憶神經(jīng)網(wǎng)絡(luò)在COAE2013和深圳航空的數(shù)據(jù)集上進(jìn)行了實(shí)驗(yàn)對(duì)比,性能相較于普通的長(zhǎng)短時(shí)記憶神經(jīng)網(wǎng)絡(luò)高1%左右。本文對(duì)比實(shí)驗(yàn)了目前流行的針對(duì)微博短文本分類的方法;另外,為了解決基于微博的語(yǔ)料較少的問題,本文設(shè)計(jì)了爬蟲系統(tǒng),爬取了大量微博語(yǔ)料,并專門針對(duì)特定賬號(hào)爬取相關(guān)博文下的所有評(píng)論信息。最后選取了堆棧長(zhǎng)短時(shí)記憶神經(jīng)網(wǎng)絡(luò)模型作為基于微博話題評(píng)論情感分析系統(tǒng)的微博評(píng)論情感分析方法,搭建了具有可視化、易用性特點(diǎn)的情感分析系統(tǒng)。
[Abstract]:Weibo is a popular social and information dissemination platform. In 2016, with the divorce of Wang Baoqiang from the Rio Olympics, Weibo showed its importance as an information dissemination platform. In September 2016, Weibo reached 297 million monthly active users, up 34 percent from the same month a year earlier. The average number of active users per day reached 132 million, up 32 percent from the same period last year. People send messages through Weibo, retweets news, comments, praises blog posts, expresses their views on people and events, and exchanges views with others. By analyzing the blog posts forwarded and commented by Weibo users, we can quickly find out the current trend of public opinion and the public opinion aimed at specific affairs, and provide great reference value for decision makers. In enterprises, the content of Weibo, which is published, forwarded and commented by users, can be used to analyze the degree of users' preference for products and services, which is the starting point of this paper. The emotion analysis system based on Weibo topic can quickly and accurately statistics the public opinion environment of current company or product. It has important application value for quick decision making, crisis public relations and public opinion guidance. This paper mainly analyzes Weibo's comments, and obtains the positive and negative emotions of Weibo's comments. The main work of this paper includes: first, the design crawler, crawling company Weibo and corresponding comments. Secondly, the data should be treated with deactivation words, participles and so on. Thirdly, three classifiers based on support vector machine, convolutional neural network and long and short memory neural network are trained based on word2vec to get word vector corresponding to comment content. The performance indexes such as F1 value and calculation time are analyzed and compared, and an economical and practical algorithm is selected. Fourth, design the UI interactive interface. In order to verify the validity of the algorithm, this paper evaluates the algorithm based on the public data set COAE2013 to ensure the effectiveness of various algorithms. The results show that the long and short memory neural network has the best performance. The optimized stack long and short time memory neural network is used to compare the data sets of COAE2013 and Shenzhen Airlines. The performance is about 1% higher than that of ordinary long and short term memory neural networks. This paper compares and tests the current popular methods of classifying Weibo's short texts. In addition, in order to solve the problem of less corpus based on Weibo, this paper designs a crawler system, crawls a large number of Weibo corpus, and specifically crawls all comments under related blog posts for specific accounts. Finally, the neural network model of stack long and short time memory is selected as the emotional analysis method of Weibo comment based on Weibo topic comment emotional analysis system, and a visual and easy-to-use emotional analysis system is built.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.1;TP393.092
本文編號(hào):2405826
[Abstract]:Weibo is a popular social and information dissemination platform. In 2016, with the divorce of Wang Baoqiang from the Rio Olympics, Weibo showed its importance as an information dissemination platform. In September 2016, Weibo reached 297 million monthly active users, up 34 percent from the same month a year earlier. The average number of active users per day reached 132 million, up 32 percent from the same period last year. People send messages through Weibo, retweets news, comments, praises blog posts, expresses their views on people and events, and exchanges views with others. By analyzing the blog posts forwarded and commented by Weibo users, we can quickly find out the current trend of public opinion and the public opinion aimed at specific affairs, and provide great reference value for decision makers. In enterprises, the content of Weibo, which is published, forwarded and commented by users, can be used to analyze the degree of users' preference for products and services, which is the starting point of this paper. The emotion analysis system based on Weibo topic can quickly and accurately statistics the public opinion environment of current company or product. It has important application value for quick decision making, crisis public relations and public opinion guidance. This paper mainly analyzes Weibo's comments, and obtains the positive and negative emotions of Weibo's comments. The main work of this paper includes: first, the design crawler, crawling company Weibo and corresponding comments. Secondly, the data should be treated with deactivation words, participles and so on. Thirdly, three classifiers based on support vector machine, convolutional neural network and long and short memory neural network are trained based on word2vec to get word vector corresponding to comment content. The performance indexes such as F1 value and calculation time are analyzed and compared, and an economical and practical algorithm is selected. Fourth, design the UI interactive interface. In order to verify the validity of the algorithm, this paper evaluates the algorithm based on the public data set COAE2013 to ensure the effectiveness of various algorithms. The results show that the long and short memory neural network has the best performance. The optimized stack long and short time memory neural network is used to compare the data sets of COAE2013 and Shenzhen Airlines. The performance is about 1% higher than that of ordinary long and short term memory neural networks. This paper compares and tests the current popular methods of classifying Weibo's short texts. In addition, in order to solve the problem of less corpus based on Weibo, this paper designs a crawler system, crawls a large number of Weibo corpus, and specifically crawls all comments under related blog posts for specific accounts. Finally, the neural network model of stack long and short time memory is selected as the emotional analysis method of Weibo comment based on Weibo topic comment emotional analysis system, and a visual and easy-to-use emotional analysis system is built.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.1;TP393.092
【參考文獻(xiàn)】
相關(guān)期刊論文 前5條
1 高琰;陳白帆;晁緒耀;毛芳;;基于對(duì)比散度-受限玻爾茲曼機(jī)深度學(xué)習(xí)的產(chǎn)品評(píng)論情感分析[J];計(jì)算機(jī)應(yīng)用;2016年04期
2 陳釗;徐睿峰;桂林;陸勤;;結(jié)合卷積神經(jīng)網(wǎng)絡(luò)和詞語(yǔ)情感序列特征的中文情感分析[J];中文信息學(xué)報(bào);2015年06期
3 梁軍;柴玉梅;原慧斌;昝紅英;劉銘;;基于深度學(xué)習(xí)的微博情感分析[J];中文信息學(xué)報(bào);2014年05期
4 王文華;朱艷輝;徐葉強(qiáng);杜銳;魯琳;鄧程;;基于SVM的產(chǎn)品評(píng)論屬性特征的情感傾向分析[J];湖南工業(yè)大學(xué)學(xué)報(bào);2012年05期
5 徐軍;丁宇新;王曉龍;;使用機(jī)器學(xué)習(xí)方法進(jìn)行新聞的情感自動(dòng)分類[J];中文信息學(xué)報(bào);2007年06期
相關(guān)碩士學(xué)位論文 前2條
1 李明;面向微博電影評(píng)論的情感分類研究[D];云南財(cái)經(jīng)大學(xué);2014年
2 郭偉;網(wǎng)絡(luò)電影評(píng)論的情感挖掘分析[D];吉林大學(xué);2010年
,本文編號(hào):2405826
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2405826.html
最近更新
教材專著