天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 軟件論文 >

基于微博的情感傾向分析系統(tǒng)的研究與實(shí)現(xiàn)

發(fā)布時(shí)間:2018-05-27 04:36

  本文選題:情感分類 + 情感傾向; 參考:《北京郵電大學(xué)》2016年碩士論文


【摘要】:近年來(lái),互聯(lián)網(wǎng)飛速發(fā)展,社交網(wǎng)站已經(jīng)成為人們表達(dá)觀點(diǎn)的主要平臺(tái)。微博作為其中熱門的網(wǎng)站之一,每天都會(huì)產(chǎn)生大量的用戶行為數(shù)據(jù),這些數(shù)據(jù)對(duì)很多領(lǐng)域都具有研究?jī)r(jià)值。情感傾向分析是當(dāng)下熱門的研究領(lǐng)域之一,它使用統(tǒng)計(jì)學(xué)和機(jī)器學(xué)習(xí)方法對(duì)用戶行為數(shù)據(jù)進(jìn)行分析和挖掘,并通過(guò)分析結(jié)果預(yù)測(cè)用戶的情感態(tài)度。本文主要研究和實(shí)現(xiàn)了針對(duì)微博文本的情感分析系統(tǒng),具體內(nèi)容包括以下六個(gè)方面:第一,研究了常用的情感分析算法,包括支持向量機(jī)算法、樸素貝葉斯算法、Adaboost算法以及神經(jīng)網(wǎng)絡(luò)算法。研究了四種算法的原理以并對(duì)四種算法進(jìn)行了分析比較。第二,研究了微博平臺(tái)頁(yè)面布局,設(shè)計(jì)了分布式微博爬蟲系統(tǒng)。本系統(tǒng)主要爬取微博熱門話題數(shù)據(jù),包括微博正文和微博評(píng)論。第三,設(shè)計(jì)了數(shù)據(jù)預(yù)處理系統(tǒng),并定義了數(shù)據(jù)預(yù)處理的三種規(guī)則,包括表情數(shù)據(jù)轉(zhuǎn)化規(guī)則、數(shù)據(jù)去重規(guī)則以及無(wú)效數(shù)據(jù)清洗規(guī)則。第四,分析了微博文本數(shù)據(jù)特點(diǎn),并針對(duì)其特點(diǎn)選擇文本特征提取方法。本文主要使用卡方檢驗(yàn)方法和TF-IDF方法對(duì)微博文本提取和表示特征。第五,使用上述分類算法中的前三種構(gòu)建微博文本分類器,將微博文本分成正向、負(fù)向和中性三類,同時(shí)對(duì)三種算法分類結(jié)果進(jìn)行了比較和分析。第六,設(shè)計(jì)并實(shí)現(xiàn)了一個(gè)展示系統(tǒng),獲取話題數(shù)據(jù)并通過(guò)WEB進(jìn)行展示。最后,本文基于微博話題數(shù)據(jù),對(duì)情感分析系統(tǒng)進(jìn)行了測(cè)試,結(jié)果表明系統(tǒng)在微博情感預(yù)測(cè)中表現(xiàn)出較好的效果。
[Abstract]:In recent years, with the rapid development of the Internet, social networking sites have become the main platform for people to express their views. As one of the most popular websites, Weibo produces a lot of user behavior data every day. Affective tendency analysis is one of the most popular research fields. It uses statistics and machine learning methods to analyze and mine user behavior data and predict the emotional attitude of users through the analysis results. This paper mainly studies and implements the emotion analysis system for Weibo text. The specific contents include the following six aspects: first, the commonly used affective analysis algorithms, including support vector machine algorithm, are studied. Naive Bayes algorithm, Adaboost algorithm and neural network algorithm. The principle of four algorithms is studied, and the four algorithms are analyzed and compared. Secondly, the page layout of Weibo platform is studied, and the distributed Weibo crawler system is designed. This system mainly crawls Weibo hot topic data, including Weibo text and Weibo comment. Thirdly, the data preprocessing system is designed, and three rules of data preprocessing are defined, including expression data transformation rule, data de-reduplication rule and invalid data cleaning rule. Fourthly, this paper analyzes the characteristics of Weibo text data, and selects a text feature extraction method according to its characteristics. This paper mainly uses chi-square test method and TF-IDF method to extract and represent Weibo text. Fifthly, Weibo text classifier is constructed by using the first three classification algorithms, and the Weibo text is divided into three categories: forward, negative and neutral. At the same time, the classification results of the three algorithms are compared and analyzed. Sixth, a display system is designed and implemented to obtain topic data and display it through WEB. Finally, based on the topic data of Weibo, this paper tests the affective analysis system, and the results show that the system has a good effect in the prediction of Weibo emotion.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.1;TP393.092

【參考文獻(xiàn)】

相關(guān)期刊論文 前3條

1 劉全超;黃河燕;馮沖;;基于多特征微博話題情感傾向性判定算法研究[J];中文信息學(xué)報(bào);2014年04期

2 黃承慧;印鑒;侯f ;;一種結(jié)合詞項(xiàng)語(yǔ)義信息和TF-IDF方法的文本相似度量方法[J];計(jì)算機(jī)學(xué)報(bào);2011年05期

3 杜偉夫;譚松波;云曉春;程學(xué)旗;;一種新的情感詞匯語(yǔ)義傾向計(jì)算方法[J];計(jì)算機(jī)研究與發(fā)展;2009年10期

,

本文編號(hào):1940469

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1940469.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶edf96***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com