面向微博短文本的情感新詞發(fā)現(xiàn)與傾向性研究
[Abstract]:In the age of social networks sweeping the world, many new words and even new emojis have emerged. They often come along with the social hot news, just like the vane of online public opinion. How to effectively extract new words from massive Weibo information and carry out emotional analysis plays an important role in the topic tracking and public opinion analysis of Weibo content. These neologisms contain strong emotions, which to some extent represent the feelings of the user. However, the existing text orientation analysis mainly focuses on the field of product review, news report and so on. At present, the traditional method is still used for the tendency analysis of Weibo neologisms, and the analysis of the related features of Weibo neologisms is lacking, so the effect is poor. The main research work of this paper includes the following three aspects: first, this paper designs and implements a method based on repeated string statistics to extract candidate new words, and uses generalized suffix tree to extract all possible candidate strings. Secondly, this paper proposes a new word detection algorithm based on the combination of language rules and statistics to filter the candidate neologisms. This paper compares the performance of several classical statistics in neologism detection, and finally chooses mutual information as internal statistic and left and right adjacency information entropy as external statistic. This paper also thinks and analyzes the distinction between ordinary neologisms and emotional neologisms. Thirdly, on the basis of practice, this paper proposes a new word emotion decision algorithm based on neural network. Using the context information of emotion new words to judge the polarity of emotion words, the word vector is used to represent the semantic and grammatical features of the new words. This method combines the local context and the global context information effectively. In this paper, a multi-source language model is used to determine the polysemous vector of words by clustering the context, and then the semantic analysis of new words is carried out to determine their affective tendency in different contexts.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 周超;嚴(yán)馨;余正濤;洪旭東;線巖團(tuán);;融合詞頻特性及鄰接變化數(shù)的微博新詞識(shí)別[J];山東大學(xué)學(xué)報(bào)(理學(xué)版);2015年03期
2 楊陽;劉龍飛;魏現(xiàn)輝;林鴻飛;;基于詞向量的情感新詞發(fā)現(xiàn)方法[J];山東大學(xué)學(xué)報(bào)(理學(xué)版);2014年11期
3 張海軍;劉戰(zhàn)東;木妮娜;;基于逐層剪枝的中文高頻重復(fù)模式快速提取算法[J];計(jì)算機(jī)科學(xué);2014年05期
4 霍帥;張敏;劉奕群;馬少平;;基于微博內(nèi)容的新詞發(fā)現(xiàn)方法[J];模式識(shí)別與人工智能;2014年02期
5 安艷輝;高雙喜;劉宗敏;;基于BP網(wǎng)絡(luò)的字符識(shí)別系統(tǒng)設(shè)計(jì)[J];河北省科學(xué)院學(xué)報(bào);2012年01期
6 紀(jì)娟;;神經(jīng)網(wǎng)絡(luò)模型在財(cái)務(wù)風(fēng)險(xiǎn)預(yù)警中的應(yīng)用[J];網(wǎng)絡(luò)安全技術(shù)與應(yīng)用;2011年01期
7 林自芳;蔣秀鳳;;基于詞內(nèi)部模式的新詞識(shí)別[J];計(jì)算機(jī)與現(xiàn)代化;2010年11期
8 王素格;李德玉;魏英杰;宋曉雷;;基于同義詞的詞匯情感傾向判別方法[J];中文信息學(xué)報(bào);2009年05期
9 賀敏;龔才春;張華平;程學(xué)旗;;一種基于大規(guī)模語料的新詞識(shí)別方法[J];計(jì)算機(jī)工程與應(yīng)用;2007年21期
10 羅智勇;宋柔;;基于多特征的自適應(yīng)新詞識(shí)別[J];北京工業(yè)大學(xué)學(xué)報(bào);2007年07期
相關(guān)碩士學(xué)位論文 前7條
1 杜振雷;面向微博短文本的情感分析研究[D];北京信息科技大學(xué);2013年
2 蘇其龍;微博新詞發(fā)現(xiàn)研究[D];哈爾濱工業(yè)大學(xué);2013年
3 薩合多拉·木巴拉克;基于條件隨機(jī)域算法的哈薩克語基本形容詞短語的識(shí)別[D];新疆大學(xué);2013年
4 唐都鈺;領(lǐng)域自適應(yīng)的中文情感分析詞典構(gòu)建研究[D];哈爾濱工業(yè)大學(xué);2012年
5 丁溪源;基于大規(guī)模語料的中文新詞抽取算法的設(shè)計(jì)與實(shí)現(xiàn)[D];南京理工大學(xué);2011年
6 劉利剛;中文名實(shí)體識(shí)別與新詞發(fā)現(xiàn)技術(shù)研究[D];哈爾濱工業(yè)大學(xué);2007年
7 崔世起;中文新詞檢測與分析[D];中國科學(xué)院研究生院(計(jì)算技術(shù)研究所);2006年
,本文編號(hào):2468257
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2468257.html