基于條件隨機(jī)場的微博情感對(duì)象識(shí)別研究
發(fā)布時(shí)間:2019-04-12 13:20
【摘要】:近年來社交網(wǎng)絡(luò)飛速發(fā)展,越來越多的人通過微博來進(jìn)行信息交換和分享。由于微博具有短小精悍,使用便捷,傳播迅速等特點(diǎn),使得其廣受歡迎。用戶樂于在微博上分享自己的觀點(diǎn)或體驗(yàn),這使得微博中存在著大量具有情感傾向的用戶評(píng)論信息。隨著這樣的評(píng)論信息迅速膨脹,僅靠人工的方法難以應(yīng)對(duì)海量信息的處理和分析。因此,如何利用計(jì)算機(jī)技術(shù)對(duì)微博中的評(píng)論數(shù)據(jù)進(jìn)行有效的加工處理和分析挖掘己成為當(dāng)前熱門研究問題,情感對(duì)象識(shí)別研究就是用于解決這個(gè)問題的一種非常有效的途徑。 本文主要是針對(duì)中文微博文本進(jìn)行情感對(duì)象識(shí)別研究,然而對(duì)非結(jié)構(gòu)化的文本進(jìn)行情感對(duì)象識(shí)別本身就是一個(gè)困難的問題,現(xiàn)有研究往往存在一些不足之處。一方面,微博和傳統(tǒng)文本是有區(qū)別的,其表達(dá)簡短且具有較大的自由性,通常不是規(guī)范的中文語言表達(dá),現(xiàn)有的基礎(chǔ)中文文本處理工具并不能很好的適用于微博這種特殊的文本,這為情感對(duì)象識(shí)別任務(wù)提高了難度。為了解決這個(gè)問題,本文提出對(duì)微博文本進(jìn)行了規(guī)范化處理并構(gòu)建了包括網(wǎng)絡(luò)用語詞典、表情詞典、情感詞典和否定詞詞典等在內(nèi)的多個(gè)詞典,通過這種方式不但能夠改善現(xiàn)有文本處理工具對(duì)微博進(jìn)行分詞和句法依賴解析,而且還能夠更加有效地結(jié)合上下文信息進(jìn)行特征提取。另一方面,針對(duì)文本中顯性出現(xiàn)的情感對(duì)象,目前一些方法已經(jīng)能夠有效的識(shí)別,但是面對(duì)隱性的情感對(duì)象時(shí)還是顯得力不從心。因此,當(dāng)情感對(duì)象直接出現(xiàn)在文本中時(shí),本文采用條件隨機(jī)場模型和分類模型相融合的方式進(jìn)行情感對(duì)象識(shí)別;而對(duì)于情感對(duì)象并不出現(xiàn)在文本中時(shí),則嘗試對(duì)蘊(yùn)含的情感對(duì)象進(jìn)行抽象化處理,提出了一種包含隱節(jié)點(diǎn)的條件隨機(jī)場改進(jìn)模型用于識(shí)別隱藏情感對(duì)象。 本課題研究的核心思想是將情感對(duì)象識(shí)別問題看成序列標(biāo)記問題,利用條件隨機(jī)場模型在句子級(jí)的微博文本上進(jìn)行對(duì)象標(biāo)注,模型綜合利用多種特征改善識(shí)別準(zhǔn)確度。在實(shí)驗(yàn)部分,本文在公開評(píng)測數(shù)據(jù)集和自建數(shù)據(jù)集兩個(gè)數(shù)據(jù)集上進(jìn)行了實(shí)驗(yàn)驗(yàn)證和評(píng)估,結(jié)果表明模型不但能夠較好識(shí)別出微博中顯性的情感對(duì)象,還能夠識(shí)別出隱藏情感對(duì)象。
[Abstract]:In recent years, with the rapid development of social networks, more and more people use Weibo to exchange and share information. Weibo is popular because it is short, easy to use and spread quickly. Users are happy to share their views or experiences on Weibo, which leads to a large number of emotional user comments in Weibo. With the rapid expansion of such comment information, it is difficult to deal with the massive information processing and analysis only by artificial method. Therefore, how to process and mine the comment data in Weibo effectively by using computer technology has become a hot research problem at present. Emotion object recognition is a very effective way to solve this problem. This paper mainly focuses on the emotional object recognition of Chinese Weibo text. However, the emotional object recognition of unstructured text is a difficult problem in itself, and there are often some shortcomings in the existing research. On the one hand, Weibo is different from traditional text in that it is short and free, and is usually not a canonical Chinese language. The existing basic Chinese text processing tools are not suitable for the special text such as Weibo, which makes the task of emotional object recognition more difficult. In order to solve this problem, this paper proposes to normalize the Weibo text and construct a number of dictionaries including network dictionary, expression dictionary, emotion dictionary and negative word dictionary, etc. This approach can not only improve the existing text processing tools for word segmentation and syntactic dependency analysis of Weibo, but also can more effectively combine context information for feature extraction. On the other hand, some methods have been able to effectively identify the explicit emotional objects in the text, but they still appear to be weak in the face of implicit emotional objects. Therefore, when emotional objects appear directly in the text, this paper uses the combination of conditional random field model and classification model to identify emotional objects. When the emotion object does not appear in the text, the implied emotion object is abstracted, and a modified conditional random field model with hidden nodes is proposed to identify hidden emotion object. The key idea of this paper is to consider the emotional object recognition as a sequence marking problem. The conditional random field model is used to label the object on the sentence-level Weibo text. The model comprehensively uses a variety of features to improve the recognition accuracy. In the experiment part, two sets of open evaluation data set and self-built data set are tested and evaluated. The results show that the model can not only recognize the dominant emotional objects in Weibo well. It can also identify hidden emotional objects.
【學(xué)位授予單位】:廣東工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.092;TP391.1
本文編號(hào):2457053
[Abstract]:In recent years, with the rapid development of social networks, more and more people use Weibo to exchange and share information. Weibo is popular because it is short, easy to use and spread quickly. Users are happy to share their views or experiences on Weibo, which leads to a large number of emotional user comments in Weibo. With the rapid expansion of such comment information, it is difficult to deal with the massive information processing and analysis only by artificial method. Therefore, how to process and mine the comment data in Weibo effectively by using computer technology has become a hot research problem at present. Emotion object recognition is a very effective way to solve this problem. This paper mainly focuses on the emotional object recognition of Chinese Weibo text. However, the emotional object recognition of unstructured text is a difficult problem in itself, and there are often some shortcomings in the existing research. On the one hand, Weibo is different from traditional text in that it is short and free, and is usually not a canonical Chinese language. The existing basic Chinese text processing tools are not suitable for the special text such as Weibo, which makes the task of emotional object recognition more difficult. In order to solve this problem, this paper proposes to normalize the Weibo text and construct a number of dictionaries including network dictionary, expression dictionary, emotion dictionary and negative word dictionary, etc. This approach can not only improve the existing text processing tools for word segmentation and syntactic dependency analysis of Weibo, but also can more effectively combine context information for feature extraction. On the other hand, some methods have been able to effectively identify the explicit emotional objects in the text, but they still appear to be weak in the face of implicit emotional objects. Therefore, when emotional objects appear directly in the text, this paper uses the combination of conditional random field model and classification model to identify emotional objects. When the emotion object does not appear in the text, the implied emotion object is abstracted, and a modified conditional random field model with hidden nodes is proposed to identify hidden emotion object. The key idea of this paper is to consider the emotional object recognition as a sequence marking problem. The conditional random field model is used to label the object on the sentence-level Weibo text. The model comprehensively uses a variety of features to improve the recognition accuracy. In the experiment part, two sets of open evaluation data set and self-built data set are tested and evaluated. The results show that the model can not only recognize the dominant emotional objects in Weibo well. It can also identify hidden emotional objects.
【學(xué)位授予單位】:廣東工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.092;TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 謝麗星;周明;孫茂松;;基于層次結(jié)構(gòu)的多策略中文微博情感分析和特征抽取[J];中文信息學(xué)報(bào);2012年01期
2 王榮洋;鞠久朋;李壽山;周國棟;;基于CRFs的評(píng)價(jià)對(duì)象抽取特征研究[J];中文信息學(xué)報(bào);2012年02期
3 徐冰;趙鐵軍;王山雨;鄭德權(quán);;基于淺層句法特征的評(píng)價(jià)對(duì)象抽取研究[J];自動(dòng)化學(xué)報(bào);2011年10期
4 周勝臣;瞿文婷;石英子;施詢之;孫韻辰;;中文微博情感分析研究綜述[J];計(jì)算機(jī)應(yīng)用與軟件;2013年03期
5 鄭敏潔;雷志城;廖祥文;陳國龍;;基于層疊CRFs的中文句子評(píng)價(jià)對(duì)象抽取[J];中文信息學(xué)報(bào);2013年03期
6 陽愛民;林江豪;周詠梅;;中文文本情感詞典構(gòu)建方法[J];計(jì)算機(jī)科學(xué)與探索;2013年11期
7 宋暉;史南勝;;基于模式匹配與半監(jiān)督學(xué)習(xí)的評(píng)價(jià)對(duì)象抽取[J];計(jì)算機(jī)工程;2013年10期
,本文編號(hào):2457053
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2457053.html
最近更新
教材專著