事件驅(qū)動的文本情緒原因發(fā)現(xiàn)研究
發(fā)布時間:2018-04-23 10:44
本文選題:文本情緒原因發(fā)現(xiàn) + 事件驅(qū)動 ; 參考:《哈爾濱工業(yè)大學(xué)》2017年碩士論文
【摘要】:互聯(lián)網(wǎng)技術(shù)飛速發(fā)展的今天,網(wǎng)絡(luò)空間所包含的大量文本數(shù)據(jù)中既蘊含著智慧的結(jié)晶,又存在著潛在的風(fēng)險。在此背景下,基于自然語言處理技術(shù)的輿情監(jiān)控、觀點抽取和情緒分析等研究顯得愈發(fā)重要。目前相關(guān)研究重點正從日趨成熟的文本情緒分析向挖掘文本中包含的情緒產(chǎn)生原因深入,也就是從“知其然”向“知其所以然”深入,即本文所研究的文本情緒原因發(fā)現(xiàn)。文本情緒原因發(fā)現(xiàn)研究不僅依賴于所實施的算法,也受到原因標(biāo)注語料的限制。目前相關(guān)語料庫的缺乏影響了該領(lǐng)域研究的深入。因此本文首先設(shè)計構(gòu)建一個規(guī)模適中的情緒原因標(biāo)注語料庫,并在此基礎(chǔ)上研究情緒原因事件驅(qū)動的文本情緒原因發(fā)現(xiàn)方法。本文的工作主要包括以下三部分:針對標(biāo)注語料庫缺乏的問題,首先設(shè)計并構(gòu)建基于新聞文本的情緒原因語料庫。在對情緒原因表達(dá)規(guī)律進(jìn)行觀察和分析的基礎(chǔ)上,設(shè)計了一套完整全面的標(biāo)注體系。遵循這一體系,從15,687篇新聞文檔中人工挑選出2,105個包含情緒原因的實例,并完成情緒原因的標(biāo)注,最終構(gòu)建了一個情緒原因標(biāo)注語料庫。應(yīng)用這一語料庫,本文研究事件驅(qū)動的文本情緒原因發(fā)現(xiàn)方法。在對情緒原因文本的表達(dá)特點進(jìn)行分析和觀察的基礎(chǔ)上,設(shè)計了將引發(fā)情緒產(chǎn)生和變化的外界刺激抽象為事件元組結(jié)構(gòu)的方法。進(jìn)而,設(shè)計實現(xiàn)了基于依存句法分析的候選情緒原因事件抽取算法以及基于多項式核支持向量機(jī)算法的情緒原因事件識別算法。在本文構(gòu)建語料庫上進(jìn)行的實驗顯示,該方法在文本情緒原因識別的F值性能相較于基線方法提升3.34%。針對事件元組結(jié)構(gòu)表達(dá)能力有限的不足,研究將情緒原因事件元組進(jìn)一步轉(zhuǎn)換為事件樹結(jié)構(gòu),實現(xiàn)情緒原因從文本到事件樹的有效映射。通過結(jié)合樹核和多項式核,設(shè)計實現(xiàn)更有效的情緒原因發(fā)現(xiàn)方法。實驗結(jié)果顯示,相比基線系統(tǒng),該方法的F值提升10.61%。本文提出的事件驅(qū)動的情緒原因發(fā)現(xiàn)方法,可以很好地實現(xiàn)對情緒原因文本的抽象和映射,在情緒原因發(fā)現(xiàn)實驗中達(dá)到了目前已知方法中的最優(yōu)效果。同時,本文所建立的中文情緒原因標(biāo)注語料庫作為開放研究資源,也可推動本領(lǐng)域研究的發(fā)展和深入。
[Abstract]:With the rapid development of Internet technology, the large amount of text data contained in cyberspace contains not only the crystallization of wisdom, but also the potential risks. In this context, the research of public opinion monitoring, opinion extraction and emotion analysis based on natural language processing technology becomes more and more important. At present, the focus of relevant research is from the increasingly mature text emotional analysis to the exploration of the causes of emotion contained in the text, that is, from "knowing what it is" to "knowing what it is", that is, finding out the emotional reasons of the text studied in this paper. The research of text emotional cause discovery is not only dependent on the algorithm, but also limited by the reason tagging corpus. At present, the lack of corpuscles affects the depth of research in this field. Therefore, this paper first designs and constructs a moderate scale tagging corpus of emotional causes, and on this basis, studies a text method of emotional cause discovery driven by event of emotional cause. The work of this paper mainly includes the following three parts: aiming at the lack of annotated corpus, we first design and construct a corpus of emotional causes based on news texts. Based on the observation and analysis of the expression rules of emotional causes, a complete and comprehensive labeling system is designed. Following this system, 2105 examples containing emotional reasons were selected from 15687 news documents, and the tagging of emotional reasons was completed. Finally, a corpus of emotional cause tagging was constructed. Using this corpus, this paper studies an event-driven approach to the discovery of emotional causes in text. Based on the analysis and observation of the expression characteristics of the emotional cause text, a method is designed to abstract the external stimulus which causes the emotion generation and change into the event tuple structure. Furthermore, we design and implement the candidate emotion cause event extraction algorithm based on dependency syntax analysis and the emotion cause event recognition algorithm based on polynomial kernel support vector machine algorithm. The experiments on the corpus constructed in this paper show that the F-value performance of this method is 3.34 higher than that of the baseline method. Aiming at the limitation of the expression ability of event tuple structure, this paper studies the transformation of emotional cause event tuple into event tree structure to realize the effective mapping of emotional cause from text to event tree. By combining tree kernels with polynomial kernels, a more effective method of emotional cause detection is designed. The experimental results show that the F value of this method is 10.61% higher than that of baseline system. The event-driven method of emotional cause discovery proposed in this paper can well realize the abstraction and mapping of the emotional cause text and achieve the best result of the known methods in the experiment of emotional cause discovery. At the same time, as an open research resource, the Chinese emotional cause tagging corpus established in this paper can also promote the development and development of this field.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 徐睿峰;鄒承天;鄭燕珍;徐軍;桂林;劉濱;王曉龍;;一種基于情緒表達(dá)與情緒認(rèn)知分離的新型情緒詞典[J];中文信息學(xué)報;2013年06期
2 李逸薇;李壽山;黃居仁;高偉;;基于序列標(biāo)注模型的情緒原因識別方法[J];中文信息學(xué)報;2013年05期
3 何向東;王磊;;中西哲學(xué)因果關(guān)系研究的回顧及其啟示[J];哲學(xué)研究;2010年02期
4 袁毓林;用動詞的論元結(jié)構(gòu)跟事件模板相匹配——一種由動詞驅(qū)動的信息抽取方法[J];中文信息學(xué)報;2005年05期
,本文編號:1791622
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1791622.html
最近更新
教材專著