基于依存句法分析的中文評(píng)價(jià)對(duì)象抽取和情感傾向性分析
發(fā)布時(shí)間:2019-01-28 08:26
【摘要】:隨著互聯(lián)網(wǎng)的發(fā)展,包含有觀點(diǎn)和評(píng)論的文本大量涌現(xiàn)。人們一方面瀏覽別人發(fā)表的評(píng)論,一方面不停地分享自己對(duì)于某些人或物的觀點(diǎn)和情感。情感分析能夠從互聯(lián)網(wǎng)上的評(píng)論文本中挖掘出群體性的觀點(diǎn),這對(duì)于經(jīng)濟(jì)發(fā)展、政治決策和個(gè)體行為都有著極其重要的指引作用。情感分析分為粗粒度和細(xì)粒度兩種,目前粗粒度情感分析取得了不錯(cuò)的效果,而細(xì)粒度情感分析的效果依舊不理想。評(píng)價(jià)對(duì)象抽取和情感傾向性分析是細(xì)粒度情感分析的一個(gè)重要的子任務(wù)。其中,評(píng)價(jià)對(duì)象抽取是該任務(wù)性能提高的瓶頸。針對(duì)評(píng)價(jià)對(duì)象抽取主要有四種方法,分別是基于尋找頻繁出現(xiàn)的名詞和名詞短語(yǔ)的抽取方法,利用觀點(diǎn)詞和評(píng)價(jià)對(duì)象的關(guān)系進(jìn)行抽取的方法,使用有監(jiān)督學(xué)習(xí)進(jìn)行抽取的方法,使用主題模型進(jìn)行抽取的方法。目前很多使用觀點(diǎn)詞和評(píng)價(jià)對(duì)象的關(guān)系進(jìn)行抽取的方法往往難以精準(zhǔn)地抽取出觀點(diǎn)詞真正關(guān)聯(lián)的評(píng)價(jià)對(duì)象,尤其是評(píng)價(jià)對(duì)象與觀點(diǎn)詞不在同一子句中的時(shí)候。針對(duì)該問(wèn)題,本文在利用中文評(píng)論句子中詞匯間依存關(guān)系的基礎(chǔ)上,通過(guò)語(yǔ)義角色標(biāo)注、添加抽取規(guī)則和搜索算法,以提高情感分析的性能。論文的主要工作如下:(1)在現(xiàn)有詞典的基礎(chǔ)上,構(gòu)建用于情感分析的情感詞典,包括:正面情緒詞典負(fù)面情緒詞典、正面評(píng)價(jià)詞典、負(fù)面評(píng)價(jià)詞典、觀點(diǎn)引述詞典、虛擬語(yǔ)氣詞典、轉(zhuǎn)折詞典、名詞性情感詞典等。這些詞典主要用于處理評(píng)價(jià)句中無(wú)用成分或只是表達(dá)想法、意愿的非評(píng)價(jià)句對(duì)情感分析的干擾,提供語(yǔ)義規(guī)則和傾向性分析需要的詞庫(kù)支持。(2)在依存句法分析的基礎(chǔ)上,利用語(yǔ)義角色標(biāo)注,添加了一系列的抽取規(guī)則進(jìn)行情感分析。同時(shí)使用了定中短語(yǔ)(定語(yǔ)和中心語(yǔ)組成的短語(yǔ))替換通常的名詞短語(yǔ)抽取出候選評(píng)價(jià)對(duì)象,用以提高評(píng)價(jià)對(duì)象和觀點(diǎn)詞的抽取精確度。這些規(guī)則主要考慮了中文語(yǔ)義知識(shí)、常用句式等對(duì)情感分析的影響。實(shí)驗(yàn)結(jié)果表明,在NLPCC 2013的微博評(píng)測(cè)語(yǔ)料上,添加語(yǔ)義規(guī)則的基于依存句法分析的方法,能夠顯著提高評(píng)價(jià)對(duì)象的抽取性能。(3)提出一種評(píng)價(jià)對(duì)象搜索方法,用于改善在只抽取出代詞或句法關(guān)系中無(wú)評(píng)價(jià)對(duì)象的情況下,搜索上下文中真正的評(píng)價(jià)對(duì)象的精確度。該方法主要結(jié)合了詞義和詞語(yǔ)相似度計(jì)算算法,縮小了上下文中潛在評(píng)價(jià)對(duì)象的搜索范圍。實(shí)驗(yàn)結(jié)果表明,該方法在實(shí)驗(yàn)語(yǔ)料上提高了評(píng)價(jià)對(duì)象的抽取精度。
[Abstract]:With the development of the Internet, a large number of texts containing views and comments have emerged. People browse the comments of others and share their views and feelings about certain people or things. Emotional analysis can excavate group views from comments on the Internet, which plays an extremely important role in guiding economic development, political decision-making and individual behavior. There are two kinds of affective analysis: coarse-grained and fine-grained. At present, coarse-grained affective analysis has achieved good results, but the effect of fine-grained affective analysis is still not ideal. Evaluation object extraction and affective orientation analysis are important sub-tasks of fine-grained emotional analysis. Evaluation object extraction is the bottleneck to improve the performance of the task. There are four main methods for evaluation object extraction, one is based on finding frequently occurring nouns and noun phrases, the other is based on the relationship between viewpoint words and evaluation objects, and the supervised learning is used to extract. The method of using topic model to extract. At present, many methods of extracting the relationship between viewpoint words and evaluation objects often find it difficult to accurately extract the evaluation objects that are really related to the opinion words, especially when the evaluation objects and the opinion words are not in the same clause. In order to improve the performance of emotional analysis, this paper aims to improve the performance of affective analysis by adding extraction rules and search algorithms based on the lexical dependencies in Chinese comment sentences and semantic role tagging. The main work of this paper is as follows: (1) on the basis of the existing dictionaries, we construct the emotional dictionaries for emotional analysis, including: positive emotion dictionaries, negative emotion dictionaries, positive evaluation dictionaries, negative evaluation dictionaries, viewpoint citing dictionaries. Subjunctive mood Dictionary, transition Dictionary, Noun emotion Dictionary, etc. These dictionaries are mainly used to deal with the interference of non-evaluative sentences in evaluative sentences to emotional analysis by useless elements or merely expressing ideas, and to provide lexical support for semantic rules and tendency analysis. (2) on the basis of dependency syntactic analysis, Using semantic role annotation, a series of extraction rules are added for emotional analysis. At the same time, the candidate evaluation objects are extracted by replacing the common noun phrases with fixed middle phrases (phrases composed of attributive and central words) in order to improve the accuracy of the extraction of evaluation objects and opinion words. These rules mainly consider the influence of Chinese semantic knowledge and common sentence patterns on affective analysis. The experimental results show that adding semantic rules to the evaluation corpus of Weibo in NLPCC 2013 can significantly improve the performance of evaluation object extraction. (3) an evaluation object search method is proposed. It is used to improve the accuracy of searching for real evaluation objects in the context of searching without evaluating objects in pronouns or syntactic relations. This method mainly combines word meaning and word similarity calculation algorithm, and reduces the search range of potential evaluation object in context. The experimental results show that the method improves the extraction accuracy of the evaluation object on the experimental corpus.
【學(xué)位授予單位】:東南大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類(lèi)號(hào)】:TP391.1
本文編號(hào):2416882
[Abstract]:With the development of the Internet, a large number of texts containing views and comments have emerged. People browse the comments of others and share their views and feelings about certain people or things. Emotional analysis can excavate group views from comments on the Internet, which plays an extremely important role in guiding economic development, political decision-making and individual behavior. There are two kinds of affective analysis: coarse-grained and fine-grained. At present, coarse-grained affective analysis has achieved good results, but the effect of fine-grained affective analysis is still not ideal. Evaluation object extraction and affective orientation analysis are important sub-tasks of fine-grained emotional analysis. Evaluation object extraction is the bottleneck to improve the performance of the task. There are four main methods for evaluation object extraction, one is based on finding frequently occurring nouns and noun phrases, the other is based on the relationship between viewpoint words and evaluation objects, and the supervised learning is used to extract. The method of using topic model to extract. At present, many methods of extracting the relationship between viewpoint words and evaluation objects often find it difficult to accurately extract the evaluation objects that are really related to the opinion words, especially when the evaluation objects and the opinion words are not in the same clause. In order to improve the performance of emotional analysis, this paper aims to improve the performance of affective analysis by adding extraction rules and search algorithms based on the lexical dependencies in Chinese comment sentences and semantic role tagging. The main work of this paper is as follows: (1) on the basis of the existing dictionaries, we construct the emotional dictionaries for emotional analysis, including: positive emotion dictionaries, negative emotion dictionaries, positive evaluation dictionaries, negative evaluation dictionaries, viewpoint citing dictionaries. Subjunctive mood Dictionary, transition Dictionary, Noun emotion Dictionary, etc. These dictionaries are mainly used to deal with the interference of non-evaluative sentences in evaluative sentences to emotional analysis by useless elements or merely expressing ideas, and to provide lexical support for semantic rules and tendency analysis. (2) on the basis of dependency syntactic analysis, Using semantic role annotation, a series of extraction rules are added for emotional analysis. At the same time, the candidate evaluation objects are extracted by replacing the common noun phrases with fixed middle phrases (phrases composed of attributive and central words) in order to improve the accuracy of the extraction of evaluation objects and opinion words. These rules mainly consider the influence of Chinese semantic knowledge and common sentence patterns on affective analysis. The experimental results show that adding semantic rules to the evaluation corpus of Weibo in NLPCC 2013 can significantly improve the performance of evaluation object extraction. (3) an evaluation object search method is proposed. It is used to improve the accuracy of searching for real evaluation objects in the context of searching without evaluating objects in pronouns or syntactic relations. This method mainly combines word meaning and word similarity calculation algorithm, and reduces the search range of potential evaluation object in context. The experimental results show that the method improves the extraction accuracy of the evaluation object on the experimental corpus.
【學(xué)位授予單位】:東南大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類(lèi)號(hào)】:TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 周紅照;侯明午;顏彭莉;張葉青;侯敏;滕永林;;語(yǔ)義特征在評(píng)價(jià)對(duì)象抽取與極性判定中的作用[J];北京大學(xué)學(xué)報(bào)(自然科學(xué)版);2014年01期
2 張莉;錢(qián)玲飛;許鑫;;基于核心句及句法關(guān)系的評(píng)價(jià)對(duì)象抽取[J];中文信息學(xué)報(bào);2011年03期
,本文編號(hào):2416882
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2416882.html
最近更新
教材專(zhuān)著