基于依存句法分析的中文評價對象抽取和情感傾向性分析
發(fā)布時間:2019-01-28 08:26
【摘要】:隨著互聯(lián)網(wǎng)的發(fā)展,包含有觀點和評論的文本大量涌現(xiàn)。人們一方面瀏覽別人發(fā)表的評論,一方面不停地分享自己對于某些人或物的觀點和情感。情感分析能夠從互聯(lián)網(wǎng)上的評論文本中挖掘出群體性的觀點,這對于經(jīng)濟發(fā)展、政治決策和個體行為都有著極其重要的指引作用。情感分析分為粗粒度和細粒度兩種,目前粗粒度情感分析取得了不錯的效果,而細粒度情感分析的效果依舊不理想。評價對象抽取和情感傾向性分析是細粒度情感分析的一個重要的子任務。其中,評價對象抽取是該任務性能提高的瓶頸。針對評價對象抽取主要有四種方法,分別是基于尋找頻繁出現(xiàn)的名詞和名詞短語的抽取方法,利用觀點詞和評價對象的關系進行抽取的方法,使用有監(jiān)督學習進行抽取的方法,使用主題模型進行抽取的方法。目前很多使用觀點詞和評價對象的關系進行抽取的方法往往難以精準地抽取出觀點詞真正關聯(lián)的評價對象,尤其是評價對象與觀點詞不在同一子句中的時候。針對該問題,本文在利用中文評論句子中詞匯間依存關系的基礎上,通過語義角色標注、添加抽取規(guī)則和搜索算法,以提高情感分析的性能。論文的主要工作如下:(1)在現(xiàn)有詞典的基礎上,構(gòu)建用于情感分析的情感詞典,包括:正面情緒詞典負面情緒詞典、正面評價詞典、負面評價詞典、觀點引述詞典、虛擬語氣詞典、轉(zhuǎn)折詞典、名詞性情感詞典等。這些詞典主要用于處理評價句中無用成分或只是表達想法、意愿的非評價句對情感分析的干擾,提供語義規(guī)則和傾向性分析需要的詞庫支持。(2)在依存句法分析的基礎上,利用語義角色標注,添加了一系列的抽取規(guī)則進行情感分析。同時使用了定中短語(定語和中心語組成的短語)替換通常的名詞短語抽取出候選評價對象,用以提高評價對象和觀點詞的抽取精確度。這些規(guī)則主要考慮了中文語義知識、常用句式等對情感分析的影響。實驗結(jié)果表明,在NLPCC 2013的微博評測語料上,添加語義規(guī)則的基于依存句法分析的方法,能夠顯著提高評價對象的抽取性能。(3)提出一種評價對象搜索方法,用于改善在只抽取出代詞或句法關系中無評價對象的情況下,搜索上下文中真正的評價對象的精確度。該方法主要結(jié)合了詞義和詞語相似度計算算法,縮小了上下文中潛在評價對象的搜索范圍。實驗結(jié)果表明,該方法在實驗語料上提高了評價對象的抽取精度。
[Abstract]:With the development of the Internet, a large number of texts containing views and comments have emerged. People browse the comments of others and share their views and feelings about certain people or things. Emotional analysis can excavate group views from comments on the Internet, which plays an extremely important role in guiding economic development, political decision-making and individual behavior. There are two kinds of affective analysis: coarse-grained and fine-grained. At present, coarse-grained affective analysis has achieved good results, but the effect of fine-grained affective analysis is still not ideal. Evaluation object extraction and affective orientation analysis are important sub-tasks of fine-grained emotional analysis. Evaluation object extraction is the bottleneck to improve the performance of the task. There are four main methods for evaluation object extraction, one is based on finding frequently occurring nouns and noun phrases, the other is based on the relationship between viewpoint words and evaluation objects, and the supervised learning is used to extract. The method of using topic model to extract. At present, many methods of extracting the relationship between viewpoint words and evaluation objects often find it difficult to accurately extract the evaluation objects that are really related to the opinion words, especially when the evaluation objects and the opinion words are not in the same clause. In order to improve the performance of emotional analysis, this paper aims to improve the performance of affective analysis by adding extraction rules and search algorithms based on the lexical dependencies in Chinese comment sentences and semantic role tagging. The main work of this paper is as follows: (1) on the basis of the existing dictionaries, we construct the emotional dictionaries for emotional analysis, including: positive emotion dictionaries, negative emotion dictionaries, positive evaluation dictionaries, negative evaluation dictionaries, viewpoint citing dictionaries. Subjunctive mood Dictionary, transition Dictionary, Noun emotion Dictionary, etc. These dictionaries are mainly used to deal with the interference of non-evaluative sentences in evaluative sentences to emotional analysis by useless elements or merely expressing ideas, and to provide lexical support for semantic rules and tendency analysis. (2) on the basis of dependency syntactic analysis, Using semantic role annotation, a series of extraction rules are added for emotional analysis. At the same time, the candidate evaluation objects are extracted by replacing the common noun phrases with fixed middle phrases (phrases composed of attributive and central words) in order to improve the accuracy of the extraction of evaluation objects and opinion words. These rules mainly consider the influence of Chinese semantic knowledge and common sentence patterns on affective analysis. The experimental results show that adding semantic rules to the evaluation corpus of Weibo in NLPCC 2013 can significantly improve the performance of evaluation object extraction. (3) an evaluation object search method is proposed. It is used to improve the accuracy of searching for real evaluation objects in the context of searching without evaluating objects in pronouns or syntactic relations. This method mainly combines word meaning and word similarity calculation algorithm, and reduces the search range of potential evaluation object in context. The experimental results show that the method improves the extraction accuracy of the evaluation object on the experimental corpus.
【學位授予單位】:東南大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP391.1
本文編號:2416882
[Abstract]:With the development of the Internet, a large number of texts containing views and comments have emerged. People browse the comments of others and share their views and feelings about certain people or things. Emotional analysis can excavate group views from comments on the Internet, which plays an extremely important role in guiding economic development, political decision-making and individual behavior. There are two kinds of affective analysis: coarse-grained and fine-grained. At present, coarse-grained affective analysis has achieved good results, but the effect of fine-grained affective analysis is still not ideal. Evaluation object extraction and affective orientation analysis are important sub-tasks of fine-grained emotional analysis. Evaluation object extraction is the bottleneck to improve the performance of the task. There are four main methods for evaluation object extraction, one is based on finding frequently occurring nouns and noun phrases, the other is based on the relationship between viewpoint words and evaluation objects, and the supervised learning is used to extract. The method of using topic model to extract. At present, many methods of extracting the relationship between viewpoint words and evaluation objects often find it difficult to accurately extract the evaluation objects that are really related to the opinion words, especially when the evaluation objects and the opinion words are not in the same clause. In order to improve the performance of emotional analysis, this paper aims to improve the performance of affective analysis by adding extraction rules and search algorithms based on the lexical dependencies in Chinese comment sentences and semantic role tagging. The main work of this paper is as follows: (1) on the basis of the existing dictionaries, we construct the emotional dictionaries for emotional analysis, including: positive emotion dictionaries, negative emotion dictionaries, positive evaluation dictionaries, negative evaluation dictionaries, viewpoint citing dictionaries. Subjunctive mood Dictionary, transition Dictionary, Noun emotion Dictionary, etc. These dictionaries are mainly used to deal with the interference of non-evaluative sentences in evaluative sentences to emotional analysis by useless elements or merely expressing ideas, and to provide lexical support for semantic rules and tendency analysis. (2) on the basis of dependency syntactic analysis, Using semantic role annotation, a series of extraction rules are added for emotional analysis. At the same time, the candidate evaluation objects are extracted by replacing the common noun phrases with fixed middle phrases (phrases composed of attributive and central words) in order to improve the accuracy of the extraction of evaluation objects and opinion words. These rules mainly consider the influence of Chinese semantic knowledge and common sentence patterns on affective analysis. The experimental results show that adding semantic rules to the evaluation corpus of Weibo in NLPCC 2013 can significantly improve the performance of evaluation object extraction. (3) an evaluation object search method is proposed. It is used to improve the accuracy of searching for real evaluation objects in the context of searching without evaluating objects in pronouns or syntactic relations. This method mainly combines word meaning and word similarity calculation algorithm, and reduces the search range of potential evaluation object in context. The experimental results show that the method improves the extraction accuracy of the evaluation object on the experimental corpus.
【學位授予單位】:東南大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP391.1
【參考文獻】
相關期刊論文 前2條
1 周紅照;侯明午;顏彭莉;張葉青;侯敏;滕永林;;語義特征在評價對象抽取與極性判定中的作用[J];北京大學學報(自然科學版);2014年01期
2 張莉;錢玲飛;許鑫;;基于核心句及句法關系的評價對象抽取[J];中文信息學報;2011年03期
,本文編號:2416882
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2416882.html
最近更新
教材專著