天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

中文情緒表達(dá)常識(shí)庫(kù)構(gòu)建及其在情緒分析中的應(yīng)用

發(fā)布時(shí)間:2018-07-22 11:01
【摘要】:隨著人機(jī)交互逐漸被人們所熟知和應(yīng)用,計(jì)算機(jī)被期望擁有與人一樣的情感、情緒方面處理能力。近年來(lái),社會(huì)化媒體的興起使得用戶(hù)生成的文本,尤其是帶有個(gè)人情緒的微博、博客和評(píng)論等被大量推送在網(wǎng)絡(luò)上。網(wǎng)絡(luò)文本數(shù)據(jù)推動(dòng)了對(duì)大量真實(shí)個(gè)體情緒分析和跟蹤的研究,在社會(huì)、政治、經(jīng)濟(jì)等領(lǐng)域顯示出重要的研究意義和廣闊的應(yīng)用前景。本課題研究中文情緒基礎(chǔ)資源建設(shè)及其在文本情緒分析中的應(yīng)用,從情緒體系模型、情緒詞基礎(chǔ)資源構(gòu)建和多標(biāo)簽文本情緒自動(dòng)分類(lèi)三個(gè)方面分析。本文主要包括以下四項(xiàng)工作:第一,針對(duì)中文情緒詞典資源較為匱乏的問(wèn)題,利用英文情緒詞典Word Net-Affect,通過(guò)機(jī)器翻譯、噪音過(guò)濾和同義擴(kuò)展步驟,自動(dòng)構(gòu)建了一個(gè)具有較高質(zhì)量和覆蓋率的中文情緒詞表,為文本情緒分析建立可靠的基礎(chǔ)資源。第二,目前已有的中文情緒詞典普遍存在完善性和精確性等問(wèn)題,以往研究中,情緒詞信息通常只包括詞語(yǔ)簡(jiǎn)單的情緒類(lèi)別和強(qiáng)度值。本課題認(rèn)為詞語(yǔ)的情緒類(lèi)型分為表達(dá)和認(rèn)知兩種,在本文中主要挖掘詞語(yǔ)情緒表達(dá)方面蘊(yùn)含的深層信息,同時(shí)引入How Net的詞語(yǔ)概念解釋來(lái)區(qū)分詞語(yǔ)多義性,在此基礎(chǔ)上提出新型標(biāo)注體系,構(gòu)建了細(xì)粒度中文情緒表達(dá)常識(shí)庫(kù)。第三,面對(duì)網(wǎng)絡(luò)文本和詞語(yǔ)不斷新增的情況,采用基于規(guī)則的新詞發(fā)現(xiàn)方法自動(dòng)擴(kuò)充常識(shí)庫(kù)。面對(duì)句子短小信息量少和難以識(shí)別非情緒詞表達(dá)情緒的問(wèn)題,引入詞語(yǔ)的義項(xiàng)概念自動(dòng)擴(kuò)展句子。第四,將情緒詞資源應(yīng)用在基于語(yǔ)義規(guī)則以及基于機(jī)器學(xué)習(xí)的多類(lèi)標(biāo)文本情緒分類(lèi)算法中,通過(guò)對(duì)比實(shí)驗(yàn)發(fā)現(xiàn),本課題構(gòu)建的中文情緒詞詞表和情緒表達(dá)常識(shí)庫(kù)分類(lèi)性能優(yōu)于傳統(tǒng)情緒詞資源,同時(shí)表明,融入了常識(shí)庫(kù)信息的特征表示方法能有效提升基于機(jī)器學(xué)習(xí)方法的分類(lèi)性能。本課題的貢獻(xiàn)在于:一,構(gòu)建了高質(zhì)量的中文情緒詞表以及目前已知最精細(xì)的中文情緒表達(dá)常識(shí)庫(kù)。二,采用規(guī)則的方法發(fā)掘新情緒詞可以擴(kuò)大常識(shí)庫(kù)規(guī)模,同時(shí),利用詞語(yǔ)概念擴(kuò)充句子的方法有利于改善文本情緒分析結(jié)果。三,相比于傳統(tǒng)中文情緒詞典以及現(xiàn)有特征表達(dá)方法在多標(biāo)簽文本情緒分類(lèi)中的作用,新詞典及新型細(xì)粒度中文情緒表達(dá)常識(shí)庫(kù)的應(yīng)用提高了分類(lèi)性能,體現(xiàn)了它們的優(yōu)勢(shì)以及在文本情緒計(jì)算應(yīng)用中的有效性。
[Abstract]:As human-computer interaction is gradually known and applied, computers are expected to have the same emotional and emotional processing abilities as humans. In recent years, the rise of social media has made user-generated texts, especially Weibo, blogs and comments with personal emotions, being heavily pushed online. Web text data promote the research of a large number of real individual emotional analysis and tracking, and show important research significance and broad application prospect in social, political, economic and other fields. This paper studies the construction of Chinese emotional basic resources and its application in text emotion analysis, which is analyzed from three aspects: the emotional system model, the construction of the basic resources of emotional words and the automatic classification of multi-label text emotions. This paper mainly includes the following four tasks: first, aiming at the shortage of Chinese emotion dictionary resources, we use the English emotion dictionary word Net-Affectthrough machine translation, noise filtering and synonymous extension steps. An automatic Chinese emotional lexicon with high quality and coverage is constructed to establish a reliable basic resource for text emotion analysis. Secondly, the existing Chinese emotion dictionaries generally have some problems, such as perfection and accuracy. In previous studies, the information of emotion words usually only includes simple categories of emotions and intensity of words. This thesis holds that the emotion types of words can be divided into expression and cognition. In this paper, the deep information contained in the expression of words' emotions is mainly explored, and the concept of how net is introduced to distinguish the polysemy of words. On this basis, a new annotation system is proposed, and a fine-grained common sense database of Chinese emotion expression is constructed. Thirdly, in the face of the new network text and words, the rule-based new word discovery method is used to automatically expand the common sense database. In the face of the problem that there is little short information in sentences and it is difficult to recognize the expression of emotion by non-emotional words, the concept of meaning of words is introduced to extend sentences automatically. Fourthly, the emotional word resources are applied to the multi-class text emotion classification algorithm based on semantic rules and machine learning. The classification performance of the Chinese emotional vocabulary and the common sense database of emotion expression constructed in this paper is superior to that of the traditional emotional word resources. It is also shown that the feature representation method incorporating the common sense information can effectively improve the classification performance based on the machine learning method. The contributions of this thesis are as follows: first, a high quality Chinese emotional lexicon and the best known common sense database of Chinese emotion expression are constructed. Secondly, the use of rules to discover new emotional words can expand the scale of the common sense database, at the same time, the use of word concepts to expand the sentence method is conducive to improve the text emotional analysis results. Third, compared with the traditional Chinese emotion dictionary and the existing feature expression methods in multi-label text emotion classification, the new dictionary and the new fine-grained Chinese emotion expression common sense database have improved the classification performance. It shows their advantages and effectiveness in the application of text emotion calculation.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類(lèi)號(hào)】:TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 徐睿峰;鄒承天;鄭燕珍;徐軍;桂林;劉濱;王曉龍;;一種基于情緒表達(dá)與情緒認(rèn)知分離的新型情緒詞典[J];中文信息學(xué)報(bào);2013年06期

,

本文編號(hào):2137228

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/jingjilunwen/zhengzhijingjixuelunwen/2137228.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶(hù)e13de***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com