天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于主題的文本情感分類模型研究

發(fā)布時(shí)間:2018-11-17 13:46
【摘要】:隨著WEB2.0技術(shù)的普及和電子商務(wù)應(yīng)用的增長(zhǎng),人們更加易于在網(wǎng)站發(fā)表自己對(duì)于商品的看法和建議。抽取和分析這些情感信息能夠利于企業(yè)對(duì)于商品的改進(jìn),同時(shí)能夠指導(dǎo)用戶作出更好的選擇。所以,情感分類已經(jīng)成為了一個(gè)研究熱點(diǎn)。首先,本文對(duì)處理主觀性文本信息時(shí)所涉及到的一些理論和工具進(jìn)行研究與探討,然后基于原有的潛在狄利克雷分布(LDA)模型創(chuàng)建了SO-LDA模型,借助情感語料和分詞工具識(shí)別出評(píng)論文本中的情感詞和非情感詞,并利用SO-LDA模型進(jìn)行文本表示,最后利用SVM分類器進(jìn)行情感傾向性分類。本文所做的工作主要包含以下兩個(gè)方面:(1)研究情感文本表示模型以及相關(guān)技術(shù),提出一種基于LDA的情感主題和其它主題模型。在文本情感傾向性分類之前,首先要做的是針對(duì)主觀性文本進(jìn)行建立文檔表示模型。因?yàn)閭鹘y(tǒng)的VSM向量空間模型局限于高維性和稀疏性,所以本文應(yīng)用了LDA主題模型。論文對(duì)LDA進(jìn)行改進(jìn),得到了新的文本表示模型:SO-LDA主題模型。并將其應(yīng)用到了文本情感傾向性分類領(lǐng)域。(2)分別用LDA和SO-LDA模型解決文本情感分類問題,利用相關(guān)的情感語料進(jìn)行測(cè)試,在不同的主題數(shù)目對(duì)酒店和電腦兩個(gè)主題進(jìn)行實(shí)驗(yàn)。經(jīng)實(shí)驗(yàn)測(cè)試表明,和以往的LDA模型相比,實(shí)驗(yàn)設(shè)計(jì)的SO-LDA模型分類精確度更高。實(shí)驗(yàn)中,應(yīng)用SO-LDA模型對(duì)已得到的文本進(jìn)行建模,將文本中的詞語分成兩類,情感詞和非情感詞。根據(jù)文本中潛在的情感主題和其它主題對(duì)詞語進(jìn)行抽取,然后使用Gibbs抽樣算法估計(jì)SO-LDA模型的參數(shù),最后進(jìn)行分類。實(shí)驗(yàn)表明,在情感分類問題上,SO-LDA比LDA的情感分類更有效。
[Abstract]:With the popularity of WEB2.0 technology and the growth of e-commerce applications, it is easier for people to publish their opinions and suggestions on goods on the website. Extracting and analyzing these emotional information can help enterprises to improve their products and guide users to make better choices. Therefore, emotional classification has become a research hotspot. First of all, this paper studies and discusses some theories and tools involved in dealing with subjective text information, and then builds a SO-LDA model based on the original potential Delikley distribution (LDA) model. Emotional and non-emotional words are identified by means of affective corpus and word segmentation tools, and SO-LDA model is used for text representation. Finally, SVM classifier is used to classify affective tendency. The work of this paper mainly includes the following two aspects: (1) the emotional text representation model and related techniques are studied, and an emotional theme model and other thematic models based on LDA are proposed. Before classification of text emotion orientation, the first thing to do is to build a document representation model for subjective text. Because the traditional VSM vector space model is limited to high dimension and sparsity, the LDA topic model is applied in this paper. In this paper, we improve LDA and get a new text representation model: SO-LDA topic model. It is applied to the field of text affective preference classification. (2) LDA and SO-LDA models are used to solve the problem of text affective classification, and the related affective corpus is used to test. Experiment with both hotel and computer themes in different number of topics. The experimental results show that the experimental SO-LDA model is more accurate than the previous LDA model. In the experiment, the SO-LDA model is used to model the obtained text, and the words in the text are divided into two categories: affective word and non-emotional word. Words are extracted according to the underlying emotional topics and other topics in the text, then the parameters of the SO-LDA model are estimated by Gibbs sampling algorithm, and then classified. Experiments show that SO-LDA is more effective than LDA in emotional classification.
【學(xué)位授予單位】:沈陽工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 傅向華;劉國(guó);郭巖巖;郭武彪;;中文博客多方面話題情感分析研究[J];中文信息學(xué)報(bào);2013年01期

2 林政;譚松波;程學(xué)旗;;基于情感關(guān)鍵句抽取的情感分類研究[J];計(jì)算機(jī)研究與發(fā)展;2012年11期

3 馮時(shí);付永陳;陽鋒;王大玲;張一飛;;基于依存句法的博文情感傾向分析研究[J];計(jì)算機(jī)研究與發(fā)展;2012年11期

4 李本陽;關(guān)毅;董喜雙;李生;;基于單層標(biāo)注級(jí)聯(lián)模型的篇章情感傾向分析[J];中文信息學(xué)報(bào);2012年04期

5 謝麗星;周明;孫茂松;;基于層次結(jié)構(gòu)的多策略中文微博情感分析和特征抽取[J];中文信息學(xué)報(bào);2012年01期

6 陶富民;高軍;王騰蛟;周凱;;面向話題的新聞評(píng)論的情感特征選取[J];中文信息學(xué)報(bào);2010年03期

7 徐軍;丁宇新;王曉龍;;使用機(jī)器學(xué)習(xí)方法進(jìn)行新聞的情感自動(dòng)分類[J];中文信息學(xué)報(bào);2007年06期

8 唐慧豐;譚松波;程學(xué)旗;;基于監(jiān)督學(xué)習(xí)的中文情感分類技術(shù)比較研究[J];中文信息學(xué)報(bào);2007年06期

9 王根;趙軍;;基于多重冗余標(biāo)記CRFs的句子情感分析研究[J];中文信息學(xué)報(bào);2007年05期

10 張一彬;周杰;邊肇祺;郭軍;;基于內(nèi)容的音頻與音樂分析綜述[J];計(jì)算機(jī)學(xué)報(bào);2007年05期

,

本文編號(hào):2338005

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/jingjilunwen/dianzishangwulunwen/2338005.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶221ff***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com