天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 軟件論文 >

網(wǎng)絡(luò)評論文本的細(xì)粒度情感分析研究

發(fā)布時(shí)間:2018-02-03 08:43

  本文關(guān)鍵詞: 細(xì)粒度 情感歧義 情感詞典 情感要素 CRFs 出處:《山東師范大學(xué)》2017年碩士論文 論文類型:學(xué)位論文


【摘要】:隨著網(wǎng)絡(luò)評論文本的爆炸式增長,評論中承載了大量的用戶情感信息,分析評論的整體傾向性已經(jīng)不能滿足當(dāng)前用戶的需求,迫切需要更細(xì)粒度屬性層面的情感分析,并且由于用戶表達(dá)隨意性造成的分詞準(zhǔn)確率過低,情感要素抽取準(zhǔn)確率低和隱式情感信息丟失等問題也急需解決。本文首先對垃圾評論過濾和中文分詞兩種文本預(yù)處理任務(wù)進(jìn)行了分析;其次基于CRFs模型對情感要素進(jìn)行抽取,補(bǔ)充隱式情感對象后聚合處理;然后提出一種對聚合后特征類的對立觀點(diǎn)進(jìn)行情感強(qiáng)度分析的算法。本文研究內(nèi)容有以下四個(gè)部分:(1)針對文本預(yù)處理問題,基于構(gòu)建的評論特征分類來識別垃圾評論,并構(gòu)建用戶詞典改善中文分詞本文首先基于構(gòu)建的評論特征進(jìn)行文本分類,包括主客觀文本分類,過濾掉垃圾觀點(diǎn)信息評論數(shù)據(jù),保留真實(shí)有價(jià)值的評論文本信息進(jìn)行情感分析任務(wù),并進(jìn)行意群劃分,便于后續(xù)語義情感聚合處理;中文分詞采用NLPIR分詞系統(tǒng),基于新詞、網(wǎng)絡(luò)詞匯和領(lǐng)域術(shù)語類關(guān)鍵詞等未登錄詞構(gòu)建用戶詞典,既可以糾正分詞錯(cuò)誤,提高情感對象抽取的準(zhǔn)確率,又可以作為情感詞典的補(bǔ)充,減少用戶情感信息的丟失。(2)基于CRFs模型抽取情感要素,將情感對象、情感詞及情感修飾詞的聯(lián)合識別任務(wù)轉(zhuǎn)化為結(jié)構(gòu)化序列標(biāo)注任務(wù)采用條件隨機(jī)場模型聯(lián)合識別情感要素,首先選取特征構(gòu)建特征模板和標(biāo)注集,然后基于CRFs聯(lián)合識別情感要素,利用顯式情感對象-情感詞對和評論語料中標(biāo)簽集組成的產(chǎn)品特征觀點(diǎn)對構(gòu)建訓(xùn)練文檔,采用樸素貝葉斯分類器識別隱式情感對象,最后通過詞義代碼實(shí)現(xiàn)情感對象聚合,改進(jìn)特征稀疏性問題。(3)提出了基于語境情感消岐的對立觀點(diǎn)情感強(qiáng)度分析算法本文首先依據(jù)情感詞的動(dòng)態(tài)極性定義了情感歧義詞,利用關(guān)聯(lián)規(guī)則挖掘情感歧義詞語搭配集,PMI剪枝過濾后構(gòu)建出情感歧義詞搭配詞典,然后介紹了構(gòu)建的網(wǎng)絡(luò)詞典及情感修飾詞典等,提出了對立觀點(diǎn)情感強(qiáng)度計(jì)算的方法,最后依據(jù)情感強(qiáng)度生成對立觀點(diǎn)情感摘要完成細(xì)粒度情感分析,實(shí)驗(yàn)表明了本文詞典構(gòu)建及情感強(qiáng)度計(jì)算方法的有效性。(4)設(shè)計(jì)并實(shí)現(xiàn)了評論文本細(xì)粒度情感分析系統(tǒng)本文實(shí)現(xiàn)了細(xì)粒度情感分析系統(tǒng),該系統(tǒng)各功能模可以完成評論采集、垃圾評論過濾、中文分詞、情感要素抽取和細(xì)粒度情感分析全過程,并最終提供給用戶直觀的包含對立觀點(diǎn)強(qiáng)度信息的細(xì)粒度分析結(jié)果。
[Abstract]:With the explosive growth of the network comment text, the commentary carries a large amount of user emotional information, the overall tendency of the analysis of comments can no longer meet the needs of current users. There is an urgent need for more fine-grained attribute level emotional analysis, and the segmentation accuracy caused by the randomness of user expression is too low. The problems of low accuracy rate of emotion extraction and loss of implicit emotional information are also urgently needed to be solved. Firstly, the text preprocessing tasks of spam comment filtering and Chinese word segmentation are analyzed in this paper. Secondly, based on the CRFs model, the emotion elements are extracted and the implicit affective objects are processed by post-aggregation. Then we propose an algorithm to analyze the affective intensity of the opposite view of the aggregated feature class. In this paper, there are four parts: 1) to deal with the text preprocessing problem. Based on the constructed comment feature classification to identify spam comments, and build a user dictionary to improve the Chinese word segmentation this paper first based on the constructed comment features for text classification, including subjective and objective text classification. Filtering out the comment data of spam view information, retaining the real and valuable comment text information for emotional analysis task, and dividing the semantic group, so as to facilitate the subsequent semantic emotional aggregation processing; The Chinese word segmentation system uses NLPIR word segmentation system, based on new words, network vocabulary and domain terms and other unrecorded words to build a user dictionary, which can correct segmentation errors and improve the accuracy of emotional object extraction. Can also be used as an affective dictionary to reduce the loss of user emotional information. 2) based on the CRFs model to extract emotional elements, emotional objects. The joint recognition task of affective words and affective modifiers is transformed into structured sequence tagging tasks. The conditional random field model is used to jointly identify emotional elements. Firstly, feature templates and tagging sets are constructed by selecting features. Then, based on CRFs, the training document is constructed by using the product feature viewpoint of explicit affective object-affective word pair and tag set in the comment corpus. The naive Bayesian classifier is used to identify the implicit emotional objects, and finally the semantic code is used to aggregate the emotional objects. Improved feature sparsity problem. (3) this paper proposes an analysis algorithm of affective intensity based on contextual emotional disambiguation. In this paper, we first define emotional ambiguity according to the dynamic polarity of affective words. Using association rules to mine affective ambiguity words collocation set PMI pruning filter to construct affective ambiguity words collocation dictionary then introduced the network dictionary and affective modification dictionary and so on. A method for calculating the emotional intensity of opposites is proposed. Finally, the fine grain emotional analysis is completed according to the emotional summary of the opposing viewpoints. Experiments show that the dictionary construction and the validity of the calculation method of emotional strength. 4) designed and implemented the fine-grained emotional analysis system of comment text. In this paper, the fine-grained emotional analysis system is implemented. Each functional module of the system can complete the whole process of comment collection, garbage comment filtering, Chinese word segmentation, emotion element extraction and fine-grained emotion analysis. Finally, the fine-grained analysis results containing the strength information of opposing views are provided to the user.
【學(xué)位授予單位】:山東師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 梅莉莉;黃河燕;周新宇;毛先領(lǐng);;情感詞典構(gòu)建綜述[J];中文信息學(xué)報(bào);2016年05期

2 王科;夏睿;;情感詞典自動(dòng)構(gòu)建方法綜述[J];自動(dòng)化學(xué)報(bào);2016年04期

3 劉麗;王永恒;韋航;;面向產(chǎn)品評論的細(xì)粒度情感分析[J];計(jì)算機(jī)應(yīng)用;2015年12期

4 邱云飛;倪學(xué)峰;邵良杉;;商品隱式評價(jià)對象提取的方法研究[J];計(jì)算機(jī)工程與應(yīng)用;2015年19期

5 孫曉;唐陳意;;基于層疊模型細(xì)粒度情感要素抽取及傾向分析[J];模式識別與人工智能;2015年06期

6 劉麗珍;趙新蕾;王函石;聶欣慧;宋巍;;基于產(chǎn)品特征的領(lǐng)域情感本體構(gòu)建[J];北京理工大學(xué)學(xué)報(bào);2015年05期

7 韓冬煦;常寶寶;;中文分詞模型的領(lǐng)域適應(yīng)性方法[J];計(jì)算機(jī)學(xué)報(bào);2015年02期

8 陳燕方;李志宇;;基于評論產(chǎn)品屬性情感傾向評估的虛假評論識別研究[J];現(xiàn)代圖書情報(bào)技術(shù);2014年09期

9 戴敏;王榮洋;李壽山;朱珠;周國棟;;基于句法特征的評價(jià)對象抽取方法研究[J];中文信息學(xué)報(bào);2014年04期

10 王昌厚;王菲;;使用基于模式的Bootstrapping方法抽取情感詞[J];計(jì)算機(jī)工程與應(yīng)用;2014年01期

相關(guān)會(huì)議論文 前2條

1 林琛;汪衛(wèi);;Web論壇上的垃圾貼過濾[A];第26屆中國數(shù)據(jù)庫學(xué)術(shù)會(huì)議論文集(B輯)[C];2009年

2 姚天f ;聶青陽;李建超;李林琳;婁德成;陳珂;付宇;;一個(gè)用于漢語汽車評論的意見挖掘系統(tǒng)[A];中文信息處理前沿進(jìn)展——中國中文信息學(xué)會(huì)二十五周年學(xué)術(shù)會(huì)議論文集[C];2006年

相關(guān)博士學(xué)位論文 前3條

1 江騰蛟;基于句法和語義挖掘的Web金融評論情感分析[D];江西財(cái)經(jīng)大學(xué);2015年

2 黃勝;Web評論文本的細(xì)粒度意見挖掘技術(shù)研究[D];北京理工大學(xué);2014年

3 楊玉珍;基于Web評論信息的傾向性分析關(guān)鍵技術(shù)研究[D];山東師范大學(xué);2014年

相關(guān)碩士學(xué)位論文 前2條

1 荀靜;基于LDA模型的文檔情感摘要研究[D];山東師范大學(xué);2015年

2 曾令偉;產(chǎn)品評論中隱式評價(jià)對象的抽取研究[D];上海交通大學(xué);2014年

,

本文編號:1486932

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1486932.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶38e8d***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請E-mail郵箱bigeng88@qq.com