天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 搜索引擎論文 >

基于條件隨機(jī)場(chǎng)的網(wǎng)絡(luò)短評(píng)論挖掘系統(tǒng)研究與實(shí)現(xiàn)

發(fā)布時(shí)間:2018-05-21 12:38

  本文選題:評(píng)論挖掘 + 情感分析; 參考:《華南理工大學(xué)》2012年碩士論文


【摘要】:伴隨著WEB2.0時(shí)代的發(fā)展,互聯(lián)網(wǎng)上的信息內(nèi)容不斷增多,,人們獲得準(zhǔn)確信息的難度也隨著增大。互聯(lián)網(wǎng)的信息主要是兩種:事實(shí)信息和觀點(diǎn)信息,我們可以通過搜索引擎去獲得事實(shí)信息卻難以有途徑去獲得互聯(lián)網(wǎng)上人們對(duì)某一事物的觀點(diǎn)。因此對(duì)評(píng)論的挖掘能夠?yàn)榛ヂ?lián)網(wǎng)用戶提供發(fā)現(xiàn)觀點(diǎn)信息的途徑。 評(píng)論挖掘是當(dāng)前自然語言處理領(lǐng)域研究的熱點(diǎn),主要任務(wù)是對(duì)評(píng)論進(jìn)行主客觀識(shí)別和褒貶義分析。當(dāng)前在評(píng)論挖掘領(lǐng)域的研究主要是針對(duì)通用領(lǐng)域的,效果不佳,對(duì)于特定領(lǐng)域的評(píng)論挖掘研究則是過多依賴于人工構(gòu)建的領(lǐng)域詞典。相對(duì)于評(píng)論文章來說,短評(píng)論文本的主要特點(diǎn)是文本短小、內(nèi)容稀疏、主觀性強(qiáng)、構(gòu)詞不規(guī)律、領(lǐng)域依賴性強(qiáng),本文針對(duì)短評(píng)論的特點(diǎn)采用條件隨機(jī)場(chǎng)模型和自動(dòng)構(gòu)建的領(lǐng)域詞典來進(jìn)行短評(píng)論評(píng)價(jià)對(duì)象和情感詞信息的提取。 本論文研究并實(shí)現(xiàn)了基于條件隨機(jī)場(chǎng)的網(wǎng)絡(luò)短評(píng)論挖掘系統(tǒng),主要工作如下: 第一、提取出短評(píng)論中的特征對(duì)象組合詞,然后結(jié)合半自動(dòng)化構(gòu)建的情感詞集構(gòu)建自定義領(lǐng)域詞典; 第二、設(shè)計(jì)符合評(píng)論內(nèi)容結(jié)構(gòu)特征的條件隨機(jī)場(chǎng)模型,針對(duì)評(píng)論的內(nèi)容規(guī)律設(shè)計(jì)了條件隨機(jī)場(chǎng)的特征函數(shù),使得條件隨機(jī)場(chǎng)能夠準(zhǔn)確地挖掘出評(píng)論的特征對(duì)象和情感詞; 第三、研究了特征對(duì)象和情感詞的匹配算法,提取出評(píng)論中的評(píng)價(jià)對(duì)象與情感詞對(duì); 第四、識(shí)別出情感詞的情感傾向性。 本文將基于條件隨機(jī)場(chǎng)的網(wǎng)絡(luò)短評(píng)論挖掘系統(tǒng)應(yīng)用到挖掘點(diǎn)評(píng)網(wǎng)站的餐飲評(píng)論的服務(wù)評(píng)價(jià)信息,實(shí)驗(yàn)結(jié)果證明基于條件隨機(jī)場(chǎng)模型確實(shí)可以有效提取出短評(píng)論的特征對(duì)象和情感詞信息,在加入自動(dòng)構(gòu)建的領(lǐng)域詞典的情況下能夠?qū)⒛P蛿U(kuò)展到其它領(lǐng)域的評(píng)論中,用戶可以通過挖掘的結(jié)果了解到該主題的所有評(píng)論的有價(jià)值觀點(diǎn)信息。
[Abstract]:With the development of the WEB2.0 era, the content of information on the Internet is increasing, and the difficulty of obtaining accurate information is also increasing. There are two kinds of information on the Internet: factual information and opinion information. We can obtain factual information through search engine, but it is difficult to obtain people's views on a certain thing on the Internet. Therefore, the mining of comments can provide Internet users with a way to discover viewpoint information. Comment mining is a hot topic in the field of natural language processing. The main task is to identify comments objectively and subjectively. The current research in the field of comment mining is mainly aimed at the general field, and the effect is not good. The research of comment mining in specific fields is too dependent on artificial domain dictionaries. Compared with the comment articles, the main features of the short commentary texts are short text, sparse content, strong subjectivity, irregular word-formation, strong domain dependence. According to the characteristics of short comment, this paper uses conditional random field model and automatic domain dictionary to extract the information of evaluation object and emotion word of short comment. This paper studies and implements a conditional random field based network short comment mining system. The main work is as follows: First, the feature object combination words in short comments are extracted, and then the custom domain dictionary is constructed with the semi-automatic set of emotion words. Secondly, the conditional random field model which accords with the structural features of comment content is designed, and the feature function of conditional random field is designed according to the rule of content of comment, so that the conditional random field can accurately excavate the feature object and emotion word of comment. Thirdly, the matching algorithm of feature objects and affective words is studied, and the evaluation objects and affective word pairs in comments are extracted. Fourth, identify the emotional tendency of emotional words. In this paper, the conditional random field based network short comment mining system is applied to the service evaluation information of the restaurant comment mining site. The experimental results show that the conditional random field model can effectively extract the feature object and affective word information of the short comment, and the model can be extended to the comments in other fields by adding the automatically constructed domain dictionary. Users can get valuable opinion information about all comments on the subject through the results of the mining.
【學(xué)位授予單位】:華南理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP391.3

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 田久樂;趙蔚;;基于同義詞詞林的詞語相似度計(jì)算方法[J];吉林大學(xué)學(xué)報(bào)(信息科學(xué)版);2010年06期

2 傅賽香,袁鼎榮,黃柏雄,鐘智;基于統(tǒng)計(jì)的無詞典分詞方法[J];廣西科學(xué)院學(xué)報(bào);2002年04期

3 程濤;施水才;王霞;呂學(xué)強(qiáng);;基于同義詞詞林的中文文本主題詞提取[J];廣西師范大學(xué)學(xué)報(bào)(自然科學(xué)版);2007年02期

4 趙偉,戴新宇,尹存燕,陳家駿;一種規(guī)則與統(tǒng)計(jì)相結(jié)合的漢語分詞方法[J];計(jì)算機(jī)應(yīng)用研究;2004年03期

5 張玉芳;莫凌琳;熊忠陽;耿曉斐;;基于條件隨機(jī)場(chǎng)的科研論文信息分層抽取[J];計(jì)算機(jī)應(yīng)用研究;2009年10期

6 梅立軍,周強(qiáng),臧路,陳祖舜;知網(wǎng)與同義詞詞林的信息融合研究[J];中文信息學(xué)報(bào);2005年01期

7 黃昌寧;趙海;;中文分詞十年回顧[J];中文信息學(xué)報(bào);2007年03期

8 唐慧豐;譚松波;程學(xué)旗;;基于監(jiān)督學(xué)習(xí)的中文情感分類技術(shù)比較研究[J];中文信息學(xué)報(bào);2007年06期

9 徐軍;丁宇新;王曉龍;;使用機(jī)器學(xué)習(xí)方法進(jìn)行新聞的情感自動(dòng)分類[J];中文信息學(xué)報(bào);2007年06期

10 劉康;趙軍;;基于層疊CRFs模型的句子褒貶度分析研究[J];中文信息學(xué)報(bào);2008年01期

相關(guān)會(huì)議論文 前1條

1 倪茂樹;林鴻飛;;基于關(guān)聯(lián)規(guī)則和極性分析的商品評(píng)論挖掘[A];第三屆全國(guó)信息檢索與內(nèi)容安全學(xué)術(shù)會(huì)議論文集[C];2007年

相關(guān)碩士學(xué)位論文 前2條

1 楊樂;基于同義詞詞林的自動(dòng)文摘系統(tǒng)的研究[D];天津大學(xué);2007年

2 陳建美;中文情感詞匯本體的構(gòu)建及其應(yīng)用[D];大連理工大學(xué);2009年



本文編號(hào):1919223

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1919223.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶b05b6***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com