觀點(diǎn)挖掘中評(píng)價(jià)對(duì)象抽取方法的研究
[Abstract]:Viewpoint mining, also known as emotional analysis, refers to the automatic analysis of the text content of user comments to get the user's feelings, attitudes and opinions on products, services, people, events and topics, etc., which have important theoretical and applied value. Viewpoint mining can be divided into coarse-grained and fine-grained. Although coarse-grained viewpoint mining is mature, there are still many problems in fine-grained viewpoint mining. Evaluation object extraction is an important sub-task in fine-grained viewpoint mining, which aims to extract fine-grained evaluation objects from view text, such as the product itself and its components, attributes and features. At present, evaluation object extraction methods are mainly divided into two categories: supervised and unsupervised. The former is mainly based on hidden Markov model and conditional random field, while the latter is mainly based on topic model and syntactic rules. In recent years, some studies have shown that the method based on unsupervised syntax rules shows good performance, but it faces some challenges at the same time. The first challenge is how to quickly implement evaluation object extraction rules. The second challenge is how to automatically select high-quality rules from different evaluation objects. The third challenge is how to use a large number of unannotated comment texts to help evaluate the object extraction. In response to these challenges, this article proposes the following solutions. As far as we know, these solutions are proposed for the first time in this paper. (1) A evaluation object extraction framework based on logical programming is proposed to implement evaluation object extraction rules quickly. The logical programming language used in this paper is the answer set programming language (ASP). Firstly, the part of speech and syntactic dependencies of the words in a comment sentence are expressed as ASP facts. Then the known evaluation object extraction rules are transformed into ASP rules. Finally, the existing ASP answer set solver is used to realize the rules automatically. The experimental results show that the proposed method is not only efficient but also simple. (2) two methods of automatic rule selection are proposed to automatically select high quality rules from the variable quality evaluation object extraction rules for evaluation object extraction. The first is based on greedy algorithm and the second is based on local search (simulated annealing algorithm). The experimental results show that both methods can effectively select a subset of high quality rules from the initial rule set with uneven quality. In order to obtain better results than the initial rule set. (3) an evaluation object recommendation method based on semantic similarity and correlation is proposed to help evaluate object extraction by using a large number of unannotated comment texts. Firstly, a large number of unannotated comments on the Internet are used to learn the semantic similarity and relevance between words. Then using these knowledge and a small number of seed evaluation objects to recommend evaluation objects to the new field. Experimental results show that this method can effectively use the knowledge learned from other fields to recommend high-quality evaluation objects to new fields.
【學(xué)位授予單位】:東南大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.1
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 張志強(qiáng),李天柱,張波,陳少飛,郝亞南;基于文檔結(jié)構(gòu)的信息抽取規(guī)則的描述語(yǔ)言比較研究[J];河北大學(xué)學(xué)報(bào)(自然科學(xué)版);2004年02期
2 彭祥禮;朱小軍;查志勇;;Web信息抽取和展現(xiàn)系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[J];電力信息化;2012年02期
3 石倩;陳榮;魯明羽;;基于規(guī)則歸納的信息抽取系統(tǒng)實(shí)現(xiàn)[J];計(jì)算機(jī)工程與應(yīng)用;2008年21期
4 李洋;;基于Web的信息抽取研究[J];吉林工程技術(shù)師范學(xué)院學(xué)報(bào);2007年12期
5 化柏林;劉一寧;鄭彥寧;;針對(duì)學(xué)術(shù)定義的抽取規(guī)則構(gòu)建方法研究[J];情報(bào)理論與實(shí)踐;2011年12期
6 張志遠(yuǎn);徐濤;馮霞;;航班信息抽取規(guī)則的自動(dòng)生成技術(shù)[J];計(jì)算機(jī)工程;2011年06期
7 李向陽(yáng);戴江山;張亞非;;一種Web信息抽取規(guī)則的優(yōu)化方法[J];蘭州理工大學(xué)學(xué)報(bào);2006年01期
8 曲著偉;李敏強(qiáng);;基于數(shù)據(jù)區(qū)域發(fā)現(xiàn)的信息抽取規(guī)則生成方法[J];計(jì)算機(jī)工程;2009年22期
9 魏保子;王儒敬;;基于多Agent技術(shù)的分布式信息抽取系統(tǒng)研究[J];微電子學(xué)與計(jì)算機(jī);2008年06期
10 方少卿;胡學(xué)鋼;;基于Web挖掘的信息抽取系統(tǒng)的研究[J];銅陵學(xué)院學(xué)報(bào);2010年04期
相關(guān)會(huì)議論文 前2條
1 葉娜;羅海濤;朱靖波;張斌;;基于歸納邏輯編程的多槽信息抽取規(guī)則自動(dòng)學(xué)習(xí)方法[A];全國(guó)第八屆計(jì)算語(yǔ)言學(xué)聯(lián)合學(xué)術(shù)會(huì)議(JSCL-2005)論文集[C];2005年
2 楊文柱;徐林昊;郝亞南;陳少飛;李天柱;;個(gè)性化的智能Web查詢助手的設(shè)計(jì)與實(shí)現(xiàn)[A];第十九屆全國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2002年
相關(guān)博士學(xué)位論文 前1條
1 劉倩;觀點(diǎn)挖掘中評(píng)價(jià)對(duì)象抽取方法的研究[D];東南大學(xué);2016年
相關(guān)碩士學(xué)位論文 前10條
1 魏武;復(fù)雜結(jié)構(gòu)精確Web信息抽取規(guī)則語(yǔ)言與關(guān)鍵技術(shù)研究[D];南京大學(xué);2014年
2 羅鐳;基于用戶交互的半監(jiān)督式Web信息抽取規(guī)則生成技術(shù)研究[D];南京大學(xué);2014年
3 咸珂;基于本體的健康知識(shí)庫(kù)自動(dòng)構(gòu)建方法研究[D];哈爾濱工業(yè)大學(xué);2016年
4 余淼;主題搜索引擎的信息抽取和索引的研究[D];重慶大學(xué);2007年
5 莊重;WEB信息抽取的研究[D];湖北工業(yè)大學(xué);2009年
6 於媛;Web信息抽取系統(tǒng)SEU-WIE設(shè)計(jì)與實(shí)現(xiàn)[D];東南大學(xué);2006年
7 張曉歡;基于本體的產(chǎn)品信息抽取系統(tǒng)的研究[D];天津理工大學(xué);2009年
8 狄慧;基于Agent的Web信息抽取研究[D];大連理工大學(xué);2004年
9 陳建輝;基于模式發(fā)現(xiàn)的在線就業(yè)信息抽取[D];內(nèi)蒙古工業(yè)大學(xué);2006年
10 郭德先;一種模式發(fā)現(xiàn)算法及其Web信息抽取應(yīng)用[D];景德鎮(zhèn)陶瓷學(xué)院;2008年
,本文編號(hào):2324681
本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/2324681.html