在線用戶評論細(xì)粒度屬性抽取
發(fā)布時間:2018-01-15 12:30
本文關(guān)鍵詞:在線用戶評論細(xì)粒度屬性抽取 出處:《情報(bào)學(xué)報(bào)》2017年05期 論文類型:期刊論文
更多相關(guān)文章: 屬性抽取 屬性聚類 深度學(xué)習(xí) 近鄰傳播聚類 細(xì)粒度屬性
【摘要】:隨著在線評論信息數(shù)量的快速增長與應(yīng)用的不斷擴(kuò)展,評論挖掘研究得到學(xué)術(shù)界的持續(xù)關(guān)注。當(dāng)前的評論挖掘任務(wù)對屬性的全面性、細(xì)粒度等要求越來越高,而多數(shù)現(xiàn)有研究方法主要關(guān)注評價對象主要屬性的抽取。盡可能地發(fā)現(xiàn)評價對象的全部用戶關(guān)注屬性、并以細(xì)粒度方式表述屬性,是一項(xiàng)有意義的工作。本文提出一種細(xì)粒度屬性抽取方法,旨在全面、快速地抽取產(chǎn)品屬性。本文首先利用高頻名詞構(gòu)建候選屬性詞;然后通過深度學(xué)習(xí)構(gòu)建候選屬性詞向量,在此基礎(chǔ)上完成候選屬性的聚類,得到聚類后的候選屬性詞集;最后對候選屬性詞集進(jìn)行噪音過濾,得到細(xì)粒度產(chǎn)品屬性集。在飲食、手機(jī)、圖書等三個領(lǐng)域評論語料上的實(shí)驗(yàn)結(jié)果表明,相對于基于種子詞的方法、基于結(jié)合人工的LDA方法及基于情感詞的方法,本文方法能夠更加全面地發(fā)現(xiàn)評價對象屬性,并且能夠給出細(xì)粒度的屬性。
[Abstract]:With the rapid growth of online review information and the continuous expansion of applications, the research of comment mining has been continuously concerned by the academic community. The current task of comment mining requires more and more comprehensive attributes, fine-grained and so on. Most of the existing research methods mainly focus on the extraction of the main attributes of the evaluation object. As far as possible, we can find all the user concerned attributes of the evaluation object, and express the attributes in a fine-grained manner. In this paper, a fine-grained attribute extraction method is proposed to extract product attributes comprehensively and quickly. Firstly, candidate attribute words are constructed by using high-frequency nouns. Then, the candidate attribute word vector is constructed by in-depth learning, and the candidate attribute word set is obtained by clustering the candidate attribute. Finally, the candidate attribute word set is filtered by noise, and the fine-grained product attribute set is obtained. The experimental results in the review corpus of diet, mobile phone and book show that compared with the method based on seed words. Based on the combination of artificial LDA method and affective word based method, this method can find evaluation object attributes more comprehensively, and can give fine grained attributes.
【作者單位】: 南京理工大學(xué)信息管理系;福建省信息處理與智能控制重點(diǎn)實(shí)驗(yàn)室(閩江學(xué)院);江蘇省數(shù)據(jù)工程與知識服務(wù)重點(diǎn)實(shí)驗(yàn)室(南京大學(xué));
【基金】:國家社會科學(xué)基金項(xiàng)目“在線社交網(wǎng)絡(luò)中基于用戶的知識組織模式研究”(No.14BTQ033) 福建省信息處理與智能控制重點(diǎn)實(shí)驗(yàn)室(閩江學(xué)院)開放課題
【分類號】:G254
【正文快照】: 1引言當(dāng)前,電商平臺以及社交媒體上存在大量的用戶評論信息。如何從紛繁復(fù)雜的在線評論中高效挖掘用戶感興趣的信息,是社會媒體計(jì)算領(lǐng)域關(guān)心的重要研究問題。屬性抽取作為評論挖掘研究中重要任務(wù)之一,已經(jīng)引起眾多學(xué)者的重視[1_2]。在正確識別用戶關(guān)注屬性的同時,對抽取出的屬,
本文編號:1428401
本文鏈接:http://sikaile.net/tushudanganlunwen/1428401.html
最近更新
教材專著