基于PhraseLDA模型的主題短語(yǔ)挖掘方法研究
發(fā)布時(shí)間:2019-03-17 20:46
【摘要】:[目的/意義]以主題短語(yǔ)識(shí)別為研究對(duì)象,提出基于PhraseLDA模型的主題短語(yǔ)挖掘方法,為快速理解文本內(nèi)容、準(zhǔn)確抽取文本主題提供借鑒思路。[方法/過(guò)程]對(duì)低頻詞進(jìn)行量化定義,提出一種合理的短語(yǔ)重要度計(jì)算方法,最終利用PhraseLDA主題模型推理出主題短語(yǔ)。[結(jié)果/結(jié)論]實(shí)驗(yàn)結(jié)果表明該方法在多種數(shù)據(jù)集中挖掘出的主題短語(yǔ)質(zhì)量較高,主題一致性較強(qiáng)。
[Abstract]:[aim / meaning] this paper proposes a topic phrase mining method based on PhraseLDA model, which can be used for reference to quickly understand the text content and extract the text topic accurately. [methods / processes] quantificationally define low-frequency words, and propose a reasonable method to calculate the importance of phrases. Finally, we use the PhraseLDA topic model to infer the topic phrases. [results / conclusion] the experimental results show that the quality of topic phrases mined by this method in various data sets is high and the topic consistency is strong.
【作者單位】: 中國(guó)科學(xué)院文獻(xiàn)情報(bào)中心;中國(guó)科學(xué)院大學(xué);中國(guó)科學(xué)院武漢文獻(xiàn)情報(bào)中心;
【基金】:中國(guó)科學(xué)院“全院科技信息監(jiān)測(cè)中心建設(shè)”項(xiàng)目(項(xiàng)目編號(hào):院1628-4)研究成果之一
【分類號(hào)】:TP391.1
,
本文編號(hào):2442686
[Abstract]:[aim / meaning] this paper proposes a topic phrase mining method based on PhraseLDA model, which can be used for reference to quickly understand the text content and extract the text topic accurately. [methods / processes] quantificationally define low-frequency words, and propose a reasonable method to calculate the importance of phrases. Finally, we use the PhraseLDA topic model to infer the topic phrases. [results / conclusion] the experimental results show that the quality of topic phrases mined by this method in various data sets is high and the topic consistency is strong.
【作者單位】: 中國(guó)科學(xué)院文獻(xiàn)情報(bào)中心;中國(guó)科學(xué)院大學(xué);中國(guó)科學(xué)院武漢文獻(xiàn)情報(bào)中心;
【基金】:中國(guó)科學(xué)院“全院科技信息監(jiān)測(cè)中心建設(shè)”項(xiàng)目(項(xiàng)目編號(hào):院1628-4)研究成果之一
【分類號(hào)】:TP391.1
,
本文編號(hào):2442686
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2442686.html
最近更新
教材專著