基于PhraseLDA模型的主題短語挖掘方法研究
發(fā)布時間:2019-03-17 20:46
【摘要】:[目的/意義]以主題短語識別為研究對象,提出基于PhraseLDA模型的主題短語挖掘方法,為快速理解文本內(nèi)容、準確抽取文本主題提供借鑒思路。[方法/過程]對低頻詞進行量化定義,提出一種合理的短語重要度計算方法,最終利用PhraseLDA主題模型推理出主題短語。[結(jié)果/結(jié)論]實驗結(jié)果表明該方法在多種數(shù)據(jù)集中挖掘出的主題短語質(zhì)量較高,主題一致性較強。
[Abstract]:[aim / meaning] this paper proposes a topic phrase mining method based on PhraseLDA model, which can be used for reference to quickly understand the text content and extract the text topic accurately. [methods / processes] quantificationally define low-frequency words, and propose a reasonable method to calculate the importance of phrases. Finally, we use the PhraseLDA topic model to infer the topic phrases. [results / conclusion] the experimental results show that the quality of topic phrases mined by this method in various data sets is high and the topic consistency is strong.
【作者單位】: 中國科學院文獻情報中心;中國科學院大學;中國科學院武漢文獻情報中心;
【基金】:中國科學院“全院科技信息監(jiān)測中心建設”項目(項目編號:院1628-4)研究成果之一
【分類號】:TP391.1
,
本文編號:2442686
[Abstract]:[aim / meaning] this paper proposes a topic phrase mining method based on PhraseLDA model, which can be used for reference to quickly understand the text content and extract the text topic accurately. [methods / processes] quantificationally define low-frequency words, and propose a reasonable method to calculate the importance of phrases. Finally, we use the PhraseLDA topic model to infer the topic phrases. [results / conclusion] the experimental results show that the quality of topic phrases mined by this method in various data sets is high and the topic consistency is strong.
【作者單位】: 中國科學院文獻情報中心;中國科學院大學;中國科學院武漢文獻情報中心;
【基金】:中國科學院“全院科技信息監(jiān)測中心建設”項目(項目編號:院1628-4)研究成果之一
【分類號】:TP391.1
,
本文編號:2442686
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2442686.html
最近更新
教材專著