基于矩陣分解和子模最大化的微博新聞?wù)椒?/H1>
發(fā)布時間:2018-09-12 19:51
【摘要】:針對面向微博的中文新聞?wù)闹饕魬?zhàn),提出了一種將矩陣分解與子模最大化相結(jié)合的新聞自動摘要方法。該方法首先利用正交矩陣分解模型得到新聞文本潛語義向量,解決了短文本信息稀疏問題,并使投影方向近似正交以減少冗余;然后從相關(guān)性和多樣性等方面評估新聞?wù)Z句集合,該評估函數(shù)由多個單調(diào)子模函數(shù)和一個評估語句不相似度的非子模函數(shù)組成;最后設(shè)計貪心算法生成最終摘要。在NLPCC2015數(shù)據(jù)集上的實驗結(jié)果表明,該方法能有效提高面向微博的新聞自動摘要質(zhì)量,ROUGE得分超過其他基線系統(tǒng)。
[Abstract]:Aiming at the main challenge of Weibo's Chinese news abstract, this paper proposes an automatic news digest method which combines matrix decomposition with submodule maximization. Firstly, the latent semantic vector of news text is obtained by using orthogonal matrix decomposition model, which solves the problem of sparse information in short text, and makes the projection direction approximate orthogonal to reduce redundancy. Then the set of news statements is evaluated from the aspects of correlation and diversity. The evaluation function is composed of several monotonic submodules and a non-submodule function to evaluate the dissimilarity of statements. Finally, a greedy algorithm is designed to generate the final summary. The experimental results on the NLPCC2015 dataset show that this method can effectively improve the quality of automatic news abstracts for Weibo and the score of group is higher than that of other baseline systems.
【作者單位】: 武漢大學計算機學院;
【基金】:國家社科重大招標計劃資助項目(11&ZD189) 國家自然科學基金面上資助項目(61373108)
【分類號】:TP391.1
【相似文獻】
相關(guān)碩士學位論文 前1條
1 高亞奇;基于判別特征回歸的子模優(yōu)化跟蹤算法[D];大連理工大學;2016年
,
本文編號:2240056
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2240056.html
[Abstract]:Aiming at the main challenge of Weibo's Chinese news abstract, this paper proposes an automatic news digest method which combines matrix decomposition with submodule maximization. Firstly, the latent semantic vector of news text is obtained by using orthogonal matrix decomposition model, which solves the problem of sparse information in short text, and makes the projection direction approximate orthogonal to reduce redundancy. Then the set of news statements is evaluated from the aspects of correlation and diversity. The evaluation function is composed of several monotonic submodules and a non-submodule function to evaluate the dissimilarity of statements. Finally, a greedy algorithm is designed to generate the final summary. The experimental results on the NLPCC2015 dataset show that this method can effectively improve the quality of automatic news abstracts for Weibo and the score of group is higher than that of other baseline systems.
【作者單位】: 武漢大學計算機學院;
【基金】:國家社科重大招標計劃資助項目(11&ZD189) 國家自然科學基金面上資助項目(61373108)
【分類號】:TP391.1
【相似文獻】
相關(guān)碩士學位論文 前1條
1 高亞奇;基于判別特征回歸的子模優(yōu)化跟蹤算法[D];大連理工大學;2016年
,本文編號:2240056
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2240056.html
最近更新
教材專著