基于指代消解的漢語句群自動劃分方法
發(fā)布時間:2019-04-28 19:57
【摘要】:漢語句群自動劃分是將篇章劃分成包含不同主題的文本片段,在信息提取、文摘生成、語篇理解及其他多個領域有著極為重要的應用。指代消解是識別篇章中先行詞和照應詞關聯(lián)起來的過程,消解不同表達是自然語言理解的基礎之一。針對目前的句群劃分工作的重點在于劃分出主題之間的邊界而較少利用其本身指代關系來進行語言理解,或者因指代模糊而得到錯誤的劃分結果的問題,提出了一種基于指代消解的句群自動劃分方法。該方法從對篇章的指代情況消解出發(fā),利用適合中文的多層過濾指代消解方法得到指代鏈信息,以消除不同名詞代表相同實體、代詞指代不明的問題。結合指代鏈信息,并同時考慮篇章銜接詞因素,設計并進行了基于多元判別分析(Multiple Discriminate Analysis,MDA)的一組評價函數(shù)J評價句群劃分驗證實驗。實驗結果表明,所提出的方法能夠有效地進行句群自動劃分,統(tǒng)計正確分割平均Pμ提高了7%左右。
[Abstract]:Automatic segmentation of Chinese sentence groups is the division of text into text fragments containing different topics. It has very important applications in information extraction, abstracting generation, text understanding and many other fields. Referential resolution is a process of identifying antecedents and anaphora in a text. Dispelling different expressions is one of the bases of natural language understanding. In view of the problem that the focus of the current sentence group division is to divide the boundaries between the themes and make less use of its own referential relationship for language understanding, or because of the ambiguity of the reference, it is difficult to get the wrong classification results. In this paper, an automatic sentence group partition method based on reference resolution is proposed. In order to eliminate the problem that different nouns represent the same entity and pronouns are unknown, the reference chain information is obtained by using the multi-layer filtering referential resolution method which is suitable for Chinese to resolve the referential situation of the text. Combining with the information of reference chain and considering the factors of text conjunction, a set of evaluation function J evaluation sentence group partition verification experiments based on multivariate discriminant analysis (Multiple Discriminate Analysis,MDA) are designed and carried out. The experimental results show that the proposed method can automatically partition sentence groups effectively, and the average P 渭 of statistical correct segmentation is increased by about 7%.
【作者單位】: 杭州電子科技大學計算機學院;浙江大學軟件學院;
【基金】:國家自然科學基金資助項目(61202281,61103101) 教育部人文社會科學研究項目青年基金(10YJCZH052,12YJCZH201)
【分類號】:TP391.1
本文編號:2467920
[Abstract]:Automatic segmentation of Chinese sentence groups is the division of text into text fragments containing different topics. It has very important applications in information extraction, abstracting generation, text understanding and many other fields. Referential resolution is a process of identifying antecedents and anaphora in a text. Dispelling different expressions is one of the bases of natural language understanding. In view of the problem that the focus of the current sentence group division is to divide the boundaries between the themes and make less use of its own referential relationship for language understanding, or because of the ambiguity of the reference, it is difficult to get the wrong classification results. In this paper, an automatic sentence group partition method based on reference resolution is proposed. In order to eliminate the problem that different nouns represent the same entity and pronouns are unknown, the reference chain information is obtained by using the multi-layer filtering referential resolution method which is suitable for Chinese to resolve the referential situation of the text. Combining with the information of reference chain and considering the factors of text conjunction, a set of evaluation function J evaluation sentence group partition verification experiments based on multivariate discriminant analysis (Multiple Discriminate Analysis,MDA) are designed and carried out. The experimental results show that the proposed method can automatically partition sentence groups effectively, and the average P 渭 of statistical correct segmentation is increased by about 7%.
【作者單位】: 杭州電子科技大學計算機學院;浙江大學軟件學院;
【基金】:國家自然科學基金資助項目(61202281,61103101) 教育部人文社會科學研究項目青年基金(10YJCZH052,12YJCZH201)
【分類號】:TP391.1
【相似文獻】
相關期刊論文 前6條
1 繆建明;張全;;現(xiàn)代漢語句群處理研究的進展[J];微計算機應用;2009年12期
2 劉淑榮;;試論句群分析在播音中的重要意義[J];廣播歌選;2009年12期
3 韋向峰;繆建明;張全;池毓煥;;基于概念基元的句群情景框架抽取研究[J];微計算機應用;2010年04期
4 韋向峰;繆建明;張全;;漢語句群領域的自動抽取研究[J];計算機工程與應用;2009年04期
5 吳晨;張全;;自然語言處理中句群劃分及其判定規(guī)則研究[J];計算機工程;2007年04期
6 李穎;韋向峰;池毓煥;;句群情景框架在搜索引擎中的應用[J];現(xiàn)代計算機;2013年08期
相關會議論文 前2條
1 韋向峰;;句群小句的語義塊共享研究[A];第八屆全國人機語音通訊學術會議論文集[C];2005年
2 繆建明;張全;;HNC句群處理研究新進展[A];中國計算機語言學研究前沿進展(2007-2009)[C];2009年
相關碩士學位論文 前1條
1 張璐瑤;漢語句群自動劃分方法研究及應用[D];杭州電子科技大學;2016年
,本文編號:2467920
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2467920.html
最近更新
教材專著