面向閱讀理解復雜問題的句子融合
發(fā)布時間:2019-07-22 10:33
【摘要】:閱讀理解是目前NLP領域的一個研究熱點。閱讀理解中好的復雜問題解答策略不僅要進行答案句的抽取,還要對答案句進行融合、生成相應的答案,但是目前的研究大多集中在前者。該文針對復雜問題解答中的句子融合進行研究,提出了一種兼顧句子重要信息、問題關聯(lián)度與句子流暢度的句子融合方法。該方法的主要思想為:首先,基于句子拆分和詞重要度選擇待融合部分;然后,基于詞對齊進行句子相同信息的合并;最后,利用基于依存關系、二元語言模型及詞重要度的整數(shù)線性規(guī)劃優(yōu)化生成句子。在歷年高考閱讀理解數(shù)據集上的測試結果表明,該方法取得了82.62%的F值,同時更好地保證了結果的可讀性及信息量。
[Abstract]:Reading comprehension is a hot research topic in the field of NLP at present. The good complex question solving strategy in reading comprehension not only needs to extract the answer sentence, but also merges the answer sentence to generate the corresponding answer, but most of the current research focuses on the former. In this paper, sentence fusion in complex problem solving is studied, and a sentence fusion method which takes into account the important information of sentence, the correlation degree of question and the fluency of sentence is proposed. The main ideas of this method are as follows: firstly, the fusion part is selected based on sentence resolution and word importance; then, the sentence is merged based on word alignment; finally, sentences are optimized by integer linear programming based on dependency, binary language model and word importance. The test results on the reading comprehension data set of college entrance examination over the years show that the method achieves 82.62% F value, and better ensures the readability and information of the results.
【作者單位】: 山西大學計算機與信息技術學院;山西大學計算智能與中文信息處理教育部重點實驗室;
【基金】:國家高技術研究發(fā)展計劃(863計劃)項目(2015AA015407) 國家自然科學青年基金(61100138,61403238) 山西省自然科學基金(2011011016-2,2012021012-1) 山西省回國留學人員科研項目(2013-022) 山西省高?萍奸_發(fā)項目(20121117) 山西省2012年度留學回國人員科技活動擇優(yōu)項目
【分類號】:TP391.1
,
本文編號:2517566
[Abstract]:Reading comprehension is a hot research topic in the field of NLP at present. The good complex question solving strategy in reading comprehension not only needs to extract the answer sentence, but also merges the answer sentence to generate the corresponding answer, but most of the current research focuses on the former. In this paper, sentence fusion in complex problem solving is studied, and a sentence fusion method which takes into account the important information of sentence, the correlation degree of question and the fluency of sentence is proposed. The main ideas of this method are as follows: firstly, the fusion part is selected based on sentence resolution and word importance; then, the sentence is merged based on word alignment; finally, sentences are optimized by integer linear programming based on dependency, binary language model and word importance. The test results on the reading comprehension data set of college entrance examination over the years show that the method achieves 82.62% F value, and better ensures the readability and information of the results.
【作者單位】: 山西大學計算機與信息技術學院;山西大學計算智能與中文信息處理教育部重點實驗室;
【基金】:國家高技術研究發(fā)展計劃(863計劃)項目(2015AA015407) 國家自然科學青年基金(61100138,61403238) 山西省自然科學基金(2011011016-2,2012021012-1) 山西省回國留學人員科研項目(2013-022) 山西省高?萍奸_發(fā)項目(20121117) 山西省2012年度留學回國人員科技活動擇優(yōu)項目
【分類號】:TP391.1
,
本文編號:2517566
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2517566.html
最近更新
教材專著