Improved Attentional Seq2seq with Policy Gradient for Text S
發(fā)布時(shí)間:2023-04-10 18:59
隨著數(shù)字信息的爆炸式增長(zhǎng),文本摘要生成技術(shù)已經(jīng)滲透到我們生活的每一個(gè)角落,當(dāng)我們打開(kāi)頭條或者騰訊新聞app的時(shí)候,經(jīng)常會(huì)看到這樣的標(biāo)題:“BAT領(lǐng)先,市值8000億……”,“今年,中國(guó)將完成8項(xiàng)重大任務(wù)”,“下飛機(jī)!飛機(jī)要爆炸了!飛機(jī)已緊急迫降俄羅斯”等。當(dāng)我們看一眼這些標(biāo)題,我們就可以很容易的了解到新聞的內(nèi)容,不需要打開(kāi)新聞然后逐行逐字閱讀。此外,自動(dòng)文本摘要的應(yīng)用場(chǎng)景很多,如新聞標(biāo)題生成,科學(xué)文檔摘要生成,搜索結(jié)果段生成,產(chǎn)品評(píng)論摘要等。在互聯(lián)網(wǎng)信息爆炸的時(shí)代,如果能用簡(jiǎn)短的文字來(lái)+表達(dá)信息的主要內(nèi)容,這無(wú)疑將有助于緩解信息過(guò)載的問(wèn)題。文本摘要生成的主流技術(shù)包括壓縮式,抽取式和生成式三種方法。其中壓縮式摘要是通過(guò)抽取并簡(jiǎn)化原文中的重要句子構(gòu)成摘要;抽取式摘要是通過(guò)直接從原文中抽取已有的句子組成摘要;生成式摘要?jiǎng)t是通過(guò)改寫或重新組織原文內(nèi)容形成最終的摘要?梢(jiàn)壓縮式和抽取式比較相近,而生成式摘要更符合人類的邏輯思維。傳統(tǒng)的文本摘要生成方法主要集中于抽取式文本摘要-生成,如TextRank和PageRank。這些方法可以利用BM25,TF-IDF算法計(jì)算詞頻或語(yǔ)義相似度來(lái)生成摘要。所有...
【文章頁(yè)數(shù)】:60 頁(yè)
【學(xué)位級(jí)別】:碩士
【文章目錄】:
Acknowledgements
Abstract
Chapter1 Introduction
1.1 Definition of Text Summarization
1.2 Why We Need Text Summarization
1.3 Mainstream Methods for Text Summarization
1.3.1 Extractive Summarization
1.3.2 Compressive Summarization
1.3.3 Abstractive Summarization
1.4 Related Work
1.5 Challenges
Chapter2 Basic Techniques
2.1 TextRank
2.2 Sequence to Sequence
2.3 The Evaluation Methods
Chapter3 Improved Attentional Seq2seq with Policy Gradient for Text Summarization
3.1 Word Embedding
3.2 Add Drop Out in LSTM Cells of Encoder
3.3 Use Soft Attention Mechanism in Decoder
3.4 Mini-Batch Gradient Descent
3.5 Use Beam Search Algorithm to Generate the Summary
3.6 Add the Policy Gradient in Attentional Seq2seq Model
3.7 Use Scheduled Sampling in Decoder
Chapter4 Experiments
4.1 Experimental Environment
4.2 Experimental Design
4.3 Dataset
4.3.1 Data Visualization
4.3.2 Data Preprocessing
4.4 Analysis the Mini-batch Gradient Descent Method
4.5 Parameter Setting
4.6 Results
Chapter5 Conclusion and Future Work
5.1 Conclusion
5.2 Future Work
References
Appendix A
摘要
本文編號(hào):3788653
【文章頁(yè)數(shù)】:60 頁(yè)
【學(xué)位級(jí)別】:碩士
【文章目錄】:
Acknowledgements
Abstract
Chapter1 Introduction
1.1 Definition of Text Summarization
1.2 Why We Need Text Summarization
1.3 Mainstream Methods for Text Summarization
1.3.1 Extractive Summarization
1.3.2 Compressive Summarization
1.3.3 Abstractive Summarization
1.4 Related Work
1.5 Challenges
Chapter2 Basic Techniques
2.1 TextRank
2.2 Sequence to Sequence
2.3 The Evaluation Methods
Chapter3 Improved Attentional Seq2seq with Policy Gradient for Text Summarization
3.1 Word Embedding
3.2 Add Drop Out in LSTM Cells of Encoder
3.3 Use Soft Attention Mechanism in Decoder
3.4 Mini-Batch Gradient Descent
3.5 Use Beam Search Algorithm to Generate the Summary
3.6 Add the Policy Gradient in Attentional Seq2seq Model
3.7 Use Scheduled Sampling in Decoder
Chapter4 Experiments
4.1 Experimental Environment
4.2 Experimental Design
4.3 Dataset
4.3.1 Data Visualization
4.3.2 Data Preprocessing
4.4 Analysis the Mini-batch Gradient Descent Method
4.5 Parameter Setting
4.6 Results
Chapter5 Conclusion and Future Work
5.1 Conclusion
5.2 Future Work
References
Appendix A
摘要
本文編號(hào):3788653
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/3788653.html
最近更新
教材專著