天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 碩博論文 > 信息類博士論文 >

基于深度神經(jīng)網(wǎng)絡(luò)的文本表示及其應(yīng)用

發(fā)布時(shí)間:2018-05-27 07:07

  本文選題:深度學(xué)習(xí) + 語言表示 ; 參考:《哈爾濱工業(yè)大學(xué)》2016年博士論文


【摘要】:近年來,深度神經(jīng)網(wǎng)絡(luò)在諸如圖像分類、語音識別等任務(wù)上被深入探索并取得了突出的效果,表現(xiàn)出了優(yōu)異的表示學(xué)習(xí)能力。文本表示一直是自然語言處理領(lǐng)域的核心問題,傳統(tǒng)的文本表示的維數(shù)災(zāi)難、數(shù)據(jù)稀疏等問題,已經(jīng)成為大量自然語言處理任務(wù)性能提高的瓶頸。近年來,通過深度神經(jīng)網(wǎng)絡(luò)對文本學(xué)習(xí)表示逐漸成為一個(gè)新的研究熱點(diǎn)。然而,由于人類語言的靈活多變以及語義信息的復(fù)雜抽象,深度神經(jīng)網(wǎng)絡(luò)模型在文本表示學(xué)習(xí)上的應(yīng)用更為困難。本文旨在研究深度神經(jīng)網(wǎng)絡(luò)對不同粒度的文本學(xué)習(xí)表示,并將其應(yīng)用于相關(guān)任務(wù)上。首先,對詞向量的學(xué)習(xí)進(jìn)行了研究。提出了一種基于動(dòng)名分離的詞向量學(xué)習(xí)模型。該模型將詞性引入到詞向量的學(xué)習(xí)過程,同時(shí)保持了詞序信息。受人類大腦的動(dòng)名分離結(jié)構(gòu)的啟發(fā),在學(xué)習(xí)詞向量的過程中,該模型根據(jù)詞性標(biāo)注工具得到的詞性,動(dòng)態(tài)的選擇模型頂層的網(wǎng)絡(luò)參數(shù),從而實(shí)現(xiàn)模型的動(dòng)名分離。與相關(guān)向量學(xué)習(xí)方法進(jìn)行實(shí)驗(yàn)對比,結(jié)果顯示該模型能夠以相對較低的時(shí)間復(fù)雜度,學(xué)習(xí)得到高質(zhì)量的詞向量;通過其得到的常見詞的相似詞更為合理;在命名實(shí)體識別和組塊分析任務(wù)上的性能,顯著地優(yōu)于其它對比的詞向量。其次,對語句的表示學(xué)習(xí)進(jìn)行了研究。提出了基于深度卷積神經(jīng)網(wǎng)絡(luò)的語句表示模型。該模型不依賴句法分析樹,通過多層交疊的卷積和最大池化操作對語句進(jìn)行建模。語句匹配對自然語言處理領(lǐng)域的大量任務(wù)非常重要。一個(gè)好的匹配模型,不僅需要對語句的內(nèi)部結(jié)構(gòu)進(jìn)行合理建模,還需要捕捉到語句間不同層次的匹配模式。基于此,本文提出了兩種基于深度卷積神經(jīng)網(wǎng)絡(luò)的語句匹配架構(gòu)。架構(gòu)一,首先通過兩個(gè)卷積神經(jīng)網(wǎng)絡(luò)分別對兩個(gè)語句進(jìn)行表示,然后通過多層感知機(jī)進(jìn)行匹配。架構(gòu)二,則是對兩個(gè)語句的匹配直接建模,然后通過多層感知機(jī)對匹配表示進(jìn)行打分。兩種匹配架構(gòu)都無需任何先驗(yàn)知識,因此可被廣泛應(yīng)用于不同性質(zhì)、不同語言的匹配任務(wù)上。在三種不同語言、不同性質(zhì)的語句級匹配任務(wù)上的實(shí)驗(yàn)結(jié)果表明,本文提出的架構(gòu)一和架構(gòu)二遠(yuǎn)高于其他對比模型。相比架構(gòu)一,架構(gòu)二更能夠有效地捕捉到兩個(gè)語句間多層次的匹配模式,架構(gòu)二在三種任務(wù)上取得了優(yōu)異的性能。第三,對統(tǒng)計(jì)機(jī)器翻譯中短語對的選擇進(jìn)行了研究。提出了上下文依賴的卷積神經(jīng)網(wǎng)絡(luò)短語匹配模型。該模型對目標(biāo)短語對進(jìn)行選擇,不僅考慮到了源端短語與目標(biāo)端短語的語義相似度,同時(shí)利用了源端短語的句子上下文信息。為了有效的對模型進(jìn)行訓(xùn)練,提出使用上下文依賴的雙語詞向量初始化模型,同時(shí)設(shè)計(jì)了一種“課程式”的學(xué)習(xí)算法對模型進(jìn)行從易到難、循序漸進(jìn)的訓(xùn)練。實(shí)驗(yàn)表明,將該模型對雙語短語的匹配打分融入到一個(gè)較強(qiáng)的統(tǒng)計(jì)機(jī)器翻譯系統(tǒng)中,可以顯著提高翻譯性能,BLEU值提高了1.0%。自動(dòng)生成進(jìn)行了研究。構(gòu)建了一個(gè)較高質(zhì)量的大規(guī)模中文短文本摘要數(shù)據(jù)集,該數(shù)據(jù)集包括240多萬的摘要,同時(shí)構(gòu)造了一個(gè)高質(zhì)量的測試集。提出使用基于循環(huán)神經(jīng)網(wǎng)絡(luò)的編碼-解碼架構(gòu)從大規(guī)模數(shù)據(jù)集中自動(dòng)學(xué)習(xí)生成摘要,構(gòu)建了兩個(gè)基于循環(huán)神經(jīng)網(wǎng)絡(luò)的摘要生成模型。模型一通過使用循環(huán)神經(jīng)網(wǎng)絡(luò)對原文進(jìn)行建模,并將其最后一個(gè)狀態(tài)作為原文段落的表示,利用另一個(gè)循環(huán)神經(jīng)網(wǎng)絡(luò)從該表示中解碼生成摘要。模型二在模型一的基礎(chǔ)上,通過動(dòng)態(tài)的從編碼階段的循環(huán)神經(jīng)網(wǎng)絡(luò)的所有狀態(tài)中綜合得到上下文表示,然后將當(dāng)前的上下文表示傳遞給解碼循環(huán)神經(jīng)網(wǎng)絡(luò)生成摘要。兩種模型都是產(chǎn)生式模型,無需任何人工特征。實(shí)驗(yàn)表明,兩種模型能夠?qū)υ倪M(jìn)行較為合理的表示,生成具有較高信息量的摘要文本。特別地,模型二生成的摘要文本質(zhì)量顯著優(yōu)于模型一。綜上所述,本文以深度神經(jīng)網(wǎng)絡(luò)為手段,以文本表示為研究對象,對自然語言中不同粒度的文本即詞、句、段的表示學(xué)習(xí)及其應(yīng)用進(jìn)行了深入研究。本文將所提出的方法應(yīng)用到了序列標(biāo)注、語句匹配、機(jī)器翻譯以及自動(dòng)文摘生成問題上,并取得了良好的效果。
[Abstract]:In recent years, deep neural networks have been deeply explored and achieved outstanding results in such tasks as image classification and speech recognition, showing excellent learning ability. Text representation has always been the core problem in the field of Natural Language Processing. The problems of traditional text representation of dimension disaster and data sparsity have become a large number of problems. However, in recent years, the expression of text learning by deep neural networks has become a new research hotspot in recent years. However, because of the flexibility of human language and the complex abstraction of semantic information, the application of deep neural network model in text representation learning is more difficult. Deep neural network is used to express text learning with different granularity and applies it to related tasks. First, the learning of word vectors is studied. A word vector learning model based on moving name separation is proposed. The model introduces word character to the learning process of word vector and maintains the word order information. It is subject to the human brain. In the course of learning the structure, in the process of learning the word vector, the model can dynamically select the network parameters of the top layer of the model according to the word character obtained by the part of speech tagging tool, so as to realize the separation of the dynamic name of the model. The results show that the model can be higher in learning with relatively low time complexity. The word vector of the quality is more reasonable by the similar words of the common words it gets; the performance on the named entity recognition and the block analysis task is significantly better than the other words vector. Secondly, the expression learning of the sentence is studied. The statement representation model based on the deep convolution neural network is proposed. The model does not depend on the syntax. It is very important for a large number of tasks in the field of Natural Language Processing. A good matching model requires not only the rational modeling of the internal structure of the statements, but also the matching patterns of different levels between sentences. Based on this, this paper Two kinds of Sentence Matching architecture based on deep convolution neural network are proposed. First, two convolution neural networks are used to represent two statements and then match through multi layer perceptron. Architecture two is the direct modeling of the matching of the two statements, and then the matching representation is scored by a multi-layer perceptron. Two No prior knowledge is required for the species matching architecture, so it can be widely used in the matching tasks of different properties and different languages. Experimental results on three different language, different sentence level matching tasks show that the proposed architecture and architecture two are far higher than their contrast model. Compared with architecture one, architecture two is more capable. The multi level matching pattern between two statements is captured effectively. Architecture two achieves excellent performance on the three tasks. Third, the selection of phrase pairs in statistical Machine Translation is studied. A context dependent convolution neural network phrase matching model is proposed. The model selects the target phrase pair, not only takes into account the source of the target phrase. The semantic similarity between the end phrase and the target phrase is used and the context information of the source phrase is used. In order to train the model effectively, a context dependent bilingual word vector is used to initialize the model. At the same time, a "Curriculum" learning algorithm is designed to train the model from easy to difficult and gradual. The experiment shows that the matching score of the bilingual phrase is integrated into a strong statistical Machine Translation system, which can significantly improve the translation performance. The BLEU value improves the automatic generation of 1.0%.. A high quality large scale Chinese short text summary data set, which includes about 2400000 summaries, is constructed. A high quality test set is constructed. A recurrent neural network based encoding and decoding architecture is used to generate abstracts from a large dataset, and two recurrent neural network based Abstract generation models are built. The model one models the original text by using recurrent neural network and makes the last state of the model. For the representation of the text paragraph, another recurrent neural network is used to decode the abstract from the representation. Model two is based on the model one, and is synthesized by a dynamic context representation from all the states of the recurrent neural network at the coding stage, and then passes the current upper and lower expressions to the decoded recurrent neural network to generate the pluck. All two models are production models without any artificial features. Experiments show that the two models can represent the original text more reasonably and generate abstract text with higher information. In particular, the quality of the abstract text generated by model two is significantly better than that of the model one. In this paper, we have studied the expression learning and its application of different granularity of text, words, sentences and segments in natural language. This paper applies the proposed method to sequence tagging, statement matching, Machine Translation and automatic abstract generation, and has achieved good results.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2016
【分類號】:TP391.1;TP183

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 孫茂松;;基于互聯(lián)網(wǎng)自然標(biāo)注資源的自然語言處理[J];中文信息學(xué)報(bào);2011年06期

,

本文編號:1940961

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/1940961.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶10666***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請E-mail郵箱bigeng88@qq.com