天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 軟件論文 >

基于神經(jīng)網(wǎng)絡(luò)的語言模型的改進研究

發(fā)布時間:2019-04-08 09:43
【摘要】:在自然語言處理的研究中,詞和句子是主要的研究單位。詞一般是文本處理領(lǐng)域最小的有意義的單位,比如搜索引擎一般會把搜索的query切分成詞再進行查找。句子是比詞更高一級的文本單位,如果我們不限定句子的長度,那么句子也可以是段落或者篇章。由于詞和句子是文本處理的主要單位,詞和句子的表示的研究就顯得尤為重要。詞的表示學(xué)習(xí)方法可以分為兩種類型的方法。一種是基于神經(jīng)網(wǎng)絡(luò)的模型訓(xùn)練得出的詞向量,另一種主要是LSA,LDA這些類似矩陣分解的方法。句子的表示主要有基于TF-IDF的向量空間模型,主題模型可以學(xué)出句子在主題中的分布作為句子的表示,基于神經(jīng)網(wǎng)絡(luò)的語言模型可以無監(jiān)督的學(xué)習(xí)出句子的表示。本文的主要工作包括以下幾個方面。第一,在詞向量模型方面,首先提出了基于逆詞頻霍夫曼編碼的層次化softmax方法。神經(jīng)網(wǎng)絡(luò)語言模型通常采用基于霍夫曼編碼的層次化softmax和negative sampling進行模型的加速。本文認(rèn)為word2vec中高頻詞編碼短而低頻詞編碼長的算法不合理,因此提出了基于逆詞頻的霍夫曼編碼。其次,本文研究了基于位置的權(quán)重向量和權(quán)重因子的問題,本文采用基于位置的權(quán)重向量以及基于位置的權(quán)重因子方法改進了word2vec詞向量模型。最后,本文提出將背景詞向量(context representation)與目標(biāo)詞向量(target representation)共享的詞向量模型。背景詞向量和目標(biāo)詞向量通常對應(yīng)不同的向量,但本文實驗發(fā)現(xiàn)共享詞向量會得到更好的結(jié)果。第二,在段落向量方面本文,本文提出了 D-CBOW模型來學(xué)習(xí)段落向量和詞向量,與Quoc的模型采用拼接或者平均的方法不同,D-CBOW模型采用段落權(quán)重向量和位置權(quán)重向量來融合詞向量與段落向量。第三,采用上述算法,本文設(shè)計實現(xiàn)了段落的情感傾向判斷。本文進行了多組實驗對比,發(fā)現(xiàn)采用基于位置的權(quán)重向量和逆詞頻編碼之后,在IMDB電影評論的情感傾向的判斷的任務(wù)上效果好于Quoc的方法。同時本文還對sigmoid,tanh,relu三種激活函數(shù)進行比較,發(fā)現(xiàn)在情感傾向判斷的任務(wù)中使用relu作為激活函數(shù)的效果較好。
[Abstract]:In the study of natural language processing, words and sentences are the main research units. Words are generally the smallest meaningful units in the field of text processing. For example, search engines usually divide the search query into words and then look them up. A sentence is a unit of text higher than a word. If we do not limit the length of the sentence, the sentence can also be a paragraph or a text. Because words and sentences are the main units of text processing, it is very important to study the representation of words and sentences. The expression learning method of words can be divided into two types. One is the word vector based on the neural network model training, the other is the similar matrix decomposition method such as LSA,LDA. The main expression of sentences is vector space model based on TF-IDF. The topic model can learn the distribution of sentences in the topic as the expression of sentences, and the language model based on neural networks can learn the representation of sentences unsupervised. The main work of this paper includes the following aspects. Firstly, in the aspect of word vector model, a hierarchical softmax method based on inverse word-frequency Huffman coding is proposed. Neural network language model is usually accelerated by hierarchical softmax and negative sampling based on Huffman coding. This paper considers that the algorithm of high frequency word coding is short and low frequency word coding length is not reasonable in word2vec. Therefore, Huffman coding based on inverse word frequency is proposed. Secondly, this paper studies the problem of position-based weight vector and weight factor. In this paper, we use the position-based weight vector and the position-based weight factor method to improve the word2vec word vector model. Finally, this paper proposes a word vector model which shares the background word vector (context representation) with the target word vector (target representation). The background word vector and the target word vector usually correspond to different vectors, but this paper finds that the shared word vector can get better results. Second, in the aspect of paragraph vector, this paper proposes a D-CBOW model to learn paragraph vector and word vector, which is different from Quoc's model by splicing or averaging. D-CBOW model uses paragraph weight vector and position weight vector to fuse word vector and paragraph vector. Thirdly, using the algorithm mentioned above, this paper designs and realizes the judgment of the emotional tendency of the paragraph. In this paper, a number of experiments are carried out and it is found that using position-based weight vector and inverse word-frequency coding, the result is better than that of Quoc in the task of judging the emotional tendency of IMDB movie reviews. At the same time, the paper also compares the three activation functions of sigmoid,tanh,relu, and finds that relu is a better activation function in the task of judging emotional tendency.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2016
【分類號】:TP391.1;TP183

【相似文獻】

相關(guān)期刊論文 前10條

1 黃萱菁,吳立德,郭以昆,劉秉偉;現(xiàn)代漢語熵的計算及語言模型中稀疏事件的概率估計[J];電子學(xué)報;2000年08期

2 黃順珍,方棣棠;利用語言模型實現(xiàn)音字轉(zhuǎn)換的高效解碼算法[J];深圳大學(xué)學(xué)報;2000年04期

3 侯s,

本文編號:2454481


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2454481.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶e314e***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com