深度學(xué)習(xí)算法在藏文情感分析中的應(yīng)用研究
發(fā)布時(shí)間:2018-06-22 05:48
本文選題:深度學(xué)習(xí) + 情感分析; 參考:《計(jì)算機(jī)科學(xué)與探索》2017年07期
【摘要】:針對(duì)以往進(jìn)行藏文情感分析時(shí)算法忽略藏文語句結(jié)構(gòu)、詞序等重要信息而導(dǎo)致結(jié)果準(zhǔn)確率較低的問題,將深度學(xué)習(xí)領(lǐng)域內(nèi)的遞歸自編碼算法引入藏文情感分析中,以更深層次提取語義情感信息。將藏文分詞后,用詞向量表示詞語,則藏文語句變?yōu)橛稍~向量組成的矩陣;利用無監(jiān)督遞歸自編碼算法對(duì)該矩陣向量化,此時(shí)獲得的最佳藏文語句向量編碼融合了語義、語序等重要信息;利用藏文語句向量和其對(duì)應(yīng)的情感標(biāo)簽,有監(jiān)督地訓(xùn)練輸出層分類器以預(yù)測(cè)藏文語句的情感傾向。在實(shí)例驗(yàn)證部分,探討了不同向量維度、重構(gòu)誤差系數(shù)及語料庫大小對(duì)算法準(zhǔn)確度的影響,并分析了語料庫大小和模型訓(xùn)練時(shí)間之間的關(guān)系,指出若要快速完成模型的訓(xùn)練,可適當(dāng)減小數(shù)據(jù)集語句條數(shù)。實(shí)例驗(yàn)證表明,在最佳參數(shù)組合下,所提算法準(zhǔn)確度比傳統(tǒng)機(jī)器學(xué)習(xí)算法中性能較好的語義空間模型高約8.6%。
[Abstract]:Aiming at the problem that the algorithm ignored the Tibetan sentence structure and the word order and other important information in the past, the recursive self encoding algorithm in the domain of deep learning was introduced to the Tibetan emotional analysis to extract the semantic emotion information in a deeper level. The text is transformed into a matrix consisting of a word vector; using an unsupervised recursive self encoding algorithm to quantify the matrix, the best Tibetan sentence vector encoding at this time combines semantic, word order and other important information. Using the Tibetan sentence vector and its corresponding emotional label, the output layer classifier is trained to predict the situation of Tibetan sentences. In case validation part, the effect of different vector dimensions, reconstruction error coefficient and corpus size on the accuracy of the algorithm is discussed, and the relationship between the size of the corpus and the training time of the model is analyzed. It is pointed out that the number of data sets can be reduced properly if the training of the model is to be completed quickly. Example verification shows that the best parameter is in the best case. In combination, the accuracy of the proposed algorithm is about 8.6%. higher than that of the traditional machine learning algorithm.
【作者單位】: 西藏大學(xué)藏文信息技術(shù)研究中心;西南交通大學(xué)信息科學(xué)與技術(shù)學(xué)院;
【基金】:國(guó)家自然科學(xué)基金61540060 國(guó)家軟科學(xué)研究計(jì)劃項(xiàng)目2013GXS4D150 西藏自治區(qū)科技廳科學(xué)研究項(xiàng)目~~
【分類號(hào)】:TP18;TP391.1
,
本文編號(hào):2051836
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2051836.html
最近更新
教材專著