面向餐館評論的情感分析關(guān)鍵技術(shù)研究
發(fā)布時間:2018-06-18 03:02
本文選題:循環(huán)神經(jīng)網(wǎng)絡(luò) + LSTM; 參考:《哈爾濱工業(yè)大學(xué)》2017年碩士論文
【摘要】:隨著互聯(lián)網(wǎng)與電子商務(wù)的發(fā)展,網(wǎng)上購物、網(wǎng)上訂餐等方便快捷的應(yīng)用日益深入人們的生活,相應(yīng)地人們在這些平臺上發(fā)表的評論信息也正在呈指數(shù)級的方式增長。這些信息數(shù)量龐大,擁有極其重要的研究價值。對這些評論信息進行分析,從中獲得消費者對每個評價對象的評價極性,不僅能指導(dǎo)消費者的消費行為,而且有利于商家掌握消費者需求,從而對產(chǎn)品進行改進。本文對餐館評論領(lǐng)域評價對象的抽取和評價極性判別兩個情感分析子任務(wù)進行研究,選擇效果最好的方法應(yīng)用于餐館評論情感分析系統(tǒng)。具體地,本文研究內(nèi)容如下:首先,研究評價對象的抽取方法。提出基于輸出依賴的雙向LSTM模型,該模型在LSTM模型的基礎(chǔ)之上通過利用兩個獨立的隱含層來對文本進行雙向處理,從而充分利用文本上文和下文中所蘊含的有效特征,同時在輸出層之間加入自連接,有效利用輸出序列之間存在的依賴關(guān)系,并通過加入詞性特征、句法特征、情感傾向特征和命名實體識別特征來提升模型的效果。其次,實現(xiàn)了條件隨機場方法,主要在特征選擇與組合上對模型的效果進行提升。此外,實現(xiàn)了基于BLSTM-CRF的評價對象抽取方法,將BLSTM的輸出向量直接送入CRF模型中進行計算,得到最佳輸出標簽序列。其次,研究評價對象極性判別方法。提出基于雙向LSTM的評價對象極性判別模型,該模型利用兩個BLSTM網(wǎng)絡(luò)即BLSTML和BLSTMR來分別收集評價對象的上文和下文語義信息,在每個時間步驟上將當前單詞詞向量和評價對象向量進行連接后一同送入模型,從而使模型能捕獲到每個單詞與評價對象之間的語義關(guān)系。該模型取得了同類模型的最好效果。此外,本文提出了基于提升的模型融合方法,該方法法將支持向量機模型和隨機森林模型融合,在訓(xùn)練完一個分類模型后,增大該模型錯誤分類的樣本所占的權(quán)重并減小該模型正確分類的樣本的權(quán)重,最后按照各模型的效果對結(jié)果加權(quán)得到最終的結(jié)果。該方法做到了將線性分類模型和非線性分類模型的優(yōu)點結(jié)合。最后,設(shè)計實現(xiàn)基于餐館評論的情感分析系統(tǒng)。將基于輸出依賴雙向LSTM的評價對象抽取方法和基于雙向LSTM的評價極性判別方法應(yīng)用到系統(tǒng)中,提高了系統(tǒng)進行評價對象抽取與極性判別的準確性。該系統(tǒng)能夠直觀地以餅圖的方式將評價對象及評價極性占比形象地表示出來。
[Abstract]:With the development of Internet and electronic commerce, the convenient and fast application of online shopping, online ordering and so on is deepening into people's life. Accordingly, the comments on these platforms are also increasing exponentially. The amount of information is so large that it has extremely important research value. Through the analysis of these comments, the evaluation polarity of each evaluation object can be obtained, which can not only guide the consumer's consumption behavior, but also help the merchant to grasp the consumer's demand and improve the product. In this paper, we study the two sub-tasks of the selection and evaluation polarity of the evaluation objects in the field of restaurant review, and choose the best method to be applied to the restaurant comment emotion analysis system. Specifically, the contents of this paper are as follows: firstly, the extraction method of evaluation objects is studied. A bidirectional LSTM model based on output dependence is proposed. Based on the LSTM model, the two independent hidden layers are used to process the text bidirectional, so as to make full use of the effective features contained in the text above and below. At the same time, self-linking is added between the output layers to effectively utilize the dependency between output sequences, and to improve the effectiveness of the model by adding part-of-speech features, syntactic features, affective tendency features and named entity recognition features. Secondly, the conditional random field method is implemented to improve the performance of the model in feature selection and combination. In addition, the evaluation object extraction method based on BLSTM-CRF is implemented, and the output vector of BLSTM is directly input into the CRF model for calculation, and the optimal output label sequence is obtained. Secondly, the polarity discrimination method of evaluation object is studied. A polarity discriminant model of evaluation object based on bidirectional LSTM is proposed. Two BLSTM networks, BLSTML and BLSTMR, are used to collect the above and the following semantic information of the evaluation object, respectively. The current word vector and the evaluation object vector are linked into the model in each time step, so that the model can capture the semantic relationship between each word and the evaluation object. The model achieves the best effect of the same model. In addition, this paper proposes a model fusion method based on lifting, which combines support vector machine model and stochastic forest model, after training a classification model, The weight of the samples of the model is increased and the weight of the samples classified correctly is reduced. Finally, the final results are obtained by weighting the results according to the effects of each model. This method combines the advantages of linear classification model and nonlinear classification model. Finally, an emotional analysis system based on restaurant reviews is designed and implemented. The evaluation object extraction method based on output-dependent bidirectional LSTM and the evaluation polarity discrimination method based on bidirectional LSTM are applied to the system, which improves the accuracy of evaluation object extraction and polarity discrimination. The system can visualize the evaluation object and the proportion of evaluation polarity by pie chart.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.1
【參考文獻】
相關(guān)期刊論文 前3條
1 ZHANG Yangsen;JIANG Yuru;TONG Yixuan;;Study of Sentiment Classification for Chinese Microblog Based on Recurrent Neural Network[J];Chinese Journal of Electronics;2016年04期
2 程佳軍;張鑫;張勝;王暉;劉博;;Sentiment Parsing of Chinese Microblogs Using Recurrent Neural Network[J];Journal of Donghua University(English Edition);2016年03期
3 唐慧豐;譚松波;程學(xué)旗;;基于監(jiān)督學(xué)習的中文情感分類技術(shù)比較研究[J];中文信息學(xué)報;2007年06期
,本文編號:2033735
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2033735.html
最近更新
教材專著