天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 經濟論文 > 電子商務論文 >

基于主動學習和遷移學習的文本情感預測研究

發(fā)布時間:2018-04-23 04:31

  本文選題:主動學習 + 遷移學習; 參考:《山西大學》2016年碩士論文


【摘要】:隨著新興電子商務平臺廣泛使用,用戶在享受便利的同時,也通過論壇發(fā)表關于產品的觀點。通過這些評論,普通用戶可以了解產品的性能,為購買行為做出理性的選擇,生產者可以快速掌握市場動向,為商品營銷做出正確的決策。因此,面向產品評論的觀點挖掘和情感分析是解決此類問題的有效手段。傳統(tǒng)的監(jiān)督學習方法多應用于靜態(tài)單領域數據,需要大量的標注數據,而遷移學習方法可以利用已有的標注數據來學習分類模型,用于解決訓練目標樣本標注不足的問題。由于不同領域或不同時期的數據之間存在一定的差異性,本文通過主動學習對分類模型進行優(yōu)化,用于提高文本的情感預測效果,主要研究的內容如下:(1)文本情感預測的問題分析根據實驗語料,從傳統(tǒng)文本表示的局限性、評論文本語言表達的多樣性以及評論文本不同時段的關注點不同三個方面,具體分析了目前情感分析研究中存在的問題,并提出了相應的解決方法。(2)基于主動學習和遷移學習的跨領域文本情感預測針對靜態(tài)跨領域數據領域不同導致的語言表達多樣性問題,提出一種基于主動學習和遷移學習的跨領域文本情感預測方法,首先通過源領域數據訓練分類模型,選擇目標領域置信度較高的文本作為分類模型的初始種子樣本,迭代過程中,選取專家標注的低置信度文本與高置信度文本共同加入訓練數據集,加快了目標領域分類模型的優(yōu)化速度,再根據情感詞典、評價詞搭配抽取規(guī)則以及輔助特征詞從訓練集中動態(tài)抽取特征集,最終利用優(yōu)化好的分類模型對測試數據集進行分類。相比Active-Dynamic,Active-Semi-Dynamic平均精度提高了 2.75個百分點,實驗結果表明加入高置信度樣本,能夠豐富訓練樣本和特征信息,有助于分類模型的訓練。相比Active-BOW,Active-Semi-Dynamic平均精度提高了 2.79個百分點,實驗結果表明利用情感詞典和依存句法分析相結合抽取情感詞,能夠更加準確地刻畫文本的情感信息,提高跨領域文本的情感預測效果。(3)基于主動學習和遷移學習的時序評論情感預測針對動態(tài)時序數據評論時間不同導致的評論關注點不同問題,提出一種基于主動學習和遷移學習的時序評論情感預測方法,采用遷移學習思想,通過前一時期標注數據獲得當前時期數據的初始標注樣本。在主動學習中,采用SMOTE算法平衡訓練數據集,通過優(yōu)化后分類模型預測當前時期汽車評論的情感傾向。相比UN_SMOTE,SMOTE算法的平均準確率提高了 4.32個百分點,實驗結果表明分類模型優(yōu)化過程中,在少數類中插入新樣本,能夠平衡訓練語料,提升汽車評論的情感預測效果,同時,實現(xiàn)了混合類評論的情感預測。
[Abstract]:With the widespread use of the emerging e-commerce platform, users enjoy the convenience, while the views of the products are published in the forum. Through these comments, ordinary users can understand the performance of the products, make a rational choice for the purchase behavior, and the producers can quickly grasp the market trend and make the correct decision for the marketing of goods. Therefore, View mining and emotional analysis for product reviews is an effective means to solve such problems. The traditional supervised learning method is applied to static single domain data and requires a large number of annotation data. The migration learning method can use the existing annotation data to learn the classification model and solve the problem of the shortage of training target samples. Due to the difference in data between different fields and different periods, this paper optimizes the classification model by active learning to improve the emotional prediction effect of text. The main contents are as follows: (1) the analysis of text emotional prediction is based on the authentic corpus, the limitations of the traditional text representation and the comment text. The diversity of language expression and the different points of attention in different periods of review text are three different aspects, and the existing problems in the present emotional analysis are analyzed, and the corresponding solutions are put forward. (2) a cross domain text emotion prediction needle based on active learning and migration learning has a different language in the static cross domain data field. To express the problem of diversity, a cross domain text emotion prediction method based on active learning and migration learning is proposed. First, the classification model is trained by the source domain data, and the text of higher confidence in the target domain is selected as the initial seed sample of the classification model. In the iterative process, the low confidence text and high confidence of the expert tagging are selected. The degree text joins the training data set together to speed up the optimization speed of the target domain classification model, and then according to the affective dictionary, the evaluation of the word collocation extraction rules and the auxiliary feature words from the training set dynamic extraction of the feature set. Finally, the optimized classification model is used to classify the test data sets. Compared with Active-Dynamic, Active-Semi-Dynami The average accuracy of C is increased by 2.75 percentage points. The experimental results show that adding high confidence samples can enrich the training samples and feature information and help the training of classification models. Compared with Active-BOW, the average precision of Active-Semi-Dynamic is increased by 2.79 percentage points. The experimental results show that the combination of emotional dictionary and dependency syntactic analysis is used to draw the combination of the emotional dictionary and the dependency syntactic analysis. Emotional words can be used to describe the emotional information of the text more accurately and improve the emotional prediction effect of the cross domain text. (3) a time series review emotional prediction based on active learning and migration learning is based on the different problems of critical attention caused by the different time of dynamic time series data commentary, and a time based on active learning and migration learning is proposed. In the active learning, the SMOTE algorithm is used to balance the training data set and to predict the emotional tendencies of the current period car reviews by optimizing the classification model. Compared with the average UN_SMOTE, the average accuracy of the SMOTE algorithm is compared. The accuracy of the experiment is increased by 4.32 percentage points. The experimental results show that in the optimization process of the classification model, new samples are inserted in a few classes, which can balance the training corpus, improve the emotional prediction effect of the car reviews, and realize the emotional prediction of the mixed class reviews.

【學位授予單位】:山西大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP391.1

【參考文獻】

相關期刊論文 前10條

1 唐超;王文劍;李偉;李國斌;曹峰;;基于多學習器協(xié)同訓練模型的人體行為識別方法[J];軟件學報;2015年11期

2 趙傳君;王素格;李德玉;李欣;;基于分組提升集成的跨領域文本情感分類[J];計算機研究與發(fā)展;2015年03期

3 姜高霞;王文劍;;時序數據曲線排齊的相關性分析方法[J];軟件學報;2014年09期

4 張玉紅;周全;胡學鋼;;面向跨領域情感分類的特征選擇方法[J];模式識別與人工智能;2013年11期

5 魏現(xiàn)輝;張紹武;楊亮;林鴻飛;;基于加權SimRank的跨領域文本情感傾向性分析[J];模式識別與人工智能;2013年11期

6 呂云云;李e,

本文編號:1790476


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/jingjilunwen/dianzishangwulunwen/1790476.html


Copyright(c)文論論文網All Rights Reserved | 網站地圖 |

版權申明:資料由用戶03283***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com