天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 軟件論文 >

基于深度學(xué)習(xí)的社交媒體文本立場分析研究

發(fā)布時(shí)間:2018-05-05 00:00

  本文選題:立場分析 + 深度學(xué)習(xí)。 參考:《哈爾濱工業(yè)大學(xué)》2017年碩士論文


【摘要】:隨著互聯(lián)網(wǎng)技術(shù)的迅猛發(fā)展和智能終端的快速普及,越來越多的用戶在社交媒體平臺針對各類事件發(fā)表自己的立場和看法。用戶針對具體對象和事件的立場態(tài)度對商業(yè)機(jī)構(gòu)與政府機(jī)關(guān)決策具有重大的價(jià)值。傳統(tǒng)情感分析只對文本表面的情感表達(dá)進(jìn)行正負(fù)面分類,難以挖掘文本中用戶針對特定事件話題的立場。因此,針對特定話題的社交媒體文本立場分析研究具有重要的科學(xué)研究價(jià)值和廣泛的應(yīng)用前景,F(xiàn)有的文本立場分析方法主要分為兩類,分別是基于特征工程機(jī)器學(xué)習(xí)的方法和基于深度學(xué)習(xí)的方法;谔卣鞴こ虣C(jī)器學(xué)習(xí)的立場分析方法需要構(gòu)造和選擇大量的特征,往往對語言學(xué)知識具有較高要求,同時(shí)經(jīng)常受到訓(xùn)練樣本不足導(dǎo)致的特征稀疏的影響;谏疃葘W(xué)習(xí)的方法往往直接將立場分析視為簡單的文本分類問題,很少結(jié)合社交媒體文本詞嵌入中的背景知識,也沒有有效利用立場分析中特定話題的信息。針對以上問題,本文使用社交文本詞嵌入作為背景知識,結(jié)合深層記憶網(wǎng)絡(luò)的注意力機(jī)制,研究基于深度學(xué)習(xí)的社交媒體文本立場分析方法。本文首先在利用大規(guī)模社交文本預(yù)訓(xùn)練的詞嵌入基礎(chǔ)上,研究一種基于卷積神經(jīng)網(wǎng)絡(luò)的文本立場分析方法。在Sem Eval英文立場分析數(shù)據(jù)集和NLPCC中文立場分析數(shù)據(jù)集上的實(shí)驗(yàn)結(jié)果顯示,該方法取得了Semeval數(shù)據(jù)集F值0.6752、NLPCC數(shù)據(jù)F值0.7036的成績。在若干子話題上的性能超出評測最佳隊(duì)伍,綜合性能均列中英文兩立場評測任務(wù)的第2位。同時(shí),分析發(fā)現(xiàn),相對于隨機(jī)賦值等詞嵌入初始化方式,社交媒體文本預(yù)訓(xùn)練詞嵌入的加入能夠有效提升模型的立場分析性能。針對現(xiàn)有研究往往對特定話題信息缺乏有效利用的問題,本文進(jìn)一步提出一種利用深層記憶網(wǎng)絡(luò)的注意力機(jī)制評估特定話題與文本成分關(guān)聯(lián)關(guān)系的立場分析模型。該模型讀取文本和話題的詞嵌入表示,結(jié)合深層記憶網(wǎng)絡(luò)的記憶機(jī)制和注意力機(jī)制,利用多個(gè)網(wǎng)絡(luò)層疊加學(xué)習(xí)多層次的文本表示,分析得到文本對特定話題所持有的立場傾向。實(shí)驗(yàn)結(jié)果顯示,該方法在Sem Eval數(shù)據(jù)集中的平均F值為0.6821,比該評測中表現(xiàn)最好的遷移學(xué)習(xí)模型提高了0.39%;在NLPCC數(shù)據(jù)集中的平均F值達(dá)到0.7140,較評測最佳模型提升了0.34%。該結(jié)果顯示了本文提出的方法在社交媒體文本立場分析中的有效性。
[Abstract]:With the rapid development of Internet technology and the rapid popularity of intelligent terminals, more and more users on social media platforms to express their views on all kinds of events. The user's attitude towards specific objects and events is of great value to business organizations and government agencies. Traditional affective analysis only classifies the emotional expression on the surface of the text positively and negatively, so it is difficult to mine the user's position on the topic of a particular event in the text. Therefore, the research of social media text position analysis on specific topics has important scientific research value and wide application prospect. The existing text position analysis methods are mainly divided into two categories, one is based on feature engineering machine learning and the other is based on depth learning. The position analysis method based on feature engineering machine learning needs to construct and select a large number of features. It often requires a high level of linguistic knowledge and is often influenced by sparse features caused by insufficient training samples. The method based on in-depth learning often directly regards position analysis as a simple text classification problem, and seldom combines the background knowledge of social media text word embedding, and does not effectively utilize the information of a particular topic in position analysis. Aiming at the above problems, this paper uses social text word embedding as the background knowledge, combined with the attention mechanism of deep memory network, to study the social media text position analysis method based on deep learning. In this paper, we first study a method of text position analysis based on convolution neural network based on the word embedding of large-scale social text pretraining. The experimental results on the Sem Eval English position analysis data set and the NLPCC Chinese position analysis data set show that the method achieves the result of Semeval data set F 0.6752 and NLPCC data F 0.7036. Performance outperforms the best team on a number of sub-topics, and the comprehensive performance is ranked second in both English and Chinese positions. At the same time, it is found that the addition of pre-training words in social media text can effectively improve the performance of position analysis of the model compared with the initial method of word embedding such as random assignment. Aiming at the lack of effective use of information on specific topics, this paper proposes a position analysis model which uses the attention mechanism of deep memory networks to evaluate the relationship between specific topics and text components. The model reads the word-embedded representation of text and topic, combines the memory mechanism and attention mechanism of deep memory network, and uses multiple network layers to learn multi-level text representation, and analyzes the position tendency of text on a particular topic. The experimental results show that the average F value of this method in Sem Eval dataset is 0.6821, which is 0.39 higher than that of the best performance transfer learning model in this evaluation, and the average F value in NLPCC dataset is 0.7140, which is 0.34 points higher than that in the best evaluation model. The results show the effectiveness of the proposed method in social media text position analysis.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前4條

1 陳釗;徐睿峰;桂林;陸勤;;結(jié)合卷積神經(jīng)網(wǎng)絡(luò)和詞語情感序列特征的中文情感分析[J];中文信息學(xué)報(bào);2015年06期

2 劉龍飛;楊亮;張紹武;林鴻飛;;基于卷積神經(jīng)網(wǎng)絡(luò)的微博情感傾向性分析[J];中文信息學(xué)報(bào);2015年06期

3 林政;譚松波;程學(xué)旗;;基于情感關(guān)鍵句抽取的情感分類研究[J];計(jì)算機(jī)研究與發(fā)展;2012年11期

4 宋艷雪;張紹武;林鴻飛;;基于語境歧義詞的句子情感傾向性分析[J];中文信息學(xué)報(bào);2012年03期

相關(guān)會議論文 前1條

1 姚天f ;婁德成;;漢語語句主題語義傾向分析方法的研究[A];內(nèi)容計(jì)算的研究與應(yīng)用前沿——第九屆全國計(jì)算語言學(xué)學(xué)術(shù)會議論文集[C];2007年

,

本文編號:1845251

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1845251.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶e46bf***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請E-mail郵箱bigeng88@qq.com