漢越雙語新聞觀點(diǎn)句抽取及分析方法研究
發(fā)布時間:2018-04-26 19:26
本文選題:跨語言 + 觀點(diǎn)分析; 參考:《昆明理工大學(xué)》2017年碩士論文
【摘要】:越南是我國的重要鄰國之一,與我國在政治、經(jīng)濟(jì)、軍事及文化上有著緊密的聯(lián)系,分析及掌握兩國新聞的輿情動向有著重要的作用。然而,互聯(lián)網(wǎng)上存在著海量的新聞文本,人工對其進(jìn)行分析和總結(jié)費(fèi)時又費(fèi)力。因此,研究能夠自動化分析漢越雙語新聞文本的方法具有重大的意義和價值。新聞文本主要包含對已經(jīng)發(fā)生的客觀事實(shí)的描述和對客觀事實(shí)的主觀評判兩部分,其中客觀事實(shí)代表性的詞匯稱為新聞要素,如“人名,地名,機(jī)構(gòu)名”,主觀評判代表性的詞匯稱為情感詞,如“意義,影響,贊揚(yáng)”等;诖,本文將新聞要素關(guān)聯(lián)及情感關(guān)聯(lián)融入圖模型,研究基于圖模型的觀點(diǎn)句抽取方法,在此基礎(chǔ)上研究差異摘要生成方法,并進(jìn)一步分析觀點(diǎn)句的情感傾向性,主要完成了以下研究工作:(1)基于要素關(guān)聯(lián)及情感關(guān)聯(lián)的漢越雙語新聞文本觀點(diǎn)句抽取方法不論何種語言的新聞均包含新聞要素和情感詞,結(jié)合這個特點(diǎn)提出了一種基于要素關(guān)聯(lián)及情感關(guān)聯(lián)的觀點(diǎn)句抽取方法。首先,根據(jù)句子包含的要素及情感信息對句子進(jìn)行關(guān)聯(lián)分析,構(gòu)建句子關(guān)聯(lián)圖模型。然后,根據(jù)要素關(guān)聯(lián)強(qiáng)度和情感關(guān)聯(lián)強(qiáng)度計(jì)算圖模型中邊的權(quán)重,對圖模型進(jìn)行求解,實(shí)現(xiàn)觀點(diǎn)句的抽取。(2)基于圖模型排序漢越雙語新聞的差異觀點(diǎn)摘要抽取方法對于相同的事件,越南語新聞和漢語新聞表達(dá)的觀點(diǎn)不盡相同,與相同的觀點(diǎn)相比,不同的觀點(diǎn)具有更大的價值。為了提取出漢越雙語新聞所表達(dá)的具有差異的觀點(diǎn),在第一個研究工作的基礎(chǔ)上,進(jìn)一步研究了一種基于無向圖模型的差異觀點(diǎn)生成方法。該方法根據(jù)機(jī)器翻譯搭建不同語言之間的橋梁,首先,計(jì)算漢語和越南語新聞句之間的相似度,根據(jù)相似度對雙語句子進(jìn)行過濾。然后,以過濾后的句子為節(jié)點(diǎn)構(gòu)建無向圖模型,相同語言節(jié)點(diǎn)之間邊的權(quán)值為相似度,不同語言節(jié)點(diǎn)之間邊的權(quán)值為差異度。最后,根據(jù)邊的權(quán)值結(jié)合隨機(jī)游走算法計(jì)算節(jié)點(diǎn)的權(quán)重,抽取權(quán)重高的句子作為差異摘要句。(3)基于卷積神經(jīng)網(wǎng)絡(luò)的漢越雙語新聞觀點(diǎn)句情感傾向性判別方法為了進(jìn)一步分析觀點(diǎn)句的情感傾向性,研究了一種基于卷積神經(jīng)網(wǎng)絡(luò)的跨語言情感傾向性判別方法。該方法與傳統(tǒng)方法相比,不需要構(gòu)建情感詞典,或者進(jìn)行復(fù)雜的特征提取工作。為了使用卷積神經(jīng)網(wǎng)絡(luò)解決漢越雙語新聞句的情感傾向性判別問題,首先,收集大量漢語和越南語未標(biāo)注語料,分別訓(xùn)練漢語詞向量和越南語詞向量。然后,對于漢語句子,利用機(jī)器翻譯將其翻譯到越南語進(jìn)行處理,以此解決越南語語料匱乏的問題。最后,將句子的漢語詞向量和越南語詞向量作為不同的channel輸入卷積神經(jīng)網(wǎng)絡(luò)模型進(jìn)行訓(xùn)練,實(shí)現(xiàn)跨語言情感傾向性的判別。
[Abstract]:Vietnam is one of the important neighbors of our country, and has close relation with our country in politics, economy, military and culture. It plays an important role in analyzing and mastering the public opinion trend of the news of the two countries. However, there is a huge amount of news text on the Internet, which is time-consuming and laborious to analyze and summarize manually. Therefore, it is of great significance and value to study the automatic analysis of Chinese-Vietnamese bilingual news texts. The news text mainly includes two parts: the description of the objective facts that have taken place and the subjective judgment of the objective facts, in which the words representative of the objective facts are called news elements, such as "name of person, place name, agency name". The representative words of subjective judgment are called affective words, such as meaning, influence, praise and so on. Based on this, this paper integrates news element association and affective association into graph model, studies the method of extracting opinion sentence based on graph model, and then studies the method of generating differential summary, and further analyzes the affective tendency of opinion sentence. The main work of this paper is as follows: (1) A method of extracting Chinese and Vietnamese bilingual news text viewpoint sentences based on element relevance and emotional relevance is presented. The news in any language contains news elements and affective words. According to this feature, a method of extracting opinion sentences based on element association and emotional correlation is proposed. Firstly, according to the elements and affective information contained in the sentence, the sentence association analysis is carried out, and the sentence association graph model is constructed. Then, the graph model is solved by calculating the weights of the edges in the graph model according to the factor association strength and emotional association strength. Abstract extraction of Chinese and Vietnamese Bilingual Journalism based on Graph Model for the same event, Vietnamese news and Chinese news expressed different views, compared with the same view. Different views have greater value. In order to extract the different viewpoints expressed by Chinese and Vietnamese bilingual news, a method of generating difference views based on undirected graph model is further studied on the basis of the first research work. The method builds bridges between different languages according to machine translation. Firstly, the similarity between Chinese and Vietnamese news sentences is calculated, and the bilingual sentences are filtered according to the similarity. Then, the undirected graph model is constructed with filtered sentences as nodes. The weights of edges of the same language nodes are similar, and the weights of edges of different language nodes are different. Finally, according to the weight of the edge and the random walk algorithm, the weight of the node is calculated. Abstract sentences with high weight as abstracts of differences. 3) based on convolutional neural network, the discriminant method of emotional tendency of Chinese and Vietnamese bilingual news opinion sentences is presented in order to further analyze the affective tendency of opinion sentences. A method of discriminating cross-language affective tendency based on convolution neural network is studied. Compared with the traditional method, this method does not need to construct emotion dictionary or perform complex feature extraction. In order to use convolutional neural network to determine the emotional orientation of Chinese-Vietnamese bilingual news sentences, a large number of untagged Chinese and Vietnamese language data were collected to train Chinese word vectors and Vietnamese word vectors respectively. Then, Chinese sentences are translated into Vietnamese by machine translation to solve the problem of lack of Vietnamese corpus. Finally, the Chinese word vector and the Vietnamese word vector of the sentence are trained as different channel input convolution neural network models to distinguish the cross-language affective tendency.
【學(xué)位授予單位】:昆明理工大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 田久樂;趙蔚;;基于同義詞詞林的詞語相似度計(jì)算方法[J];吉林大學(xué)學(xué)報(信息科學(xué)版);2010年06期
相關(guān)博士學(xué)位論文 前1條
1 劉楠;面向微博短文本的情感分析研究[D];武漢大學(xué);2013年
相關(guān)碩士學(xué)位論文 前1條
1 許力波;產(chǎn)品評價對象與情感詞搭配關(guān)系的抽取[D];北京郵電大學(xué);2013年
,本文編號:1807356
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1807356.html
最近更新
教材專著