天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

漢—老雙語(yǔ)詞語(yǔ)對(duì)齊及依存樹庫(kù)構(gòu)建方法研究

發(fā)布時(shí)間:2018-05-01 03:18

  本文選題:漢語(yǔ) + 老撾語(yǔ); 參考:《昆明理工大學(xué)》2017年碩士論文


【摘要】:隨著科技和社會(huì)經(jīng)濟(jì)的快速發(fā)展,伴隨著跨語(yǔ)言溝通的不斷深化,全球互聯(lián)已成為不可抗拒的發(fā)展趨勢(shì)。面對(duì)互聯(lián)網(wǎng)上的數(shù)量巨大且實(shí)時(shí)動(dòng)態(tài)變化的多語(yǔ)言信息,僅僅依賴人工翻譯來處理這些數(shù)據(jù)簡(jiǎn)直就是天方夜譚,唯一的解決方案就是充分利用機(jī)器翻譯技術(shù)來實(shí)現(xiàn)自動(dòng)翻譯服務(wù),由此掀起了研究機(jī)器翻譯領(lǐng)域的浪潮。語(yǔ)言上的互相溝通和理解是國(guó)與國(guó)之間進(jìn)行經(jīng)濟(jì)文化各方面之間交流的基礎(chǔ),中國(guó)和老撾也不例外,對(duì)漢-老雙語(yǔ)進(jìn)行深入的研究也可以為構(gòu)建漢語(yǔ)-老撾語(yǔ)雙語(yǔ)語(yǔ)料資源打下基礎(chǔ)。在自然語(yǔ)言處理中,雙語(yǔ)詞對(duì)齊是一個(gè)十分重要的基礎(chǔ)工作,它將雙語(yǔ)平行語(yǔ)料庫(kù)中互為翻譯的一對(duì)雙語(yǔ)語(yǔ)言之間的關(guān)系看作一根連線,而這些對(duì)齊關(guān)系可以為機(jī)器翻譯提供有價(jià)值的參考知識(shí)。在自然語(yǔ)言研究領(lǐng)域中的許多應(yīng)用,例如:構(gòu)建依存樹庫(kù),雙語(yǔ)字典編纂、機(jī)器翻譯、雙語(yǔ)信息抽取等應(yīng)用,雙語(yǔ)詞對(duì)齊都能為它們提供基礎(chǔ)性支持。對(duì)漢-老雙語(yǔ)詞語(yǔ)自動(dòng)對(duì)齊方法的深入研究并且在此基礎(chǔ)上構(gòu)建具有一定規(guī)模的雙語(yǔ)平行語(yǔ)料庫(kù)在漢-老雙語(yǔ)信息化處理中有著舉足輕重的地位。本文通過分析漢語(yǔ)和老撾語(yǔ)這兩種語(yǔ)言在語(yǔ)法結(jié)構(gòu)上的異同點(diǎn),在漢-老雙語(yǔ)自動(dòng)詞對(duì)齊的方法和在基于漢-老雙語(yǔ)詞對(duì)齊語(yǔ)料的基礎(chǔ)上構(gòu)建老撾語(yǔ)依存樹庫(kù)的方法進(jìn)行相關(guān)研究,具有特色的研究工作有以下幾點(diǎn):(1)首先對(duì)漢語(yǔ)老撾語(yǔ)兩種語(yǔ)言在語(yǔ)法特點(diǎn)上存在的差別展開分析,通過分析發(fā)現(xiàn),漢語(yǔ)和老撾語(yǔ)的句子結(jié)構(gòu)中修飾詞與中心詞之間存在順序錯(cuò)位的情況,從這一特點(diǎn)入手,篩選出一些雙語(yǔ)特征,對(duì)漢-老雙語(yǔ)詞對(duì)齊加以約束。(2)將句法特征的融入到統(tǒng)計(jì)詞對(duì)齊算法中,對(duì)漢-老雙語(yǔ)自動(dòng)詞對(duì)齊算法加以約束。漢語(yǔ)和老撾語(yǔ)在語(yǔ)法和句法結(jié)構(gòu)上均存在巨大差異,漢-老雙語(yǔ)自動(dòng)詞對(duì)齊實(shí)現(xiàn)的困難較大,因此本文提出一種融合多種句法特征的漢-老雙語(yǔ)自動(dòng)詞對(duì)齊方法。首先分析和選取中老雙語(yǔ)的一些句法特征,對(duì)這些特征進(jìn)行整合并構(gòu)建模型,使用對(duì)數(shù)線性模型框架并在最小錯(cuò)誤率算法的條件下訓(xùn)練模型。實(shí)驗(yàn)以IBM3為基礎(chǔ)比對(duì)模型,結(jié)果表明該雙語(yǔ)詞對(duì)齊方法取得了很好的對(duì)齊效果,明顯優(yōu)于基礎(chǔ)模型。(3)提出了通過漢-老雙語(yǔ)詞對(duì)齊語(yǔ)料來構(gòu)建老撾語(yǔ)依存樹庫(kù)的方法。在前期的文獻(xiàn)調(diào)查中,我們發(fā)現(xiàn)國(guó)內(nèi)外目前針對(duì)老撾語(yǔ)研究工作相對(duì)較少且沒有建立較大規(guī)模的依存樹庫(kù),而人工方法構(gòu)建老撾語(yǔ)依存樹庫(kù)困難重重,所以本文提出了一種借助漢-老雙語(yǔ)詞對(duì)齊語(yǔ)料構(gòu)建老撾語(yǔ)依存樹庫(kù)的方法。在已經(jīng)獲取漢-老雙語(yǔ)詞對(duì)齊平行語(yǔ)料的基礎(chǔ)上,首先對(duì)平行語(yǔ)料中的漢語(yǔ)句子進(jìn)行依存句法分析,然后結(jié)合老撾語(yǔ)自身語(yǔ)言特點(diǎn),在依存句法規(guī)則的基礎(chǔ)上將漢語(yǔ)句子的依存關(guān)系通過漢-老雙語(yǔ)詞對(duì)齊關(guān)系映射到老撾語(yǔ)句子中,最終生成老撾語(yǔ)句子的依存樹。在實(shí)驗(yàn)中,將該方法和傳統(tǒng)的機(jī)器學(xué)習(xí)的方法進(jìn)行比較,結(jié)果表明該方法的準(zhǔn)確率得到了明顯提高,并且簡(jiǎn)化了構(gòu)建老撾語(yǔ)依存樹庫(kù)過程中的人工標(biāo)注收集工作,節(jié)省了大量的人力物力,可以在老撾語(yǔ)語(yǔ)料稀缺的情況下快速的構(gòu)建質(zhì)量較好的老撾語(yǔ)依存樹庫(kù)。
[Abstract]:With the rapid development of science and technology and social economy, with the continuous deepening of cross language communication, the global interconnection has become an irresistible trend. Facing the huge and real-time and dynamic multilingual information on the Internet, relying solely on artificial translation to deal with these data is simply the night, the only solution is It is to make full use of Machine Translation technology to realize automatic translation service, and thus set off a wave of research in the field of Machine Translation. Language communication and understanding are the basis for the exchange of economic and cultural aspects between countries and countries. China and Laos are no exception. The in-depth study of Chinese and old bilingualism can also be used to build Chinese. In the Natural Language Processing, bilingual word alignment is a very important basic work in the bilingual corpus of bilingual words. It regards the relationship between bilingual parallel corpus as a link between a pair of bilingual languages, which can provide valuable reference knowledge for Machine Translation. Many applications in the field of language research, such as building dependency tree library, bilingual dictionary compilation, Machine Translation, bilingual information extraction, can provide basic support for bilingual word alignment. A thorough study of the automatic alignment method of Chinese and old bilingual words and the construction of a bilingual parallel corpus with a certain scale on this basis. By analyzing the similarities and differences of the grammatical structure between the two languages of Chinese and Laos, this paper studies the methods for the alignment of Chinese and old bilingual words and the method of constructing the Laotian dependency tree base on the basis of the align corpus of Chinese and old bilingual words. The characteristics of the research are as follows: (1) first, the analysis of the differences in the grammatical characteristics of the two languages of the Chinese Laos is first analyzed. Through the analysis, it is found that the sequence of the modifiers and the central words in the sentence structure of Chinese and Laos are in the wrong order. From this feature, some bilingual features are screened out, and the Chinese and old bilingualism are selected. The word alignment is constrained. (2) the syntactic features are incorporated into the statistical word alignment algorithm, and the Chinese and old bilingual word alignment algorithms are constrained. There are great differences in the grammatical and syntactic structure between Chinese and Laos, and the difficulties in realizing the alignment of Chinese and old bilingual words are more difficult. Therefore, this paper puts forward a kind of syntactic feature. This paper firstly analyzes and selects some syntactic features of Chinese and old bilinguals, integrates these features and constructs the model, uses a logarithmic linear model framework and trains the model under the minimum error rate algorithm. The experiment is based on the IBM3 based comparison model. The results show that the bilingual word alignment method is very good. The alignment effect is obviously superior to that of the basic model. (3) a method of constructing the Laos dependency tree base through Chinese and old bilingual words is proposed. In the previous literature survey, we found that there are relatively few Laos research work at home and abroad, and there is no larger norm dependent dependency tree, and the artificial method is used to construct Lao language. It is difficult to save the tree bank, so this paper puts forward a method of constructing the Laos dependency tree base with the alignment corpus of Chinese and old bilingual words. On the basis of the alignment of the parallel corpus of Chinese and old bilingual words, the Chinese sentences in the parallel corpus are analyzed with dependency syntax, and then the dependency sentence is combined with the language characteristics of the Laos. On the basis of the rule of law, the dependency relationship of Chinese sentences is mapped to the Laotian sentence, and the dependency tree of the Laos sentence is generated. In the experiment, the method is compared with the traditional machine learning method. The result shows that the accuracy of the method is obviously improved and the structure is simplified. The manual labelling collection in the process of building the Laos dependency tree can save a lot of manpower and material resources, and can quickly build a good Laotian dependency tree base in the case of the scarce Lao language.

【學(xué)位授予單位】:昆明理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前6條

1 楊蓓;周蘭江;余正濤;劉麗佳;;半監(jiān)督學(xué)習(xí)的老撾語(yǔ)詞性標(biāo)注方法研究[J];計(jì)算機(jī)科學(xué);2016年09期

2 曹井香;黃德根;王偉;王帥軍;;中英平行短語(yǔ)依存樹庫(kù)構(gòu)建[J];大連理工大學(xué)學(xué)報(bào);2014年01期

3 銀莎格;;國(guó)內(nèi)老撾語(yǔ)研究綜述[J];銅仁學(xué)院學(xué)報(bào);2014年01期

4 車萬(wàn)翔;張梅山;劉挺;;基于主動(dòng)學(xué)習(xí)的中文依存句法分析[J];中文信息學(xué)報(bào);2012年02期

5 呂學(xué)強(qiáng),吳宏林,姚天順;無(wú)雙語(yǔ)詞典的英漢詞對(duì)齊[J];計(jì)算機(jī)學(xué)報(bào);2004年08期

6 劉群;統(tǒng)計(jì)機(jī)器翻譯綜述[J];中文信息學(xué)報(bào);2003年04期

相關(guān)博士學(xué)位論文 前2條

1 劉樂茂;統(tǒng)計(jì)機(jī)器翻譯判別式訓(xùn)練方法研究[D];哈爾濱工業(yè)大學(xué);2013年

2 黃書劍;統(tǒng)計(jì)機(jī)器翻譯中的詞對(duì)齊研究[D];南京大學(xué);2012年

相關(guān)碩士學(xué)位論文 前3條

1 盧文杰;老撾語(yǔ)和漢語(yǔ)量詞對(duì)比研究[D];廣西民族大學(xué);2013年

2 阮華剛;基于IBM模型的漢—越雙語(yǔ)詞語(yǔ)對(duì)齊研究[D];昆明理工大學(xué);2013年

3 陳鑫;基于主動(dòng)學(xué)習(xí)的漢語(yǔ)依存樹庫(kù)構(gòu)建[D];哈爾濱工業(yè)大學(xué);2011年

,

本文編號(hào):1827496

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/jingjilunwen/jiliangjingjilunwen/1827496.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶c888b***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
欧美成人黄色一区二区三区| 人妻精品一区二区三区视频免精| 风韵人妻丰满熟妇老熟女av| 好骚国产99在线中文| 国产99久久精品果冻传媒| 日韩精品一区二区亚洲| 亚洲一级二级三级精品| 国产又粗又深又猛又爽又黄| 日本精品中文字幕在线视频| 亚洲国产精品久久网午夜| 亚洲一区二区福利在线| 欧美日韩一级黄片免费观看| 国产一区二区三区口爆在线| 91欧美日韩国产在线观看| 亚洲欧美日本视频一区二区| 日韩精品人妻少妇一区二区| 99久久国产综合精品二区| 污污黄黄的成年亚洲毛片 | 国产又粗又猛又爽色噜噜 | 国产对白老熟女正在播放| 成人午夜免费观看视频| 久久99夜色精品噜噜亚洲av| 中国黄色色片色哟哟哟哟哟哟| 日韩不卡一区二区在线| 精品偷拍一区二区三区| 免费观看在线午夜视频| 精品少妇人妻av一区二区蜜桃| 91精品国产综合久久精品| 欧美日韩国产综合特黄| 日韩欧美国产精品自拍| 绝望的校花花间淫事2| 欧美黑人暴力猛交精品| 日韩性生活片免费观看| 亚洲精品国男人在线视频| 国产综合欧美日韩在线精品| 国产乱久久亚洲国产精品| 日韩性生活片免费观看| 在线免费国产一区二区| 久久综合亚洲精品蜜桃| 国产精品丝袜一二三区| 丰满人妻一二三区av|