從網(wǎng)頁和在線中英詞典中獲取專業(yè)術(shù)語翻譯的方法研究和實現(xiàn)
[Abstract]:The term in a specific domain is the core concept of the domain, carrying rich domain information. The translation of technical terms has become one of the difficult problems in machine translation and information retrieval due to the increasing and changing of technical terms. Both statistical and rule-based approaches have encountered difficulties in translating terminology. In this paper, Web is used as a corpus to study English translation of Chinese technical terms by means of Web mining and knowledge acquisition. It is not only helpful to solve the problem of term translation, but also has a positive effect on cross-language information retrieval, cross-language knowledge acquisition and so on. This paper mainly focuses on the following aspects. The main contents of this paper are as follows: 1) the main problems, difficulties and current research situation of translation acquisition based on Web are analyzed, and the shortcomings of previous studies are discussed. Then, the basic flow and thinking of obtaining terminology translation from web pages are given. 2) using the information extraction technology based on Web and semantic prediction principle, combining with the partial translation of terms to construct query items, returning the related web pages of term translation from search engine, solving the problem that the bilingual cooccurrence data of term translation is difficult to obtain. High-quality terminology translation related data acquisition for the subsequent term extraction laid a good foundation. 3) using knowledge acquisition technology, combining semi-structured text analysis method and information extraction method of statistics and rules to extract terminology translation from web pages. Based on template, dictionary pattern and location mode, the extraction method is proposed, which can improve the accuracy of the result under the premise of guaranteeing recall rate. 4) in order to eliminate the noise data from the translation results, this paper proposes three verification methods: end analogy alignment, bilingual alignment and word-formation, using the manually compiled term bilingual alignment corpus to verify candidate translation unnecessarily. Term translation verification ensures the accuracy of term translation and makes the system more practical and reliable. 5) On-line Chinese-English dictionaries are used to help the translation of commonly used terms to ensure the accuracy of the translation and improve the efficiency of the translation acquisition system. The experiments on the acquisition of terms in different fields show that the method and system of terms translation from web pages and online dictionaries have good accuracy and are significantly improved compared with previous methods, and the system takes less time and has strong practicability.
【學(xué)位授予單位】:江蘇科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2012
【分類號】:TP391.1
【參考文獻】
相關(guān)期刊論文 前10條
1 陳魁;馮寅;;一種基于隱馬爾可夫模型的第一類對位生成方法[J];福建電腦;2008年09期
2 何中軍;劉群;林守勛;;基于短語相似度的統(tǒng)計機器翻譯模型[J];高技術(shù)通訊;2009年04期
3 董燕舉;白宇;蔡東風(fēng);;基于Web的中英術(shù)語翻譯獲取方法研究[J];沈陽航空工業(yè)學(xué)院學(xué)報;2010年02期
4 李保利,陳玉忠,俞士汶;信息抽取研究綜述[J];計算機工程與應(yīng)用;2003年10期
5 鄧丹,劉群,俞鴻魁;基于雙語詞典的漢英詞語對齊算法研究[J];計算機工程;2005年16期
6 夏天;;漢語詞語語義相似度計算研究[J];計算機工程;2007年06期
7 呂學(xué)強,吳宏林,姚天順;無雙語詞典的英漢詞對齊[J];計算機學(xué)報;2004年08期
8 劉群;統(tǒng)計機器翻譯綜述[J];中文信息學(xué)報;2003年04期
9 蔣龍;周明;簡立峰;;利用音譯和網(wǎng)絡(luò)挖掘翻譯命名實體[J];中文信息學(xué)報;2007年01期
10 熊德意;劉群;林守勛;;基于句法的統(tǒng)計機器翻譯綜述[J];中文信息學(xué)報;2008年02期
相關(guān)碩士學(xué)位論文 前2條
1 鄧丹;漢英詞語對齊技術(shù)研究[D];中國科學(xué)院研究生院(計算技術(shù)研究所);2004年
2 王旭東;基于Web的信息抽取技術(shù)研究[D];西南交通大學(xué);2008年
本文編號:2254447
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2254447.html