面向多語種的新聞翻譯及信息抽取系統(tǒng)的設計與實現(xiàn)
發(fā)布時間:2018-04-16 15:39
本文選題:網(wǎng)絡新聞 + 網(wǎng)絡爬蟲; 參考:《哈爾濱工業(yè)大學》2017年碩士論文
【摘要】:隨著經(jīng)濟全球化程度的進一步加深,我國和世界各國之間的交流、合作越來越頻繁,為此國家提出并積極推動“一帶一路”戰(zhàn)略,加強與區(qū)域各國的溝通合作。在“一帶一路”政策的帶動下,各國政府和企業(yè)積極進行交流和合作,雙方不斷加深了解、促進共同發(fā)展。與此同時,各國人民之間交流互通也越發(fā)頻繁。而網(wǎng)絡新聞作為記錄與傳播信息的新媒體,其實時性、真實性以及覆蓋范圍廣等特點,使得越來越多的人們通過新聞這扇窗戶更多的了解國外的信息。但是語言的不通成為了區(qū)域各國和人民溝通與交流的最大障礙。在積極推動“一帶一路”戰(zhàn)略的關鍵時期,對與各國語言的新聞報道、政府公文等文本的翻譯需求量大增。但是人工翻譯并不能滿足現(xiàn)有的大規(guī)模的文本翻譯需求,而當前階段神經(jīng)網(wǎng)絡機器翻譯(NMT)技術蓬勃發(fā)展,并且在在英語、德語、俄語以及中文等語言方面取得了非常好的效果。因此,對于網(wǎng)絡新聞翻譯的開發(fā)顯得尤為關鍵。與此同時,當前網(wǎng)絡上的各種新聞鋪天蓋地,人們也迫切地希望有這么一個工具可以幫助自己用最短的時間了解最多的最有用的新聞。因此,為了方便用戶快速的了解各國的新聞報道,方便用戶的閱讀以及判斷該新聞的可讀性,所以基于網(wǎng)絡爬蟲的新聞獲取,以及獲取新聞翻譯后對于其內(nèi)容的信息抽取也是十分關鍵的。對此,本課題提出了面向多語言的新聞信息抽取及翻譯系統(tǒng)的設計與實現(xiàn)的工作。
[Abstract]:With the further deepening of economic globalization, the exchanges and cooperation between China and the rest of the world become more and more frequent. Therefore, the countries put forward and actively promote the strategy of "Belt and Road" and strengthen the communication and cooperation with regional countries.Under the impetus of Belt and Road's policy, governments and enterprises of various countries have actively carried out exchanges and cooperation, and the two sides have continuously deepened their understanding and promoted common development.At the same time, people from all over the world communicate more and more frequently.As a new media to record and spread information, network news has the characteristics of real-time, authenticity and wide coverage, which makes more and more people know more foreign information through the window of news.However, language barrier has become the biggest obstacle to communication and communication among countries and people in the region.In the critical period of actively promoting Belt and Road's strategy, the translation demand for news reports, government documents and other texts in various languages has increased greatly.However, manual translation can not meet the needs of large-scale text translation. At present, the neural network machine translation (NMTT) technology is booming, and it has achieved very good results in English, German, Russian and Chinese.Therefore, the development of network news translation is particularly critical.At the same time, with all kinds of news on the Internet, people are eager to have such a tool to help themselves to know the most useful news in the shortest time.Therefore, in order to facilitate the users to quickly understand the news reports in various countries, to facilitate the reading of the news and to judge the readability of the news, the news acquisition based on the web crawler,It is also very important to extract the information of news translation.In this paper, the design and implementation of multilingual news information extraction and translation system are presented.
【學位授予單位】:哈爾濱工業(yè)大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.2
【參考文獻】
相關期刊論文 前7條
1 劉美良;;語料庫語言學綜述[J];科技信息;2010年21期
2 孫立偉;何國輝;吳禮發(fā);;網(wǎng)絡爬蟲技術的研究[J];電腦知識與技術;2010年15期
3 常寶寶;俞士汶;;語料庫技術及其應用[J];外語研究;2009年05期
4 蒲筱哥;;基于Web的信息抽取技術研究綜述[J];現(xiàn)代情報;2007年10期
5 劉世濤;;簡析搜索引擎中網(wǎng)絡爬蟲的搜索策略[J];阜陽師范學院學報(自然科學版);2006年03期
6 張曉艷;王挺;陳火旺;;命名實體識別研究[J];計算機科學;2005年04期
7 朱虹;劉揚;;詞匯語義知識庫的研究現(xiàn)狀與發(fā)展趨勢[J];情報學報;2008年06期
相關碩士學位論文 前2條
1 崔金國;基于蟻群算法的主題爬蟲技術研究與實現(xiàn)[D];成都理工大學;2010年
2 陳奮;過濾型網(wǎng)絡爬蟲的研究與設計[D];廈門大學;2007年
,本文編號:1759593
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1759593.html
最近更新
教材專著