面向多語種的新聞翻譯及信息抽取系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)
本文選題:網(wǎng)絡(luò)新聞 + 網(wǎng)絡(luò)爬蟲; 參考:《哈爾濱工業(yè)大學(xué)》2017年碩士論文
【摘要】:隨著經(jīng)濟(jì)全球化程度的進(jìn)一步加深,我國和世界各國之間的交流、合作越來越頻繁,為此國家提出并積極推動(dòng)“一帶一路”戰(zhàn)略,加強(qiáng)與區(qū)域各國的溝通合作。在“一帶一路”政策的帶動(dòng)下,各國政府和企業(yè)積極進(jìn)行交流和合作,雙方不斷加深了解、促進(jìn)共同發(fā)展。與此同時(shí),各國人民之間交流互通也越發(fā)頻繁。而網(wǎng)絡(luò)新聞作為記錄與傳播信息的新媒體,其實(shí)時(shí)性、真實(shí)性以及覆蓋范圍廣等特點(diǎn),使得越來越多的人們通過新聞這扇窗戶更多的了解國外的信息。但是語言的不通成為了區(qū)域各國和人民溝通與交流的最大障礙。在積極推動(dòng)“一帶一路”戰(zhàn)略的關(guān)鍵時(shí)期,對與各國語言的新聞報(bào)道、政府公文等文本的翻譯需求量大增。但是人工翻譯并不能滿足現(xiàn)有的大規(guī)模的文本翻譯需求,而當(dāng)前階段神經(jīng)網(wǎng)絡(luò)機(jī)器翻譯(NMT)技術(shù)蓬勃發(fā)展,并且在在英語、德語、俄語以及中文等語言方面取得了非常好的效果。因此,對于網(wǎng)絡(luò)新聞翻譯的開發(fā)顯得尤為關(guān)鍵。與此同時(shí),當(dāng)前網(wǎng)絡(luò)上的各種新聞鋪天蓋地,人們也迫切地希望有這么一個(gè)工具可以幫助自己用最短的時(shí)間了解最多的最有用的新聞。因此,為了方便用戶快速的了解各國的新聞報(bào)道,方便用戶的閱讀以及判斷該新聞的可讀性,所以基于網(wǎng)絡(luò)爬蟲的新聞獲取,以及獲取新聞翻譯后對于其內(nèi)容的信息抽取也是十分關(guān)鍵的。對此,本課題提出了面向多語言的新聞信息抽取及翻譯系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)的工作。
[Abstract]:With the further deepening of economic globalization, the exchanges and cooperation between China and the rest of the world become more and more frequent. Therefore, the countries put forward and actively promote the strategy of "Belt and Road" and strengthen the communication and cooperation with regional countries.Under the impetus of Belt and Road's policy, governments and enterprises of various countries have actively carried out exchanges and cooperation, and the two sides have continuously deepened their understanding and promoted common development.At the same time, people from all over the world communicate more and more frequently.As a new media to record and spread information, network news has the characteristics of real-time, authenticity and wide coverage, which makes more and more people know more foreign information through the window of news.However, language barrier has become the biggest obstacle to communication and communication among countries and people in the region.In the critical period of actively promoting Belt and Road's strategy, the translation demand for news reports, government documents and other texts in various languages has increased greatly.However, manual translation can not meet the needs of large-scale text translation. At present, the neural network machine translation (NMTT) technology is booming, and it has achieved very good results in English, German, Russian and Chinese.Therefore, the development of network news translation is particularly critical.At the same time, with all kinds of news on the Internet, people are eager to have such a tool to help themselves to know the most useful news in the shortest time.Therefore, in order to facilitate the users to quickly understand the news reports in various countries, to facilitate the reading of the news and to judge the readability of the news, the news acquisition based on the web crawler,It is also very important to extract the information of news translation.In this paper, the design and implementation of multilingual news information extraction and translation system are presented.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.2
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 劉美良;;語料庫語言學(xué)綜述[J];科技信息;2010年21期
2 孫立偉;何國輝;吳禮發(fā);;網(wǎng)絡(luò)爬蟲技術(shù)的研究[J];電腦知識與技術(shù);2010年15期
3 常寶寶;俞士汶;;語料庫技術(shù)及其應(yīng)用[J];外語研究;2009年05期
4 蒲筱哥;;基于Web的信息抽取技術(shù)研究綜述[J];現(xiàn)代情報(bào);2007年10期
5 劉世濤;;簡析搜索引擎中網(wǎng)絡(luò)爬蟲的搜索策略[J];阜陽師范學(xué)院學(xué)報(bào)(自然科學(xué)版);2006年03期
6 張曉艷;王挺;陳火旺;;命名實(shí)體識別研究[J];計(jì)算機(jī)科學(xué);2005年04期
7 朱虹;劉揚(yáng);;詞匯語義知識庫的研究現(xiàn)狀與發(fā)展趨勢[J];情報(bào)學(xué)報(bào);2008年06期
相關(guān)碩士學(xué)位論文 前2條
1 崔金國;基于蟻群算法的主題爬蟲技術(shù)研究與實(shí)現(xiàn)[D];成都理工大學(xué);2010年
2 陳奮;過濾型網(wǎng)絡(luò)爬蟲的研究與設(shè)計(jì)[D];廈門大學(xué);2007年
,本文編號:1759593
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1759593.html