基于挖掘Web雙語(yǔ)詞匯關(guān)聯(lián)度的無(wú)指導(dǎo)譯文消歧
發(fā)布時(shí)間:2019-05-12 17:38
【摘要】:為緩解譯文消歧任務(wù)中消歧知識(shí)獲取困難及數(shù)據(jù)稀疏問(wèn)題,提出了一種基于Web的挖掘雙語(yǔ)詞匯相關(guān)關(guān)系的無(wú)指導(dǎo)譯文消歧方法。該方法將雙語(yǔ)詞匯在語(yǔ)料庫(kù)中的間接相關(guān)拓展到Web,提出了基于Web的雙語(yǔ)詞匯間接相關(guān)模型,在此基礎(chǔ)上又提出了一種基于Web的雙語(yǔ)詞匯相關(guān)度的消歧方法,通過(guò)構(gòu)造不同queries并利用搜索引擎抽取返回頁(yè)面的page counts,最后利用點(diǎn)式互信息來(lái)計(jì)算詞匯間的相關(guān)度并用于消歧決策。該方法最好性能(P_(mar)=0.464)超過(guò)了國(guó)際語(yǔ)義評(píng)測(cè)Semeval-2007的Task #5上可比較的最好無(wú)指導(dǎo)系統(tǒng)TorMd。
[Abstract]:In order to alleviate the difficulty of obtaining disambiguation knowledge and sparse data in the task of target disambiguation, an unguided translation disambiguation method based on Web for mining bilingual vocabulary correlation is proposed. In this method, the indirect correlation of bilingual vocabulary in corpus is extended to Web,. An indirect correlation model of bilingual vocabulary based on Web is proposed, and on this basis, a disambiguation method of bilingual vocabulary correlation based on Web is proposed. By constructing different queries and using search engine to extract the page counts, of the returned page, finally, the correlation between words is calculated by using point mutual information and used in disambiguation decision. The best performance of this method (P _ (mar) = 0.464) exceeds the best undirected system TorMd. on Task # 5, which is used for international semantic evaluation of Semeval-2007.
【作者單位】: 北京大學(xué)信息科學(xué)與技術(shù)學(xué)院計(jì)算語(yǔ)言學(xué)研究所;哈爾濱工業(yè)大學(xué)計(jì)算機(jī)科學(xué)與技術(shù)學(xué)院;
【基金】:973計(jì)劃(2004CB318102) 國(guó)家自然科學(xué)基金(60903063) 中國(guó)博士后科學(xué)基金(20090450007)資助項(xiàng)目
【分類號(hào)】:TP391.1
本文編號(hào):2475567
[Abstract]:In order to alleviate the difficulty of obtaining disambiguation knowledge and sparse data in the task of target disambiguation, an unguided translation disambiguation method based on Web for mining bilingual vocabulary correlation is proposed. In this method, the indirect correlation of bilingual vocabulary in corpus is extended to Web,. An indirect correlation model of bilingual vocabulary based on Web is proposed, and on this basis, a disambiguation method of bilingual vocabulary correlation based on Web is proposed. By constructing different queries and using search engine to extract the page counts, of the returned page, finally, the correlation between words is calculated by using point mutual information and used in disambiguation decision. The best performance of this method (P _ (mar) = 0.464) exceeds the best undirected system TorMd. on Task # 5, which is used for international semantic evaluation of Semeval-2007.
【作者單位】: 北京大學(xué)信息科學(xué)與技術(shù)學(xué)院計(jì)算語(yǔ)言學(xué)研究所;哈爾濱工業(yè)大學(xué)計(jì)算機(jī)科學(xué)與技術(shù)學(xué)院;
【基金】:973計(jì)劃(2004CB318102) 國(guó)家自然科學(xué)基金(60903063) 中國(guó)博士后科學(xué)基金(20090450007)資助項(xiàng)目
【分類號(hào)】:TP391.1
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 劉鵬遠(yuǎn);趙鐵軍;;基于挖掘Web雙語(yǔ)詞匯關(guān)聯(lián)度的無(wú)指導(dǎo)譯文消歧[J];高技術(shù)通訊;2010年04期
2 ;[J];;年期
3 ;[J];;年期
4 ;[J];;年期
5 ;[J];;年期
6 ;[J];;年期
7 ;[J];;年期
8 ;[J];;年期
9 ;[J];;年期
10 ;[J];;年期
相關(guān)博士學(xué)位論文 前1條
1 劉鵬遠(yuǎn);基于知識(shí)自動(dòng)獲取的無(wú)指導(dǎo)譯文消歧方法研究[D];哈爾濱工業(yè)大學(xué);2008年
,本文編號(hào):2475567
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2475567.html
最近更新
教材專著