基于位置敏感Embedding的中文命名實體識別
發(fā)布時間:2018-09-10 06:54
【摘要】:在基于條件隨機場的中文命名實體識別任務(wù)中,現(xiàn)有表示學(xué)習方法學(xué)習到的特征存在語義表示偏差,給中文命名實體識別帶來噪聲。針對此問題,提出了一種基于位置敏感Embedding的中文命名實體識別方法。該方法將上下文位置信息融入到現(xiàn)有的Embedding模型中,采用多尺度聚類方法抽取不同粒度的Embedding特征,通過條件隨機場來識別中文命名實體。實驗證明,該方法學(xué)習到的特征緩解了語義表示偏差,進一步提高了現(xiàn)有系統(tǒng)的性能,與傳統(tǒng)方法相比,F值提高了2.85%。
[Abstract]:In the task of Chinese named entity recognition based on conditional Random Field, the existing representation learning methods have semantic representation bias, which brings noise to Chinese named entity recognition. To solve this problem, a Chinese named entity recognition method based on location-sensitive Embedding is proposed. In this method, the context location information is incorporated into the existing Embedding model, and the multi-scale clustering method is used to extract the Embedding features of different granularity, and the Chinese named entities are identified by conditional random fields. Experimental results show that the proposed method can alleviate the deviation of semantic representation and further improve the performance of the existing system. Compared with the traditional method, the value of F is increased by 2.85%.
【作者單位】: 武漢大學(xué)計算機學(xué)院;
【基金】:國家自然科學(xué)基金重點項目(61133012);國家自然科學(xué)基金面上項目(61373108)
【分類號】:TP391.1
[Abstract]:In the task of Chinese named entity recognition based on conditional Random Field, the existing representation learning methods have semantic representation bias, which brings noise to Chinese named entity recognition. To solve this problem, a Chinese named entity recognition method based on location-sensitive Embedding is proposed. In this method, the context location information is incorporated into the existing Embedding model, and the multi-scale clustering method is used to extract the Embedding features of different granularity, and the Chinese named entities are identified by conditional random fields. Experimental results show that the proposed method can alleviate the deviation of semantic representation and further improve the performance of the existing system. Compared with the traditional method, the value of F is increased by 2.85%.
【作者單位】: 武漢大學(xué)計算機學(xué)院;
【基金】:國家自然科學(xué)基金重點項目(61133012);國家自然科學(xué)基金面上項目(61373108)
【分類號】:TP391.1
【參考文獻】
相關(guān)期刊論文 前7條
1 邱莎;王付艷;申浩如;段玻;阿圓;丁海燕;;基于含邊界詞性特征的中文命名實體識別[J];計算機工程;2012年13期
2 彭春艷;張暉;包玲玉;陳昌平;;基于條件隨機域的生物命名實體識別[J];計算機工程;2009年22期
3 馮元勇;孫樂;張大鯤;李文波;;基于小規(guī)模尾字特征的中文命名實體識別研究[J];電子學(xué)報;2008年09期
4 張sソ,
本文編號:2233712
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2233712.html
最近更新
教材專著