金融領域的事件句抽取
發(fā)布時間:2019-03-31 09:02
【摘要】:事件句抽取是事件抽取中的核心環(huán)節(jié),在金融領域中,公司名識別則是事件句抽取中的重點和難點。針對金融領域的事件句抽取,首先充分利用互聯(lián)網(wǎng)搜索和上市公司名信息進行公司名識別,如果一個N元組是公司名,則進行互聯(lián)網(wǎng)搜索的結果中包含"公司""集團"等字詞多,同時與公司名庫中部分公司名有較高的匹配度;其次,綜合考慮句子位置信息、包含公司名信息、包含領域動詞信息、與標題相似度四個方面特征,構造權值表達式;最終從句子集中選出金融事件句。在數(shù)據(jù)集上測試,實驗結果證明提出的金融領域事件句抽取方法是可行的,公司名識別方法的正確率可達82.28%,召回率達68.93%,事件句抽取的正確率可達66.83%。
[Abstract]:Event sentence extraction is the core of event extraction. In the field of finance, corporate name recognition is the focus and difficulty in event sentence extraction. For the event sentence extraction in the financial field, first of all, make full use of the Internet search and listed company name information to identify the company name, if an N tuple is the company name, Then the result of Internet search contains many words such as "company" and "group", and has a high degree of matching with some of the company names in the company name library; Secondly, considering sentence position information, including company name information, domain verb information and title similarity, weight expression is constructed. Finally, financial event sentence is selected from sentence set. The test results on the data set show that the proposed method is feasible. The correct rate of the company name recognition method is 82.28%, the recall rate is 68.93%, and the correct rate of event sentence extraction is 66.83%.
【作者單位】: 北京信息科技大學網(wǎng)絡文化與數(shù)字傳播北京市重點實驗室;首都師范大學北京成像技術高精尖創(chuàng)新中心;
【基金】:2014年度國家社會科學基金委托課題(14@ZH036) 北京成像技術高精尖創(chuàng)新中心資助項目(BAICIT-2016003) 國家自然科學基金資助項目(61271304,61671070)
【分類號】:TP391.1
,
本文編號:2450760
[Abstract]:Event sentence extraction is the core of event extraction. In the field of finance, corporate name recognition is the focus and difficulty in event sentence extraction. For the event sentence extraction in the financial field, first of all, make full use of the Internet search and listed company name information to identify the company name, if an N tuple is the company name, Then the result of Internet search contains many words such as "company" and "group", and has a high degree of matching with some of the company names in the company name library; Secondly, considering sentence position information, including company name information, domain verb information and title similarity, weight expression is constructed. Finally, financial event sentence is selected from sentence set. The test results on the data set show that the proposed method is feasible. The correct rate of the company name recognition method is 82.28%, the recall rate is 68.93%, and the correct rate of event sentence extraction is 66.83%.
【作者單位】: 北京信息科技大學網(wǎng)絡文化與數(shù)字傳播北京市重點實驗室;首都師范大學北京成像技術高精尖創(chuàng)新中心;
【基金】:2014年度國家社會科學基金委托課題(14@ZH036) 北京成像技術高精尖創(chuàng)新中心資助項目(BAICIT-2016003) 國家自然科學基金資助項目(61271304,61671070)
【分類號】:TP391.1
,
本文編號:2450760
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2450760.html
最近更新
教材專著