天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 軟件論文 >

中文文本語義錯誤偵測方法研究

發(fā)布時間:2018-03-20 01:35

  本文選題:語義錯誤 切入點:知識庫 出處:《計算機學(xué)報》2017年04期  論文類型:期刊論文


【摘要】:中文文本語義錯誤偵測一直以來都是中文文本自動查錯的難點.該文針對中文文本語義錯誤,提出了一種基于語義搭配知識庫和證據(jù)理論的語義錯誤偵測模型.討論了三層語義搭配知識庫的構(gòu)建以及基于該知識庫和證據(jù)理論的語義錯誤偵測算法.三層語義搭配知識庫的構(gòu)建主要分為兩步:(1)根據(jù)《現(xiàn)代漢語實詞搭配詞典》中的實詞搭配框架構(gòu)建詞語搭配規(guī)則集,從訓(xùn)練語料中抽取詞語搭配,并利用互信息和共現(xiàn)頻次進行篩選,構(gòu)建詞語搭配知識庫;(2)利用《HowNet》抽取詞語的義原信息,生成詞語-義原和義原-義原搭配知識庫,并利用聚合度進行二次篩選.在三層語義搭配知識庫的基礎(chǔ)上,首先對知識庫采用自頂向下的搜索模式確定可能錯誤的語義搭配,然后使用語義搭配的互信息量MI和聚合度PD作為證據(jù),采用統(tǒng)計的方法建立證據(jù)信任分配函數(shù),結(jié)合證據(jù)的沖突處理和加權(quán)分配D-S規(guī)則進行不確定性推理,獲取詞語的語義搭配關(guān)聯(lián)強度,以判定是否存在語義錯誤.實驗結(jié)果顯示,該文所提出的查錯模型和算法的F-Score值比其他文獻中的最好值提高了14.02%.
[Abstract]:Semantic error detection of Chinese text has always been the difficulty of automatic error detection in Chinese text. This paper presents a semantic error detection model based on semantic collocation knowledge base and evidence theory, and discusses the construction of three-layer semantic collocation knowledge base and the semantic error detection algorithm based on this knowledge base and evidence theory. The construction of semantic collocation knowledge base is divided into two steps: 1) according to the framework of notional collocation in Modern Chinese Dictionary of notional collocation, the collocation rule set is constructed. The collocation of words is extracted from the training corpus, and the collocation knowledge base is constructed by using mutual information and co-occurrence frequency.) the sememe information of words is extracted from < HowNet >, and the collocation knowledge base of word-semantic and sememysemous collocation is generated. On the basis of the three-layer semantic collocation knowledge base, the top-down search pattern is used to determine the semantic collocation that may be wrong. Then the mutual information of semantic collocation (MI) and aggregation degree (PD) are used as evidence, and the evidence trust assignment function is established by statistical method, and the uncertainty reasoning is carried out by combining the conflict handling of evidence and weighted allocation D-S rule. In order to determine whether there are semantic errors, the experimental results show that the F-Score value of the proposed error checking model and algorithm is 14.02 higher than the best value in other literatures.
【作者單位】: 北京信息科技大學(xué)智能信息處理研究所;
【基金】:國家自然科學(xué)基金(61070119,61370139) 北京市屬高等學(xué)校創(chuàng)新團隊建設(shè)與教師職業(yè)發(fā)展計劃(IDHT20130519)資助~~
【分類號】:TP391.1

【相似文獻】

相關(guān)博士學(xué)位論文 前1條

1 曾志浩;用于語義Web服務(wù)搜索的語義條件表達(dá)式的研究[D];武漢大學(xué);2010年

相關(guān)碩士學(xué)位論文 前1條

1 胡娜;中文語義錯覺效應(yīng)的認(rèn)知加工過程[D];云南師范大學(xué);2016年

,

本文編號:1636948

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1636948.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶d90cb***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com