天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 軟件論文 >

基于知識(shí)庫(kù)的自然語(yǔ)言問(wèn)答方法研究

發(fā)布時(shí)間:2018-04-16 03:24

  本文選題:知識(shí)庫(kù)問(wèn)答 + 詞向量 ; 參考:《中國(guó)科學(xué)技術(shù)大學(xué)》2017年碩士論文


【摘要】:基于知識(shí)庫(kù)的自然語(yǔ)言問(wèn)答指的是針對(duì)以自然語(yǔ)言形式給出的問(wèn)題,利用結(jié)構(gòu)化的知識(shí)庫(kù)給出答案,它是自然語(yǔ)言處理的重要研究方向之一。知識(shí)庫(kù)問(wèn)答的主要方法可以分為基于信息提取的方法、基于語(yǔ)義解析的方法和基于向量空間建模的方法三類,其中的關(guān)鍵技術(shù)包括知識(shí)的抽取和表示、用戶問(wèn)句的語(yǔ)義表征和基于知識(shí)庫(kù)的答案生成等。受到問(wèn)句語(yǔ)義表征準(zhǔn)確性、問(wèn)答對(duì)訓(xùn)練數(shù)據(jù)規(guī)模等因素的影響,現(xiàn)階段知識(shí)庫(kù)問(wèn)答系統(tǒng)的性能仍有待提升。此外,開(kāi)源的大規(guī)模開(kāi)放領(lǐng)域中文知識(shí)庫(kù)較為缺乏,這也制約了面向中文的知識(shí)庫(kù)問(wèn)答技術(shù)的研究開(kāi)展。本文圍繞基于知識(shí)庫(kù)的自然語(yǔ)言問(wèn)答任務(wù),從問(wèn)句語(yǔ)義表征、訓(xùn)練數(shù)據(jù)準(zhǔn)備和中文知識(shí)庫(kù)構(gòu)建等多個(gè)方面開(kāi)展研究工作,主要研究?jī)?nèi)容包括面向知識(shí)庫(kù)問(wèn)答中復(fù)述問(wèn)句評(píng)分的詞向量構(gòu)建方法、結(jié)合神經(jīng)網(wǎng)絡(luò)問(wèn)句生成的知識(shí)庫(kù)問(wèn)答方法以及中文知識(shí)庫(kù)構(gòu)建中的知識(shí)融合方法。傳統(tǒng)詞向量通過(guò)與具體任務(wù)無(wú)關(guān)的無(wú)監(jiān)督訓(xùn)練方法得到,用于知識(shí)庫(kù)問(wèn)答中的復(fù)述問(wèn)句評(píng)分時(shí)無(wú)法體現(xiàn)句子級(jí)的語(yǔ)義約束關(guān)系。因此,本文提出了一種基于復(fù)述知識(shí)約束的詞向量訓(xùn)練方法。該方法在詞向量訓(xùn)練過(guò)程中引入句子級(jí)的語(yǔ)義約束信息,在不改變句子語(yǔ)義合成方法的前提下,通過(guò)優(yōu)化單詞層面的語(yǔ)義向量,來(lái)改善句子層面的語(yǔ)義表征,最后達(dá)到提升復(fù)述問(wèn)句評(píng)分以及知識(shí)庫(kù)問(wèn)答系統(tǒng)回答問(wèn)題的準(zhǔn)確度的效果,F(xiàn)有基于向量空間建模的知識(shí)庫(kù)問(wèn)答方法依賴訓(xùn)練數(shù)據(jù),而人工生成大規(guī)模的問(wèn)答對(duì)數(shù)據(jù)較為困難。本章針對(duì)以上問(wèn)題將基于編碼器-解碼器神經(jīng)網(wǎng)絡(luò)模型的問(wèn)句生成方法引入知識(shí)庫(kù)問(wèn)答系統(tǒng)構(gòu)建,通過(guò)構(gòu)建問(wèn)句生成模型實(shí)現(xiàn)由知識(shí)庫(kù)中三元組自動(dòng)生成問(wèn)句,用于知識(shí)庫(kù)問(wèn)答的模型訓(xùn)練。實(shí)驗(yàn)結(jié)果表明使用模型生成問(wèn)句相對(duì)傳統(tǒng)模版生成問(wèn)句,有效改善了知識(shí)庫(kù)問(wèn)答系統(tǒng)的準(zhǔn)確率。最后,本論文介紹一種基于知識(shí)融合的中文知識(shí)庫(kù)構(gòu)建方法。該方法首先從百度百科網(wǎng)頁(yè)的信息框中抽取信息構(gòu)建初始知識(shí)庫(kù),然后采用基于鏈接詞信息的實(shí)體對(duì)齊和基于Jaccard系數(shù)的屬性映射方法,實(shí)現(xiàn)初始知識(shí)庫(kù)與現(xiàn)有Freebase知識(shí)庫(kù)的融合。通過(guò)構(gòu)建人物、地理等部分領(lǐng)域的中文知識(shí)庫(kù),驗(yàn)證了以上方法在已有本體庫(kù)基礎(chǔ)上實(shí)現(xiàn)知識(shí)庫(kù)擴(kuò)充的有效性。
[Abstract]:The question and answer of natural language based on knowledge base refers to the question given in the form of natural language. It is one of the important research directions of natural language processing by using the structured knowledge base to give the answer.The main methods of knowledge base question and answer can be divided into three kinds: one is based on information extraction, the other is based on semantic analysis and vector space modeling. The key technologies include knowledge extraction and representation.The semantic representation of user question and the answer generation based on knowledge base.Due to the accuracy of semantic representation of question sentences and the effect of question answering on the scale of training data, the performance of the knowledge base question answering system still needs to be improved.In addition, the lack of Chinese knowledge base in open-source and large-scale open field also restricts the research of Chinese-oriented knowledge base question and answer technology.This paper focuses on the question and answer task of natural language based on knowledge base, including the semantic representation of question sentence, the preparation of training data and the construction of Chinese knowledge base, etc.The main contents of this paper include the word vector construction method which is oriented to the scoring of quizzes in the knowledge base, the knowledge base question answering method combined with the neural network question generation method and the knowledge fusion method in the Chinese knowledge base construction.The traditional word vector is obtained by unsupervised training method which is independent of the specific task, and can not reflect the semantic constraint relationship of sentence level when used in the scoring of question retelling in the knowledge base question answering.Therefore, this paper proposes a word vector training method based on retelling knowledge constraints.This method introduces sentence level semantic constraint information in the process of word vector training, and improves the semantic representation of sentence level by optimizing the semantic vector of word level without changing the sentence semantic synthesis method.Finally, the accuracy of answering questions in question answering system is improved.The existing knowledge base question-and-answer methods based on vector space modeling rely on training data, but it is difficult to generate large-scale question and answer data manually.In this chapter, the question generation method based on encoder and decoder neural network model is introduced into the question answering system of knowledge base, and the question generation model is constructed to generate question sentences automatically by triples in knowledge base.Model training for knowledge Base questions and answers.The experimental results show that using the model to generate questions is more effective than the traditional template to generate questions, which can effectively improve the accuracy of the question answering system of knowledge base.Finally, this paper introduces a knowledge fusion based Chinese knowledge base construction method.In this method, the initial knowledge base is constructed by extracting information from the information box of Baidu encyclopedia page, and then the method of entity alignment based on link word information and attribute mapping method based on Jaccard coefficient is adopted to realize the fusion of initial knowledge base and existing Freebase knowledge base.By constructing the Chinese knowledge base of people, geography and other fields, the validity of the above methods to realize the expansion of the knowledge base based on the existing ontology library is verified.
【學(xué)位授予單位】:中國(guó)科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.1

【相似文獻(xiàn)】

相關(guān)碩士學(xué)位論文 前1條

1 詹晨迪;基于知識(shí)庫(kù)的自然語(yǔ)言問(wèn)答方法研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2017年

,

本文編號(hào):1757091

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1757091.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶ff503***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com