面向地域信息的問答系統(tǒng)研究與實現
發(fā)布時間:2018-03-19 20:51
本文選題:信息檢索 切入點:知識庫 出處:《西南交通大學》2013年碩士論文 論文類型:學位論文
【摘要】:互聯(lián)網已經成為當今人們獲取信息的重要渠道,這也使得搜索引擎技術成為互聯(lián)網中極其重要的技術,但傳統(tǒng)的搜索引擎不能一次性返回給用戶準確的信息。問答系統(tǒng)作為信息檢索的一種新形式,能夠彌補傳統(tǒng)搜索引擎的諸多不足,因而逐漸受到人們的重視。本文對基于特定領域的問答系統(tǒng)進行了相關的研究和設計,主要包括結構化知識庫的構建、問句的分析及理解以及答案抽取技術的研究,最后實現了面向地域信息的問答系統(tǒng)的原型系統(tǒng)。 在結構化知識庫的構建方面,對互聯(lián)網上大量的與地域相關的信息進行了下載整理,運用信息抽取技術構建了面向地域信息的結構化知識庫,該知識庫可以支持簡單的與地域相關信息的檢索。設計了可以通過用戶行為自動添加的問答庫,利用該問答庫可以進一步支持問答系統(tǒng)快速、準確的檢索。 在問句分析與理解方面,使用對問句進行屬性標注、模式判斷等方法進行問句分析。并且深入研究了基于《知網》的語義相似度計算方法,針對《知網》未登錄詞不能參加計算的問題做了相關的處理,在對基本地域信息結構化知識庫的語義檢索中提高了準確率和召回率。通過實驗比較,確定采用基于《知網》的句子相似度計算算法進行問答庫檢索。 在答案抽取方面,對知識庫的答案檢索采用了提取問句屬性塊,利用屬性塊檢索答案的方法。由于本地數據庫始終存在著局限性,而互聯(lián)網作為巨大的信息集成體,是可以利用的數據源,因此本文設計了基于互聯(lián)網的答案抽取模塊,并且提出了基于向量空間模型的網絡答案抽取算法,該模塊充分考慮了搜索引擎與網頁文檔的特點,實驗證明其具有較高的準確率。 針對所設計的問答系統(tǒng)的檢索流程,實現了問答系統(tǒng)的原型系統(tǒng),該系統(tǒng)主要由問句分析、語義相似度計算、知識庫檢索、問題庫管理以及互聯(lián)網檢索等模塊組成。并且利用Google地圖對相關檢索結果的地理位置做了標記。本文針對地域相關信息,完整的實現了從數據采集、信息結構化到語義檢索的過程。達到了預期的目的。
[Abstract]:The Internet has become an important channel for people to obtain information, which makes search engine technology become an extremely important technology in the Internet. But traditional search engines can not return accurate information to users at once. As a new form of information retrieval, question-and-answer system can make up for many shortcomings of traditional search engines. As a result, people pay more and more attention to it. In this paper, a question answering system based on a specific field has been studied and designed, including the construction of a structured knowledge base, the analysis and understanding of question sentences, and the research on the technology of answer extraction. Finally, the prototype system of question and answer system oriented to regional information is implemented. In the construction of structured knowledge base, a large number of information related to the region on the Internet are downloaded and sorted, and a structured knowledge base oriented to regional information is constructed by using the technology of information extraction. The knowledge base can support the retrieval of the information related to the region, and a question and answer library which can be automatically added by user behavior is designed, which can be used to further support the quick and accurate retrieval of the question and answer system. In the aspect of question analysis and understanding, we use the methods of attribute tagging, pattern judgment and so on to analyze question sentences, and deeply study the semantic similarity calculation method based on the knowledge Web. This paper deals with the problem that unregistered words cannot participate in the computation, and improves the accuracy and recall rate in the semantic retrieval of the structured knowledge base of basic regional information. A sentence similarity calculation algorithm based on KnowledgeNet is adopted for question and answer database retrieval. In the aspect of answer extraction, the method of extracting question sentence attribute block and using attribute block is used to retrieve the answer in the knowledge base. Because of the limitation of the local database, the Internet is a huge information integration body. Therefore, this paper designs an Internet-based answer extraction module, and proposes a vector space model based network answer extraction algorithm, which takes full account of the characteristics of search engines and web documents. Experimental results show that it has a high accuracy. According to the retrieval flow of the question answering system designed, the prototype system of question answering system is implemented. The system is mainly composed of question analysis, semantic similarity calculation, knowledge base retrieval, etc. The Google map is used to mark the geographical location of the related retrieval results. The process of information structure to semantic retrieval.
【學位授予單位】:西南交通大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP391.3
【參考文獻】
相關期刊論文 前10條
1 劉文華;康海燕;;領域問答系統(tǒng)生成器的研究[J];北京信息科技大學學報(自然科學版);2009年03期
2 樊孝忠,李宏喬,李良富,葉江;銀行領域漢語自動問答系統(tǒng)BAQS的研究與實現[J];北京理工大學學報;2004年06期
3 田久樂;趙蔚;;基于同義詞詞林的詞語相似度計算方法[J];吉林大學學報(信息科學版);2010年06期
4 秦兵,劉挺,王洋,鄭實福,李生;基于常問問題集的中文問答系統(tǒng)研究[J];哈爾濱工業(yè)大學學報;2003年10期
5 余正濤,樊孝忠,郭劍毅;基于支持向量機的漢語問句分類[J];華南理工大學學報(自然科學版);2005年09期
6 王宇;戰(zhàn)學剛;蔡建山;;基于網絡的中文問答系統(tǒng)的研究[J];計算機工程與應用;2006年07期
7 周法國;楊炳儒;;句子相似度計算新方法及在問答系統(tǒng)中的應用[J];計算機工程與應用;2008年01期
8 張永奎,趙輒謙,白麗君,陳鑫卿;基于互聯(lián)網的中文問答系統(tǒng)[J];計算機工程;2003年15期
9 姜吉發(fā);開放領域漢語知識問答方法[J];計算機工程;2005年11期
10 夏天;;漢語詞語語義相似度計算研究[J];計算機工程;2007年06期
,本文編號:1635945
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1635945.html
教材專著