查詢語句的概念分析及其在檢索中的應(yīng)用
發(fā)布時(shí)間:2018-03-31 21:19
本文選題:查詢語句 切入點(diǎn):概念分析 出處:《上海交通大學(xué)》2013年碩士論文
【摘要】:近年來,隨著計(jì)算機(jī)技術(shù)的發(fā)展和互聯(lián)網(wǎng)的普及,Internet上的資源以指數(shù)級迅速增長,這不僅為我們提供了博大的信息資源,也伴隨著信息爆炸的問題。面對紛繁復(fù)雜的網(wǎng)絡(luò)資源,如何從海量的信息中獲取自己所需的信息,也就是檢索系統(tǒng)如何從海量文檔中返回出最符合用戶需求的候選文檔,成為了現(xiàn)在最關(guān)注的問題。 目前的信息檢索系統(tǒng)只能提供給使用者有限的幫助,檢索的準(zhǔn)確率低下,大量的信息不僅不能給用戶提供幫助,反而帶來了不小的困擾。這個(gè)問題的癥結(jié)在于現(xiàn)有的大部分檢索系統(tǒng)采用的是如布爾模型等的“離散型”模型,用戶的需求和文檔被表示成離散的、無關(guān)的字串,從而喪失了它們概念上的完整性,帶來了新的噪聲。一個(gè)可行的方案是將自然語言理解的手段引入到檢索中,通過深層次的語義分析來提高檢索的準(zhǔn)確率。具體的說,就是應(yīng)用語義分析的方法標(biāo)引需求和文檔,標(biāo)引的基本單位不再是字串,而是完整的概念。這樣就構(gòu)建了需求和文檔中概念之間的關(guān)系。 本文研究的是漢語用戶需求的概念分析,這是中文概念檢索系統(tǒng)必不可少的組成部分。需求分析是檢索過程中的第一步,其目的是還原用戶的檢索意圖以指導(dǎo)進(jìn)一步的檢索工作。因此需求分析是檢索系統(tǒng)的首要任務(wù),其質(zhì)量直接影響了整個(gè)檢索系統(tǒng)的性能。需求分析,跟文本文檔的分析存在較大的區(qū)別,其目的除了將用戶查詢語句表示成概念信息;更重要的是能準(zhǔn)確的刻畫用戶腦海中的檢索概念,其依據(jù)則是模糊的用戶需求表達(dá)式。 本論文引入概念新思想,在概念層次上,利用語義概念圖模型,處理中文查詢語句,再將其轉(zhuǎn)化為語義概念圖,,把用戶輸入的關(guān)鍵詞通過它們之間的語義關(guān)系聯(lián)結(jié)成為內(nèi)涵完整的圖的形式,使得在整個(gè)語義檢索過程中不丟失其語義概念信息,從而可以根據(jù)用戶需求的完整概念內(nèi)涵,對返回的網(wǎng)頁結(jié)果進(jìn)行相關(guān)性的衡量,達(dá)到提高準(zhǔn)確率的效果。 本文在用戶需求概念分析上提出了一種新的嘗試和方法,從內(nèi)涵概念圖層次上分析用戶的真正意圖,特別是在處理疑問句需求時(shí),通過提取查詢語句的焦點(diǎn)信息,并用其替換句子中的疑問詞,構(gòu)建出表達(dá)查詢語句內(nèi)涵語義信息的概念圖。該方法從中文概念內(nèi)涵的角度,分析用戶需求,較為完整、準(zhǔn)確地還原用戶的檢索意圖,以指導(dǎo)接下去的檢索工作,從而提高了檢索系統(tǒng)的準(zhǔn)確度。這對于中文搜索引擎的新開發(fā),提供了有效的技術(shù)支持。
[Abstract]:In recent years, with the development of computer technology and the popularization of Internet, the resources on Internet are increasing exponentially, which not only provides us with extensive information resources, but also with the problem of information explosion. How to obtain the necessary information from the massive information, that is, how to retrieve the candidate documents from the massive documents, which is the most suitable for users' needs, has become the most concerned issue. The current information retrieval system can only provide users with limited help, the retrieval accuracy is low, a large amount of information not only can not provide users with help, The crux of the problem is that most of the existing retrieval systems use "discrete" models, such as Boolean models, in which users' needs and documents are represented as discrete, unrelated strings. One feasible solution is to introduce natural language understanding into retrieval and improve the retrieval accuracy through deep semantic analysis. The basic unit of indexing is no longer a string, but a complete concept, which constructs the relationship between the requirements and the concepts in the document. This paper studies the conceptual analysis of Chinese users' needs, which is an essential part of the Chinese concept retrieval system. The requirement analysis is the first step in the retrieval process. The purpose is to restore the user's retrieval intention to guide further retrieval work. Therefore, requirement analysis is the primary task of the retrieval system, and its quality directly affects the performance of the whole retrieval system. There is a great difference from the analysis of text documents. Its purpose is not only to express the user query as conceptual information, but also to accurately depict the retrieval concept in the user's mind, which is based on the fuzzy user demand expression. In this paper, a new concept is introduced. At the conceptual level, the semantic concept map model is used to deal with Chinese query sentences, and then the semantic concept map is transformed into the semantic concept map. The key words input by the user are connected into the form of a graph with complete connotation through the semantic relation between them, so that the semantic conceptual information is not lost in the whole semantic retrieval process, so that the complete conceptual connotation of the user can be obtained according to the needs of the user. The relevance of the returned page results is measured to improve the accuracy of the results. In this paper, a new attempt and method is put forward to analyze the real intention of the user from the level of intension concept map, especially when dealing with the requirement of interrogative sentence, by extracting the focus information of the query sentence. By replacing the interrogative words in the sentence, a concept map is constructed to express the semantic information of the connotation of the query sentence. From the angle of the Chinese concept connotation, the method analyzes the user's demand, and restores the user's retrieval intention completely and accurately. In order to guide the next retrieval work, the accuracy of the retrieval system is improved, which provides an effective technical support for the new development of the Chinese search engine.
【學(xué)位授予單位】:上海交通大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 蔣紹愚;;漢語詞義和詞匯系統(tǒng)的歷史演變初探——以“投”為例[J];北京大學(xué)學(xué)報(bào)(哲學(xué)社會科學(xué)版);2006年04期
2 裴炳鎮(zhèn),陳曉明,胡熠,陸汝占;一種建立中文概念分類關(guān)系的新算法[J];計(jì)算機(jī)工程與應(yīng)用;2004年36期
3 張華平,劉群;基于N-最短路徑方法的中文詞語粗分模型[J];中文信息學(xué)報(bào);2002年05期
4 文勖;張宇;劉挺;馬金山;;基于句法結(jié)構(gòu)分析的中文問題分類[J];中文信息學(xué)報(bào);2006年02期
相關(guān)博士學(xué)位論文 前2條
1 段建勇;多詞表達(dá)抽取及其應(yīng)用[D];上海交通大學(xué);2007年
2 胡熠;面向信息檢索的文本內(nèi)容分析[D];上海交通大學(xué);2007年
本文編號:1692485
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1692485.html
最近更新
教材專著