基于語言模型的個性化信息檢索的方法與實現(xiàn)
發(fā)布時間:2018-03-02 01:32
本文關(guān)鍵詞: 信息檢索 語言模型 查詢擴展 用戶模型 出處:《內(nèi)蒙古大學》2013年碩士論文 論文類型:學位論文
【摘要】:由于互聯(lián)網(wǎng)的快速發(fā)展,在繁多紛雜的信息中,如何辨別用戶的真實意圖,準確的從浩瀚的信息資源中找到所需的信息,成為當前信息檢索領(lǐng)域一個較為關(guān)注的問題。在當今技術(shù)較為成熟的搜索引擎網(wǎng)站上,查全率及響應速度已經(jīng)做得很好,但在查準率上始終難以讓用戶滿意。 信息檢索的主要目的,即:從眾多的文檔中找到符合用戶查詢需求的文檔。傳統(tǒng)的查詢擴展重視原問句的擴展,但是忽略了擴展后查詢問句中存在許多不必要的詞匯,從而又阻礙了擴展后查詢的準確性,因此不能從根本上表達用戶查詢意圖。本文將從用戶的個性化角度,對查詢擴展進行研究。 本文為個性化的研究看出了兩種檢索方法,即:用戶查詢擴展模型和去掉擴展詞的停用詞表方法,兩種方法的基本思想是源于查詢優(yōu)化,對用戶的查詢進行查詢擴展或是查詢詞的刪減。用戶模型主要是通過結(jié)合個體用戶所涉及到的主題領(lǐng)域?qū)ζ洳樵儐柧溥M行擴充,擴展后的新查詢可以提高用戶的準確率和查全率。而去掉擴展詞的停用詞是將通過原始查詢進行偽相關(guān)擴展后的新查詢問句的研究,在不同的領(lǐng)域基礎(chǔ)上總結(jié)得出查詢問句的停用詞表,以減少新的查詢問句中詞的不必要詞,將其所分配的概率值重新分配,加大了原始查詢詞的概率值。 本文在語言模型的基礎(chǔ)上,利用現(xiàn)有的成熟技術(shù),從新的角度來研究查詢問句擴展,通過實驗,進一步改進查詢問句的方法,利用用戶興趣模型,提高用戶的檢索結(jié)果。我們將在文中詳細討論各種檢索模型中查詢擴展的方法。經(jīng)過實驗訓練,驗證本文提出用戶查詢擴展和提出的不同領(lǐng)域的停用詞表。
[Abstract]:Because of the rapid development of the Internet, how to distinguish the real intention of the user and find the needed information from the vast information resources in the numerous and complicated information, It has become a more concerned problem in the field of information retrieval. Recall rate and response speed have been done well on search engine websites with more mature technology, but it is always difficult to satisfy users in recall rate. The main purpose of information retrieval is to find documents from many documents that meet the needs of users. Traditional query expansion attaches importance to the expansion of the original question, but ignores the existence of many unnecessary words in the extended query. Therefore, the accuracy of the extended query can not be expressed fundamentally. In this paper, the query expansion will be studied from the user's personalized point of view. In this paper, we find out two retrieval methods for personalized research, that is, user query extension model and the method of removing extended word table. The basic idea of the two methods is from query optimization. The user model mainly extends the query questions by combining the subject areas of the individual users. The extended new query can improve the accuracy and recall of the user. In order to reduce the unnecessary words in the new query question and redistribute the probability value, the probabilistic value of the original query word can be increased by summing up the stop word list of the query question on the basis of different fields. On the basis of the language model, this paper makes use of the existing mature technology to study the expansion of query questions from a new perspective. Through experiments, we further improve the method of querying question sentences, and use the user interest model. In this paper, we will discuss in detail the methods of query expansion in various retrieval models. Through experimental training, we verify the proposed user query expansion and the proposed discontinuation tables in different domains.
【學位授予單位】:內(nèi)蒙古大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP391.3
【引證文獻】
相關(guān)碩士學位論文 前2條
1 李云飛;基于查詢?nèi)罩镜膭討B(tài)查詢擴展研究[D];內(nèi)蒙古大學;2016年
2 丁凱朝;信息檢索中虛擬域重排技術(shù)的研究與實現(xiàn)[D];內(nèi)蒙古大學;2014年
,本文編號:1554484
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1554484.html
最近更新
教材專著