基于會(huì)話搜索的網(wǎng)頁排序算法的研究與設(shè)計(jì)
發(fā)布時(shí)間:2018-03-07 22:25
本文選題:會(huì)話搜索 切入點(diǎn):網(wǎng)頁檢索 出處:《南京大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著互聯(lián)網(wǎng)技術(shù)的迅速發(fā)展,互聯(lián)網(wǎng)上的資源數(shù)量越來越多。搜索引擎的發(fā)展使得用戶可以在龐大的信息資源中找到自己所需要的信息。用戶可以在搜索引擎上得到自己感興趣的信息,影響用戶信息檢索滿意度的就是搜索引擎返回給用戶的網(wǎng)頁信息,并且影響返回給用戶網(wǎng)頁信息的核心技術(shù)就是搜索引擎的網(wǎng)頁排序算法,目前主流的網(wǎng)頁排序算法主要是Google的PageRank算法和IBM的HITS算法,但是這些算法的設(shè)計(jì)思想主要是利用網(wǎng)頁之間的鏈接關(guān)系,如果一個(gè)網(wǎng)頁被其他網(wǎng)頁的鏈接次數(shù)比較多,搜索引擎就會(huì)認(rèn)為它的網(wǎng)頁質(zhì)量比較高,從而在排序時(shí)將它的位置相對(duì)靠前,但是這些算法并不考慮用戶與搜索引擎之間的交互問題,所以在網(wǎng)頁排序算法的改進(jìn)上面存在很大的提升空間,現(xiàn)在的研究者對(duì)于搜索引擎的研究重點(diǎn)主要就體現(xiàn)在搜索引擎的排序算法上。本文首先介紹了現(xiàn)在搜索引擎中主要的網(wǎng)頁排序算法以及MDP模型,隨后提出了基于用戶會(huì)話搜索的QCM網(wǎng)頁排序算法,其利用相鄰查詢之間的句法編輯變化和查詢變更之間的關(guān)系,以及先前檢索的文件來增強(qiáng)會(huì)話搜索,并將會(huì)話搜索建模為馬爾科夫決策過程(MDP),文中會(huì)通過實(shí)驗(yàn)來驗(yàn)證算法的有效性,最后基于QCM網(wǎng)頁排序算法設(shè)計(jì)了一個(gè)信息檢索原型系統(tǒng)。本文針對(duì)于現(xiàn)有網(wǎng)頁排序算法的不足,提出了一種基于用戶會(huì)話搜索的網(wǎng)頁排序算法,該算法更加注重用戶與搜索引擎的交互,關(guān)注用戶進(jìn)行會(huì)話搜索過程中檢索詞的變化,基于檢索詞的變化采用MDP模型進(jìn)行建模,這種網(wǎng)頁排序算法取名為QCM,最后通過實(shí)驗(yàn)進(jìn)行算法效率分析,并經(jīng)過設(shè)計(jì)實(shí)驗(yàn)驗(yàn)證,本文提出的QCM網(wǎng)頁排序算法在排序效率上有著較大提高。
[Abstract]:With the rapid development of Internet technology, There are more and more resources on the Internet. With the development of search engine, users can find the information they need in the huge information resources. Users can get the information they are interested in in the search engine. What affects the satisfaction of user information retrieval is the web page information returned by the search engine, and the key technology that affects the web page information return to the user is the search engine's web page sorting algorithm. At present, the main algorithms of web page sorting are Google's PageRank algorithm and IBM's HITS algorithm. However, the design of these algorithms is mainly based on the link relationship between web pages, if a web page is linked more times by other web pages. Search engines tend to think that their web pages are of high quality, so they rank them before they are sorted, but these algorithms don't take into account the interaction between users and search engines. So there is a lot of room for improvement in the sorting algorithm for web pages. The research focus of the present researchers on search engine is mainly reflected in the search engine sorting algorithm. Firstly, this paper introduces the main web page sorting algorithm and MDP model in the current search engine. Then, a QCM web page sorting algorithm based on user session search is proposed, which utilizes the relationship between syntactic editing changes and query changes between adjacent queries, as well as the previously retrieved files to enhance session search. The session search is modeled as Markov decision process, and the validity of the algorithm is verified by experiments. Finally, an information retrieval prototype system is designed based on the QCM web page sorting algorithm. A web page sorting algorithm based on user session search is proposed. The algorithm pays more attention to the interaction between user and search engine, and focuses on the change of search words in the process of user session search. Based on the change of search words, the MDP model is used for modeling. The algorithm is named QCM. at last, the efficiency of the algorithm is analyzed by experiment. The result of the design experiment shows that the sorting efficiency of the QCM page sorting algorithm proposed in this paper has been greatly improved.
【學(xué)位授予單位】:南京大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP393.092;TP391.3
,
本文編號(hào):1581243
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1581243.html
最近更新
教材專著