基于時(shí)間反饋和分類(lèi)技術(shù)的PageRank算法改進(jìn)研究
發(fā)布時(shí)間:2018-06-21 08:07
本文選題:PageRank + 搜索引擎 ; 參考:《北京化工大學(xué)》2013年碩士論文
【摘要】:當(dāng)前的信息時(shí)代,互聯(lián)網(wǎng)快速發(fā)展,網(wǎng)絡(luò)上不斷產(chǎn)生大量無(wú)序的信息,當(dāng)用戶(hù)需要搜尋自己關(guān)心的信息時(shí),就需要利用搜索引擎快速而準(zhǔn)確的反饋結(jié)果。這就對(duì)搜索引擎的搜索技術(shù)提出了更高的挑戰(zhàn),在這一搜索引擎改進(jìn)過(guò)程中,搜索引擎網(wǎng)頁(yè)排序算法理所當(dāng)然的成為搜索引擎改進(jìn)的關(guān)鍵問(wèn)題。在搜索引擎產(chǎn)生的初期,傳統(tǒng)的搜索引擎排名算法中,PageRank算法和Hits算法是兩個(gè)經(jīng)典算法,,它們都是基于網(wǎng)頁(yè)鏈接結(jié)構(gòu)的,這些算法也是國(guó)內(nèi)外搜索引擎算法改進(jìn)的依據(jù)和基礎(chǔ),同時(shí)也出現(xiàn)了一些行之有效的改進(jìn)算法。 本文首先闡述了搜索引擎排序算法的研究背景和意義,以及國(guó)內(nèi)外關(guān)于搜索引擎的發(fā)展現(xiàn)狀等,分析了搜索引擎的工作原理與技術(shù),以及搜索引擎網(wǎng)站的評(píng)測(cè)指標(biāo)。然后通過(guò)分析傳統(tǒng)PageRank算法、Hits算法的優(yōu)勢(shì)和不足,為本文對(duì)PageRank算法的綜合改進(jìn)奠定了基礎(chǔ)。 本文的重點(diǎn)內(nèi)容在于對(duì)已有PageRank算法的改進(jìn)算法進(jìn)行進(jìn)一步融合,提出了結(jié)合網(wǎng)頁(yè)分類(lèi)技術(shù)和具有時(shí)間反饋因子的PageRank算法的綜合改進(jìn)算法,并依據(jù)該算法,改進(jìn)了PR值的計(jì)算公式。并對(duì)改進(jìn)后的算法進(jìn)行設(shè)計(jì)驗(yàn)證,將改進(jìn)前后算法的實(shí)驗(yàn)結(jié)果進(jìn)行對(duì)比,驗(yàn)證了改進(jìn)后的算法可以一定程度上提高搜索引擎的查準(zhǔn)率和查全率。
[Abstract]:In the current information age, with the rapid development of the Internet, a large amount of disordered information is constantly produced on the Internet. When users need to search for the information they care about, they need to make use of the fast and accurate feedback results from search engines. This poses a higher challenge to search engine search technology. In the process of search engine improvement, search engine web page sorting algorithm becomes the key problem of search engine improvement. In the initial stage of search engine, PageRank algorithm and hits algorithm are two classical algorithms, which are based on the link structure of web pages. These algorithms are also the basis and foundation for the improvement of search engine algorithms at home and abroad. At the same time, there are some effective improved algorithms. This paper first describes the research background and significance of search engine sorting algorithm, as well as the development status of search engine at home and abroad, and analyzes the working principle and technology of search engine, as well as the evaluation index of search engine website. Then, by analyzing the advantages and disadvantages of the traditional PageRank algorithm, it lays a foundation for the comprehensive improvement of the PageRank algorithm in this paper. The emphasis of this paper is to further fuse the existing improved PageRank algorithm, and propose a comprehensive improved PageRank algorithm, which combines the web page classification technology and the PageRank algorithm with time feedback factor, and based on the improved PageRank algorithm, an improved PageRank algorithm based on the PageRank algorithm is proposed. The formula of PR value is improved. The improved algorithm is designed and verified, and the experimental results before and after the improvement are compared to verify that the improved algorithm can improve the precision and recall of search engine to a certain extent.
【學(xué)位授予單位】:北京化工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 王繼民,陳
本文編號(hào):2047844
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2047844.html
最近更新
教材專(zhuān)著