海量Web搜索引擎系統(tǒng)中用戶行為的分布特征及其啟示
發(fā)布時(shí)間:2018-03-06 17:39
本文選題:萬(wàn)維網(wǎng) 切入點(diǎn):搜索引擎 出處:《中國(guó)科學(xué)E輯:技術(shù)科學(xué)》2001年04期 論文類型:期刊論文
【摘要】:統(tǒng)計(jì)分析了大規(guī)模搜索引擎系統(tǒng)的用戶行為的分布特征 .結(jié)果表明 ,用戶查詢內(nèi)容和URL點(diǎn)擊表現(xiàn)出明顯的局部性 ;用戶查詢的分布符合冪函數(shù)的特征并具有良好的自相似性 .基于上述規(guī)律 ,設(shè)計(jì)了查詢cache ,比較了FIFO ,LRU及帶衰減的LFU等 3種cache替換策略 .然后 ,基于用戶行為考察了海量網(wǎng)頁(yè)信息的分布特征 ,并利用URL的入度、鏡像度、目錄深度等網(wǎng)頁(yè)參數(shù)與用戶行為反饋后的相關(guān)度的方差分析 ,闡明了其對(duì)優(yōu)化搜索引擎系統(tǒng)定序算法 (rankingalgorithm)的啟示 .
[Abstract]:The distribution characteristics of user behavior in large-scale search engine system are statistically analyzed. The results show that the content of user query and URL click show obvious locality. The distribution of user queries conforms to the characteristics of power function and has good self-similarity. Based on the above rules, the query cache is designed, and three cache substitution strategies, FIFO LRU and LFU with attenuation, are compared. Based on user behavior, the distribution characteristics of massive web pages are investigated, and the variance analysis of the correlation between the web page parameters such as URL entry, mirroring degree, directory depth and user behavior feedback is made. The enlightenment of this algorithm to the ranking algorithm of search engine optimization system is expounded.
【作者單位】: 北京大學(xué)計(jì)算機(jī)科學(xué)技術(shù)系網(wǎng)絡(luò)與分布式系統(tǒng)研究室!北京100871 北京大學(xué)計(jì)算機(jī)科學(xué)技術(shù)系網(wǎng)絡(luò)與分布式系統(tǒng)研究室!北京100871 北京大學(xué)計(jì)算機(jī)科學(xué)技術(shù)系網(wǎng)絡(luò)與分布式系統(tǒng)研究室!北京100871 北京大學(xué)計(jì)算機(jī)科學(xué)技術(shù)系網(wǎng)絡(luò)與分布式系統(tǒng)研究室!北京100871 ,
本文編號(hào):1575824
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1575824.html
最近更新
教材專著