基于用戶(hù)興趣模型的個(gè)性化搜索算法研究
[Abstract]:With the rapid growth of information on Internet, people have developed a search engine in order to search for information related to themselves, which is a major milestone in the development of query resources. However, with the increasing demand of people, the shortcomings of traditional search engine, such as low retrieval accuracy, repeated pages and so on, are becoming more and more obvious, so that they can not meet the needs of users. In order to better meet the needs of users, individuation, intelligence has become the trend of search engine development. In this paper, the personalization of search engine is deeply studied. The main contents are as follows: firstly, through the study of existing user interest model, a new algorithm for constructing user interest model is proposed. The singular value decomposition (SVD) and k-means clustering algorithm are used to cluster the user's browsing history and its words at different levels, and then two weighted interest trees are created: document class tree and class of speech tree. The weights of each node in the tree represent the degree of interest of the user in this class of documents or words. The experimental results show that the user interest model proposed in this paper has a great improvement in calculating the accuracy of page interest classification. Secondly, aiming at the deficiency of vector space model, an improved method is proposed. In other words, the singular value decomposition (SVD) technique is used to reduce the dimension of the vector space model. The obtained document-class matrix can solve the problems of high dimension, sparsity, synonym and polysemy phenomenon of vector space model. The experimental results show that the improved vector space model is more accurate than the traditional vector space model in calculating page classification. Finally, a new sorting algorithm is proposed to overcome the shortcomings of existing search engine sorting algorithms. On the basis of the user interest model proposed in this paper, the naive Bayesian classifier is used to classify the documents retrieved by the traditional search engine and classify the words, and then the documents are graded according to the classification results. Finally, the document is arranged in descending order according to the document score. The experimental results show that the proposed personalized sorting algorithm is more accurate than the probabilistic model-based personalized search algorithm under the same conditions and can better meet the personalized needs of users.
【學(xué)位授予單位】:太原科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 王繼成,潘金貴,張福炎;Web文本挖掘技術(shù)研究[J];計(jì)算機(jī)研究與發(fā)展;2000年05期
2 曾春,邢春曉,周立柱;基于內(nèi)容過(guò)濾的個(gè)性化搜索算法[J];軟件學(xué)報(bào);2003年05期
3 蘇貴洋,馬穎華,李建華;一種基于內(nèi)容的信息過(guò)濾改進(jìn)模型[J];上海交通大學(xué)學(xué)報(bào);2004年12期
4 常璐,夏祖奇;搜索引擎的幾種常用排序算法[J];圖書(shū)情報(bào)工作;2003年06期
5 李廣建,黃];用戶(hù)模型及其學(xué)習(xí)方法[J];現(xiàn)代圖書(shū)情報(bào)技術(shù);2002年06期
6 楊思洛;搜索引擎的排序技術(shù)研究[J];現(xiàn)代圖書(shū)情報(bào)技術(shù);2005年01期
7 陳彤兵,汪保友,胡金化,施伯樂(lè);一個(gè)實(shí)時(shí)搜索引擎的設(shè)計(jì)[J];小型微型計(jì)算機(jī)系統(tǒng);2004年05期
相關(guān)博士學(xué)位論文 前1條
1 劉云峰;基于潛在語(yǔ)義分析的中文概念檢索研究[D];華中科技大學(xué);2005年
相關(guān)碩士學(xué)位論文 前10條
1 李彥輝;基于用戶(hù)興趣的個(gè)性化搜索引擎研究[D];山西財(cái)經(jīng)大學(xué);2011年
2 裴仰軍;個(gè)性化服務(wù)中用戶(hù)興趣模型的研究[D];重慶大學(xué);2005年
3 張園園;基于用戶(hù)興趣的個(gè)性化搜索引擎的分析與研究[D];燕山大學(xué);2006年
4 李?lèi)?ài)明;個(gè)性化搜索引擎用戶(hù)模型研究[D];華中師范大學(xué);2007年
5 陳玉娥;個(gè)性化服務(wù)中用戶(hù)模型的研究與設(shè)計(jì)[D];山東科技大學(xué);2007年
6 王禮禮;基于潛在語(yǔ)義索引的文本聚類(lèi)算法研究[D];西南交通大學(xué);2008年
7 趙權(quán);基于粒度分析原理的模糊聚類(lèi)算法研究[D];山西大學(xué);2008年
8 時(shí)延軍;基于Nutch的分布式搜索引擎的設(shè)計(jì)與研究[D];長(zhǎng)春理工大學(xué);2010年
9 張躍火;基于用戶(hù)興趣偏好模型的個(gè)性化搜索算法[D];重慶大學(xué);2010年
10 賈欣;基于用戶(hù)興趣模型的元搜索結(jié)果排序算法研究[D];華中科技大學(xué);2012年
本文編號(hào):2466907
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2466907.html