基于文本聚類搜索引擎查詢擴(kuò)展算法的研究與實(shí)現(xiàn)
發(fā)布時間:2018-04-24 19:34
本文選題:搜索引擎 + 文本聚類; 參考:《北京林業(yè)大學(xué)》2012年碩士論文
【摘要】:互聯(lián)網(wǎng)的出現(xiàn)使得信息不斷激增,搜索引擎給人們提供了一種從海量信息中定位信息的有效工具。然而信息增長的速度超乎人們的想象,在信息爆炸面前,傳統(tǒng)的通用搜索引擎查詢方式已不能繼續(xù)滿足人們的需求,如何有效組織浩瀚汪洋中的多樣化信息并以合理有效的方式提供給用戶是搜索引擎面臨的巨大挑戰(zhàn)。數(shù)據(jù)挖掘、模式識別、語義網(wǎng)、本體、查詢擴(kuò)展等技術(shù)在搜索引擎領(lǐng)域大顯身手,被人們廣泛的應(yīng)用以解決搜索引擎面臨的挑戰(zhàn)和問題。本文首先介紹了搜索引擎的發(fā)展,國內(nèi)外的研究現(xiàn)狀,傳統(tǒng)全文檢索搜索引擎的基本原理及存在的問題。之后闡述了本文的研究重點(diǎn)查詢擴(kuò)展的發(fā)展及趨勢。接著從聚類算法選取策略、擴(kuò)展詞選取策略、相似度計算方法等方面詳細(xì)介紹了本文提出的基于文本聚類搜索引擎的查詢擴(kuò)展算法,該算法結(jié)合本文實(shí)現(xiàn)的文本聚類搜索引擎系統(tǒng)的實(shí)際應(yīng)用做了一些改進(jìn),針對基于文本聚類搜索引擎存在的深入查詢問題提供了一種解決方案。然后介紹了本文實(shí)現(xiàn)的文本聚類搜索引擎原型系統(tǒng)的模塊設(shè)計及數(shù)據(jù)庫設(shè)計,并通過實(shí)驗(yàn)驗(yàn)證了本文提出的查詢擴(kuò)展算法的有效性。
[Abstract]:With the emergence of the Internet, information is proliferating, and search engines provide people with an effective tool to locate information from mass information. However, the speed of information growth is beyond people's imagination. In the face of the information explosion, the traditional general search engine query method can no longer meet the needs of people. How to effectively organize the diversified information in Wang Yang and provide it to users in a reasonable and effective way is a great challenge for search engines. Data mining, pattern recognition, semantic web, ontology, query extension and other technologies have been widely used to solve the challenges and problems faced by search engines. This paper first introduces the development of search engine, the current research situation at home and abroad, the basic principle and existing problems of traditional full-text search engine. After that, the paper expounds the development and trend of the key query extension in this paper. Then, the query extension algorithm based on text clustering search engine is introduced in detail from the selection strategy of clustering algorithm, the strategy of selecting extension words, the method of similarity calculation and so on. This algorithm combines with the practical application of the text clustering search engine system implemented in this paper and provides a solution to the deep query problem in the text clustering search engine. Then the module design and database design of the text clustering search engine prototype system implemented in this paper are introduced, and the validity of the query expansion algorithm proposed in this paper is verified by experiments.
【學(xué)位授予單位】:北京林業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2012
【分類號】:TP391.3
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 相春雷;;2009年中國搜索引擎市場趨勢分析[J];軟件世界;2010年02期
2 ;揭秘搜索引擎收錄網(wǎng)站的秘密[J];計算機(jī)與網(wǎng)絡(luò);2010年Z1期
3 蘇喻;鄭誠;馬中杰;;基于語義的VSM模型改進(jìn)[J];計算機(jī)應(yīng)用與軟件;2011年08期
4 馬s,
本文編號:1797934
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1797934.html
最近更新
教材專著