天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 搜索引擎論文 >

搜索引擎系統(tǒng)網(wǎng)頁消重的研究與實現(xiàn).pdf 全文

發(fā)布時間:2016-10-30 21:01

  本文關(guān)鍵詞:搜索引擎系統(tǒng)網(wǎng)頁消重的研究與實現(xiàn),,由筆耕文化傳播整理發(fā)布。


中南民族大學(xué) 碩士學(xué)位論文 搜索引擎系統(tǒng)網(wǎng)頁消重的研究與實現(xiàn) 姓名:范小源 申請學(xué)位級別:碩士 專業(yè):計算機(jī)應(yīng)用技術(shù) 指導(dǎo)教師:陸際光 20070520- I - Internet? URL- II - Windows? JavaLucene??Lucene- III - Abstract The rapid popularization and development of Internet makes people face a sea of information. It becomes essential to obtain really important informat ion from it. The search engine mainly referred to the full text search system is a kind of tool that provides this function. However, in the retrieval results from the search engine, there are a large number of duplicated web pages which mainly come from the reproduction among the websites. Those repetitive web pages not only occupy the network bandwidth but also waste storage resources. Users do not want to see a pile of search results with the same or approximate contents, and truly useful results are often drowned in this redundant information and can’t be easily discovered. Effective removal of those duplicate web pages will enhance the accuracy in searching and save time and energy for users, so that the search system itself can save a lot of storage resources and improve work efficiencyThis paper mainly studies the problem of removing duplicated web pages for search engine. At present the effective methods of removing duplicated web pages are still few, and most of them are realized in the server end, it means duplicated web pages are dispeled during the process of collecting web pages. At present the common used methods are the method based on the same URL, the method based on cluster, the method based on feature codes and the method based on signature. In the method based on cluster, a text is expressed as a vector in a vector spatial model, then various methods are used to achieve clustering or classification. In this method calculating the angle between vectors has high computational complexity which will take up more proce


  本文關(guān)鍵詞:搜索引擎系統(tǒng)網(wǎng)頁消重的研究與實現(xiàn),由筆耕文化傳播整理發(fā)布。



本文編號:159480

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/159480.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶6932e***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com