基于最優(yōu)查詢(xún)的多領(lǐng)域deep Web爬蟲(chóng)

發(fā)布時(shí)間：2018-03-17 17:36

本文選題：deep　切入點(diǎn)：Web　出處：《計(jì)算機(jī)應(yīng)用研究》2009年09期 　論文類(lèi)型：期刊論文

【摘要】：Deep Web信息通過(guò)在網(wǎng)頁(yè)搜索接口提交查詢(xún)?cè)~獲得。通用搜索引擎使用超鏈接爬取網(wǎng)頁(yè),無(wú)法索引deep Web數(shù)據(jù)。為解決此問(wèn)題,介紹一種基于最優(yōu)查詢(xún)的deep Web爬蟲(chóng),通過(guò)從聚類(lèi)網(wǎng)頁(yè)中生成最優(yōu)查詢(xún),自動(dòng)提交查詢(xún),最后索引查詢(xún)結(jié)果。實(shí)驗(yàn)表明系統(tǒng)能自動(dòng)、高效地完成多領(lǐng)域deep Web數(shù)據(jù)爬取。
[Abstract]:Deep Web information is obtained by submitting query words in the web search interface. Universal search engines use hyperlinks to crawl web pages and cannot index deep Web data. In order to solve this problem, a deep Web crawler based on optimal query is introduced. By generating the optimal query from the clustering web page, submitting the query automatically, and finally indexing the query results, the experiment shows that the system can automatically and efficiently crawl the multi-domain deep Web data.
【作者單位】：浙江大學(xué)計(jì)算機(jī)科學(xué)與技術(shù)學(xué)院;
【基金】：浙江省科技計(jì)劃基金資助項(xiàng)目(2007C23086)
【分類(lèi)號(hào)】：TP393.092
，

本文編號(hào)：1625775

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1625775.html

上一篇：王寶強(qiáng)離婚成了誰(shuí)的狂歡——娛樂(lè)新聞引爆輿論背后的思考
下一篇：基于圖形化定制的語(yǔ)義搜索系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于最優(yōu)查詢(xún)的多領(lǐng)域deep Web爬蟲(chóng)