基于關(guān)鍵詞的深度萬維網(wǎng)數(shù)據(jù)庫選擇

發(fā)布時間：2018-07-29 09:20

【摘要】：該文提出一種基于關(guān)鍵詞的深度萬維網(wǎng)查詢方法:用戶用關(guān)鍵詞的方式提交查詢,該方法在線地選擇能夠反映查詢意圖并且提供高質(zhì)量結(jié)果的萬維網(wǎng)數(shù)據(jù)庫.這種方法既避免了深度萬維網(wǎng)數(shù)據(jù)抓取這一代價高、難度大的操作,又可支持多領(lǐng)域的數(shù)據(jù)庫上的關(guān)鍵詞查詢,從而能夠與現(xiàn)有的搜索引擎實現(xiàn)無縫集成.文中側(cè)重于討論基于關(guān)鍵詞的數(shù)據(jù)庫選擇,從以下兩個方面解決這一問題所涉及的挑戰(zhàn):(1)提出了一種度量關(guān)鍵詞-領(lǐng)域?qū)傩躁P(guān)聯(lián)的相關(guān)性模型,并設(shè)計了基于隨機游動的算法從查詢?nèi)罩局邪l(fā)現(xiàn)潛在的關(guān)鍵詞-屬性關(guān)聯(lián);(2)給出了一種新的數(shù)據(jù)采樣方法,并用于基于采樣的數(shù)據(jù)庫-查詢的相關(guān)性模型中,最終解決深度萬維網(wǎng)的數(shù)據(jù)庫選擇問題.在中文深度萬維網(wǎng)真實數(shù)據(jù)集上的實驗表明:提出的方法能夠有效地選擇與關(guān)鍵詞查詢相關(guān)的數(shù)據(jù)庫,提供高質(zhì)量的結(jié)果.
[Abstract]:In this paper, we propose a deep Web query method based on keywords: users submit queries in the form of keywords. This method selects the Web database which can reflect the intention of the query and provide high quality results online. This method not only avoids a generation of expensive and difficult operations of deep web data capture, but also supports keyword queries in multi-domain databases, thus realizing seamless integration with existing search engines. This paper focuses on the choice of database based on keywords, and addresses the challenges involved in this problem from the following two aspects: (1) A correlation model is proposed to measure the association of keyword and domain attributes. The algorithm based on random walk is designed to find potential keyword attribute association from the query log. (2) A new data sampling method is proposed and used in the database query correlation model based on sampling. Finally, the database selection problem of the deep World wide Web is solved. Experiments on the real data set of the Chinese Deep World wide Web show that the proposed method can effectively select the database related to keyword query and provide high quality results.
【作者單位】：清華大學(xué)計算機科學(xué)與技術(shù)系;
【基金】：國家自然科學(xué)基金重點項目“支持中文Web研究的基礎(chǔ)設(shè)施建設(shè)和應(yīng)用中的基本方法與關(guān)鍵技術(shù)”(60833003)資助
【分類號】：TP311.13

【共引文獻】

相關(guān)碩士學(xué)位論文前1條

1 鄭冬冬;DeepWeb信息集成系統(tǒng)關(guān)鍵技術(shù)研究[D];蘇州大學(xué);2006年

【相似文獻】

相關(guān)碩士學(xué)位論文前1條

1 李岸峰;基于Agent的中小企業(yè)知識管理系統(tǒng)架構(gòu)研究[D];遼寧工程技術(shù)大學(xué);2010年

，

本文編號：2152227

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2152227.html

上一篇：一種有效的基于Web的雙語翻譯對獲取方法
下一篇：網(wǎng)絡(luò)信息檢索工具統(tǒng)計性能對比分析研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于關(guān)鍵詞的深度萬維網(wǎng)數(shù)據(jù)庫選擇