天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 軟件論文 >

基于領(lǐng)域特征和用戶查詢?nèi)拥腄eep Web數(shù)據(jù)源描述方法

發(fā)布時(shí)間:2018-07-20 19:58
【摘要】:[目的/意義]數(shù)據(jù)源描述(又稱數(shù)據(jù)源摘要)是Deep Web集成檢索領(lǐng)域存在的關(guān)鍵問(wèn)題之一,數(shù)據(jù)源描述的質(zhì)量直接影響著集成檢索系統(tǒng)的檢索效率和效果。本文提出一種基于領(lǐng)域特征和用戶查詢?nèi)拥臄?shù)據(jù)源描述方法,以期為非合作環(huán)境下資源集成應(yīng)用與研究提供參考和借鑒。[方法/過(guò)程]該方法為異構(gòu)非合作型數(shù)據(jù)源的離線取樣方法,通過(guò)分析數(shù)據(jù)源和用于查詢的領(lǐng)域主題屬性,依次構(gòu)建領(lǐng)域特征詞集、初始特征詞集和高頻特征詞集,并最終獲得以高頻特征詞查詢?nèi)拥臄?shù)據(jù)源描述信息。結(jié)合流行的CORI算法,深入分析基于推理網(wǎng)絡(luò)的用戶查詢與數(shù)據(jù)源描述的相關(guān)度計(jì)算方法,并基于此方法設(shè)計(jì)基于Lemur工具集的集成檢索系統(tǒng),驗(yàn)證了上述方法的有效性。[結(jié)果/結(jié)論]所提方法在查全率和查準(zhǔn)率方面均得到很好的表現(xiàn)。與其他方法相比,該方法在樣本數(shù)據(jù)自動(dòng)更新和運(yùn)維管理方面具有明顯成本優(yōu)勢(shì)和實(shí)用價(jià)值。
[Abstract]:[Objective / meaning] data source description (also known as data source Digest) is one of the key problems in Deep Web integrated retrieval field. The quality of data source description directly affects the retrieval efficiency and effect of integrated retrieval system. This paper presents a data source description method based on domain features and user query sampling, with a view to being a non cooperative environment. It provides reference and reference for the application and research of resource integration. [method / process] this method is an off-line sampling method for heterogeneous and non cooperative data sources. By analyzing the data source and the subject attributes used in the query, the domain feature words set, the initial feature word set and the high frequency characteristic word set are constructed in turn, and the high frequency feature word query is finally obtained. Sample data source description information. Combined with popular CORI algorithm, this paper analyzes the correlation calculation method of user query and data source description based on inference network, and designs an integrated retrieval system based on Lemur tool set based on this method. The effectiveness of the above method is verified. [results / Conclusion] methods are in the aspect of recall and precision. Compared with other methods, this method has obvious cost advantages and practical value in automatic updating and operation management of sample data.
【作者單位】: 中國(guó)科學(xué)院文獻(xiàn)情報(bào)中心;中國(guó)科學(xué)院大學(xué);
【基金】:國(guó)家社會(huì)科學(xué)基金項(xiàng)目“基于開放獲取學(xué)術(shù)期刊的資源深度整合與揭示研究”(項(xiàng)目編號(hào):16BTQ025)研究成果之一
【分類號(hào)】:TP391.3

【相似文獻(xiàn)】

相關(guān)期刊論文 前10條

1 萬(wàn)春,劉麗莉;數(shù)據(jù)源的自動(dòng)生成[J];計(jì)算機(jī)時(shí)代;2001年09期

2 唐懿芳 ,牛力 ,張師超;多數(shù)據(jù)源挖掘中的模式合成技術(shù)[J];菏澤師專學(xué)報(bào);2002年02期

3 蔡璇;田忠和;;多數(shù)據(jù)源查詢的幾種優(yōu)化方法[J];計(jì)算機(jī)與數(shù)字工程;2006年07期

4 王穎;;分布式空間數(shù)據(jù)源的聯(lián)合查詢[J];計(jì)算機(jī)工程與設(shè)計(jì);2007年04期

5 胡鵬昱;趙朋朋;方巍;崔志明;;深網(wǎng)數(shù)據(jù)源質(zhì)量估計(jì)模型[J];計(jì)算機(jī)工程;2009年09期

6 孫宏旭;邢薇;馬立和;;動(dòng)態(tài)多數(shù)據(jù)源的研究與實(shí)現(xiàn)[J];電腦學(xué)習(xí);2010年03期

7 鄧松;萬(wàn)常選;劉喜平;廖國(guó)瓊;;基于用戶反饋的深網(wǎng)數(shù)據(jù)源選擇[J];小型微型計(jì)算機(jī)系統(tǒng);2012年11期

8 鄧松;萬(wàn)常選;吁亮;劉德喜;雷剛;王映龍;;非合作結(jié)構(gòu)化深網(wǎng)數(shù)據(jù)源摘要的動(dòng)態(tài)更新[J];微電子學(xué)與計(jì)算機(jī);2014年04期

9 黃克穎;高s,

本文編號(hào):2134606


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2134606.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶f114b***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com