基于結(jié)構(gòu)化索引的RDF數(shù)據(jù)存儲及查詢方法的研究與實現(xiàn)

發(fā)布時間：2018-04-20 22:24

本文選題：RDF + HBase��；參考：《北京交通大學(xué)》2013年碩士論文

【摘要】：隨著互聯(lián)網(wǎng)和物聯(lián)網(wǎng)的發(fā)展,網(wǎng)絡(luò)中的數(shù)據(jù)量出現(xiàn)爆發(fā)式的增長,對數(shù)據(jù)共享與處理提出的新的要求,更多復(fù)雜的語義關(guān)系在大數(shù)據(jù)的條件下需要處理和應(yīng)用。大規(guī)模RDF數(shù)據(jù)的存儲、查詢,及有效支持數(shù)據(jù)挖掘等的數(shù)據(jù)處理方法,對計算機、制造業(yè)、鐵路等多行業(yè)的數(shù)據(jù)處理具有重要的理論和應(yīng)用意義。本文針對鐵路傳感器應(yīng)用的需求,提出一種基于HBase的面向結(jié)構(gòu)化索引的RDF數(shù)據(jù)存儲及查詢方法。首先,針對大規(guī)模數(shù)據(jù)的存儲要求,提出一種基于結(jié)構(gòu)的RDF數(shù)據(jù)索引方式,通過分析數(shù)據(jù)圖中節(jié)點的連接關(guān)系構(gòu)造索引圖,利用該索引對數(shù)據(jù)進行劃分,滿足同一結(jié)構(gòu)的數(shù)據(jù)集中存儲,以這種方法降低數(shù)據(jù)查詢時的消耗,加快查詢速度。其次,提出了使用HBase來處理RDF數(shù)據(jù)存儲的方案,根據(jù)結(jié)構(gòu)化索引實現(xiàn)數(shù)據(jù)劃分,并利用“謂詞-主體-客體”的三元組方式實現(xiàn)HBase存儲結(jié)構(gòu),同時提出行鍵值編碼方法以解決RDF數(shù)據(jù)中的多值問題,有效減少目標數(shù)據(jù)查詢的范圍,提高查詢效率。再次,提出了基于結(jié)構(gòu)化索引及SPARQL語句重排的RDF數(shù)據(jù)查詢方法,根據(jù)查詢中不同語句間未知變量的綁定關(guān)系及執(zhí)行一條查詢語句所產(chǎn)生的消耗進行相關(guān)度的計算,以此為依據(jù)對SPARQL進行重排,重排后的語句通過結(jié)構(gòu)化索引及物理查詢兩層操作完成數(shù)據(jù)的查詢,查詢效率得到較好的提升。最后,針對該鐵路傳感器應(yīng)用場景,對該系統(tǒng)的總體查詢效率進行了實驗驗證,較經(jīng)典的RDF數(shù)據(jù)存儲及檢索系統(tǒng)Sesame獲得了更好的查詢效率。圖29幅,表20張,參考文獻40篇。
[Abstract]:With the development of Internet and Internet of Things , the amount of data in the network increases exponentially , and the new requirements for data sharing and processing are required . More complex semantic relations need to be processed and applied under the condition of large data . The data processing method of large - scale RDF data is of great theoretical and practical significance for data processing in many industries , such as computer , manufacturing , railway and so on . In this paper , based on the demand of railway sensor application , a RDF data storage and query method based on HBase for structured index is proposed .

First , aiming at the storage requirement of large - scale data , a structure - based RDF data index method is proposed , and the index map is constructed by analyzing the connection relationship between nodes in the data graph .

Secondly , the scheme of using HBase to process RDF data storage is put forward . According to the structured index , the data partition is realized , and the HBase storage structure is realized by using the triple way of " predicate - body - object " , meanwhile , a row key value encoding method is proposed to solve the multi - value problem in RDF data , thus effectively reducing the range of the target data query and improving the query efficiency .

Thirdly , based on the structure index and the RDF data query method , based on the binding relationship between the unknown variables and the consumption of executing a query statement , according to the binding relationship between the unknown variables in the query and the consumption of executing a query statement , the query of the data is completed by the structured index and the physical query , so that the query efficiency is improved .

Finally , according to the railway sensor application scene , the overall query efficiency of the system is verified experimentally . Compared with the classical RDF data storage and retrieval system , the better query efficiency is obtained . There are 29 tables , 20 tables and 40 references .

【學(xué)位授予單位】：北京交通大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2013
【分類號】：TP333

【參考文獻】

相關(guān)期刊論文前1條

1 瞿裕忠,張劍鋒,陳崢,王叢剛;XML語言及相關(guān)技術(shù)綜述[J];計算機工程;2000年12期

，

本文編號：1779721

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1779721.html

上一篇：鍵盤按鍵信息的近場截獲研究
下一篇：基于鐵酸鉍薄膜異質(zhì)結(jié)的信息存儲器件研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于結(jié)構(gòu)化索引的RDF數(shù)據(jù)存儲及查詢方法的研究與實現(xiàn)