大規(guī)模語義數(shù)據(jù)存儲(chǔ)和查詢技術(shù)研究
[Abstract]:At present, semantic World wide Web is widely used in many fields, such as medicine, biology, geographic information service and so on. However, with the advent of big data era and the continuous expansion of application system scale, semantic data is growing at an astonishing rate. The traditional semantic data storage management technology and system based on relational database can no longer effectively store and manage the large-scale rapid growth of semantic data. At the same time, the traditional serialized semantic query technology is difficult to adapt to large-scale semantic data query processing. In this context, solving large-scale semantic data storage and query by parallel computing technology has become a hot research issue in academia and industry. However, the parallel computing technology is closely related to the application problem, and the application problem itself has different complexity and diversity, which makes the processing of large-scale semantic data have great technical challenges and needs to be stored. Inquiry and other aspects of in-depth discussion and research. In order to solve the above problems, based on the analysis of resource description framework RDF (Resource Description Framework) and RDF data query language SPARQL (Simple Protocol and RDF Query Language), this paper uses the semantic data processing framework based on industrial standard OpenRDF Sesame. In this paper, a large-scale distributed semantic data storage and query technique based on HBase and Redis is proposed. In this method, the hybrid index is used to construct a hierarchical storage architecture to improve the performance of semantic data query. On this basis, this paper further analyzes the processing process of SPAROL query engine, and optimizes the join query of the query model by constructing the cost model. Using the query intermediate result set to optimize the query execution strategy to ensure the high efficiency of semantic data query; In order to improve the reliability and availability of query engine, this paper also studies and discusses the fault tolerance and extensibility design of large-scale semantic data storage management and query engine. Finally, based on the storage architecture and query optimization scheme, a large-scale semantic data storage and query prototype system is designed and implemented in this paper. The experimental results show that the proposed approach to large-scale semantic data storage and query is effective. The research work of this paper is mainly divided into the following two parts: the first part: the research of the existing semantic data storage technology, the design of large-scale semantic data storage model, Based on the storage model, a hybrid index storage method and hierarchical storage architecture are proposed, and the fault tolerance and scalability solutions of the storage architecture are proposed. In the second part, the execution flow of semantic data query engine is analyzed. In the aspect of query model optimization, this paper proposes a join operation optimization algorithm based on selection degree estimation. In the aspect of query strategy optimization, this paper proposes an adaptive batch query scheme.
【學(xué)位授予單位】:南京大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP311.13;TP333
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 馮升華,黃利平,方劍,李建明;多企業(yè)聯(lián)合查詢技術(shù)的研究與實(shí)現(xiàn)[J];清華大學(xué)學(xué)報(bào)(自然科學(xué)版);2001年08期
2 伊莉娜;王培東;;基于“智能克隆”的移動(dòng)查詢技術(shù)[J];哈爾濱理工大學(xué)學(xué)報(bào);2008年06期
3 朱蓉;基于模糊理論的查詢技術(shù)研究[J];計(jì)算機(jī)應(yīng)用研究;2003年05期
4 廖湖聲,鄭玉明;多源空間數(shù)據(jù)庫(kù)查詢技術(shù)[J];北京工業(yè)大學(xué)學(xué)報(bào);2004年02期
5 黃希琛;無編碼通用詞庫(kù)的高倍邏輯壓縮和反向查詢技術(shù)原理[J];中文信息學(xué)報(bào);1994年02期
6 石靜;劉永山;;基于開放區(qū)域的定量方向關(guān)系查詢技術(shù)[J];計(jì)算機(jī)工程;2007年22期
7 王國(guó)華;;計(jì)算機(jī)圖形信息的查詢技術(shù)研究與實(shí)現(xiàn)[J];長(zhǎng)沙航空職業(yè)技術(shù)學(xué)院學(xué)報(bào);2006年01期
8 熊劍平,賈惠波,王洪;電子檔案的因特網(wǎng)查詢技術(shù)[J];縮微技術(shù);1997年04期
9 李增祥;;數(shù)據(jù)庫(kù)SQL查詢技術(shù)的優(yōu)化策略[J];消費(fèi)導(dǎo)刊;2009年10期
10 許龍飛;;數(shù)據(jù)庫(kù)自然語言查詢技術(shù)研究[J];計(jì)算機(jī)科學(xué);1997年05期
相關(guān)會(huì)議論文 前7條
1 李永光;王鏑;王國(guó)仁;馬宜菲;;基于塊排序索引的生物序列局部比對(duì)查詢技術(shù)(英文)[A];第二十二屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(研究報(bào)告篇)[C];2005年
2 魏華;;Delphi應(yīng)用程序中查詢技術(shù)的實(shí)現(xiàn)[A];圖像 仿真 信息技術(shù)——第二屆聯(lián)合學(xué)術(shù)會(huì)議論文集[C];2002年
3 任詠林;秦勉;任偉林;于重重;;基于XML的查詢技術(shù)[A];第一屆全國(guó)Web信息系統(tǒng)及其應(yīng)用會(huì)議(WISA2004)論文集[C];2004年
4 張昱;吳年;;XML數(shù)據(jù)流的過濾與查詢技術(shù)[A];第二十一屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2004年
5 胡皓;羅景青;;基于模糊理論的查詢技術(shù)研究[A];2006北京地區(qū)高校研究生學(xué)術(shù)交流會(huì)——通信與信息技術(shù)會(huì)議論文集(下)[C];2006年
6 王佳;楊樹強(qiáng);賈焰;;面向海量數(shù)據(jù)的并行UNION查詢技術(shù)研究與實(shí)現(xiàn)[A];2006年全國(guó)開放式分布與并行計(jì)算學(xué)術(shù)會(huì)議論文集(二)[C];2006年
7 張梅;;PB7.0通用任意字段查詢技術(shù)的實(shí)現(xiàn)[A];貴州省自然科學(xué)優(yōu)秀學(xué)術(shù)論文集[C];2005年
相關(guān)博士學(xué)位論文 前2條
1 黃玉龍;基于GPU的查詢技術(shù)并行化研究[D];華南理工大學(xué);2013年
2 李先通;圖數(shù)據(jù)查詢技術(shù)的研究[D];哈爾濱工業(yè)大學(xué);2009年
相關(guān)碩士學(xué)位論文 前10條
1 張建;大規(guī)模語義數(shù)據(jù)存儲(chǔ)和查詢技術(shù)研究[D];南京大學(xué);2014年
2 石靜;基于開放區(qū)域的方向關(guān)系查詢技術(shù)研究[D];燕山大學(xué);2006年
3 唐兵兵;達(dá)夢(mèng)數(shù)據(jù)倉(cāng)庫(kù)中多維查詢技術(shù)的研究[D];華中科技大學(xué);2009年
4 曾锃;基于一階謂詞邏輯的代碼查詢技術(shù)[D];南京大學(xué);2011年
5 李華強(qiáng);本體存儲(chǔ)與查詢技術(shù)研究[D];北京郵電大學(xué);2007年
6 李軍;XML文檔查詢技術(shù)研究及在數(shù)字圖書館中的應(yīng)用[D];湖南師范大學(xué);2009年
7 岳友友;XML查詢技術(shù)研究[D];重慶大學(xué);2006年
8 黃峗煒;RDF-XML文檔的索引查詢技術(shù)研究與實(shí)現(xiàn)[D];解放軍信息工程大學(xué);2007年
9 伊莉娜;基于Agent的移動(dòng)查詢技術(shù)研究[D];哈爾濱理工大學(xué);2008年
10 任俊國(guó);多數(shù)據(jù)源可控查詢技術(shù)的研究與應(yīng)用[D];山東科技大學(xué);2011年
,本文編號(hào):2471720
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2471720.html