一種基于MongoDB和HDFS的大規(guī)模遙感數(shù)據(jù)存儲(chǔ)系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)
發(fā)布時(shí)間:2018-03-04 14:20
本文選題:遙感數(shù)據(jù) 切入點(diǎn):元數(shù)據(jù) 出處:《浙江大學(xué)》2013年碩士論文 論文類型:學(xué)位論文
【摘要】:遙感數(shù)據(jù)有多標(biāo)準(zhǔn)、多類型、多尺度、多級(jí)別、海量以及分布式存儲(chǔ)的特征。隨著遙感技術(shù)和信息處理技術(shù)的不斷進(jìn)步,不同類型、不同級(jí)別的遙感數(shù)據(jù)不斷涌現(xiàn),社會(huì)各領(lǐng)域?qū)b感觀測(cè)數(shù)據(jù)的需求越來越大,我國(guó)各部門和科研機(jī)構(gòu)都建立了針對(duì)各行業(yè)不同資源類型、彼此異構(gòu)的遙感影像庫(kù),形成了一個(gè)分布式的、異構(gòu)的、跨部門的、跨地域、資源類型多樣的遙感數(shù)據(jù)庫(kù)群,大大制約了各部門間遙感數(shù)據(jù)的共享和應(yīng)用。為了滿足對(duì)遙感數(shù)據(jù)的管理及共享需求,我們實(shí)現(xiàn)了一個(gè)基于MongoDB和HDFS的大規(guī)模遙感數(shù)據(jù)存儲(chǔ)系統(tǒng),本文介紹了系統(tǒng)的詳細(xì)設(shè)計(jì)與實(shí)現(xiàn),并重點(diǎn)介紹了異構(gòu)遙感元數(shù)據(jù)集成以及海量遙感數(shù)據(jù)高效存儲(chǔ)兩項(xiàng)關(guān)鍵技術(shù)。 針對(duì)遙感元數(shù)據(jù)多源、異構(gòu)、海量等特點(diǎn)本文提出了一種基于映射模板的異構(gòu)遙感元數(shù)據(jù)集成方法,可以實(shí)現(xiàn)異構(gòu)遙感元數(shù)據(jù)的格式化統(tǒng)一及高效存儲(chǔ)。并具有支持元數(shù)據(jù)動(dòng)態(tài)擴(kuò)展的能力,可以解析不斷涌現(xiàn)的新類型新格式遙感元數(shù)據(jù),解決了以往的元數(shù)據(jù)管理系統(tǒng)擴(kuò)展性兼容性差且不利于數(shù)據(jù)共享的問題。 遙感元數(shù)據(jù)特點(diǎn)是異構(gòu)、只讀、小文件、海量。而遙感影像數(shù)據(jù)不但具有只讀、海量的特點(diǎn),而且單個(gè)遙感影像數(shù)據(jù)文多為GB數(shù)量級(jí)的大文件,而且多為冷數(shù)據(jù)訪問頻次少。系統(tǒng)的存儲(chǔ)層通過采用遙感元數(shù)據(jù)和遙感影像數(shù)據(jù)分離存放的策略,并針對(duì)兩種數(shù)據(jù)的特點(diǎn)進(jìn)行了優(yōu)化。針對(duì)遙感元數(shù)據(jù)采用了基于MongoDB的存儲(chǔ)架構(gòu),系統(tǒng)不但能夠提供高效的數(shù)據(jù)存儲(chǔ),而且具有高可靠性、高擴(kuò)展性的特點(diǎn)。針對(duì)遙感影像數(shù)據(jù)系統(tǒng)采用基于HDFS的分布式文件存儲(chǔ)架構(gòu),而且為了提高存儲(chǔ)資源利用率優(yōu)化了HDFS的多副本存儲(chǔ)策略,提供了基于文件訪問頻次的混合存儲(chǔ)策略,在保證數(shù)據(jù)可靠性和訪問速度的前提下提高系統(tǒng)存儲(chǔ)資源利用率。
[Abstract]:Remote sensing data has the characteristics of multi-standard, multi-type, multi-scale, multi-level, massive and distributed storage. With the development of remote sensing technology and information processing technology, different types and different levels of remote sensing data are emerging. There is a growing demand for remote sensing observation data in various fields of society. Various departments and scientific research institutions in China have established a remote sensing image database for different types of resources in various industries, which is heterogeneous to each other, forming a distributed, heterogeneous, cross-sectoral remote sensing image database. In order to meet the requirement of remote sensing data management and sharing, remote sensing data sharing and application among different departments are greatly restricted by remote sensing database groups with diverse resource types across regions. We implement a large-scale remote sensing data storage system based on MongoDB and HDFS. This paper introduces the detailed design and implementation of the system, and focuses on two key technologies: heterogeneous remote sensing metadata integration and efficient storage of massive remote sensing data. According to the characteristics of multi-source, heterogeneity and magnanimity of remote sensing metadata, this paper presents an integration method of heterogeneous remote sensing metadata based on mapping template. It can realize the unified and efficient storage of heterogeneous remote sensing metadata, support the dynamic expansion of metadata, and parse the emerging new types and formats of remote sensing metadata. It solves the problem of poor extensibility compatibility and bad data sharing of metadata management systems in the past. Remote sensing metadata is characterized by heterogeneity, read-only, small file, mass, and remote sensing image data not only has read-only, magnanimous characteristics, but also single remote sensing image data text is mostly GB large file, The storage layer of the system adopts the strategy of separating remote sensing metadata from remote sensing image data. Aiming at the characteristics of the two kinds of data, the storage architecture based on MongoDB is adopted for remote sensing metadata. The system not only can provide efficient data storage, but also has high reliability. Aiming at remote sensing image data system, the distributed file storage architecture based on HDFS is adopted. In order to improve the utilization of storage resources, the multi-copy storage strategy of HDFS is optimized. A hybrid storage strategy based on the frequency of file access is provided to improve the utilization of storage resources on the premise of data reliability and access speed.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP751;TP333
【參考文獻(xiàn)】
相關(guān)期刊論文 前5條
1 王凌云,李琦,喻文承;Web Service與地理信息互操作[J];測(cè)繪科學(xué);2004年01期
2 王華斌;唐新明;李黔湘;;海量遙感影像數(shù)據(jù)存儲(chǔ)管理技術(shù)研究與實(shí)現(xiàn)[J];測(cè)繪科學(xué);2008年06期
3 鄭南,鄭扣根;支持分布式異構(gòu)地理數(shù)據(jù)共享與集成的GIS設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)應(yīng)用研究;2004年08期
4 朱方娥;曹寶香;;基于JMS的消息隊(duì)列中間件的研究與實(shí)現(xiàn)[J];計(jì)算機(jī)技術(shù)與發(fā)展;2008年05期
5 艾海濱,孟令奎,林志勇;基于XML的分布式異構(gòu)地理數(shù)據(jù)集成與共享[J];遙感信息;2002年04期
,本文編號(hào):1565994
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1565994.html
最近更新
教材專著