基于HDFS的海量遙感影像存儲冗余機(jī)制的研究
發(fā)布時間:2018-08-01 11:46
【摘要】:海量遙感影像數(shù)據(jù)存儲基本上采用的都是分布式存儲方式。特別是在高分辨率數(shù)據(jù)存儲系統(tǒng)中,為了保證數(shù)據(jù)的安全性、完備性和高可用性,需要提供一定的數(shù)據(jù)冗余技術(shù)。 目前,傳統(tǒng)的分布式文件存儲系統(tǒng)中采用的數(shù)據(jù)冗余技術(shù)有三種:完全副本技術(shù)、磁盤陣列技術(shù)和糾刪碼編碼冗余技術(shù),完全副本和磁盤陣列這兩種技術(shù)在提高系統(tǒng)冗余性的同時都會增加對系統(tǒng)存儲空間的需求,糾刪碼編碼冗余技術(shù)雖然能彌補(bǔ)存儲空間過度消耗的缺陷,但同時也會增加系統(tǒng)I/O負(fù)擔(dān)。針對上面三種方法的缺陷,本文采用完全復(fù)制技術(shù)和糾刪碼編碼冗余技術(shù)相結(jié)合的方法來解決。在開源HDFS(HadoopDistributed File System)的基礎(chǔ)上,本文將改進(jìn)后的冗余機(jī)制替代HDFS原有的冗余機(jī)制來解決系統(tǒng)中存儲空間與系統(tǒng)I/O負(fù)擔(dān)之間的沖突問題,使整個系統(tǒng)在提高冗余性的同時能夠保證系統(tǒng)I/O速度,并且可以極大地降低系統(tǒng)對存儲空間的需求。本文重點(diǎn)研究了適合高分辨率遙感影像的數(shù)據(jù)冗余機(jī)制,提出了一種改進(jìn)的冗余策略。主要工作與貢獻(xiàn)如下。 1.在研究海量遙感影像數(shù)據(jù)存儲管理技術(shù)與數(shù)據(jù)冗余機(jī)制的基礎(chǔ)上,主要研究了HDFS分布式文件系統(tǒng)及其冗余機(jī)制,重點(diǎn)分析了適合海量遙感影像存儲的復(fù)制冗余技術(shù)和糾刪碼編碼冗余技術(shù)。 2.在復(fù)制冗余機(jī)制和糾刪碼編碼冗余機(jī)制的基礎(chǔ)上,提出了“復(fù)制+編碼”的改進(jìn)的HDFS冗余策略方法,,并給出了文件的讀寫流程方案以及編碼后系統(tǒng)中產(chǎn)生的編碼塊的管理方案。 3.對改進(jìn)的HDFS系統(tǒng)進(jìn)行了實(shí)驗(yàn),驗(yàn)證了所提出的改進(jìn)方案的可行性并且實(shí)驗(yàn)結(jié)果表明系統(tǒng)在保證系統(tǒng)I/O速度的基礎(chǔ)上,能夠極大地降低系統(tǒng)對存儲空間的需求。改進(jìn)后的HDFS系統(tǒng)被成功應(yīng)用到高分重大專項(xiàng)項(xiàng)目(ERSI-DBMS)的海量遙感影像數(shù)據(jù)存儲系統(tǒng)中。
[Abstract]:The massive remote sensing image data storage basically adopts the distributed storage method. Especially in high-resolution data storage system, in order to ensure the security, completeness and high availability of data, it is necessary to provide certain data redundancy technology. At present, there are three kinds of data redundancy techniques used in traditional distributed file storage systems: full copy technology, disk array technology and erasure code coding redundancy technology. Both full copy and disk array can increase the requirement of system storage space while improving system redundancy. Erasure code redundancy can make up the defects of excessive consumption of storage space. But it also adds to the system's I / O burden. Aiming at the defects of the above three methods, this paper adopts the method of combining the complete copy technique and erasure code coding redundancy technique to solve the problem. On the basis of open source HDFS (HadoopDistributed File System), the improved redundancy mechanism is replaced by the original redundancy mechanism of HDFS to solve the conflict between the storage space and the I / O burden of the system. The whole system can improve the redundancy and guarantee the I / O speed of the system, and greatly reduce the storage space requirement of the system. In this paper, the data redundancy mechanism suitable for high resolution remote sensing images is studied, and an improved redundancy strategy is proposed. The main work and contributions are as follows. 1. Based on the research of data storage and management technology and data redundancy mechanism of massive remote sensing image, the distributed file system of HDFS and its redundancy mechanism are studied. The duplication redundancy and erasure code coding redundancy techniques suitable for mass remote sensing image storage are analyzed in detail. 2. On the basis of duplication redundancy mechanism and erasure code coding redundancy mechanism, an improved HDFS redundancy strategy method of "replication coding" is proposed. And gives the file read and write flow scheme and the coding system generated in the code block management scheme. 3. Experiments on the improved HDFS system are carried out to verify the feasibility of the proposed scheme and the experimental results show that the system can greatly reduce the storage space requirements of the system on the basis of ensuring the system I / O speed. The improved HDFS system is successfully applied to the mass remote sensing image data storage system of high score major project (ERSI-DBMS).
【學(xué)位授予單位】:河南大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP333;TP751
本文編號:2157476
[Abstract]:The massive remote sensing image data storage basically adopts the distributed storage method. Especially in high-resolution data storage system, in order to ensure the security, completeness and high availability of data, it is necessary to provide certain data redundancy technology. At present, there are three kinds of data redundancy techniques used in traditional distributed file storage systems: full copy technology, disk array technology and erasure code coding redundancy technology. Both full copy and disk array can increase the requirement of system storage space while improving system redundancy. Erasure code redundancy can make up the defects of excessive consumption of storage space. But it also adds to the system's I / O burden. Aiming at the defects of the above three methods, this paper adopts the method of combining the complete copy technique and erasure code coding redundancy technique to solve the problem. On the basis of open source HDFS (HadoopDistributed File System), the improved redundancy mechanism is replaced by the original redundancy mechanism of HDFS to solve the conflict between the storage space and the I / O burden of the system. The whole system can improve the redundancy and guarantee the I / O speed of the system, and greatly reduce the storage space requirement of the system. In this paper, the data redundancy mechanism suitable for high resolution remote sensing images is studied, and an improved redundancy strategy is proposed. The main work and contributions are as follows. 1. Based on the research of data storage and management technology and data redundancy mechanism of massive remote sensing image, the distributed file system of HDFS and its redundancy mechanism are studied. The duplication redundancy and erasure code coding redundancy techniques suitable for mass remote sensing image storage are analyzed in detail. 2. On the basis of duplication redundancy mechanism and erasure code coding redundancy mechanism, an improved HDFS redundancy strategy method of "replication coding" is proposed. And gives the file read and write flow scheme and the coding system generated in the code block management scheme. 3. Experiments on the improved HDFS system are carried out to verify the feasibility of the proposed scheme and the experimental results show that the system can greatly reduce the storage space requirements of the system on the basis of ensuring the system I / O speed. The improved HDFS system is successfully applied to the mass remote sensing image data storage system of high score major project (ERSI-DBMS).
【學(xué)位授予單位】:河南大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP333;TP751
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 孫勁光;王淑娥;陳虹;;壓縮金字塔樹:有效的高維數(shù)據(jù)索引結(jié)構(gòu)[J];計(jì)算機(jī)工程與應(yīng)用;2009年22期
相關(guān)碩士學(xué)位論文 前1條
1 徐文強(qiáng);基于HDFS的云存儲系統(tǒng)研究[D];上海交通大學(xué);2011年
本文編號:2157476
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2157476.html
最近更新
教材專著