天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 計算機論文 >

分布式存儲系統(tǒng)上的RS糾刪碼研究與應(yīng)用

發(fā)布時間:2018-11-11 10:32
【摘要】:隨著計算機、智能設(shè)備的普及,以及互聯(lián)網(wǎng)技術(shù)的快速發(fā)展。各類數(shù)據(jù)呈幾何數(shù)量級的激增,給存儲系統(tǒng)帶來了更多的挑戰(zhàn)。數(shù)據(jù)的存儲安全已經(jīng)成為當前存儲系統(tǒng)迫切需要解決的問題。目前,分布式存儲是人們用來應(yīng)對大數(shù)據(jù)量存儲的有效手段。在分布式存儲系統(tǒng)上存在兩種保障數(shù)據(jù)安全的方式:多副本技術(shù)和糾刪碼技術(shù)。多副本技術(shù)簡單且易于實現(xiàn),只需簡單的對一份數(shù)據(jù)進行多次備份,并分開存放即可,以三副本策略最為常見。為了獲得更穩(wěn)定的數(shù)據(jù)安全保障,多副本技術(shù)只能通過增加副本數(shù)量來實現(xiàn),同時帶來的問題是存儲成本在成倍的增加。為了解決多副本技術(shù)存儲成本過高的問題,人們將通信系統(tǒng)中用于解決信息傳遞過程中數(shù)據(jù)丟失問題的糾刪碼技術(shù)引入到存儲系統(tǒng)上來,糾刪碼技術(shù)能很好地解決存儲成本過高的問題,同時能保證和多副本相同甚至更高的數(shù)據(jù)安全保障能力。在糾刪碼技術(shù)解決了多副本技術(shù)存儲成本過高的問題的同時,卻出現(xiàn)了新的問題:在數(shù)據(jù)丟失后恢復數(shù)據(jù)時的所消耗的系統(tǒng)資源和I/O數(shù)量大幅增加。正是出于這一目的,本文從RS糾刪碼入手,分析RS糾刪碼的編碼方程和特點,并結(jié)合陣列碼、LDPC碼的優(yōu)點,提出一種基于RS糾刪碼改進后的編碼——LRC。給出LRC的定義和對LRC進行容錯性分析;建立馬爾科夫模型對其可靠性進行分析;還對其編碼方程的構(gòu)造矩陣和編碼參數(shù)變化進行對比性分析。為了能夠?qū)RC的編碼思想進行應(yīng)用,以開源分布式存儲系統(tǒng)HDFS為平臺,對HDFS存儲系統(tǒng)的系統(tǒng)架構(gòu)、數(shù)據(jù)放置策略進行分析和理解的基礎(chǔ)上,從數(shù)據(jù)放置策略、數(shù)據(jù)重構(gòu)過程和通信校驗機制三個方面,提出在HDFS上實現(xiàn)LRC的設(shè)計思路。最后,通過三組對比實驗得出:存儲成本相似的情況下,LRC在解碼時間上比RS編碼節(jié)省近一半的時間成本;改變編碼參數(shù)時,LRC的編解碼性能不會大幅變化,且能提供更多的參數(shù)組合選擇;在編碼矩陣上,基于柯西矩陣的編碼方程和基于范德蒙矩陣的編碼方程具有相似的性能表現(xiàn)。
[Abstract]:With the popularity of computers, intelligent devices, and the rapid development of Internet technology. All kinds of data are in geometric order of magnitude, which brings more challenges to storage system. Data storage security has become an urgent problem to be solved in current storage system. At present, distributed storage is an effective way for people to deal with large amount of data storage. There are two ways to ensure data security in distributed storage system: multiple replica technology and erasure code technology. Multi-replica technology is simple and easy to implement. It only needs to backup one data several times and store it separately. The three-copy strategy is the most common. In order to obtain more stable data security, multi-replica technology can only be achieved by increasing the number of replicas, and the problem is that the storage cost is increasing exponentially. In order to solve the problem of high storage cost of multi-copy technology, erasure code technology, which is used to solve the problem of data loss in communication system, is introduced to the storage system. Erasure code technology can solve the problem of high storage cost and guarantee the same or higher data security ability as many copies. While erasure code technology solves the problem that the storage cost of multi-copy technology is too high, a new problem arises: the consumption of system resources and the number of I / O in recovering data after data loss are increased greatly. For this purpose, this paper begins with RS erasure codes, analyzes the coding equations and characteristics of RS erasure codes, and combines the advantages of array codes and LDPC codes, and proposes an improved LRC. code based on RS erasure codes. The definition of LRC and the fault tolerance analysis of LRC are given, the reliability of LRC is analyzed by Markov model, and the construction matrix of coding equation and the variation of coding parameters are analyzed comparatively. In order to apply the coding idea of LRC, based on the open source distributed storage system (HDFS), the system architecture and data placement strategy of HDFS storage system are analyzed and understood. Data reconfiguration process and communication verification mechanism are discussed in this paper. The design idea of implementing LRC on HDFS is put forward. Finally, through three groups of comparative experiments, it is concluded that under the condition of similar storage cost, LRC saves nearly half the time cost of RS coding in decoding time; When the coding parameters are changed, the encoding and decoding performance of LRC will not change significantly, and it can provide more choice of parameters. In the coding matrix, the coding equation based on Cauchy matrix and the coding equation based on Van der Mon matrix have similar performance.
【學位授予單位】:成都理工大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP333

【參考文獻】

相關(guān)期刊論文 前1條

1 羅象宏;舒繼武;;存儲系統(tǒng)中的糾刪碼研究綜述[J];計算機研究與發(fā)展;2012年01期

相關(guān)博士學位論文 前3條

1 謝平;RAID-6編碼布局及重構(gòu)優(yōu)化研究[D];華中科技大學;2015年

2 劉衛(wèi)平;網(wǎng)絡(luò)存儲中的數(shù)據(jù)容錯與容災(zāi)技術(shù)研究[D];西北工業(yè)大學;2006年

3 萬武南;分布式安全存儲系統(tǒng)糾刪碼技術(shù)的研究[D];中國科學院研究生院(成都計算機應(yīng)用研究所);2006年

相關(guān)碩士學位論文 前6條

1 梁先海;糾刪碼存儲集群的數(shù)據(jù)重構(gòu)優(yōu)化技術(shù)研究[D];華中科技大學;2015年

2 王敬軒;分布式文件系統(tǒng)存儲效率優(yōu)化研究[D];華中科技大學;2013年

3 楊明;基于LDPC碼的分布式容災(zāi)系統(tǒng)及其性能研究[D];哈爾濱工程大學;2012年

4 張世樂;面向大數(shù)據(jù)塊的快速多容錯編碼研究[D];復旦大學;2010年

5 金奎;基于分布式存儲系統(tǒng)的數(shù)據(jù)安全傳輸?shù)脑O(shè)計與實現(xiàn)[D];哈爾濱工業(yè)大學;2009年

6 姜英豪;基于RS和Chord的分布式存儲系統(tǒng)的設(shè)計[D];哈爾濱工業(yè)大學;2008年

,

本文編號:2324572

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2324572.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶43e31***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com