基于糾刪碼存儲的數(shù)據(jù)維護關(guān)鍵技術(shù)研究
發(fā)布時間:2018-05-08 14:39
本文選題:分布式存儲 + 糾刪碼 ; 參考:《國防科學(xué)技術(shù)大學(xué)》2013年碩士論文
【摘要】:隨著當前云存儲,大數(shù)據(jù)等技術(shù)的興起,數(shù)據(jù)的可生存性日益受到重視,而數(shù)據(jù)容錯技術(shù)則是保證數(shù)據(jù)生存性的主要方法。當前的數(shù)據(jù)容錯技術(shù)主要通過冗余的方法實現(xiàn),主要有副本方式和糾刪碼方式,糾刪碼方式在存儲效率方面遠遠高于副本方式,但存在數(shù)據(jù)維護開銷大的問題。本文對糾刪碼方案的數(shù)據(jù)維護開銷問題進行了研究,提出其面臨的兩大問題:1.節(jié)點間維護通信開銷過大;2.數(shù)據(jù)恢復(fù)計算開銷過大。針對這兩個問題,我們提出了解決的方法。首先,本文從數(shù)據(jù)讀取的角度出發(fā),提出了被動檢測修復(fù)算法。該方法的主要思想是利用系統(tǒng)正常讀帶寬進行數(shù)據(jù)的檢測,如果數(shù)據(jù)需要修復(fù),則緩存正常解碼后的數(shù)據(jù)于本地,通過這種方法檢測過的數(shù)據(jù)在被修復(fù)時,不需要再次檢測、下載和解碼數(shù)據(jù),降低了通信和修復(fù)開銷。據(jù)我們所知,目前還沒有類似的方法。其次,本文從系統(tǒng)可靠性的角度出發(fā),提出了自適應(yīng)檢測算法,降低數(shù)據(jù)修復(fù)和通信開銷。該方法的主要思想是根據(jù)系統(tǒng)可靠性的不同來調(diào)整數(shù)據(jù)維護的頻率,可靠性較高的系統(tǒng)則維護頻率較低。該方法與現(xiàn)存的其他方法相比,主要不同在于數(shù)據(jù)維護頻率是根據(jù)系統(tǒng)可靠性而動態(tài)變化,在降低維護頻率的同時,保證了數(shù)據(jù)可用性。再次,本文從數(shù)據(jù)分類的角度出發(fā),提出了容忍度選擇算法。該方法主要思想是根據(jù)數(shù)據(jù)訪問模式的不同(實際的存儲系統(tǒng)中,數(shù)據(jù)的訪問模式存在較大的差異[64]),對不同的數(shù)據(jù)實現(xiàn)不同的維護頻率,降低不經(jīng)常訪問數(shù)據(jù)的維護頻率,從而進一步降低了通信和修復(fù)開銷。據(jù)我們所知,目前還沒有基于糾刪碼存儲的系統(tǒng)在數(shù)據(jù)維護時考慮到數(shù)據(jù)分類的問題,也沒有這方面的研究。最后,我們基于一個開源的分布式文件系統(tǒng)(采用了糾刪碼冗余方式),實現(xiàn)了原型系統(tǒng),并提出了一種層次化的實現(xiàn)方法,通過這種方法,大大縮短了原型系統(tǒng)的開發(fā)時間,并通過模擬和實際測試的方法對算法和原型系統(tǒng)進行了驗證,測試結(jié)果表明本文提出的策略是有效的。
[Abstract]:With the rise of cloud storage, big data and other technologies, the survivability of data is paid more and more attention, and the technology of data fault tolerance is the main method to ensure the survivability of data. The current data fault-tolerant technology is mainly implemented by redundant methods, including replica and erasure code. Erasure code is much more efficient than replica in storage efficiency, but it has the problem of large data maintenance cost. In this paper, the problem of data maintenance overhead of erasure code scheme is studied, and two major problems: 1: 1 are put forward. The overhead of maintenance communication between nodes is too high. Data recovery computation is too expensive. In view of these two problems, we put forward the solution. First of all, this paper proposes a passive detection and repair algorithm from the point of view of data reading. The main idea of this method is to use the normal reading bandwidth of the system to detect the data. If the data needs to be repaired, the normally decoded data can be cached locally. When the data detected by this method is repaired, it is not necessary to detect the data again. Download and decode data, reducing communication and repair overhead. As far as we know, there is no such method at present. Secondly, from the point of view of system reliability, an adaptive detection algorithm is proposed to reduce data repair and communication overhead. The main idea of this method is to adjust the frequency of data maintenance according to the different reliability of the system. The main difference between this method and other existing methods is that the data maintenance frequency changes dynamically according to the reliability of the system, which reduces the maintenance frequency and ensures the availability of the data. Thirdly, a tolerance selection algorithm is proposed from the point of view of data classification. The main idea of this method is to realize different maintenance frequency for different data and reduce the maintenance frequency of infrequently accessed data according to the different data access modes (in actual storage system, there are great differences in data access patterns [64]). Thus, the communication and repair costs are further reduced. As far as we know, there is no system based on erasure code storage to consider the problem of data classification in data maintenance, and there is no research in this field. Finally, we implement the prototype system based on an open source distributed file system (using erasure code redundancy method), and propose a hierarchical implementation method. By this method, the development time of the prototype system is greatly shortened. The algorithm and prototype system are verified by simulation and actual test. The test results show that the proposed strategy is effective.
【學(xué)位授予單位】:國防科學(xué)技術(shù)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP333;TP309
【參考文獻】
相關(guān)期刊論文 前1條
1 羅象宏;舒繼武;;存儲系統(tǒng)中的糾刪碼研究綜述[J];計算機研究與發(fā)展;2012年01期
,本文編號:1861799
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1861799.html
最近更新
教材專著