虛擬磁帶庫中重復(fù)數(shù)據(jù)刪除技術(shù)的研究與設(shè)計
本文選題:虛擬磁帶庫 + 重復(fù)數(shù)據(jù)刪除 ; 參考:《西南交通大學(xué)》2013年碩士論文
【摘要】:隨著全球信息化的推進,我們的社會正在步入一個信息化的社會,政府機構(gòu)以及各行各業(yè)都對信息資源、信息技術(shù)以及信息產(chǎn)業(yè)的依賴程度越來越大,對存儲空間的需求也在飛速地增長。在進行數(shù)據(jù)備份時,會備份大量相同的數(shù)據(jù)和文件,而這些相同的數(shù)據(jù)和文件占據(jù)了大量昂貴的磁盤空間。VTL (Virtual Tape Library虛擬磁帶庫)以其備份性能高、故障率低、可靠性高等優(yōu)勢被廣泛用于政府機構(gòu)以及各行各業(yè)的數(shù)據(jù)存儲中。因此,研究可以刪除VTL中重復(fù)數(shù)據(jù)的技術(shù)是勢在必行的。 本文首先對虛擬磁帶庫和重復(fù)刪除技術(shù)在國內(nèi)外的現(xiàn)狀進行了分析,找出了現(xiàn)有重復(fù)數(shù)據(jù)刪除技術(shù)中存在的問題和不足,從而確立了本文的研究出發(fā)點。研究了重復(fù)數(shù)據(jù)刪除的基本原理,然后通過以下幾個過程:文件數(shù)據(jù)檢測分塊、塊哈希值計算、塊哈希值查找以及哈希值保存,實現(xiàn)了一個基于塊級的重復(fù)數(shù)據(jù)刪除系統(tǒng)。為了彌補重復(fù)數(shù)據(jù)刪除中MD5哈希算法的“哈希沖突”問題,使用了拉鏈法對哈希算法進行了優(yōu)化,增強了數(shù)據(jù)的安全性;為了提高系統(tǒng)檢測重復(fù)數(shù)據(jù)塊的效率,對基于內(nèi)容的數(shù)據(jù)檢測算法進行了改進;為了提高哈希表查找的效率,使用了Bloom Filter技術(shù)對哈希表進行了優(yōu)化和改進。 最后,在虛擬磁帶庫與備份軟件的環(huán)境下對系統(tǒng)進行了測試與分析,從測試結(jié)果得出,改進的CDC數(shù)據(jù)檢測算法比FSP和SB算法具有更高的重復(fù)刪除率,重復(fù)刪除系統(tǒng)比一般的數(shù)據(jù)壓縮軟件具有更高的數(shù)據(jù)縮減率。
[Abstract]:With the development of global informatization, our society is stepping into an information society. Government agencies and various industries are relying more and more on information resources, information technology and information industry. The demand for storage space is also growing rapidly. When you do a data backup, you back up a lot of the same data and files, which take up a lot of expensive disk space. VTL Virtual Tape Library virtual tape library) because of its high backup performance and low failure rate. The advantages of high reliability are widely used in the data storage of government agencies and various industries. Therefore, it is imperative to study the technology that can delete duplicate data from VTL. This paper first analyzes the status quo of virtual tape library and repeat deletion technology at home and abroad, finds out the existing problems and shortcomings in the existing repeated data deletion technology, and establishes the starting point of this paper. In this paper, the basic principle of repeated data deletion is studied, and then a block level repeated data deletion system is implemented through the following processes: file data detection, block hash value calculation, block hash value searching and hash value saving. In order to make up for the "hash conflict" problem of MD5 hash algorithm in the repeated data deletion, the zipper method is used to optimize the hash algorithm to enhance the security of the data, and to improve the efficiency of the system to detect repeated data blocks. The content-based data detection algorithm is improved, and the Bloom Filter technique is used to optimize and improve the hash table in order to improve the efficiency of hash table lookup. Finally, the system is tested and analyzed under the environment of virtual tape library and backup software. From the test results, the improved CDC data detection algorithm has higher repetition deletion rate than FSP and SB algorithm. The data reduction rate of the repeated deletion system is higher than that of the normal data compression software.
【學(xué)位授予單位】:西南交通大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP333.35
【參考文獻】
相關(guān)期刊論文 前10條
1 段夢博;蔡興旺;;基于內(nèi)容的重復(fù)數(shù)據(jù)刪除技術(shù)的研究[J];電腦知識與技術(shù);2010年22期
2 劉仲明;王放;鄭小林;;醫(yī)院影像歸檔與存儲系統(tǒng)中影像數(shù)據(jù)長期存儲問題的研究[J];第三軍醫(yī)大學(xué)學(xué)報;2005年11期
3 敖莉;舒繼武;李明強;;重復(fù)數(shù)據(jù)刪除技術(shù)[J];軟件學(xué)報;2010年05期
4 盧敏;;“零距離”重復(fù)數(shù)據(jù)刪除[J];軟件世界;2008年11期
5 徐培麗;王浩;;淺析虛擬帶庫[J];視聽界(廣播電視技術(shù));2011年03期
6 陳學(xué)鋒,陳穎行;大型企業(yè)SAN數(shù)據(jù)存儲方案設(shè)計[J];微機發(fā)展;2003年05期
7 厲劍;廉國斌;黃棟;;數(shù)據(jù)容災(zāi)系統(tǒng)與CDP技術(shù)[J];計算機技術(shù)與發(fā)展;2009年01期
8 丁振國;吳寶貴;辛友強;;基于Bloom Filter的大規(guī)模網(wǎng)頁去重策略研究[J];現(xiàn)代圖書情報技術(shù);2008年03期
9 吳松,金海;存儲虛擬化研究[J];小型微型計算機系統(tǒng);2003年04期
10 張磊;;虛擬磁帶庫在災(zāi)備系統(tǒng)中的應(yīng)用研究[J];小型微型計算機系統(tǒng);2007年06期
相關(guān)博士學(xué)位論文 前2條
1 楊天明;網(wǎng)絡(luò)備份中重復(fù)數(shù)據(jù)刪除技術(shù)研究[D];華中科技大學(xué);2010年
2 康劍斌;只讀磁盤-磁帶庫系統(tǒng)研究與實現(xiàn)[D];清華大學(xué);2009年
相關(guān)碩士學(xué)位論文 前1條
1 侯海翔;虛擬桌面環(huán)境下數(shù)據(jù)去冗余系統(tǒng)的設(shè)計與實現(xiàn)[D];華中科技大學(xué);2011年
,本文編號:1976417
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1976417.html