面向SSD的重復數(shù)據(jù)刪除機制設計與實現(xiàn)
發(fā)布時間:2018-08-13 18:04
【摘要】:大數(shù)據(jù)時代,帶來了更多的磁盤讀寫操作。固態(tài)盤(Solid State Disk,SSD)讀寫性能強于傳統(tǒng)硬盤驅(qū)動器(Hard Disk Drive,HDD),特別地,憑借其比HDD更加優(yōu)異的隨機讀寫特性,SSD能有效提升計算機系統(tǒng)的存儲性能。但是SSD存在寫入放大等性能問題,其有限的擦除次數(shù)也限制了SSD的使用壽命。如何在減少SSD寫入次數(shù)同時保證其優(yōu)異的讀寫性能成為一個亟需解決的問題。 重復數(shù)據(jù)刪除技術可以減少冗余數(shù)據(jù)在磁盤中的存儲。在分析SSD結構和讀寫特性的基礎上,結合當前SSD的研究現(xiàn)狀,提出了一種面向SSD的重復數(shù)據(jù)刪除機制來解決以上問題。具體地,引入寫緩存,可以有效的解決寫入放大和SSD使用壽命問題;采用元數(shù)據(jù)和數(shù)據(jù)區(qū)分策略,在SSD中進行分離存儲,元數(shù)據(jù)直接存入元數(shù)據(jù)存儲區(qū),應用數(shù)據(jù)經(jīng)過重刪存后存入數(shù)據(jù)存儲區(qū),同時提升元數(shù)據(jù)在寫緩存中的駐留優(yōu)先級;采用基于位圖查詢的物理地址分配策略,順序分配物理地址,均衡SSD內(nèi)部頁的寫操作;在地址轉(zhuǎn)換中,增加了虛擬塊地址(Virtual Block Address,VBA),減少數(shù)據(jù)遷移時地址轉(zhuǎn)換表的操作。 設計并實現(xiàn)了一個面向SSD的重復數(shù)據(jù)刪除原型系統(tǒng),系統(tǒng)由功能模塊組成,具有較強的可擴展性。各功能模塊實現(xiàn)上述機制(寫緩存、元數(shù)據(jù)和數(shù)據(jù)區(qū)分、基于位圖查詢的物理地址分配、虛擬塊地址等)。在原型系統(tǒng)上進行了相應的功能和性能測試。測試結果顯示,對重復數(shù)據(jù)的刪除效率高達95%,機制引入寫緩存和元數(shù)據(jù)區(qū)分后,性能提升了60%以上。
[Abstract]:Big data era, brought more disk read and write operation. The read and write performance of solid state disk (Solid State) is better than that of (Hard Disk drive (HDD). In particular, it can effectively improve the storage performance of computer system by virtue of its better random read and write characteristic than HDD. However, SSD has some performance problems, such as writing and amplifying, and its limited erasure number also limits the service life of SSD. How to reduce the number of SSD writes while ensuring its excellent reading and writing performance has become an urgent problem. Duplicate data deletion can reduce the storage of redundant data on disk. On the basis of analyzing the structure of SSD and the characteristics of reading and writing, combined with the current research status of SSD, a mechanism of repeated data deletion for SSD is proposed to solve the above problems. Specifically, the problem of write amplification and SSD lifetime can be effectively solved by introducing write cache, and metadata is stored separately in SSD, and metadata is stored directly in metadata storage area. The application data is stored in the data storage area after redelete, and the priority of metadata resident in the write cache is raised, the physical address allocation strategy based on bitmap query is adopted, the physical address is assigned sequentially, and the write operation of the SSD internal page is balanced. In address translation, virtual block address (Virtual Block address VBA) is added to reduce the operation of address translation table in data migration. A prototype system of repetitive data deletion for SSD is designed and implemented. The system is composed of functional modules and has strong extensibility. Each functional module implements the above mechanisms (write cache, metadata and data differentiation, physical address allocation based on bitmap query, virtual block address, etc.). The corresponding function and performance tests are carried out on the prototype system. The test results show that the efficiency of deleting duplicate data is as high as 95%, and the performance of the mechanism is improved by more than 60% after introducing write cache and metadata differentiation.
【學位授予單位】:華中科技大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP333
本文編號:2181767
[Abstract]:Big data era, brought more disk read and write operation. The read and write performance of solid state disk (Solid State) is better than that of (Hard Disk drive (HDD). In particular, it can effectively improve the storage performance of computer system by virtue of its better random read and write characteristic than HDD. However, SSD has some performance problems, such as writing and amplifying, and its limited erasure number also limits the service life of SSD. How to reduce the number of SSD writes while ensuring its excellent reading and writing performance has become an urgent problem. Duplicate data deletion can reduce the storage of redundant data on disk. On the basis of analyzing the structure of SSD and the characteristics of reading and writing, combined with the current research status of SSD, a mechanism of repeated data deletion for SSD is proposed to solve the above problems. Specifically, the problem of write amplification and SSD lifetime can be effectively solved by introducing write cache, and metadata is stored separately in SSD, and metadata is stored directly in metadata storage area. The application data is stored in the data storage area after redelete, and the priority of metadata resident in the write cache is raised, the physical address allocation strategy based on bitmap query is adopted, the physical address is assigned sequentially, and the write operation of the SSD internal page is balanced. In address translation, virtual block address (Virtual Block address VBA) is added to reduce the operation of address translation table in data migration. A prototype system of repetitive data deletion for SSD is designed and implemented. The system is composed of functional modules and has strong extensibility. Each functional module implements the above mechanisms (write cache, metadata and data differentiation, physical address allocation based on bitmap query, virtual block address, etc.). The corresponding function and performance tests are carried out on the prototype system. The test results show that the efficiency of deleting duplicate data is as high as 95%, and the performance of the mechanism is improved by more than 60% after introducing write cache and metadata differentiation.
【學位授予單位】:華中科技大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP333
【參考文獻】
相關期刊論文 前1條
1 史高峰;李小勇;;SSD數(shù)據(jù)結構與算法綜述[J];微型電腦應用;2012年04期
,本文編號:2181767
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2181767.html
最近更新
教材專著