容錯(cuò)分布式存儲(chǔ)系統(tǒng)擴(kuò)容機(jī)制研究
[Abstract]:Today's large-scale distributed storage systems use redundant storage to maintain data availability. The redundant information generation mode has the copy and deletion codes. the storage overhead required for providing the same fault-tolerant capability is greatly reduced with respect to the replication, and is used by an increasing number of storage systems. On the other hand, the rapid growth of data, as well as the user's increasing system capacity and performance requirements, often result in the current build-up of storage systems with low storage capacity and insufficient bandwidth resources. When application requirements exceed system capabilities, the storage resource needs to be increased and some of the data is migrated to the new storage device to relieve the pressure, which is known as the storage-system expansion. Therefore, it is of great significance to study the capacity expansion mechanism of the distributed storage system based on the erasure code, and it is of great significance to the cloud storage and the data storage in the background of the data center. This paper studies the expansion mechanism of the distributed storage system from the three dimensions of the system I/ O request and the system I/ O request and the user's access performance after the expansion, and the main research contents and contributions are as follows: (1) The research of the capacity expansion of the Cauchy Reed-Solomon (CRS) is becoming more and more important as the current storage system is improving the fault tolerance. CRS encoding is mainly applicable to a distributed storage system (e.g., CleverSafe, OceanStore) consisting of a number of storage nodes and the Internet. The expansion process requires the migration of part of the data to the new storage device, while the check needs to be updated. The storage I/ O and network transmission bandwidth overhead brought by the data migration and check update directly influence the system performance in the expansion process. In this paper, the expansion of the distributed storage system based on CRS is studied, the first step is to design the expanded coding matrix, the second step is to design the data migration scheme in the expansion process, and the third step further optimizes the data migration process by using the idea of the data of the check and decoding part. In this paper, a three-stage optimization and expansion algorithm is designed for the expansion of CRS system. The theoretical analysis shows that the three-stage optimization expansion algorithm in this paper can effectively reduce the system I/ O and network transmission bandwidth in the expansion process of the CRS system with respect to the basic capacity expansion algorithm. In this paper, the validity and practicability of the algorithm under the single thread and multi-thread architecture are verified by deploying the CRS three-stage optimization expansion algorithm in the actual distributed file system and comparing with the basic capacity expansion algorithm. (2) On-line capacity expansion is studied in the actual storage system. Most upper-level user-level applications require the system to provide an online service of 7x24 hours. Therefore, when the storage system is expanded online, the I/ O request and the migration I/ O request of the user compete with each other, and the response time performance of the user and the migration in the expansion process is bound to be affected. However, the existing capacity expansion algorithm seldom takes into account the user I/ O request at the time of design, and the response time performance of the user and the migration in the on-line expansion process is bound to be degraded. In this paper, an on-line capacity expansion optimization mechanism, Popularity-based Online Scaling (POS), is designed for a number of expansion algorithms. The on-line capacity expansion optimization mechanism (POS) of this paper is based on two characteristics of user access in the actual system, namely, data heat and data locality, by dividing the original storage space into a plurality of areas, and recording the heat of each area (mainly taking the access frequency as an index), and the influence of user access on the migration performance can be reduced. The POS can be regarded as a plug-in, which can be applied vertically to a large number of expansion algorithms, so as to improve the on-line capacity expansion performance. By deploying the POS in the actual disk simulator, and carrying out extensive experimental comparison with the existing RAID-0 expansion algorithm FastScale, this paper proves that the performance of the response time of the user and the migration in the on-line expansion process can be improved significantly with respect to the traditional expansion algorithm. (3) After capacity expansion, read and write performance optimization study storage system expansion must take account of the performance of the expansion process and the user's reading and writing operation performance after the end of expansion. On the one hand, the greater the system I/ O overhead in the expansion process, the longer the expansion time window, the greater the impact on the migration and the user's response time performance during the expansion: on the other hand, after the expansion is over, the normal user read and write operation must be served, The user access performance after the expansion is also important. However, the existing capacity expansion algorithm is mainly concerned with minimizing the amount of data migration in the expansion process, and does not consider optimizing the user's reading and writing operation performance after the expansion. Because the expansion process changes the data layout of the system, the expansion process directly influences the normal user access performance after the expansion. Therefore, this paper, from the process of expansion, considers the design of the data migration method. In this paper, a new expansion algorithm, PostScale, is designed based on the expansion of RAID-0. PostScale realizes the minimum data migration in the expansion process, and under the constraint condition, the maximum dispersion and placement of the continuous data blocks after the expansion end is guaranteed. With such a design, the expansion time window is reduced, and the user read and write requests after the expansion end can utilize the maximum concurrent access performance of the storage system. The simulation results show that the PostScale has the advantages of both the traditional two RAID-0 expansion algorithms, round-robin and FastScale, and PostScale can greatly reduce the expansion time window of the round-robin, and can effectively improve the time performance of user read and write response after the expansion of the FastScale. PostScale in this paper can further extend to the expansion of the RAID-5 system, expand the distributed storage system based on Reed-Solomon coding, and improve the user access performance after the expansion.
【學(xué)位授予單位】:中國科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP333
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 王征;劉心松;李美安;;企業(yè)信息分布式存儲(chǔ)的熱點(diǎn)處理策略[J];計(jì)算機(jī)集成制造系統(tǒng);2006年09期
2 李磊;沈海斌;黃凱;嚴(yán)曉浪;Han Sangil;Ahmed A Jerraya;;分布式存儲(chǔ)管理在多核設(shè)計(jì)中的高層建模[J];電子與信息學(xué)報(bào);2008年11期
3 劉翔;汪海玲;;分布式存儲(chǔ)中的一種數(shù)據(jù)放置策略[J];計(jì)算機(jī)與數(shù)字工程;2009年05期
4 陳衛(wèi)衛(wèi);吳海佳;胥光輝;;分布式存儲(chǔ)中文件分割的最優(yōu)化模型[J];解放軍理工大學(xué)學(xué)報(bào)(自然科學(xué)版);2010年04期
5 崔忠強(qiáng);左德承;張展;;在云間可重構(gòu)的分布式存儲(chǔ)[J];系統(tǒng)工程理論與實(shí)踐;2011年S2期
6 郝杰;逯彥博;劉鑫吉;夏樹濤;;分布式存儲(chǔ)中的再生碼綜述[J];重慶郵電大學(xué)學(xué)報(bào)(自然科學(xué)版);2013年01期
7 唐京偉;;基于云計(jì)算的分布式存儲(chǔ)技術(shù)[J];中國傳媒科技;2013年15期
8 郭棟;王偉;曾國蓀;;基于一致性樹分布的數(shù)據(jù)分布式存儲(chǔ)方法[J];計(jì)算機(jī)應(yīng)用;2013年12期
9 蘇李亮;王云福;侯斌;;海量設(shè)計(jì)文檔分布式存儲(chǔ)及負(fù)載均衡的研究與實(shí)現(xiàn)[J];電信科學(xué);2013年12期
10 謝然;;敢問存儲(chǔ)之路在何方?見分布式存儲(chǔ)搖曳在數(shù)據(jù)枝頭[J];互聯(lián)網(wǎng)周刊;2014年02期
相關(guān)會(huì)議論文 前7條
1 蘇李亮;王云福;侯斌;;海量設(shè)計(jì)文檔分布式存儲(chǔ)及負(fù)載均衡的研究與實(shí)現(xiàn)[A];2013電力行業(yè)信息化年會(huì)論文集[C];2013年
2 蘇李亮;王云福;侯斌;;海量設(shè)計(jì)文檔分布式存儲(chǔ)及負(fù)載均衡的研究與實(shí)現(xiàn)[A];2013電力行業(yè)信息化年會(huì)論文集[C];2013年
3 鄭文武;李先緒;黃植勤;邱紅飛;;云存儲(chǔ)關(guān)鍵技術(shù)[A];2012全國無線及移動(dòng)通信學(xué)術(shù)大會(huì)論文集(下)[C];2012年
4 蔣軼林;郭淑琴;;分布式存儲(chǔ)在數(shù)字集群移動(dòng)通信系統(tǒng)中的應(yīng)用[A];浙江省電子學(xué)會(huì)2013學(xué)術(shù)年會(huì)論文集[C];2013年
5 姜繼忱;陳鋼;;P2P之路——締造“分布式對(duì)等”的Internet3.0[A];全面建設(shè)小康社會(huì):中國科技工作者的歷史責(zé)任——中國科協(xié)2003年學(xué)術(shù)年會(huì)論文集(下)[C];2003年
6 付偉;肖儂;盧錫城;;QoS感知的副本放置問題研究綜述[A];第15屆全國信息存儲(chǔ)技術(shù)學(xué)術(shù)會(huì)議論文集[C];2008年
7 張彥;劉欣然;徐慧彬;;一種基于虛擬計(jì)算環(huán)境的分布式存儲(chǔ)體系結(jié)構(gòu)[A];2009全國計(jì)算機(jī)網(wǎng)絡(luò)與通信學(xué)術(shù)會(huì)議論文集[C];2009年
相關(guān)重要報(bào)紙文章 前8條
1 京東架構(gòu)委員會(huì)主任 云平臺(tái)首席架構(gòu)師 系統(tǒng)技術(shù)部負(fù)責(zé)人 劉海鋒;京東:分布式存儲(chǔ)體系成為業(yè)務(wù)基石[N];中國信息化周報(bào);2014年
2 《網(wǎng)絡(luò)世界》記者 于翔;京東分布式存儲(chǔ)體系研發(fā)歷程[N];網(wǎng)絡(luò)世界;2014年
3 《網(wǎng)絡(luò)世界》記者 于翔;融合一體機(jī)投入大規(guī)模商用[N];網(wǎng)絡(luò)世界;2013年
4 記者 余榮華;大數(shù)據(jù),催生大變革[N];人民日?qǐng)?bào);2014年
5 本報(bào)記者 張佳星;新生產(chǎn)業(yè)布局如何“云”中索驥[N];科技日?qǐng)?bào);2014年
6 本報(bào)記者 甘露;物聯(lián)網(wǎng)讓管理更美妙[N];計(jì)算機(jī)世界;2013年
7 本報(bào)記者 郭濤;華為幫用戶定制HANA一體機(jī)[N];中國計(jì)算機(jī)報(bào);2013年
8 臨江;手機(jī)瀏覽器,3G時(shí)代的采礦機(jī)?[N];人民郵電;2009年
相關(guān)博士學(xué)位論文 前9條
1 吳思;容錯(cuò)分布式存儲(chǔ)系統(tǒng)擴(kuò)容機(jī)制研究[D];中國科學(xué)技術(shù)大學(xué);2016年
2 胡q,
本文編號(hào):2341817
本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/2341817.html