天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 碩博論文 > 信息類博士論文 >

容錯(cuò)分布式存儲(chǔ)系統(tǒng)擴(kuò)容機(jī)制研究

發(fā)布時(shí)間:2018-11-19 09:06
【摘要】:當(dāng)今大規(guī)模分布式存儲(chǔ)系統(tǒng)采用冗余存儲(chǔ)的方式來維持?jǐn)?shù)據(jù)的可用性。冗余信息產(chǎn)生方式有復(fù)制和糾刪碼。糾刪碼相對(duì)于復(fù)制,因提供相同的容錯(cuò)能力所需的存儲(chǔ)開銷大大降低,而被越來越多的存儲(chǔ)系統(tǒng)所采用。另一方面,數(shù)據(jù)的快速增長以及用戶對(duì)系統(tǒng)容量和性能需求的不斷提高導(dǎo)致當(dāng)前構(gòu)建存儲(chǔ)系統(tǒng)經(jīng)常出現(xiàn)存儲(chǔ)能力和帶寬資源不足的情況。當(dāng)應(yīng)用需求超出系統(tǒng)能力,需要增加存儲(chǔ)資源,并將部分?jǐn)?shù)據(jù)遷移到新的存儲(chǔ)設(shè)備上來緩解壓力,這一操作被稱作存儲(chǔ)系統(tǒng)擴(kuò)容。因此,研究基于糾刪碼的分布式存儲(chǔ)系統(tǒng)擴(kuò)容機(jī)制,對(duì)云存儲(chǔ)以及數(shù)據(jù)中心背景下的數(shù)據(jù)存儲(chǔ)具有重要意義。本文從設(shè)計(jì)糾刪碼存儲(chǔ)系統(tǒng)的擴(kuò)容算法、調(diào)度在線擴(kuò)容過程中的用戶I/O請(qǐng)求與系統(tǒng)I/O請(qǐng)求、優(yōu)化擴(kuò)容后的用戶訪問性能三個(gè)維度出發(fā),研究分布式存儲(chǔ)系統(tǒng)的擴(kuò)容機(jī)制,主要研究內(nèi)容與貢獻(xiàn)如下:(1) Cauchy Reed-Solomon (CRS)擴(kuò)容問題研究隨著當(dāng)前存儲(chǔ)系統(tǒng)對(duì)容錯(cuò)要求的逐漸提高,考慮容任意錯(cuò)的CRS編碼的擴(kuò)容問題愈發(fā)重要。CRS編碼主要適用于由眾多存儲(chǔ)節(jié)點(diǎn)以及互聯(lián)網(wǎng)絡(luò)組成的分布式存儲(chǔ)系統(tǒng)(例,CleverSafe, OceanStore)。擴(kuò)容過程需要遷移部分?jǐn)?shù)據(jù)到新的存儲(chǔ)設(shè)備,同時(shí)需要更新校驗(yàn)。數(shù)據(jù)遷移與校驗(yàn)更新帶來的存儲(chǔ)I/O與網(wǎng)絡(luò)傳輸帶寬開銷直接影響擴(kuò)容過程中的系統(tǒng)性能。本文研究了基于CRS編碼的分布式存儲(chǔ)系統(tǒng)的擴(kuò)容問題,通過第一步設(shè)計(jì)擴(kuò)容后的編碼矩陣,第二步設(shè)計(jì)擴(kuò)容過程中的數(shù)據(jù)遷移方案,第三步利用校驗(yàn)解碼部分?jǐn)?shù)據(jù)的思想進(jìn)一步優(yōu)化數(shù)據(jù)遷移過程,為CRS系統(tǒng)擴(kuò)容設(shè)計(jì)了一個(gè)三階段優(yōu)化擴(kuò)容算法。理論分析表明,本文的三階段優(yōu)化擴(kuò)容算法相對(duì)于基本擴(kuò)容算法,能有效逐步地減少CRS系統(tǒng)擴(kuò)容過程中的系統(tǒng)I/O與網(wǎng)絡(luò)傳輸帶寬。通過在實(shí)際的分布式文件系統(tǒng)中部署CRS三階段優(yōu)化擴(kuò)容算法,并與基本擴(kuò)容算法進(jìn)行廣泛實(shí)驗(yàn)對(duì)比,本文證實(shí)了算法在單線程以及多線程架構(gòu)下的有效性與實(shí)用性。(2)在線擴(kuò)容問題研究在實(shí)際存儲(chǔ)系統(tǒng)中,大多數(shù)上層用戶級(jí)應(yīng)用都要求系統(tǒng)提供7x24小時(shí)的在線服務(wù)。因此,當(dāng)存儲(chǔ)系統(tǒng)進(jìn)行在線擴(kuò)容的時(shí)候,用戶的I/O請(qǐng)求和遷移的I/O請(qǐng)求相互競爭,勢必影響擴(kuò)容過程中的用戶和遷移的響應(yīng)時(shí)間性能!と欢,已有的擴(kuò)容算法在設(shè)計(jì)之時(shí)都很少考慮用戶I/O請(qǐng)求,在線擴(kuò)容過程中的用戶和遷移的響應(yīng)時(shí)間性能勢必降級(jí)。本文基于此問題,為已有眾多的擴(kuò)容算法設(shè)計(jì)了一個(gè)在線擴(kuò)容優(yōu)化機(jī)制Popularity-based Online Scaling (POS)。本文的在線擴(kuò)容優(yōu)化機(jī)制POS結(jié)合實(shí)際系統(tǒng)中用戶訪問的兩個(gè)特征,即:數(shù)據(jù)熱度和數(shù)據(jù)局部性,通過將原有存儲(chǔ)空間劃分為多個(gè)區(qū)域,并記錄每個(gè)區(qū)域的熱度(主要以訪問頻度為指標(biāo)),從而改變擴(kuò)容順序,優(yōu)先遷移熱度高的區(qū)域,進(jìn)一步利用數(shù)據(jù)局部性來更好地響應(yīng)用戶的讀、寫請(qǐng)求,同時(shí)可以減少用戶訪問對(duì)遷移性能的影響。POS可以看作一個(gè)插件,垂直地應(yīng)用在已有眾多的擴(kuò)容算法之上,提高在線擴(kuò)容性能。通過在實(shí)際的磁盤模擬器中部署POS,并與已有的RAID-0擴(kuò)容算法FastScale開展廣泛實(shí)驗(yàn)對(duì)比,本文證實(shí)了POS相對(duì)于傳統(tǒng)擴(kuò)容算法能顯著提高在線擴(kuò)容過程中的用戶以及遷移的響應(yīng)時(shí)間性能。(3)擴(kuò)容后讀、寫性能優(yōu)化研究存儲(chǔ)系統(tǒng)擴(kuò)容必須兼顧擴(kuò)容過程中性能與擴(kuò)容結(jié)束后用戶讀、寫操作性能。一方面,擴(kuò)容過程中的系統(tǒng)I/O開銷越大,擴(kuò)容時(shí)間窗口越長,對(duì)于擴(kuò)容過程中的遷移與用戶的響應(yīng)時(shí)間性能影響越大:另一方面,擴(kuò)容結(jié)束后,必須服務(wù)正常的用戶讀、寫操作,擴(kuò)容后的用戶訪問性能亦為重要。然而,已有的擴(kuò)容算法主要考慮最小化擴(kuò)容過程中的數(shù)據(jù)遷移量,并未考慮優(yōu)化擴(kuò)容后的用戶讀、寫操作性能。由于擴(kuò)容過程改變了系統(tǒng)的數(shù)據(jù)布局,所以,擴(kuò)容過程直接影響擴(kuò)容結(jié)束后正常的用戶訪問性能。因此,本文從擴(kuò)容過程出發(fā),考慮設(shè)計(jì)好的數(shù)據(jù)遷移方法。本文以RAID-0擴(kuò)容為例,設(shè)計(jì)一種新的擴(kuò)容算法PostScale。 PostScale實(shí)現(xiàn)了擴(kuò)容過程中的最小化數(shù)據(jù)遷移量,在此約束條件下,保證了擴(kuò)容結(jié)束后的連續(xù)數(shù)據(jù)塊的最大化分散放置。通過如此設(shè)計(jì),擴(kuò)容時(shí)間窗口得以縮小,同時(shí)擴(kuò)容結(jié)束后的用戶讀、寫請(qǐng)求能利用存儲(chǔ)系統(tǒng)最大的并發(fā)訪問性能。模擬實(shí)驗(yàn)表明,PostScale相對(duì)于傳統(tǒng)的兩種RAID-0擴(kuò)容算法round-robin、 FastScale皆有優(yōu)勢,PostScale能大大縮小round-robin的擴(kuò)容時(shí)間窗口,亦能有效提高FastScale的擴(kuò)容結(jié)束后用戶讀、寫響應(yīng)時(shí)間性能。本文的PostScale可以進(jìn)一步延伸應(yīng)用于RAID-5系統(tǒng)擴(kuò)容、基于Reed-Solomon編碼的分布式存儲(chǔ)系統(tǒng)擴(kuò)容,改進(jìn)擴(kuò)容后的用戶訪問性能。
[Abstract]:Today's large-scale distributed storage systems use redundant storage to maintain data availability. The redundant information generation mode has the copy and deletion codes. the storage overhead required for providing the same fault-tolerant capability is greatly reduced with respect to the replication, and is used by an increasing number of storage systems. On the other hand, the rapid growth of data, as well as the user's increasing system capacity and performance requirements, often result in the current build-up of storage systems with low storage capacity and insufficient bandwidth resources. When application requirements exceed system capabilities, the storage resource needs to be increased and some of the data is migrated to the new storage device to relieve the pressure, which is known as the storage-system expansion. Therefore, it is of great significance to study the capacity expansion mechanism of the distributed storage system based on the erasure code, and it is of great significance to the cloud storage and the data storage in the background of the data center. This paper studies the expansion mechanism of the distributed storage system from the three dimensions of the system I/ O request and the system I/ O request and the user's access performance after the expansion, and the main research contents and contributions are as follows: (1) The research of the capacity expansion of the Cauchy Reed-Solomon (CRS) is becoming more and more important as the current storage system is improving the fault tolerance. CRS encoding is mainly applicable to a distributed storage system (e.g., CleverSafe, OceanStore) consisting of a number of storage nodes and the Internet. The expansion process requires the migration of part of the data to the new storage device, while the check needs to be updated. The storage I/ O and network transmission bandwidth overhead brought by the data migration and check update directly influence the system performance in the expansion process. In this paper, the expansion of the distributed storage system based on CRS is studied, the first step is to design the expanded coding matrix, the second step is to design the data migration scheme in the expansion process, and the third step further optimizes the data migration process by using the idea of the data of the check and decoding part. In this paper, a three-stage optimization and expansion algorithm is designed for the expansion of CRS system. The theoretical analysis shows that the three-stage optimization expansion algorithm in this paper can effectively reduce the system I/ O and network transmission bandwidth in the expansion process of the CRS system with respect to the basic capacity expansion algorithm. In this paper, the validity and practicability of the algorithm under the single thread and multi-thread architecture are verified by deploying the CRS three-stage optimization expansion algorithm in the actual distributed file system and comparing with the basic capacity expansion algorithm. (2) On-line capacity expansion is studied in the actual storage system. Most upper-level user-level applications require the system to provide an online service of 7x24 hours. Therefore, when the storage system is expanded online, the I/ O request and the migration I/ O request of the user compete with each other, and the response time performance of the user and the migration in the expansion process is bound to be affected. However, the existing capacity expansion algorithm seldom takes into account the user I/ O request at the time of design, and the response time performance of the user and the migration in the on-line expansion process is bound to be degraded. In this paper, an on-line capacity expansion optimization mechanism, Popularity-based Online Scaling (POS), is designed for a number of expansion algorithms. The on-line capacity expansion optimization mechanism (POS) of this paper is based on two characteristics of user access in the actual system, namely, data heat and data locality, by dividing the original storage space into a plurality of areas, and recording the heat of each area (mainly taking the access frequency as an index), and the influence of user access on the migration performance can be reduced. The POS can be regarded as a plug-in, which can be applied vertically to a large number of expansion algorithms, so as to improve the on-line capacity expansion performance. By deploying the POS in the actual disk simulator, and carrying out extensive experimental comparison with the existing RAID-0 expansion algorithm FastScale, this paper proves that the performance of the response time of the user and the migration in the on-line expansion process can be improved significantly with respect to the traditional expansion algorithm. (3) After capacity expansion, read and write performance optimization study storage system expansion must take account of the performance of the expansion process and the user's reading and writing operation performance after the end of expansion. On the one hand, the greater the system I/ O overhead in the expansion process, the longer the expansion time window, the greater the impact on the migration and the user's response time performance during the expansion: on the other hand, after the expansion is over, the normal user read and write operation must be served, The user access performance after the expansion is also important. However, the existing capacity expansion algorithm is mainly concerned with minimizing the amount of data migration in the expansion process, and does not consider optimizing the user's reading and writing operation performance after the expansion. Because the expansion process changes the data layout of the system, the expansion process directly influences the normal user access performance after the expansion. Therefore, this paper, from the process of expansion, considers the design of the data migration method. In this paper, a new expansion algorithm, PostScale, is designed based on the expansion of RAID-0. PostScale realizes the minimum data migration in the expansion process, and under the constraint condition, the maximum dispersion and placement of the continuous data blocks after the expansion end is guaranteed. With such a design, the expansion time window is reduced, and the user read and write requests after the expansion end can utilize the maximum concurrent access performance of the storage system. The simulation results show that the PostScale has the advantages of both the traditional two RAID-0 expansion algorithms, round-robin and FastScale, and PostScale can greatly reduce the expansion time window of the round-robin, and can effectively improve the time performance of user read and write response after the expansion of the FastScale. PostScale in this paper can further extend to the expansion of the RAID-5 system, expand the distributed storage system based on Reed-Solomon coding, and improve the user access performance after the expansion.
【學(xué)位授予單位】:中國科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP333

【相似文獻(xiàn)】

相關(guān)期刊論文 前10條

1 王征;劉心松;李美安;;企業(yè)信息分布式存儲(chǔ)的熱點(diǎn)處理策略[J];計(jì)算機(jī)集成制造系統(tǒng);2006年09期

2 李磊;沈海斌;黃凱;嚴(yán)曉浪;Han Sangil;Ahmed A Jerraya;;分布式存儲(chǔ)管理在多核設(shè)計(jì)中的高層建模[J];電子與信息學(xué)報(bào);2008年11期

3 劉翔;汪海玲;;分布式存儲(chǔ)中的一種數(shù)據(jù)放置策略[J];計(jì)算機(jī)與數(shù)字工程;2009年05期

4 陳衛(wèi)衛(wèi);吳海佳;胥光輝;;分布式存儲(chǔ)中文件分割的最優(yōu)化模型[J];解放軍理工大學(xué)學(xué)報(bào)(自然科學(xué)版);2010年04期

5 崔忠強(qiáng);左德承;張展;;在云間可重構(gòu)的分布式存儲(chǔ)[J];系統(tǒng)工程理論與實(shí)踐;2011年S2期

6 郝杰;逯彥博;劉鑫吉;夏樹濤;;分布式存儲(chǔ)中的再生碼綜述[J];重慶郵電大學(xué)學(xué)報(bào)(自然科學(xué)版);2013年01期

7 唐京偉;;基于云計(jì)算的分布式存儲(chǔ)技術(shù)[J];中國傳媒科技;2013年15期

8 郭棟;王偉;曾國蓀;;基于一致性樹分布的數(shù)據(jù)分布式存儲(chǔ)方法[J];計(jì)算機(jī)應(yīng)用;2013年12期

9 蘇李亮;王云福;侯斌;;海量設(shè)計(jì)文檔分布式存儲(chǔ)及負(fù)載均衡的研究與實(shí)現(xiàn)[J];電信科學(xué);2013年12期

10 謝然;;敢問存儲(chǔ)之路在何方?見分布式存儲(chǔ)搖曳在數(shù)據(jù)枝頭[J];互聯(lián)網(wǎng)周刊;2014年02期

相關(guān)會(huì)議論文 前7條

1 蘇李亮;王云福;侯斌;;海量設(shè)計(jì)文檔分布式存儲(chǔ)及負(fù)載均衡的研究與實(shí)現(xiàn)[A];2013電力行業(yè)信息化年會(huì)論文集[C];2013年

2 蘇李亮;王云福;侯斌;;海量設(shè)計(jì)文檔分布式存儲(chǔ)及負(fù)載均衡的研究與實(shí)現(xiàn)[A];2013電力行業(yè)信息化年會(huì)論文集[C];2013年

3 鄭文武;李先緒;黃植勤;邱紅飛;;云存儲(chǔ)關(guān)鍵技術(shù)[A];2012全國無線及移動(dòng)通信學(xué)術(shù)大會(huì)論文集(下)[C];2012年

4 蔣軼林;郭淑琴;;分布式存儲(chǔ)在數(shù)字集群移動(dòng)通信系統(tǒng)中的應(yīng)用[A];浙江省電子學(xué)會(huì)2013學(xué)術(shù)年會(huì)論文集[C];2013年

5 姜繼忱;陳鋼;;P2P之路——締造“分布式對(duì)等”的Internet3.0[A];全面建設(shè)小康社會(huì):中國科技工作者的歷史責(zé)任——中國科協(xié)2003年學(xué)術(shù)年會(huì)論文集(下)[C];2003年

6 付偉;肖儂;盧錫城;;QoS感知的副本放置問題研究綜述[A];第15屆全國信息存儲(chǔ)技術(shù)學(xué)術(shù)會(huì)議論文集[C];2008年

7 張彥;劉欣然;徐慧彬;;一種基于虛擬計(jì)算環(huán)境的分布式存儲(chǔ)體系結(jié)構(gòu)[A];2009全國計(jì)算機(jī)網(wǎng)絡(luò)與通信學(xué)術(shù)會(huì)議論文集[C];2009年

相關(guān)重要報(bào)紙文章 前8條

1 京東架構(gòu)委員會(huì)主任 云平臺(tái)首席架構(gòu)師 系統(tǒng)技術(shù)部負(fù)責(zé)人 劉海鋒;京東:分布式存儲(chǔ)體系成為業(yè)務(wù)基石[N];中國信息化周報(bào);2014年

2 《網(wǎng)絡(luò)世界》記者 于翔;京東分布式存儲(chǔ)體系研發(fā)歷程[N];網(wǎng)絡(luò)世界;2014年

3 《網(wǎng)絡(luò)世界》記者 于翔;融合一體機(jī)投入大規(guī)模商用[N];網(wǎng)絡(luò)世界;2013年

4 記者 余榮華;大數(shù)據(jù),催生大變革[N];人民日?qǐng)?bào);2014年

5 本報(bào)記者 張佳星;新生產(chǎn)業(yè)布局如何“云”中索驥[N];科技日?qǐng)?bào);2014年

6 本報(bào)記者 甘露;物聯(lián)網(wǎng)讓管理更美妙[N];計(jì)算機(jī)世界;2013年

7 本報(bào)記者 郭濤;華為幫用戶定制HANA一體機(jī)[N];中國計(jì)算機(jī)報(bào);2013年

8 臨江;手機(jī)瀏覽器,3G時(shí)代的采礦機(jī)?[N];人民郵電;2009年

相關(guān)博士學(xué)位論文 前9條

1 吳思;容錯(cuò)分布式存儲(chǔ)系統(tǒng)擴(kuò)容機(jī)制研究[D];中國科學(xué)技術(shù)大學(xué);2016年

2 胡q,

本文編號(hào):2341817


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/2341817.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶c8f3b***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com