云存儲環(huán)境下分布式文件系統(tǒng)的副本策略研究
本文關(guān)鍵詞: 云存儲 副本策略 負載均衡 一致性 出處:《電子科技大學(xué)》2013年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著互聯(lián)網(wǎng)的高速發(fā)展,網(wǎng)絡(luò)上產(chǎn)生的數(shù)據(jù)量呈爆炸式的增長,對這些數(shù)據(jù)的存儲成為了計算機領(lǐng)域研究的熱點。傳統(tǒng)的存儲方式,比如存儲區(qū)域網(wǎng)絡(luò)和網(wǎng)絡(luò)附加存儲因為其存儲容量和性能存在瓶頸,價格昂貴,,不易擴展等原因,應(yīng)用范圍受到限制。云存儲采用分布式文件系統(tǒng)為核心,硬件設(shè)備價格低廉,可擴展性好。副本技術(shù)提高了系統(tǒng)的可靠性、可用性和性能的同時,也帶來了負載均衡,網(wǎng)絡(luò)帶寬開銷,一致性等問題。 本文分析了當(dāng)前典型的分布式文件系統(tǒng),結(jié)合云存儲環(huán)境下的業(yè)務(wù)特征,研究了相應(yīng)的副本策略,重點考慮系統(tǒng)訪問的效率和性能,負載均衡,數(shù)據(jù)的一致性等問題,并根據(jù)這些需求設(shè)計和實現(xiàn)了分布式文件系統(tǒng)中的副本管理模塊。本文的主要工作包括以下內(nèi)容: (1)提出了基于一致性哈希的副本放置策略。傳統(tǒng)的分布式文件系統(tǒng)副本放置位置主要通過元數(shù)據(jù)中心服務(wù)器存儲,當(dāng)系統(tǒng)并發(fā)訪問量很大時,元數(shù)據(jù)服務(wù)器將成為系統(tǒng)的瓶頸。基于一致性哈希的副本放置策略很好的解決了文件檢索,存儲設(shè)備擴展和失效的問題。引入虛擬節(jié)點映射,極大地減少了系統(tǒng)設(shè)備的改變帶來的數(shù)據(jù)遷移。同時對設(shè)備按照存儲能力增加權(quán)重屬性,更好的提高了系統(tǒng)的負載均衡。 (2)提出了基于文件熱度的副本調(diào)整策略。該策略以文件的請求次數(shù)為基礎(chǔ),結(jié)合服務(wù)器的負載,動態(tài)地調(diào)整副本數(shù)量,提升了系統(tǒng)的性能和效率。同時,輔以副本壓縮策略,對長時間沒有訪問的文件副本壓縮,在保證數(shù)據(jù)可靠性和可用性的同時,節(jié)省了系統(tǒng)的存儲空間。 (3)提出了基于用戶請求的副本一致性策略。該策略在對副本一致性維護的同時,充分考慮到了避免增加系統(tǒng)開銷。同時,為了防止長期沒有用戶訪問的文件不一致而使得數(shù)據(jù)的可靠性降低,副本失去冗余備份的作用,采用定時更新的機制,在系統(tǒng)空閑的時候?qū)⑾到y(tǒng)中副本各版本更新到一致狀態(tài)。
[Abstract]:With the rapid development of the Internet, the amount of data generated on the network is increasing explosively. The storage of these data has become a hot topic in the field of computer. For example, the storage area network and network additional storage are limited because of the bottleneck of storage capacity and performance, expensive and difficult to expand, etc. Cloud storage uses distributed file system as the core, and the hardware is cheap. The replica technology not only improves the reliability, availability and performance of the system, but also brings some problems such as load balancing, network bandwidth overhead, consistency and so on. In this paper, the typical distributed file system is analyzed, and the corresponding replica strategy is studied based on the business characteristics of cloud storage environment. The system access efficiency and performance, load balance, data consistency and so on are considered. The replica management module in distributed file system is designed and implemented according to these requirements. The main work of this paper includes the following contents:. This paper proposes a replica placement strategy based on consistency hashing. The traditional distributed file system replica location is stored mainly through metadata center server. Metadata server will become the bottleneck of the system. The replica placement strategy based on consistency hash solves the problems of file retrieval, storage device extension and invalidation. It greatly reduces the data migration caused by the change of the system equipment, and increases the weight attribute of the device according to the storage capacity, thus improving the load balance of the system. (2) A copy adjustment strategy based on file heat is proposed. This strategy is based on the number of requests of files, and dynamically adjusts the number of copies in combination with the load of the server, which improves the performance and efficiency of the system. At the same time, it is supplemented by a copy compression strategy. The compression of long time unvisited copies of files saves the storage space of the system while ensuring the reliability and availability of the data. (3) A replica consistency policy based on user request is proposed. This policy takes full account of avoiding system overhead while maintaining replica consistency. At the same time, In order to prevent the file inconsistency which has not been accessed by the user for a long time, the reliability of the data is reduced and the replica loses the role of redundant backup. In order to update each version of the copy in the system when the system is idle, the mechanism of timing update is adopted to update each version of the copy to the same state.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP333;TP316.4
【參考文獻】
相關(guān)期刊論文 前10條
1 拓守恒;;云計算與云數(shù)據(jù)存儲技術(shù)研究[J];電腦開發(fā)與應(yīng)用;2010年09期
2 馮幼樂;朱六璋;;CEPH動態(tài)元數(shù)據(jù)管理方法分析與改進[J];電子技術(shù);2010年09期
3 李東升,李春江,肖儂,王意潔,盧錫城;數(shù)據(jù)網(wǎng)格環(huán)境下一種動態(tài)自適應(yīng)的副本定位方法[J];計算機研究與發(fā)展;2003年12期
4 伍文靜;程耀東;汪璐;武杰;陳剛;;面向本地分布式存儲系統(tǒng)的動態(tài)副本策略[J];計算機工程與應(yīng)用;2010年12期
5 周婧;王意潔;阮煒;李思昆;;面向海量數(shù)據(jù)的數(shù)據(jù)一致性研究[J];計算機科學(xué);2006年04期
6 侯孟書;王曉斌;盧顯良;任立勇;;一種新的動態(tài)副本管理機制[J];計算機科學(xué);2006年09期
7 田穎,許魯;分布式文件系統(tǒng)中的負載平衡技術(shù)[J];計算機工程;2003年19期
8 陳宇;董健全;;非結(jié)構(gòu)化P2P網(wǎng)絡(luò)中的副本管理策略[J];計算機工程;2008年18期
9 陳全;鄧倩妮;;云計算及其關(guān)鍵技術(shù)[J];計算機應(yīng)用;2009年09期
10 唐川;;淺談云計算的概念問題[J];科技情報開發(fā)與經(jīng)濟;2010年10期
相關(guān)碩士學(xué)位論文 前6條
1 付惠惠;一種分布式存儲管理原型系統(tǒng)客戶端軟件的設(shè)計與實現(xiàn)[D];北京交通大學(xué);2011年
2 林松濤;基于Lustre文件系統(tǒng)的并行I/O技術(shù)研究[D];國防科學(xué)技術(shù)大學(xué);2004年
3 鄧自立;云計算中的網(wǎng)絡(luò)拓撲設(shè)計和Hadoop平臺研究[D];中國科學(xué)技術(shù)大學(xué);2009年
4 孫鵬;面向SaaS應(yīng)用的多租戶海量存儲系統(tǒng)設(shè)計與實現(xiàn)[D];浙江大學(xué);2010年
5 黃曉云;基于HDFS的云存儲服務(wù)系統(tǒng)研究[D];大連海事大學(xué);2010年
6 龔高晟;通用分布式文件系統(tǒng)的研究與改進[D];華南理工大學(xué);2010年
本文編號:1506205
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1506205.html