分布式文件系統(tǒng)的負(fù)載均衡策略研究
本文選題:分布式文件系統(tǒng) 切入點(diǎn):DHT 出處:《電子科技大學(xué)》2014年碩士論文 論文類型:學(xué)位論文
【摘要】:分布式文件系統(tǒng)從最初的網(wǎng)絡(luò)文件系統(tǒng)發(fā)展至今,逐漸形成現(xiàn)階段的云存儲(chǔ)。它是在云計(jì)算(Cloud Computing)概念上延伸和發(fā)展出來(lái)的一個(gè)新概念。在大集群、動(dòng)態(tài)的分布式文件系統(tǒng)中,隨著文件數(shù)量、文件訪問(wèn)次數(shù)的增加,中心節(jié)點(diǎn)將成為整個(gè)系統(tǒng)的性能瓶頸。因此,減少中心節(jié)點(diǎn)在分布式文件系統(tǒng)中的依賴性是很重要的問(wèn)題。此外,還有一種基于分布式哈希表(DHT)的分布式文件系統(tǒng),其中不存在中心節(jié)點(diǎn),并且文件根據(jù)DHT算法,均勻分布于系統(tǒng)中。但是,伴隨著文件修改、熱點(diǎn)出現(xiàn),系統(tǒng)將無(wú)法保持負(fù)載均衡狀態(tài)。負(fù)載均衡在分布式文件系統(tǒng)中是很重要的問(wèn)題。系統(tǒng)負(fù)載狀態(tài)影響著集群存儲(chǔ)利用率和網(wǎng)絡(luò)吞吐量。同時(shí),已均衡的集群能夠有效避免熱點(diǎn)的出現(xiàn),提高系統(tǒng)響應(yīng)速度。因此,本論文將對(duì)基于DHT的分布式文件系統(tǒng)負(fù)載再均衡問(wèn)題進(jìn)行深入的研究。通過(guò)對(duì)現(xiàn)有負(fù)載再均衡算法研究與分析,其執(zhí)行過(guò)程中,由于負(fù)載信息獲取的局部性,導(dǎo)致在數(shù)據(jù)遷移過(guò)程中出現(xiàn)抖動(dòng)的現(xiàn)象,并且數(shù)據(jù)遷移過(guò)程中會(huì)導(dǎo)致額外高負(fù)載節(jié)點(diǎn)的出現(xiàn),從而影響負(fù)載均衡效率。因此,本論文通過(guò)擴(kuò)大隨機(jī)樣本節(jié)點(diǎn)的個(gè)數(shù),使計(jì)算出的估計(jì)值更加貼近實(shí)際值;在選擇后繼節(jié)點(diǎn)的過(guò)程中,采取雙向延伸的方式,找出滿足條件的后繼節(jié)點(diǎn),以此減少多余的數(shù)據(jù)遷移操作。該算法有效地解決了抖動(dòng)現(xiàn)象的發(fā)生,避免了額外高負(fù)載節(jié)點(diǎn)的出現(xiàn)。同時(shí),副本管理和節(jié)點(diǎn)選擇也是分布式文件系統(tǒng)中的重要問(wèn)題。文件系統(tǒng)中的副本分發(fā)策略與DHT算法中的文件分發(fā)策略不兼容。本論文通過(guò)采用軟鏈接方式將副本存放位置重定向,保證兩種策略的兼容。結(jié)合負(fù)載再均衡算法提出節(jié)點(diǎn)選擇策略,避免數(shù)據(jù)遷移過(guò)程中,多個(gè)低負(fù)載節(jié)點(diǎn)同時(shí)選擇同一個(gè)高負(fù)載節(jié)點(diǎn)的沖突問(wèn)題。在負(fù)載均衡算法的應(yīng)用中,使用Space-filling Curve將節(jié)點(diǎn)物理位置與邏輯位置對(duì)應(yīng)起來(lái),同時(shí)利用節(jié)點(diǎn)容量使用率將異構(gòu)節(jié)點(diǎn)集群轉(zhuǎn)換為同構(gòu)集群。綜上所述,本論文提出一種ILR(Improved Load Rebalancing)算法。采用Matlab數(shù)學(xué)軟件和Chord仿真程序?qū)Ψ植际轿募到y(tǒng)的負(fù)載均衡算法進(jìn)行實(shí)驗(yàn)仿真。仿真實(shí)驗(yàn)生成數(shù)據(jù)后,將本論文提出的ILR算法與現(xiàn)有的負(fù)載在均衡算法從累積分布函數(shù)、節(jié)點(diǎn)負(fù)載狀態(tài)、數(shù)據(jù)遷移次數(shù)和負(fù)載信息交換次數(shù)四個(gè)方面進(jìn)行比較與分析。結(jié)果驗(yàn)證了ILR算法的有效性。
[Abstract]:Distributed file system is a new concept extending and developing from the original network file system to present stage cloud storage. In large cluster, dynamic distributed file system, it is a new concept that extends and develops in cloud computing. As the number of files and the number of file visits increase, the central node becomes a performance bottleneck for the system. Therefore, reducing the dependency of the central node in a distributed file system is an important issue. There is also a distributed file system based on distributed hash table (DHT), in which there is no central node, and files are evenly distributed in the system according to DHT algorithm. The system will not be able to maintain load balancing. Load balancing is an important issue in distributed file systems. System load status affects cluster storage utilization and network throughput. A balanced cluster can effectively avoid hot spots and improve system response speed. In this paper, the load rebalancing problem of distributed file system based on DHT will be deeply studied. Through the research and analysis of the existing load rebalancing algorithms, during the execution process, due to the localization of load information acquisition, The phenomenon of jitter occurs in the process of data migration, and the appearance of extra high-load nodes in the process of data migration, which affects the efficiency of load balancing. Therefore, this paper expands the number of random sample nodes. The estimated value is closer to the actual value, and in the process of selecting the successor node, a two-way extension is adopted to find out the successor node that satisfies the condition. In order to reduce the redundant data migration operation, the algorithm can effectively solve the jitter phenomenon and avoid the appearance of extra high load nodes. At the same time, Copy management and node selection are also important problems in distributed file system. The copy distribution strategy in file system is not compatible with the file distribution strategy in DHT algorithm. This paper proposes a node selection strategy combined with the load rebalancing algorithm to avoid the problem of multiple low-load nodes choosing the same high-load node at the same time in the process of data migration. Space-filling Curve is used to match the physical location and logical location of nodes, and the heterogeneous node cluster is transformed into isomorphic cluster using node capacity utilization. In this paper, a ILR(Improved Load rebalancing algorithm is proposed. The load balancing algorithm of distributed file system is simulated by Matlab software and Chord simulation program. The ILR algorithm proposed in this paper is compared with the existing load balancing algorithm from four aspects: cumulative distribution function, node load state, data migration times and load information exchange times. The results verify the effectiveness of the ILR algorithm.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP333
【共引文獻(xiàn)】
相關(guān)期刊論文 前3條
1 尹向東;楊杰;屈長(zhǎng)青;;云計(jì)算環(huán)境下分布式文件系統(tǒng)的負(fù)載平衡研究[J];計(jì)算機(jī)科學(xué);2014年03期
2 MA XingKong;WANG YiJie;SUN WeiDong;;Feverfew: a scalable coverage-based hybrid overlay for Internet-scale pub/sub networks[J];Science China(Information Sciences);2014年05期
3 段文書(shū);陳美蓮;馬燕;王節(jié);;一種網(wǎng)格環(huán)境下教育資源社區(qū)模型的研究[J];神州;2013年23期
相關(guān)博士學(xué)位論文 前1條
1 鄭重;面向動(dòng)態(tài)網(wǎng)絡(luò)環(huán)境的高魯棒性數(shù)據(jù)分發(fā)技術(shù)研究[D];國(guó)防科學(xué)技術(shù)大學(xué);2011年
相關(guān)碩士學(xué)位論文 前1條
1 彭睿;結(jié)合云服務(wù)的P2P視頻流傳輸架構(gòu)的研究[D];中南大學(xué);2014年
,本文編號(hào):1619797
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1619797.html