基于云存儲(chǔ)的分布式文件系統(tǒng)研究與優(yōu)化
發(fā)布時(shí)間:2018-05-30 21:09
本文選題:云存儲(chǔ) + HDFS; 參考:《西安電子科技大學(xué)》2013年碩士論文
【摘要】:隨著互聯(lián)網(wǎng)的飛速發(fā)展,,全球數(shù)據(jù)量每年以指數(shù)增長(zhǎng),使得云計(jì)算成為了當(dāng)前研究與應(yīng)用的熱點(diǎn)。云存儲(chǔ)作為云計(jì)算的底層服務(wù),是一種架構(gòu)復(fù)雜的分布式文件系統(tǒng)。因?yàn)樗哂薪Y(jié)構(gòu)靈活、響應(yīng)效率高、管理方便等優(yōu)點(diǎn),因而成為世界各國(guó)解決數(shù)據(jù)爆炸性增長(zhǎng)方案的首選。Hadoop分布式文件系統(tǒng)(HDFS)作為當(dāng)今最流行的基于云存儲(chǔ)的分布式文件系統(tǒng)具有開源、廉價(jià)、高容錯(cuò)以及高可擴(kuò)展性的特點(diǎn),在云存儲(chǔ)領(lǐng)域占居了者舉足輕重的地位。然而,HDFS因其結(jié)構(gòu)和性能上的局限性,也存單點(diǎn)失效、并發(fā)用戶的高延時(shí)訪問(wèn)、負(fù)載均衡不足等的問(wèn)題。 本文在系統(tǒng)、全面的學(xué)習(xí)和總結(jié)分布式存儲(chǔ)系統(tǒng)發(fā)展現(xiàn)狀和特點(diǎn)的基礎(chǔ)上,重點(diǎn)分析了幾種常用的分布式存儲(chǔ)系統(tǒng)架構(gòu)的優(yōu)缺點(diǎn),同時(shí)設(shè)計(jì)了一個(gè)部分對(duì)等式的多Namenode系統(tǒng)架構(gòu)。該架構(gòu)通過(guò)增加元數(shù)據(jù)服務(wù)器層中部分對(duì)等的多個(gè)Namenode,改變了以HDFS為代表的集中式存儲(chǔ)系統(tǒng)對(duì)主節(jié)點(diǎn)的單點(diǎn)依賴,降低了并發(fā)用戶的等待時(shí)延和元數(shù)據(jù)服務(wù)器的平均內(nèi)存占用率。同時(shí),本文還深入研究了常用的負(fù)載均衡方法,針對(duì)HDFS存儲(chǔ)服務(wù)器負(fù)載均衡不足的缺點(diǎn),建立了磁盤利用率模型和服務(wù)阻塞率模型,設(shè)計(jì)了一種基于本文架構(gòu)的自適應(yīng)反饋負(fù)載均衡算法。通過(guò)算法性能分析與實(shí)驗(yàn)仿真進(jìn)一步論證了本文設(shè)計(jì)的算法比HDFS系統(tǒng)中的負(fù)載均衡算法在系統(tǒng)性能和負(fù)載均勻度方面都有一定的優(yōu)化。
[Abstract]:With the rapid development of the Internet, the global data volume increases exponentially every year, which makes cloud computing become the focus of current research and application. Cloud storage, as the underlying service of cloud computing, is a complicated distributed file system. Because it has the advantages of flexible structure, high response efficiency, convenient management and so on. Therefore, as the most popular distributed file system based on cloud storage, the Hadoop distributed file system (HDFS) is the most popular distributed file system based on cloud storage, which has the characteristics of open source, low cost, high fault tolerance and high scalability. Cloud storage occupies a pivotal position in the field of cloud storage. However, due to its limitations in structure and performance, HDFS also has some problems, such as failure of single point, high delay access of concurrent users, insufficient load balancing, and so on. On the basis of studying and summarizing the current situation and characteristics of distributed storage system, this paper analyzes the advantages and disadvantages of several commonly used distributed storage system architectures, and designs a partial peer-to-peer multi-Namenode system architecture. By adding several Namenodes in the metadata server layer, the architecture changes the single point dependence of the centralized storage system represented by HDFS on the master node, and reduces the waiting delay of concurrent users and the average memory occupancy of metadata server. At the same time, this paper also deeply studies the commonly used load balancing methods, aiming at the disadvantage of insufficient load balance of HDFS storage server, the disk utilization model and the service blocking rate model are established. An adaptive feedback load balancing algorithm based on this architecture is designed. Through the performance analysis and experimental simulation, it is further demonstrated that the proposed algorithm is better than the load balancing algorithm in HDFS system in terms of system performance and load uniformity.
【學(xué)位授予單位】:西安電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP333
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 謝長(zhǎng)生,傅湘林,韓德志,任勁;一種基于iSCSI的SAN的研究與實(shí)現(xiàn)[J];計(jì)算機(jī)研究與發(fā)展;2003年05期
2 鄧青;王麗芳;蔣澤軍;;云存儲(chǔ)環(huán)境下的負(fù)載均衡策略研究[J];航空計(jì)算技術(shù);2011年06期
相關(guān)碩士學(xué)位論文 前3條
1 李寬;基于HDFS的分布式Namenode節(jié)點(diǎn)模型的研究[D];華南理工大學(xué);2011年
2 張顏;基于Chord和Binary Tree混合層次P2P網(wǎng)絡(luò)結(jié)構(gòu)研究[D];南京理工大學(xué);2008年
3 欒亞建;分布式文件系統(tǒng)元數(shù)據(jù)管理研究與優(yōu)化[D];華南理工大學(xué);2010年
本文編號(hào):1956823
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1956823.html
最近更新
教材專著