基于HDFS的云存儲(chǔ)平臺(tái)的優(yōu)化與實(shí)現(xiàn)
發(fā)布時(shí)間:2018-10-11 12:21
【摘要】:云計(jì)算是當(dāng)前研究的熱門課題,云存儲(chǔ)作為云計(jì)算的衍生,也成為當(dāng)前國內(nèi)外最為熱門的研究領(lǐng)域。其中,Hadoop文件系統(tǒng)HDFS作為Google File System的開源實(shí)現(xiàn),,成為業(yè)界研究云計(jì)算和云存儲(chǔ)、實(shí)現(xiàn)云應(yīng)用和云服務(wù)參考的標(biāo)準(zhǔn)模型。然而,現(xiàn)有HDFS架構(gòu)卻有著一些不足,典型的包括對(duì)小文件支持的不足,以及單一NameNode容易成為整個(gè)集群性能瓶頸等問題。 本文在研究現(xiàn)有HDFS的基礎(chǔ)上,給出了相應(yīng)的解決方案,對(duì)于小文件問題,本文提出了一種引入用戶元數(shù)據(jù)空間的方式來將HDFS中的小文件存儲(chǔ)合并為大文件存儲(chǔ);對(duì)于HDFS單一NameNode性能瓶頸問題,本文提出了一種基于MongoDB的多NameNode解決方案。實(shí)驗(yàn)結(jié)果表明,本文提出的方案,不僅拓展了HDFS集群的命名空間,而且提高了HDFS的并發(fā)讀寫速度。 除了對(duì)HDFS現(xiàn)有架構(gòu)進(jìn)行了相關(guān)優(yōu)化,本文還在現(xiàn)有HDFS架構(gòu)的基礎(chǔ)上,架設(shè)了一個(gè)云存儲(chǔ)系統(tǒng),實(shí)現(xiàn)了文件的上傳、下載、共享、瀏覽等功能。同時(shí),該系統(tǒng)還可以對(duì)當(dāng)前HDFS集群進(jìn)行監(jiān)控,監(jiān)控信息包括集群容量信息、集群塊信息,單個(gè)節(jié)點(diǎn)的負(fù)載信息、CPU使用信息等。云存儲(chǔ)系統(tǒng)的實(shí)現(xiàn),對(duì)基于HDFS的相關(guān)應(yīng)用具有探索和指導(dǎo)意義。
[Abstract]:Cloud computing is a hot topic in current research. Cloud storage, as a derivative of cloud computing, has become the most popular research field at home and abroad. Among them, Hadoop file system HDFS, as an open source implementation of Google File System, has become the standard model for cloud computing, cloud storage, cloud application and cloud service reference. However, there are some shortcomings in the existing HDFS architecture, such as the lack of support for small files and the fact that a single NameNode can easily become a bottleneck in the performance of the entire cluster. Based on the research of the existing HDFS, this paper gives the corresponding solution. For the small file problem, this paper proposes a way of introducing user metadata space to merge the small file storage in HDFS into large file storage. For the performance bottleneck of HDFS single NameNode, this paper proposes a multi-NameNode solution based on MongoDB. The experimental results show that the proposed scheme not only extends the namespace of HDFS cluster, but also improves the speed of concurrent reading and writing of HDFS. In addition to the existing HDFS architecture optimization, this paper also based on the existing HDFS architecture, set up a cloud storage system, to achieve file upload, download, share, browse and other functions. At the same time, the system can monitor the current HDFS cluster. The monitoring information includes cluster capacity information, cluster block information, load information of single node, CPU usage information and so on. The implementation of cloud storage system has exploration and guidance significance for related applications based on HDFS.
【學(xué)位授予單位】:華南理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP333
本文編號(hào):2264175
[Abstract]:Cloud computing is a hot topic in current research. Cloud storage, as a derivative of cloud computing, has become the most popular research field at home and abroad. Among them, Hadoop file system HDFS, as an open source implementation of Google File System, has become the standard model for cloud computing, cloud storage, cloud application and cloud service reference. However, there are some shortcomings in the existing HDFS architecture, such as the lack of support for small files and the fact that a single NameNode can easily become a bottleneck in the performance of the entire cluster. Based on the research of the existing HDFS, this paper gives the corresponding solution. For the small file problem, this paper proposes a way of introducing user metadata space to merge the small file storage in HDFS into large file storage. For the performance bottleneck of HDFS single NameNode, this paper proposes a multi-NameNode solution based on MongoDB. The experimental results show that the proposed scheme not only extends the namespace of HDFS cluster, but also improves the speed of concurrent reading and writing of HDFS. In addition to the existing HDFS architecture optimization, this paper also based on the existing HDFS architecture, set up a cloud storage system, to achieve file upload, download, share, browse and other functions. At the same time, the system can monitor the current HDFS cluster. The monitoring information includes cluster capacity information, cluster block information, load information of single node, CPU usage information and so on. The implementation of cloud storage system has exploration and guidance significance for related applications based on HDFS.
【學(xué)位授予單位】:華南理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP333
【引證文獻(xiàn)】
相關(guān)碩士學(xué)位論文 前5條
1 梁興輝;云存儲(chǔ)環(huán)境下數(shù)據(jù)副本技術(shù)研究[D];南京郵電大學(xué);2013年
2 馬騁;基于云存儲(chǔ)的版本控制系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[D];大連理工大學(xué);2013年
3 錢進(jìn)進(jìn);私有云安全存儲(chǔ)技術(shù)的研究與實(shí)現(xiàn)[D];廣東工業(yè)大學(xué);2013年
4 劉振;基于云存儲(chǔ)的智能電網(wǎng)健康狀況信息平臺(tái)研究[D];武漢理工大學(xué);2013年
5 史新剛;面向海量用戶的云存儲(chǔ)系統(tǒng)的設(shè)計(jì)與優(yōu)化[D];華東師范大學(xué);2013年
本文編號(hào):2264175
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2264175.html
最近更新
教材專著