基于Hadoop的分布式文件存儲(chǔ)服務(wù)平臺(tái)設(shè)計(jì)與實(shí)現(xiàn)
發(fā)布時(shí)間:2018-01-19 06:15
本文關(guān)鍵詞: 分布式文件存儲(chǔ) Hadoop 冗余備份 品質(zhì)感知 云存儲(chǔ) 出處:《浙江大學(xué)》2012年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著互聯(lián)網(wǎng)應(yīng)用的飛速發(fā)展,互聯(lián)網(wǎng)上的信息和數(shù)據(jù)量呈現(xiàn)爆炸性增長(zhǎng),如何高效、安全地組織和存儲(chǔ)這些大規(guī)模的數(shù)據(jù),并最大程度地降低應(yīng)用成本,引發(fā)了國(guó)內(nèi)外越來越多的學(xué)術(shù)界和企業(yè)界的關(guān)注。當(dāng)前,無論在廣義的互聯(lián)網(wǎng)環(huán)境中,還是在中等規(guī)模企業(yè)的內(nèi)部網(wǎng)中,抑或在小規(guī)模的局域網(wǎng)中,都存在著大量高性能且廉價(jià)的閑散存儲(chǔ)資源。充分利用這些閑散、廉價(jià)的存儲(chǔ)資源,構(gòu)建可信、優(yōu)質(zhì)的大規(guī)模存儲(chǔ)池,是解決上述問題的有效手段。 分布式文件系統(tǒng)為有效利用分散存儲(chǔ)資源提供了一條途徑。然而,傳統(tǒng)意義上的分布式文件存儲(chǔ)系統(tǒng),如Hadoop頁目中的HDFS,是運(yùn)行在結(jié)點(diǎn)性能相似、網(wǎng)絡(luò)環(huán)境高度穩(wěn)定的集群系統(tǒng)中的。因此,如果直接將傳統(tǒng)的分布式文件系統(tǒng)部署在網(wǎng)絡(luò)環(huán)境動(dòng)態(tài)變化、存儲(chǔ)結(jié)點(diǎn)自由進(jìn)出的網(wǎng)絡(luò)中,則存在空間利用率低、網(wǎng)絡(luò)動(dòng)態(tài)適應(yīng)性差、存儲(chǔ)結(jié)點(diǎn)信譽(yù)度低等問題。本文以Hadoop開源系統(tǒng)為基礎(chǔ),研究適用于廣域網(wǎng)絡(luò)的廣義分布式文件存儲(chǔ)服務(wù)模型,設(shè)計(jì)并實(shí)現(xiàn)了一個(gè)基于高效冗余備份策略及服務(wù)品質(zhì)感知的分布式文件存儲(chǔ)服務(wù)平臺(tái)——QDFS。研究工:作取得如下成果: (1)將分布式文件存儲(chǔ)系統(tǒng)建立在動(dòng)態(tài)網(wǎng)絡(luò)環(huán)境中,充分利用了網(wǎng)絡(luò)環(huán)境中的廉價(jià)計(jì)算資源,降低了存儲(chǔ)服務(wù)系統(tǒng)的總體擁有成本; (2)提出了一種基于恢復(fù)卷的冗余備份機(jī)制,大大減少了文件冗余信息的存儲(chǔ)空間,并日降低了文件的維護(hù)成本; (3)建立了基于層次化名稱結(jié)點(diǎn)的樹狀存儲(chǔ)系統(tǒng)模型,解決了不同集群間不可共用一套分布式系統(tǒng)的瓶頸問題; (4)設(shè)計(jì)了一個(gè)文件存取客戶端軟件,解決了Hadoop客戶端在Windows環(huán)境中的運(yùn)行問題。
[Abstract]:With the rapid development of Internet applications, the amount of information and data on the Internet increases explosively. How to organize and store these large-scale data efficiently and safely, and reduce the application cost to the greatest extent. It has attracted more and more attention of academic and business circles at home and abroad. At present, it is not only in the broad Internet environment, but also in the intranet of medium scale enterprises, or in the small scale local area network (LAN). There are a lot of idle storage resources with high performance and low cost. Making full use of these idle and cheap storage resources and constructing credible and high quality storage pools is an effective way to solve the above problems. Distributed file systems provide a way to effectively utilize distributed storage resources. However, traditional distributed file storage systems, such as HDFS in Hadoop pages. Therefore, if the traditional distributed file system is deployed in the dynamic change of network environment directly, the nodes can be stored in and out of the network. There are some problems such as low utilization of space, poor dynamic adaptability of network, low reputation of storage node and so on. This paper is based on Hadoop open source system. A generalized distributed file storage service model for wide area networks is studied. A distributed file storage service platform based on efficient redundant backup strategy and QoS awareness is designed and implemented. 1) the distributed file storage system is built in the dynamic network environment, which makes full use of the cheap computing resources in the network environment and reduces the total cost of ownership of the storage service system. (2) A redundant backup mechanism based on restoring volume is proposed, which greatly reduces the storage space of redundant information and reduces the maintenance cost of files. (3) A tree storage system model based on hierarchical name node is established, which solves the bottleneck problem of distributed system which can not be shared among different clusters. A file access client software is designed to solve the running problem of Hadoop client in Windows environment.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP393.09;TP333
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 程瑩;張?jiān)朴?徐雷;房秉毅;;基于Hadoop及關(guān)系型數(shù)據(jù)庫的海量數(shù)據(jù)分析研究[J];電信科學(xué);2010年11期
2 王峰;雷葆華;;Hadoop分布式文件系統(tǒng)的模型分析[J];電信科學(xué);2010年12期
3 吳英;劉t,
本文編號(hào):1442990
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1442990.html
最近更新
教材專著