基于Hadoop的MeteCloud資源存儲與數(shù)據(jù)處理的研究
發(fā)布時間:2018-01-16 23:34
本文關(guān)鍵詞:基于Hadoop的MeteCloud資源存儲與數(shù)據(jù)處理的研究 出處:《南京信息工程大學(xué)》2013年碩士論文 論文類型:學(xué)位論文
更多相關(guān)文章: Hadoop MeteCloud Hive HBase 氣象日值數(shù)據(jù)
【摘要】:目前,氣象行業(yè)中各級氣象部門均擁有獨(dú)立的業(yè)務(wù)系統(tǒng)和存儲系統(tǒng),氣象資料無法高效地集中管理與資源共享!霸朴嬎恪奔夹g(shù)的出現(xiàn)和高速發(fā)展為這一問題提供了一個解決方案。 本文在分析云平臺相關(guān)理論模型的基礎(chǔ)上,選取中國地面國際交換站氣候資料日值數(shù)據(jù)文件(1951年至2012年)作為研究對象,主要做了如下工作: (1)分析了開源云平臺Hadoop的分布式文件系統(tǒng)HDFS的架構(gòu)、讀寫數(shù)據(jù)流程,計算模型MapReduce的數(shù)據(jù)處理流程,分布式數(shù)據(jù)庫HBase體系結(jié)構(gòu)、創(chuàng)建表格過程以及數(shù)據(jù)倉庫Hive的體系結(jié)構(gòu)、存儲和查詢數(shù)據(jù)的過程。 (2)提出了氣象云平臺MeteCloud (Meteorological Cloud)架構(gòu)和集群部署的過程。MeteCloud架構(gòu)包括:硬件層、平臺層、應(yīng)用層和用戶層。構(gòu)架中引入Facebook AvatarNode工作機(jī)制解決元數(shù)據(jù)節(jié)點(diǎn)的單點(diǎn)故障問題,分析了AvatarNode的工作原理以及運(yùn)行周期。 (3)研究MeteCloud平臺下轉(zhuǎn)存靜態(tài)的氣象日值數(shù)據(jù)文件過程。分別研究Hive轉(zhuǎn)存氣象日值數(shù)據(jù)文件過程HiveDaily和HBase轉(zhuǎn)存氣象日值數(shù)據(jù)文件過程HBaseDaily。同時,基于MapReduce編程模型,提出了HBase優(yōu)化轉(zhuǎn)存過程MRHBaseDaily (MapReduce-based HBaseDaily),以提高HBase轉(zhuǎn)存效率。 (4)研究基于MapReduce的氣象日值數(shù)據(jù)處理過程。文中分析了傳統(tǒng)情況下的本地文件系統(tǒng)中SMT(Statistics of Maximum of Temperature,最高氣溫統(tǒng)計)過程。提出了基于MapReduce的MRSMT (MapReduce-based SMT)氣象日值數(shù)據(jù)統(tǒng)計過程。 通過在實(shí)驗(yàn)室構(gòu)建MeteCloud平臺,對氣象日值數(shù)據(jù)文件進(jìn)行轉(zhuǎn)存和數(shù)據(jù)處理。結(jié)果證明MeteCloud能夠高效地進(jìn)行氣象日值數(shù)據(jù)的存儲和處理,優(yōu)化后的HBase存儲過程和MRSMT過程能夠提高轉(zhuǎn)存和數(shù)據(jù)處理效率。
[Abstract]:At present, meteorological departments at all levels in the meteorological industry have independent business systems and storage systems. Meteorological data can not be managed and shared efficiently. The emergence and rapid development of cloud computing technology provides a solution to this problem. Based on the analysis of relevant theoretical models of cloud platform, this paper selects the daily data file of climate data of China ground international exchange station (1951 to 2012) as the research object. The main tasks are as follows: 1) the architecture of HDFS, a distributed file system based on open source cloud platform Hadoop, is analyzed. The data flow is read and written, and the data processing flow of the model MapReduce is calculated. Distributed database HBase architecture, creating tabular process and data warehouse Hive architecture, storing and querying data. (2) the meteorological cloud platform MeteCloud is proposed. The architecture and cluster deployment process. MeteCloud architecture includes: the hardware layer. Platform layer, application layer and user layer. The working mechanism of Facebook AvatarNode is introduced into the framework to solve the single point failure problem of metadata node. The working principle and running cycle of AvatarNode are analyzed. 3). This paper studies the process of transferring static meteorological daily data file under MeteCloud platform. The HiveDaily and HBase transfer weather days are studied respectively in Hive transfer meteorological daily data file process. Value data file procedure HBaseDaily. at the same time. Based on MapReduce programming model. In this paper, MRHBaseDaily MapReduce-based based on optimal storage process of HBase is proposed. In order to improve the efficiency of HBase transfer. The processing process of weather daily value data based on MapReduce is studied. In this paper, the SMTs in the traditional local file system are analyzed. Statistics of Maximum of Temperature. The statistical process of daily meteorological data of MRSMT MapReduce-based SMT based on MapReduce is presented. By building the MeteCloud platform in the lab. The results show that MeteCloud can store and process the daily meteorological data efficiently. The optimized HBase stored procedure and MRSMT procedure can improve the efficiency of storage and data processing.
【學(xué)位授予單位】:南京信息工程大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP333
【參考文獻(xiàn)】
相關(guān)碩士學(xué)位論文 前1條
1 邰建華;Hadoop平臺下的海量數(shù)據(jù)存儲技術(shù)研究[D];東北石油大學(xué);2012年
,本文編號:1435355
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1435355.html
最近更新
教材專著