支持Hadoop配置的異構(gòu)虛擬機平臺的研究
發(fā)布時間:2018-09-12 12:29
【摘要】:隨著云計算技術(shù)的發(fā)展,各種大小不一的數(shù)據(jù)中心紛紛出現(xiàn),而這些數(shù)據(jù)中心往往存在各種虛擬機管理平臺(如Eucalyptus, OpenNebula和OpenStack等),應用場景需求也完全不同,各種管理平臺要求不同的運維、開發(fā)技術(shù)和經(jīng)驗,不同管理平臺問的服務器資源不能動態(tài)共享,影響了彈性服務的性能。同時由于平臺中的不同的機器配置進而也將影響其上層運行的云計算應用。Hadoop作為已廣泛應用于數(shù)據(jù)密集型計算的云計算應用之一,其中的MapReduce框架可配置參數(shù)的正確配置對計算的性能有著不可忽視的影響。然而,當遇到異構(gòu)的Hadoop集群時,用戶一般只能使用默認配置或者依照經(jīng)驗進行手工配置,由于參數(shù)調(diào)優(yōu)時可選擇的空間很大,這樣常常會導致錯誤地配置致使計算性能下降。 針對多種多樣的虛擬機平臺的問題,本文設(shè)計并實現(xiàn)了一個異構(gòu)虛擬機管理平臺。在不需要改變既有的虛擬機管理平臺結(jié)構(gòu)的基礎(chǔ)上,實現(xiàn)對現(xiàn)有主流的虛擬機管理平臺的統(tǒng)一管理和控制、虛擬資源的均衡分配;同時還提供可擴展的適配層接口和驅(qū)動部件,支持其它異構(gòu)的虛擬機供應和管理平臺。 針對異構(gòu)虛擬機平臺上的Hadoop應用的問題,本文提出了一種基于增強學習的MapReduce在線參數(shù)自動配置方法。該方法利用離線學習粗粒度地創(chuàng)建初始化策略,在線學習根據(jù)策略細粒度地配置參數(shù),并通過試錯法迭代地更新Q值表使得配置結(jié)果接近最優(yōu)。實驗結(jié)果表明,該配置方法可以有效地提高Hadoop的性能,并且能快速迭代實現(xiàn)收斂,使運行MapReduce任務的機器資源得到充分使用,縮短任務的運行時間。
[Abstract]:With the development of cloud computing technology, a variety of data centers of different sizes have emerged, and these data centers often have a variety of virtual machine management platforms (such as Eucalyptus, OpenNebula and OpenStack), and the requirements of application scenarios are completely different. Different management platforms require different operation and maintenance, development technology and experience. The server resources of different management platforms can not be dynamically shared, which affects the performance of flexible services. At the same time, because of the different machine configuration in the platform, the cloud computing application. Hadoop, which affects the upper layer of the platform, will be one of the cloud computing applications that have been widely used in data-intensive computing. The correct configuration of the configurable parameters of the MapReduce framework has an important effect on the performance of the calculation. However, when a heterogeneous Hadoop cluster is encountered, the user can only use the default configuration or manual configuration according to experience. Due to the large space available for parameter tuning, this often leads to poor performance due to misconfiguration. Aiming at the problems of various virtual machine platforms, this paper designs and implements a heterogeneous virtual machine management platform. On the basis of not changing the structure of the existing virtual machine management platform, the unified management and control of the existing mainstream virtual machine management platform and the balanced allocation of virtual resources are realized, and the extensible adaptation layer interface and driver components are also provided. Support for other heterogeneous virtual machine provisioning and management platforms. In order to solve the problem of Hadoop application on heterogeneous virtual machine platform, this paper presents a method of MapReduce online parameter automatic configuration based on reinforcement learning. This method uses off-line learning coarse-grained to create initialization strategy, on-line learning configures parameters according to the policy fine-grained, and iteratively updates the Q value table by trial and error method to make the configuration result close to optimal. The experimental results show that the proposed configuration method can effectively improve the performance of Hadoop, and can quickly iterate to achieve convergence, make full use of the machine resources running MapReduce tasks, and shorten the running time of the tasks.
【學位授予單位】:中南大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP302
本文編號:2239011
[Abstract]:With the development of cloud computing technology, a variety of data centers of different sizes have emerged, and these data centers often have a variety of virtual machine management platforms (such as Eucalyptus, OpenNebula and OpenStack), and the requirements of application scenarios are completely different. Different management platforms require different operation and maintenance, development technology and experience. The server resources of different management platforms can not be dynamically shared, which affects the performance of flexible services. At the same time, because of the different machine configuration in the platform, the cloud computing application. Hadoop, which affects the upper layer of the platform, will be one of the cloud computing applications that have been widely used in data-intensive computing. The correct configuration of the configurable parameters of the MapReduce framework has an important effect on the performance of the calculation. However, when a heterogeneous Hadoop cluster is encountered, the user can only use the default configuration or manual configuration according to experience. Due to the large space available for parameter tuning, this often leads to poor performance due to misconfiguration. Aiming at the problems of various virtual machine platforms, this paper designs and implements a heterogeneous virtual machine management platform. On the basis of not changing the structure of the existing virtual machine management platform, the unified management and control of the existing mainstream virtual machine management platform and the balanced allocation of virtual resources are realized, and the extensible adaptation layer interface and driver components are also provided. Support for other heterogeneous virtual machine provisioning and management platforms. In order to solve the problem of Hadoop application on heterogeneous virtual machine platform, this paper presents a method of MapReduce online parameter automatic configuration based on reinforcement learning. This method uses off-line learning coarse-grained to create initialization strategy, on-line learning configures parameters according to the policy fine-grained, and iteratively updates the Q value table by trial and error method to make the configuration result close to optimal. The experimental results show that the proposed configuration method can effectively improve the performance of Hadoop, and can quickly iterate to achieve convergence, make full use of the machine resources running MapReduce tasks, and shorten the running time of the tasks.
【學位授予單位】:中南大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP302
【參考文獻】
相關(guān)期刊論文 前10條
1 張帆;李磊;楊成胡;陳麗珍;;基于Eucalyptus構(gòu)建私有云計算平臺[J];電信科學;2011年11期
2 崔巍;李益發(fā);斯雪明;;基于Eucalyptus的基礎(chǔ)設(shè)施即服務云框架協(xié)議設(shè)計[J];電子與信息學報;2012年07期
3 張倩;齊德昱;;面向服務的云制造協(xié)同設(shè)計平臺[J];華南理工大學學報(自然科學版);2011年12期
4 柴玉梅;景慧敏;;一種在多Agent系統(tǒng)中求帕累托效率解的方法[J];計算機工程與應用;2010年22期
5 公偉;劉培玉;遲學芝;賈嫻;;云取證模型的構(gòu)建與分析[J];計算機工程;2012年11期
6 溫少君;陳俊杰;郭濤;;一種云平臺中優(yōu)化的虛擬機部署機制[J];計算機工程;2012年11期
7 柳香;李俊紅;段勝業(yè);;基于混沌PSO算法的Hadoop配置優(yōu)化[J];計算機工程;2012年11期
8 楊星;馬自堂;孫磊;;云環(huán)境下基于性能向量的虛擬機部署算法[J];計算機應用;2012年01期
9 顧昊;錢曉俊;梁洪亮;;開源平臺下軟件管理技術(shù)的研究[J];計算機應用研究;2007年08期
10 陳康;鄭緯民;;云計算:系統(tǒng)實例與研究現(xiàn)狀[J];軟件學報;2009年05期
,本文編號:2239011
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2239011.html
最近更新
教材專著