云計算平臺Hadoop負載均衡研究
發(fā)布時間:2018-04-16 14:32
本文選題:Big + Data; 參考:《河北工程大學》2014年碩士論文
【摘要】:負載均衡在Hadoop集群系統(tǒng)中十分重要,合理的負載均衡策略能提高集群的性能,同時也可以改善用戶體驗。Hadoop集群的任務調(diào)度策略和調(diào)度方式對負載分配有很大影響,但是目前Hadoop集群的調(diào)度方式?jīng)]有考慮到負載的均衡問題。本文主要是從任務調(diào)度的角度對集群負載進行研究,,在任務調(diào)度程中就考慮到負載均衡,比當集群負載已經(jīng)嚴重失衡時再調(diào)整更有意義。 本文詳細介紹了云計算、MapReduce分布式計算框架和HDFS及MapReduce的Java開源實現(xiàn)Hadoop,重點分析了Hadoop中作業(yè)執(zhí)行的過程和當前Hadoop中常見的調(diào)度算法FIFO、Capacity Scheduler和Fair Scheduler。充分利用TaskTracker請求新任務的Heartbeat信息,從任務調(diào)度的角度出發(fā),提出了動態(tài)反饋的負載均衡調(diào)度方式(Dynamic Feedback Load Balance,DFLB)。主要是在作業(yè)執(zhí)行過程收集相關信息,反饋給JobTracker,在新任務分配時利用這些信息,并在新任務執(zhí)行時再收集執(zhí)行信息,最終形成一個收集-反饋-利用-收集這樣一個閉環(huán)。本文對集群的負載均衡情況進行了數(shù)學定義,為任務調(diào)度時任務分配情況是否合理提供了考慮和判斷的依據(jù)。另外考慮到作業(yè)的公平性,在調(diào)度過程中提出了作業(yè)的動態(tài)優(yōu)先級。 在分析完動態(tài)反饋負載均衡的流程后,動手搭建了Hadoop集群對研究結果進行驗證,對試驗結果進行對比分析,結果表明DFLB調(diào)度方式可以使集群負載達到均衡狀態(tài),作業(yè)的平均響應時間較Hadoop自帶的調(diào)度方式有一定的改善,充分反映了負載均衡對資源利用率和作業(yè)并行度的影響,對Hadoop云計算平臺的負載均衡研究取得了有意義的進展。
[Abstract]:Load balancing is very important in Hadoop cluster system. A reasonable load balancing strategy can improve the performance of the cluster, and it can also improve the user experience. Hadoop cluster task scheduling strategy and scheduling methods have a great impact on load distribution.However, the current scheduling of Hadoop cluster does not take load balance into account.This paper mainly studies the cluster load from the perspective of task scheduling. It is more meaningful to consider load balancing in the task scheduling process than to adjust when the cluster load has been seriously out of balance.This paper introduces the distributed computing framework of cloud computing MapReduce and the Java open source implementation of HDFS and MapReduce in detail. The process of job execution in Hadoop and the scheduling algorithms such as FIFO capacity Scheduler and Fair Scheduler, which are commonly used in Hadoop at present, are analyzed in detail.Taking full advantage of the Heartbeat information of new tasks requested by TaskTracker, a dynamic Feedback Load balance scheduling method with dynamic feedback is proposed from the point of view of task scheduling.It mainly collects the relevant information in the job execution process, feeds back to JobTracker, uses the information in the new task assignment, and then collects the execution information when the new task is executed, and finally forms a close loop of collecting, feedback, utilizing and collecting.In this paper, the mathematical definition of load balancing in cluster is given, which provides a basis for considering and judging whether the task allocation is reasonable or not.In addition, considering the fairness of the job, the dynamic priority of the job is proposed in the scheduling process.After analyzing the process of dynamic feedback load balancing, a Hadoop cluster is built to verify the research results, and the results are compared. The results show that the DFLB scheduling mode can make the cluster load balance.The average response time of jobs is better than that of Hadoop, which fully reflects the influence of load balancing on resource utilization and job parallelism, and makes a significant progress in the research of load balancing in Hadoop cloud computing platform.
【學位授予單位】:河北工程大學
【學位級別】:碩士
【學位授予年份】:2014
【分類號】:TP393.09
【參考文獻】
相關期刊論文 前6條
1 李秋云;朱慶保;馬衛(wèi);;用于連續(xù)域尋優(yōu)的分組蟻群算法[J];計算機工程與應用;2010年30期
2 陳全;鄧倩妮;;云計算及其關鍵技術[J];計算機應用;2009年09期
3 顧宏久;;淺談虛擬化與云計算的關系[J];科學咨詢(科技·管理);2011年08期
4 王笑宇;程良倫;;云計算下的多源信息資源云體系及云服務模型研究[J];計算機應用研究;2014年03期
5 馮登國;張敏;張妍;徐震;;云計算安全研究[J];軟件學報;2011年01期
6 張雷;扈飛;;軟件即服務應用框架中配置的設計與實現(xiàn)(英文)[J];計算機系統(tǒng)應用;2009年06期
本文編號:1759347
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1759347.html
最近更新
教材專著