天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

云計(jì)算中分布式JobTracker節(jié)點(diǎn)模型的建立與優(yōu)化

發(fā)布時(shí)間:2019-03-15 17:18
【摘要】:云計(jì)算是隨著大規(guī)模計(jì)算機(jī)、個(gè)人計(jì)算機(jī)、互聯(lián)網(wǎng)的發(fā)展而產(chǎn)生的第四次IT產(chǎn)業(yè)革命,谷歌首先定義并發(fā)展了云計(jì)算。而云計(jì)算的開源模型Hadoop是一種基于Java的通過運(yùn)行可分布式的密集型應(yīng)用來分析和處理大數(shù)據(jù)的開源分布式計(jì)算平臺,其中的單點(diǎn)問題造成了Hadoop的性能瓶頸。針對存儲模型架構(gòu)HDFS中的namenode節(jié)點(diǎn)的單節(jié)點(diǎn)優(yōu)化,Hadoop2.0提出了多節(jié)點(diǎn)高可用性方案,但是針對JobTracker節(jié)點(diǎn)的單節(jié)點(diǎn)優(yōu)化并沒有給出相應(yīng)的解決辦法。本文期望通過建立分布式JobTracker節(jié)點(diǎn)模型來改善傳統(tǒng)的計(jì)算模型架構(gòu)中的單JobTracker節(jié)點(diǎn)故障,從而能夠自動避免單JobTracker節(jié)點(diǎn)故障導(dǎo)致的作業(yè)運(yùn)行失敗。本文的主要工作內(nèi)容及貢獻(xiàn)如下:在充分分析了前人對單JobTracker節(jié)點(diǎn)模型的改進(jìn)和對調(diào)度算法與負(fù)載均衡算法的調(diào)優(yōu)。首先通過對最短路徑算法Dijkstra算法、網(wǎng)頁權(quán)值判斷算法PageRank算法和網(wǎng)頁去重算法Bloom Fliter算法的研究,建立了分布式JobTracker節(jié)點(diǎn)模型,并通過Dijkstra算法對分布式JobTracker節(jié)點(diǎn)模型中多對多節(jié)點(diǎn)間的通信方式進(jìn)行了優(yōu)化,以期望多節(jié)點(diǎn)模型下的多個(gè)JobTracker節(jié)點(diǎn)和任務(wù)節(jié)點(diǎn)間能夠均衡的進(jìn)行通信;其次基于PageRank算法對作業(yè)的調(diào)度方式進(jìn)行了優(yōu)化;最后進(jìn)一步通過Counting Bloom Filter算法改進(jìn)各個(gè)節(jié)點(diǎn)上任務(wù)的復(fù)本數(shù),從而對分布式Job Tracker模型中節(jié)點(diǎn)的負(fù)載進(jìn)行了優(yōu)化。本文在分析完分布式JobTracker節(jié)點(diǎn)模型的通信方式及其相關(guān)的調(diào)度優(yōu)化后,搭建了小型Hadoop實(shí)驗(yàn)集群對結(jié)果進(jìn)行了實(shí)驗(yàn)驗(yàn)證。由實(shí)驗(yàn)結(jié)果可以看出,單JobTracker節(jié)點(diǎn)模型與分布式JobTracker節(jié)點(diǎn)模型相比,在集群發(fā)生宕機(jī)時(shí),分布式JobTrackder節(jié)點(diǎn)模型具有更高的可靠性,基于Dijkstra算法的通信方式能夠更快速的選出JobTracker節(jié)點(diǎn);對于改進(jìn)的作業(yè)調(diào)度算法,在提交的作業(yè)具有依賴關(guān)系時(shí),基于PageRank的改進(jìn)算法能夠更進(jìn)一步的提高作業(yè)的整體處理時(shí)間;對于改進(jìn)的負(fù)載均衡算法,從副本的存儲負(fù)載角度對集群的負(fù)載進(jìn)行了優(yōu)化,從而提高了重復(fù)數(shù)據(jù)副本存儲空間利用率。實(shí)驗(yàn)最后對集群的綜合性能進(jìn)行了對比,由實(shí)驗(yàn)結(jié)果可以看出,分布式JobTracker節(jié)點(diǎn)模型下的優(yōu)化由于主要是針對特定作業(yè)的優(yōu)化與改進(jìn),處理作業(yè)的綜合性能并不如原有集群高,但是當(dāng)集群發(fā)生JobTracker節(jié)點(diǎn)宕機(jī)時(shí),提高了集群的安全可靠性,并針對特殊場景的作業(yè)處理具有很好的意義。
[Abstract]:Cloud Computing is the fourth IT industrial revolution with the development of large-scale computers, personal computers and the Internet. Google first defined and developed cloud computing. Hadoop, the open source model of cloud computing, is an open source distributed computing platform based on Java, which runs distributed and intensive applications. The single-point problem causes the bottleneck of Hadoop performance. For the single-node optimization of namenode nodes in storage model architecture (HDFS), Hadoop2.0 proposed a multi-node high-availability scheme, but there is no corresponding solution for single-node optimization of JobTracker nodes. In this paper, a distributed JobTracker node model is expected to improve the single JobTracker node failure in the traditional computing model architecture, so that the job failure caused by the single JobTracker node failure can be avoided automatically. The main contents and contributions of this paper are as follows: in this paper, the improvement of single JobTracker node model and the optimization of scheduling algorithm and load balancing algorithm are fully analyzed. Firstly, the distributed JobTracker node model is established by studying the shortest path algorithm (Dijkstra), the web weight judgment algorithm (PageRank) and the web page de-duplication algorithm (Bloom Fliter). The communication mode between many-to-many nodes in distributed JobTracker node model is optimized by Dijkstra algorithm, so that the communication between multiple JobTracker nodes and task nodes in multi-node model can be balanced. Secondly, based on the PageRank algorithm, the scheduling mode of the job is optimized. Finally, the Counting Bloom Filter algorithm is used to improve the number of tasks on each node to optimize the load of the nodes in the distributed Job Tracker model. After analyzing the communication mode of the distributed JobTracker node model and the related scheduling optimization, a small Hadoop experimental cluster is built to verify the results. It can be seen from the experimental results that the single JobTracker node model is more reliable than the distributed JobTracker node model when the cluster goes down, and the communication mode based on Dijkstra algorithm can select JobTracker nodes more quickly. For the improved job scheduling algorithm, when the submitted job is dependent, the improved algorithm based on PageRank can further improve the overall processing time of the job. For the improved load balancing algorithm, the load of the cluster is optimized from the point of view of the storage load of the replica, thus improving the utilization of the storage space of the duplicate data copy. At the end of the experiment, the comprehensive performance of the cluster is compared. It can be seen from the experimental results that the optimization under the distributed JobTracker node model is not as high as the original cluster due to the optimization and improvement of the specific jobs, and the overall performance of the processing jobs is not as high as that of the original cluster. However, when the JobTracker node goes down in the cluster, it improves the security and reliability of the cluster, and the job processing for the special scenario is of great significance.
【學(xué)位授予單位】:河北工程大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2016
【分類號】:TP393.09

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 關(guān)國棟;滕飛;楊燕;;基于心跳超時(shí)機(jī)制的Hadoop實(shí)時(shí)容錯技術(shù)[J];計(jì)算機(jī)應(yīng)用;2015年10期

2 王勇;劉美林;李凱;任興田;許榮強(qiáng);;云環(huán)境下基于可靠性的均衡任務(wù)調(diào)度算法研究[J];計(jì)算機(jī)科學(xué);2015年S1期

3 萬聰;王翠榮;王聰;賈朔;;MapReduce模型中reduce階段負(fù)載均衡分區(qū)算法研究[J];小型微型計(jì)算機(jī)系統(tǒng);2015年02期

4 荀亞玲;張繼福;秦嘯;;MapReduce集群環(huán)境下的數(shù)據(jù)放置策略[J];軟件學(xué)報(bào);2015年08期

5 陳波;沈煒;;基于HDFS的動態(tài)副本策略設(shè)計(jì)與實(shí)現(xiàn)[J];工業(yè)控制計(jì)算機(jī);2015年01期

6 郭登輝;肖先勇;;基于BF行為的多目標(biāo)分布估計(jì)算法優(yōu)化配置SFCL[J];計(jì)算機(jī)應(yīng)用研究;2015年05期

7 翟紅敏;劉國華;趙威;劉源源;翟紅坤;;MapReduce中連接負(fù)載均衡優(yōu)化研究[J];計(jì)算機(jī)工程與科學(xué);2014年10期

8 顧榮;嚴(yán)金雙;楊曉亮;袁春風(fēng);黃宜華;;Hadoop MapReduce短作業(yè)執(zhí)行性能優(yōu)化[J];計(jì)算機(jī)研究與發(fā)展;2014年06期

9 馬莉;唐善成;王靜;趙安新;;云計(jì)算環(huán)境下的動態(tài)反饋?zhàn)鳂I(yè)調(diào)度算法[J];西安交通大學(xué)學(xué)報(bào);2014年07期

10 萬兵;黃夢醒;段茜;;一種基于資源預(yù)取的Hadoop作業(yè)調(diào)度算法[J];計(jì)算機(jī)應(yīng)用研究;2014年06期

相關(guān)博士學(xué)位論文 前4條

1 季長清;云計(jì)算環(huán)境下的大規(guī)?臻g近鄰查詢算法研究[D];大連海事大學(xué);2014年

2 顧濤;集群MapReduce環(huán)境中任務(wù)和作業(yè)調(diào)度若干關(guān)鍵問題的研究[D];南開大學(xué);2014年

3 林文輝;基于Hadoop的海量網(wǎng)絡(luò)數(shù)據(jù)處理平臺的關(guān)鍵技術(shù)研究[D];北京郵電大學(xué);2014年

4 李冰;云計(jì)算環(huán)境下動態(tài)資源管理關(guān)鍵技術(shù)研究[D];北京郵電大學(xué);2012年

相關(guān)碩士學(xué)位論文 前10條

1 萬兵;MapReduce作業(yè)調(diào)度算法優(yōu)化與改進(jìn)研究[D];海南大學(xué);2014年

2 徐鵬;云計(jì)算平臺作業(yè)調(diào)度算法優(yōu)化研究[D];山東師范大學(xué);2014年

3 張得震;基于Hadoop的分布式文件系統(tǒng)優(yōu)化技術(shù)研究[D];蘭州交通大學(xué);2013年

4 谷連軍;云計(jì)算環(huán)境下基于優(yōu)先級與可靠度的Hadoop作業(yè)調(diào)度研究[D];湖南大學(xué);2013年

5 車斌;基于Hadoop海量數(shù)據(jù)處理關(guān)鍵技術(shù)研究[D];電子科技大學(xué);2013年

6 曹英;大數(shù)據(jù)環(huán)境下Hadoop性能優(yōu)化的研究[D];大連海事大學(xué);2013年

7 楊甫恒;基于Hadoop的大數(shù)據(jù)動態(tài)資源調(diào)節(jié)服務(wù)研究[D];成都理工大學(xué);2013年

8 戴君;基于Hadoop的作業(yè)調(diào)度算法的研究和改進(jìn)[D];武漢理工大學(xué);2013年

9 劉沖;MapReduce作業(yè)調(diào)度算法研究[D];哈爾濱工程大學(xué);2013年

10 任萱萱;基于Hadoop平臺的作業(yè)調(diào)度研究[D];天津師范大學(xué);2011年

,

本文編號:2440828

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2440828.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶78001***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com