天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 碩博論文 > 信息類博士論文 >

分布式環(huán)境下主副版本任務(wù)可靠調(diào)度方法研究

發(fā)布時(shí)間:2018-10-24 19:23
【摘要】:隨著計(jì)算技術(shù)和網(wǎng)絡(luò)技術(shù)的發(fā)展,以分布式計(jì)算、并行計(jì)算為基礎(chǔ)的分布式計(jì)算系統(tǒng)所構(gòu)建的數(shù)據(jù)中心和計(jì)算中心在工業(yè)、商業(yè)、科技和軍事等領(lǐng)域有著非常廣泛的應(yīng)用。在這些應(yīng)用中將大量的復(fù)雜計(jì)算任務(wù)分解成為若干個(gè)子任務(wù)并行處理,最后將計(jì)算結(jié)果進(jìn)行有效合并得到最終結(jié)果?梢钥吹皆谌蝿(wù)的分解和計(jì)算過(guò)程中,有效的任務(wù)調(diào)度機(jī)制是影響分布式計(jì)算系統(tǒng)性能和效率的關(guān)鍵因素,而不合理的任務(wù)調(diào)度方法會(huì)嚴(yán)重影響系統(tǒng)的計(jì)算能力,降低并行效率,甚至達(dá)不到并行計(jì)算應(yīng)具有的效果。因此任務(wù)的調(diào)度問(wèn)題一直是分布式系統(tǒng)、網(wǎng)格系統(tǒng)、云計(jì)算系統(tǒng)的核心內(nèi)容,也是人們一直研究的熱點(diǎn)。但是,隨著分布式系統(tǒng)規(guī)模的不斷增加、計(jì)算能力不斷提高的同時(shí),系統(tǒng)的穩(wěn)定性和可靠性已成為影響并行應(yīng)用能否順利執(zhí)行的關(guān)鍵。例如在天河二號(hào)、Google數(shù)據(jù)中心等超級(jí)計(jì)算機(jī)或是大規(guī)模集群中,由于復(fù)雜的上層應(yīng)用以及系統(tǒng)超高的功耗導(dǎo)致了系統(tǒng)極容易出現(xiàn)故障,因此設(shè)計(jì)一套完整的可靠性保障機(jī)制顯得尤為重要,而在系統(tǒng)的調(diào)度階段設(shè)計(jì)高可靠的調(diào)度算法是其中重要的手段之一。本文從“保障性能,提高可靠性”這一目標(biāo)出發(fā),深入研究如何保障分布式計(jì)算系統(tǒng)可靠性與計(jì)算資源高效利用問(wèn)題。論文將任務(wù)的類型分為實(shí)時(shí)周期任務(wù)和非實(shí)時(shí)任務(wù)兩種任務(wù)類型,通過(guò)主副版本調(diào)度技術(shù),實(shí)現(xiàn)了高可靠、高性能的調(diào)度策略。具體工作為:(1)針對(duì)分布式計(jì)算系統(tǒng)的實(shí)時(shí)任務(wù)的可靠調(diào)度問(wèn)題,提出了一種依據(jù)計(jì)算節(jié)點(diǎn)和通信鏈路可靠性代價(jià)的調(diào)度算法(DRCAMD)。該方法能通過(guò)設(shè)置權(quán)值的方法來(lái)調(diào)整系統(tǒng)的目標(biāo)權(quán)重函數(shù),平衡用戶在系統(tǒng)的調(diào)度性能和可靠性的不同需求,另外,針對(duì)具有依賴關(guān)系的實(shí)時(shí)任務(wù)的調(diào)度問(wèn)題,本文提出了一種不考慮主版本任務(wù)與副版本任務(wù)各種重疊狀態(tài)的可調(diào)度分析方法,實(shí)驗(yàn)結(jié)果表明了在一定的計(jì)算節(jié)點(diǎn)和通信鏈路的故障概率條件下,算法的可靠性和性能方面具有一定的優(yōu)勢(shì)。(2)針對(duì)混合關(guān)鍵任務(wù)可靠性調(diào)度的問(wèn)題,基于主副版本調(diào)度策略,結(jié)合任務(wù)關(guān)鍵性等級(jí)的處理方法,提出了一種二階段可靠調(diào)度算法(MCRSS)及可調(diào)度分析方法。該算法的第一階段主要是對(duì)需要調(diào)度的混合關(guān)鍵性任務(wù)按照優(yōu)先級(jí)高低進(jìn)行調(diào)度,調(diào)度過(guò)程中,使用副本重疊的方法減少由于副版本任務(wù)的復(fù)制所帶來(lái)的系統(tǒng)開(kāi)銷,第二個(gè)階段是對(duì)調(diào)度到目標(biāo)處理機(jī)上的任務(wù)進(jìn)行可調(diào)度分析,對(duì)于不能滿足可調(diào)度需求的任務(wù)進(jìn)行升級(jí)處理,直到能滿足任務(wù)對(duì)截止期的要求。仿真實(shí)驗(yàn)表明了MCRSS算法能有效的處理混合關(guān)鍵任務(wù)中不同關(guān)鍵等級(jí)任務(wù)的可靠調(diào)度問(wèn)題,同時(shí)保證了分布式計(jì)算系統(tǒng)具有良好的靈活性和性能。(3)針對(duì)具有優(yōu)先級(jí)依賴關(guān)系的DAG任務(wù)的調(diào)度問(wèn)題,本文提出了一種基于副版本任務(wù)最早完成時(shí)間的調(diào)度算法(EFTBT),該方法通過(guò)分析主版本任務(wù)調(diào)度的狀態(tài)以此得到不同情況下副版本任務(wù)調(diào)度的最早為完成時(shí)間以及調(diào)度的目標(biāo)處理機(jī)的約束,并證明了該約束的合理性,該方法能夠在保障可靠調(diào)度前提下獲得較好的調(diào)度性能,另外,針對(duì)科學(xué)工作流應(yīng)用中存在的多個(gè)DAG任務(wù)同時(shí)調(diào)度的問(wèn)題,為了解決不公平導(dǎo)致的多個(gè)后續(xù)DAG任務(wù)無(wú)法調(diào)度問(wèn)題,提出了基于分層思想的多DAG調(diào)度策略(MDDL),實(shí)驗(yàn)結(jié)果表明上述兩種算法與經(jīng)典算法相比能有效提高調(diào)度的性能。(4)針對(duì)大規(guī)模分布式計(jì)算系統(tǒng)的異構(gòu)性、動(dòng)態(tài)性的特點(diǎn),提出基于節(jié)點(diǎn)和鏈路故障特征分析的具有依賴關(guān)系DAG任務(wù)可靠調(diào)度策略,該策略以副版本任務(wù)最早完成時(shí)間算法EFTBT為基礎(chǔ),給出了更符合實(shí)際應(yīng)用需求的通信模型以及副版本執(zhí)行策略,建立了分布式計(jì)算系統(tǒng)的故障特點(diǎn)分析方法,在此基礎(chǔ)上提出了基于通信競(jìng)爭(zhēng)模型的容錯(cuò)調(diào)度算法(RAPA),實(shí)驗(yàn)結(jié)果表明與HEFT和EFTBT相比,RAPA算法具有較好的性能和可靠性。
[Abstract]:With the development of computing technology and network technology, the data center and computing center constructed by distributed computing and parallel computing are widely used in the fields of industry, commerce, science and technology and military. In these applications, a large number of complex computational tasks are decomposed into several sub-task parallel processing, and the calculation results are effectively combined to obtain the final result. it can be seen that the effective task scheduling mechanism is the key factor that affects the performance and efficiency of distributed computing system during the decomposition and calculation process of the task, and the unreasonable task scheduling method can seriously affect the computing power of the system, reduce the parallel efficiency, Even failing to reach parallel computing should have the effect. Therefore, the task scheduling problem has been the core content of distributed system, grid system and cloud computing system. However, with the increasing scale of distributed system and increasing computing power, the stability and reliability of the system have become the key to the successful implementation of parallel application. For example, in a supercomputer or a large-scale cluster such as Chrome No. 2 and Google data center, due to the complex upper application and the ultra-high power consumption of the system, the system is extremely prone to malfunction, so it is particularly important to design a complete set of reliability guarantee mechanisms. It is one of the most important means to design a highly reliable scheduling algorithm at the scheduling stage of the system. Based on the objective of guaranteeing performance and improving reliability, this paper studies how to guarantee the efficient utilization of distributed computing system reliability and computing resources. The paper divides the types of tasks into real-time periodic task and non-real-time task type, and realizes high-reliability and high-performance scheduling strategy through main sub-version scheduling technology. The specific work is as follows: (1) In order to solve the problem of reliable scheduling of distributed computing system, a scheduling algorithm (DRCAMD) based on calculating node and communication link reliability cost is proposed. The method can adjust the target weight function of the system by the method of setting the weight value, balance the different requirements of the scheduling performance and the reliability of the user in the system, and additionally, aiming at the scheduling problem of the real-time task with the dependency relationship, This paper presents a schedulable analysis method which does not take into account the overlapping states of the main version task and the sub-version task, and the experimental results show that the algorithm has some advantages in the reliability and performance of the algorithm under the failure probability condition of certain computing nodes and communication links. (2) A two-stage reliable scheduling algorithm (MCRSS) and schedulable analysis method are proposed based on the main sub-version scheduling policy and the processing method of task criticality. the first phase of the algorithm is mainly to schedule the mixed key tasks needing to be scheduled according to the priority level, The second stage is to perform schedulable analysis on tasks scheduled to the target processor, and upgrade the tasks that can not meet the scheduling requirements until the deadline requirements for tasks can be met. The simulation experiment shows that the MCRSS algorithm can effectively deal with the reliable scheduling problem of different key-level tasks in hybrid critical tasks, and also ensures that the distributed computing system has good flexibility and performance. (3) Aiming at the scheduling problem of DAG task with priority dependence, a scheduling algorithm (EFTBT) based on the earliest completion time of sub-version task is proposed in this paper. The method obtains the earliest completion time of the sub-version task scheduling and the constraint of the scheduled target processor by analyzing the state of the main version task scheduling, and proves the rationality of the constraint, The method can obtain better scheduling performance under the premise of guaranteeing reliable scheduling, and in addition, aiming at the problem that a plurality of DAG tasks existing in the scientific workflow application are simultaneously scheduled, in order to solve the problem that a plurality of subsequent DAG tasks caused by unfair scheduling cannot be scheduled, A multi-DAG scheduling strategy (MDDL) based on layered thought is proposed. The experimental results show that the two algorithms can effectively improve the performance of scheduling compared with classical algorithms. (4) aiming at the characteristics of heterogeneous and dynamic characteristics of the large-scale distributed computing system, a reliable scheduling strategy with dependency relation DAG task based on the node and link fault characteristics is proposed, and the strategy is based on the earliest completion time algorithm EFTBT of the sub-version task, In this paper, the communication model and the sub-version execution strategy are given. The fault characteristic analysis method of distributed computing system is established. Based on this, a fault-tolerant scheduling algorithm (RAPA) based on communication contention model is proposed. The experimental results show that compared with HEFT and EFTBT, RAPA algorithm has better performance and reliability.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP338.8

【相似文獻(xiàn)】

相關(guān)期刊論文 前4條

1 彭日光;李仁發(fā);劉彥;陳宇;李浪;;動(dòng)態(tài)可重構(gòu)片上系統(tǒng)的任務(wù)在線調(diào)度算法[J];計(jì)算機(jī)工程;2010年05期

2 廖雷;如何在Windows下由一個(gè)任務(wù)啟動(dòng)和中止另一個(gè)任務(wù)[J];現(xiàn)代計(jì)算機(jī);1996年04期

3 李濤;楊愚魯;;可重構(gòu)資源管理及硬件任務(wù)布局的算法研究[J];計(jì)算機(jī)研究與發(fā)展;2008年02期

4 ;[J];;年期

相關(guān)博士學(xué)位論文 前1條

1 景維鵬;分布式環(huán)境下主副版本任務(wù)可靠調(diào)度方法研究[D];哈爾濱工業(yè)大學(xué);2016年

相關(guān)碩士學(xué)位論文 前4條

1 李橙;嵌入式MPSoC系統(tǒng)中的任務(wù)調(diào)度管理研究[D];浙江大學(xué);2010年

2 王宇;基于DVS的多核周期任務(wù)節(jié)能調(diào)度策略研究[D];武漢理工大學(xué);2013年

3 張鐵軍;基于多核CPU的任務(wù)級(jí)數(shù)據(jù)處理研究及其在集群平臺(tái)下的性能測(cè)試[D];重慶大學(xué);2011年

4 李俊;基于塊聚集的MapReduce性能研究與優(yōu)化[D];北京交通大學(xué);2014年

,

本文編號(hào):2292323

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/2292323.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶17336***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com