若干MapReduce排序問題的算法研究
發(fā)布時間:2019-07-07 20:59
【摘要】:MapReduce是Google提出的一種編程模型,一個處理和生成大數(shù)據(jù)集的相關(guān)實現(xiàn)。本文主要研究了Map Reduce平行機排序問題,對極小化最大完工時間(makespan),論文研究了m臺同類機離線和兩臺機在線排序模型。對極大化最小完工時間,文中研究了同型機離線排序模型。全文共分五章。第一章簡要地介紹了排序問題的基本知識以及MapReduce的相關(guān)知識。第二章主要研究m臺同類機離線MapReduce排序問題,在這里map任務可以平行加工然而reduce任務不能平行加工。目標是極小化最大完工時間,分別考慮reduce任務可中斷和reduce任務不可中斷兩種情況。對可中斷的reduce任務,我們設(shè)計了一個近似比為2-(?)的算法,其中s1≥s2≥…≥2sm,是機器σi的加工速度;對不可中斷的reduce任務,給出了一個近似比為2+(?)的算法。第三章主要研究兩臺同類機在線MapReduce排序問題,目標是極小化最大完工時間。首先對reduce任務可中斷和reduce任務不可中斷兩種情況,分別給出了問題的下界。對可中斷的reduce任務,給出了競爭比為(?)的最優(yōu)算法,其中s是兩臺機器的速度之比;對不中斷的reduce任務,我們證明類似LS算法是最優(yōu)的且它的競爭比為(?)和(?)。第四章主要研究離線MapReduce同型機排序問題,目標是極大化最小完工時間。對可中斷的reduce任務,給出了3臺機的最優(yōu)算法;對不可中斷的reduce任務,考慮2臺機、3臺機、任意m臺機的情況,分別給出了最壞情況界為4/3、3/2、m的近似算法且都是緊的。最后,我們對相關(guān)問題的后續(xù)研究作了討論。第五章總結(jié)全文并提出相關(guān)問題進一步的研究方向。
[Abstract]:MapReduce is a programming model proposed by Google, a related implementation of processing and generating large data sets. In this paper, the scheduling problem of Map Reduce parallel machines is studied, and the offline and two online sorting models of m similar machines are studied for minimizing the maximum completion time (makespan),. For maximizing the minimum completion time, the offline sorting model of the same machine is studied in this paper. The full text is divided into five chapters. The first chapter briefly introduces the basic knowledge of scheduling problem and the related knowledge of MapReduce. In the second chapter, we mainly study the off-line MapReduce scheduling problem of m similar machines, where map tasks can be processed in parallel, but reduce tasks can not be processed in parallel. The goal is to minimize the maximum completion time, considering the interruptible reduce task and the uninterruptible reduce task respectively. For interruptible reduce tasks, we design an approximate ratio of 2-(?) Where S1 鈮,
本文編號:2511438
[Abstract]:MapReduce is a programming model proposed by Google, a related implementation of processing and generating large data sets. In this paper, the scheduling problem of Map Reduce parallel machines is studied, and the offline and two online sorting models of m similar machines are studied for minimizing the maximum completion time (makespan),. For maximizing the minimum completion time, the offline sorting model of the same machine is studied in this paper. The full text is divided into five chapters. The first chapter briefly introduces the basic knowledge of scheduling problem and the related knowledge of MapReduce. In the second chapter, we mainly study the off-line MapReduce scheduling problem of m similar machines, where map tasks can be processed in parallel, but reduce tasks can not be processed in parallel. The goal is to minimize the maximum completion time, considering the interruptible reduce task and the uninterruptible reduce task respectively. For interruptible reduce tasks, we design an approximate ratio of 2-(?) Where S1 鈮,
本文編號:2511438
本文鏈接:http://sikaile.net/shoufeilunwen/benkebiyelunwen/2511438.html
最近更新
教材專著