基于分配適應(yīng)度的Spark漸進(jìn)填充分區(qū)映射算法
發(fā)布時(shí)間:2019-04-21 22:31
【摘要】:分析Spark的作業(yè)執(zhí)行機(jī)制,建立了執(zhí)行效率模型和Shuffle過程模型,給出了分配適應(yīng)度(AFD,allocation fitness degree)的定義,提出了算法的優(yōu)化目標(biāo)。根據(jù)模型的相關(guān)定義求解,設(shè)計(jì)了漸進(jìn)填充分區(qū)映射算法(PFPM,progressive filling partitioning and mapping algorithm),通過擴(kuò)展式分區(qū)和漸進(jìn)填充映射,建立適應(yīng)Reducer計(jì)算能力的數(shù)據(jù)分配方案,有效縮減Shuffle過程的同步延時(shí),提高集群計(jì)算效率。實(shí)驗(yàn)表明該算法提高了Shuffle過程數(shù)據(jù)分配的合理性,優(yōu)化了并行計(jì)算框架Spark的作業(yè)執(zhí)行效率。
[Abstract]:This paper analyzes the job execution mechanism of Spark, establishes the execution efficiency model and Shuffle process model, gives the definition of allocation fitness (AFD,allocation fitness degree), and puts forward the optimization objective of the algorithm. According to the relevant definition of the model, the progressive filling partition mapping algorithm (PFPM,progressive filling partitioning and mapping algorithm),) is designed to establish the data allocation scheme adapted to the computing power of Reducer through the extended partition and the progressive filling mapping. The synchronization delay of Shuffle process is reduced effectively, and the efficiency of cluster computing is improved. Experiments show that the algorithm improves the rationality of data allocation in Shuffle process and optimizes the job execution efficiency of parallel computing framework Spark.
【作者單位】: 新疆大學(xué)軟件學(xué)院;新疆財(cái)經(jīng)大學(xué)統(tǒng)計(jì)與信息學(xué)院;
【基金】:國家自然科學(xué)基金資助項(xiàng)目(No.61262088,No.61462079,No.61562078,No.61363083,No.61562086) 新疆維吾爾自治區(qū)自然科學(xué)基金資助項(xiàng)目(No.2017D01A20) 新疆維吾爾自治區(qū)高?蒲杏(jì)劃基金資助項(xiàng)目(No.XJED2016S106) 新疆財(cái)經(jīng)大學(xué)科研博士啟動(dòng)基金資助項(xiàng)目(No.2015BS007)~~
【分類號(hào)】:TP311.13
[Abstract]:This paper analyzes the job execution mechanism of Spark, establishes the execution efficiency model and Shuffle process model, gives the definition of allocation fitness (AFD,allocation fitness degree), and puts forward the optimization objective of the algorithm. According to the relevant definition of the model, the progressive filling partition mapping algorithm (PFPM,progressive filling partitioning and mapping algorithm),) is designed to establish the data allocation scheme adapted to the computing power of Reducer through the extended partition and the progressive filling mapping. The synchronization delay of Shuffle process is reduced effectively, and the efficiency of cluster computing is improved. Experiments show that the algorithm improves the rationality of data allocation in Shuffle process and optimizes the job execution efficiency of parallel computing framework Spark.
【作者單位】: 新疆大學(xué)軟件學(xué)院;新疆財(cái)經(jīng)大學(xué)統(tǒng)計(jì)與信息學(xué)院;
【基金】:國家自然科學(xué)基金資助項(xiàng)目(No.61262088,No.61462079,No.61562078,No.61363083,No.61562086) 新疆維吾爾自治區(qū)自然科學(xué)基金資助項(xiàng)目(No.2017D01A20) 新疆維吾爾自治區(qū)高?蒲杏(jì)劃基金資助項(xiàng)目(No.XJED2016S106) 新疆財(cái)經(jīng)大學(xué)科研博士啟動(dòng)基金資助項(xiàng)目(No.2015BS007)~~
【分類號(hào)】:TP311.13
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 王意潔,胡守仁;一種優(yōu)化神經(jīng)網(wǎng)絡(luò)映射算法──吸收算法[J];國防科技大學(xué)學(xué)報(bào);1996年03期
2 徐紅波;;空間填充曲線映射算法研究[J];科技信息(科學(xué)教研);2007年35期
3 孫培展;袁國良;;改進(jìn)的隱式空間映射算法的研究[J];電子設(shè)計(jì)工程;2012年09期
4 黎洪松;;一種改進(jìn)的自組織特征映射算法[J];中國民航學(xué)院學(xué)報(bào);2006年01期
5 徐德智;黃利輝;陳建二;;一種新的基于樹分割的本體映射算法[J];小型微型計(jì)算機(jī)系統(tǒng);2009年11期
6 吳國福;竇強(qiáng);竇文華;;基于查表的空間填充曲線映射算法[J];國防科技大學(xué)學(xué)報(bào);2010年05期
7 陳];;心動(dòng)陣列的自動(dòng)映射算法[J];計(jì)算機(jī)研究與發(fā)展;1992年05期
8 柳玉起;李明林;馮少宏;易國鋒;;基于有限元映射算法的試驗(yàn)網(wǎng)格顯示及其應(yīng)用[J];華中科技大學(xué)學(xué)報(bào)(自然科學(xué)版);2007年03期
9 王琳珠;單_,
本文編號(hào):2462617
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2462617.html
最近更新
教材專著