面向新型PIM異構(gòu)系統(tǒng)的任務(wù)劃分與調(diào)度方法研究
發(fā)布時間:2018-03-20 02:03
本文選題:PIM 切入點(diǎn):存儲墻 出處:《合肥工業(yè)大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著計(jì)算系統(tǒng)步入大數(shù)據(jù)時代,內(nèi)存計(jì)算(Processing in Memory,PIM),技術(shù)被認(rèn)為是緩解存儲墻效應(yīng)的革命性的新型架構(gòu)。PIM技術(shù)或稱之為近數(shù)據(jù)計(jì)算(Near Data Computing, NDC)技術(shù)的核心思想是將存儲部件和計(jì)算資源緊密耦合,以此消除內(nèi)存帶寬瓶頸的制約和處理器與內(nèi)存之間傳遞數(shù)據(jù)引起的開銷。包含PIM結(jié)構(gòu)與主處理器的系統(tǒng)是一種新型的異構(gòu)并行計(jì)算架構(gòu),近年來研究人員不斷提出結(jié)合新型存儲或三維集成技術(shù)的PIM結(jié)構(gòu),但對于適用于該類新型異構(gòu)平臺的 -通用任務(wù)調(diào)度方法,尚缺乏研究與探討。針對已有的定制化PIM結(jié)構(gòu)存在的硬件冗余與非通用性的局限性,本文基于任務(wù)圖分析的方法提出了一個形式化的模型來量化PIM+CPU異構(gòu)并行計(jì)算架構(gòu)的性能和能耗,并且首次提出了一個針對該架構(gòu)的應(yīng)用劃分與映射框架。在該框架中,一個應(yīng)用被劃分為子任務(wù)的集合,并依據(jù)提出的執(zhí)行單元映射算法PEFT(PIM-oriented Earliest-Finish-Time),將各子任務(wù)調(diào)度到合適的執(zhí)行單元(CPU或PIM),使處理器與PIM結(jié)構(gòu)并行地執(zhí)行各子任務(wù),以此最大化引入PIM機(jī)制帶來的性能提升。PEFT算法在產(chǎn)生最優(yōu)的子任務(wù)調(diào)度順序基礎(chǔ)上,為每個子任務(wù)選擇可以獲得最小完成時間的處理單元。評估時選取數(shù)據(jù)密集型的機(jī)器學(xué)習(xí)應(yīng)用作為測試集,并且一款真正的3D DRAM產(chǎn)品HMC-2.0被用作內(nèi)存實(shí)體進(jìn)行評估。實(shí)驗(yàn)結(jié)果表明我們提出的應(yīng)用劃分與映射框架對比傳統(tǒng)的計(jì)算架構(gòu),平均可以降低46%的應(yīng)用執(zhí)行時間,從而顯著提升系統(tǒng)的性能。
[Abstract]:As the computing system entered the era of big data, Memory processing in memory is considered to be a revolutionary new architecture for mitigating the effects of storage walls. PIM, or near data Data computing, is the core idea of a tight coupling between storage components and computing resources. In order to eliminate the bottleneck of memory bandwidth and the overhead caused by data transfer between processors and memory, the system including PIM structure and main processor is a new heterogeneous parallel computing architecture. In recent years, researchers have constantly proposed a new storage or 3D integration technology of PIM structure, but for this kind of new heterogeneous platforms-general task scheduling method, In view of the limitations of hardware redundancy and non-generality of existing customized PIM structures, This paper presents a formal model to quantify the performance and energy consumption of PIM CPU heterogeneous parallel computing architecture based on task graph analysis, and proposes an application partitioning and mapping framework for this architecture for the first time. An application is divided into a set of sub-tasks, and according to the proposed execution unit mapping algorithm PEFT(PIM-oriented Earliest-Finish-Time, each sub-task is scheduled to an appropriate execution unit, CPU or pimm, so that the processor performs each sub-task in parallel with the PIM structure. In order to maximize the performance improvement caused by the introduction of the PIM mechanism, the peft algorithm is based on the optimal subtask scheduling order. Select a processing unit for each subtask to get the minimum completion time. When evaluating, select data-intensive machine learning applications as the test set, And a real 3D DRAM product, HMC-2.0, is used as a memory entity for evaluation. The experimental results show that the proposed application partition and mapping framework can reduce the average application execution time by 46%, compared with the traditional computing framework. Thus, the performance of the system is greatly improved.
【學(xué)位授予單位】:合肥工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP333
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 王小樂;黃宏斌;鄧蘇;;處理順序約束的信息物理融合系統(tǒng)靜態(tài)任務(wù)表調(diào)度算法[J];自動化學(xué)報(bào);2012年11期
相關(guān)博士學(xué)位論文 前2條
1 李波;基于異構(gòu)多核平臺的優(yōu)化編程研究[D];華中科技大學(xué);2011年
2 溫璞;面向科學(xué)計(jì)算的PIM體系結(jié)構(gòu)技術(shù)研究[D];國防科學(xué)技術(shù)大學(xué);2007年
相關(guān)碩士學(xué)位論文 前1條
1 王旭濤;基于異構(gòu)多核處理器系統(tǒng)的任務(wù)調(diào)度算法研究[D];南京郵電大學(xué);2011年
,本文編號:1636994
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1636994.html
最近更新
教材專著