天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

面向多核系統(tǒng)的程序并行化方法

發(fā)布時(shí)間:2018-05-25 10:15

  本文選題:多處理器系統(tǒng)芯片 + 并行編程 ; 參考:《浙江大學(xué)》2012年碩士論文


【摘要】:由于集成電路工藝技術(shù)的縮微化和應(yīng)用需求,高性能微處理器將多個(gè)處理器單元集成到單芯片上來(lái)實(shí)現(xiàn)處理器性能的提升和功耗的減少。但要充分利用多核系統(tǒng)所提供的硬件資源,必須解決應(yīng)用程序的并行化編程問(wèn)題。然而,應(yīng)用程序的并行化編程問(wèn)題非常復(fù)雜。1)長(zhǎng)期以來(lái)大多數(shù)程序員都采用串行的編程模型和編程語(yǔ)言;2)大量現(xiàn)存的應(yīng)用程序和算法描述均是采用串行語(yǔ)言編寫;3)面對(duì)不同的應(yīng)用程序和多處理器系統(tǒng)結(jié)構(gòu),尋求通用的并行編程模型十分困難。因此,到目前為止研究人員還沒(méi)有較好的方法來(lái)解決應(yīng)用程序的并行化編程。 我們針對(duì)上述并行化編程的困難,提出一種面向C程序的編程并行化方法,由程序員指導(dǎo),工具輔助,實(shí)現(xiàn)半自動(dòng)化的開發(fā)源程序的并行性。該方法結(jié)合流應(yīng)用程序的特點(diǎn),通過(guò)關(guān)注多層循環(huán)結(jié)構(gòu)來(lái)開發(fā)應(yīng)用程序中隱藏的并行性,提出了既適用于任務(wù)并行模式,又適用于流水并行模式的并行策略。 本文提出的并行編程方法包括四個(gè)步驟。1)對(duì)源代碼進(jìn)行程序分析,結(jié)合典型數(shù)據(jù)運(yùn)行驅(qū)動(dòng)剖析和依賴分析,獲取應(yīng)用程序的執(zhí)行開銷、條件分支選擇、函數(shù)調(diào)用關(guān)系、存儲(chǔ)占用、函數(shù)訪問(wèn)的變量以及依賴關(guān)系等信息,建立任務(wù)依賴圖模型。2)對(duì)任務(wù)依賴圖模型進(jìn)行變換,消除其中的控制依賴和冗余迭代間數(shù)據(jù)依賴,聚合環(huán)狀依賴,以生成適于調(diào)度的有向無(wú)環(huán)圖。3)進(jìn)行任務(wù)調(diào)度,為任務(wù)設(shè)置兩級(jí)優(yōu)先級(jí),任務(wù)映射過(guò)程中充分考慮分支任務(wù)的互斥性。采用啟發(fā)式方法,通過(guò)反饋優(yōu)化,逐步求精,獲取并行化方案。4)封裝任務(wù),生成可執(zhí)行代碼,在多核平臺(tái)上進(jìn)行性能的評(píng)估。 我們選用T264解碼程序和AES加密程序?yàn)閷?shí)驗(yàn)對(duì)象,分別在2核、4核和8核的硬件平臺(tái)上進(jìn)行評(píng)估。本文提出的并行化方法在8核平臺(tái)上,AES和T264相對(duì)于單核處理器分別有5.12和5.62的加速比。實(shí)驗(yàn)結(jié)果證明了本方法的有效性和良好的可擴(kuò)展性。 本文所提出的并行化方法繼承了以往的并行化研究的經(jīng)驗(yàn),采用啟發(fā)式框架,同時(shí)兼具自己的特色:1)定義了任務(wù)依賴圖(task dependence graph, TDG)模型。該模型適用于任何應(yīng)用程序,能夠有效表示程序中的控制依賴和數(shù)據(jù)依賴。2)針對(duì)存在錯(cuò)綜復(fù)雜依賴關(guān)系的任務(wù)圖模型提出一種有別于傳統(tǒng)聚合消除法的變換規(guī)則。分別對(duì)分支、循環(huán)和迭代間依賴采用不同的技術(shù),能夠有效解除環(huán)狀依賴。3)將動(dòng)態(tài)優(yōu)先級(jí)與靜態(tài)優(yōu)先級(jí)相結(jié)合。動(dòng)態(tài)優(yōu)先級(jí)保證了關(guān)鍵路徑上的任務(wù)總能被優(yōu)先處理,而靜態(tài)優(yōu)先級(jí)綜合考慮了任務(wù)的計(jì)算開銷和存儲(chǔ)占用情況。任務(wù)到處理器的映射方法以盡量減少的執(zhí)行時(shí)間為優(yōu)化目標(biāo),還考慮了存在分支選擇的情況。4)不僅僅對(duì)任務(wù)并行模式進(jìn)行了研究,還針對(duì)多層循環(huán)結(jié)構(gòu)探索了相關(guān)的流水并行技術(shù)。
[Abstract]:Due to the miniaturization and application requirements of integrated circuit technology, high-performance microprocessors integrate multiple processor units into a single chip to improve processor performance and reduce power consumption. However, in order to make full use of the hardware resources provided by multi-core systems, the parallel programming problem of application programs must be solved. However, The problem of parallel programming of applications is very complex. 1) for a long time, most programmers have adopted serial programming models and programming languages.) A large number of existing application and algorithm descriptions are written in serial language. Different applications and multiprocessor system architectures, It is difficult to find a common parallel programming model. So far, researchers do not have a good solution to parallelization of applications. In view of the difficulties of parallel programming, we propose a method of programming parallelization for C programs, which is directed by programmers and assisted by tools to realize the parallelism of semi-automatic programming source programs. Based on the characteristics of flow applications and by focusing on the hidden parallelism in applications, a parallel strategy is proposed, which is suitable for both task and pipelined parallel patterns. The parallel programming method proposed in this paper includes four steps. 1) analyzing the source code, combining with the typical data running driven analysis and dependency analysis, obtaining the execution cost of the application, the selection of conditional branches, and the function calling relationship. By storing the information of occupation, function access variables and dependency relationships, a task-dependent graph model .2) is established to transform the task-dependent graph model to eliminate the control dependency and redundant iterative data dependency, and to aggregate the ring dependency. The task scheduling is carried out by generating a directed acyclic graph. 3) and two levels of priority are set for the task. In the process of task mapping, the mutuality of branch tasks is fully considered. By using heuristic method, feedback optimization, refinement step by step, parallelization scheme. 4) encapsulation task is obtained, executable code is generated, and performance evaluation is carried out on multi-core platform. We choose T264 decoder and AES encryption program as experimental objects and evaluate them on the hardware platform of 2 core 4 core and 8 core respectively. The parallelization method proposed in this paper has speedup ratios of 5.12 and 5.62 for AES and T264 on 8-core platform, respectively. Experimental results show that this method is effective and extensible. The parallelization method proposed in this paper inherits the experience of previous parallelization research, and uses heuristic framework and has its own characteristics: 1) to define the task dependence graph, TDG) model of task dependency graph. This model is suitable for any application and can effectively represent the control and data dependencies in the program. 2) for the task graph model with complex dependencies, a transformation rule is proposed, which is different from the traditional aggregation elimination method. Different techniques are used for branch, loop and iteration dependencies, which can effectively remove the ring dependence. 3) combining dynamic priority with static priority. Dynamic priority ensures that tasks in critical paths can always be prioritized, while static priority takes into account the computational overhead and storage footprint of tasks. Task-to-processor mapping aims at minimizing execution time, and considers the presence of branching options. The related pipelining parallel technology is also explored for multi-layer loop structure.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP311.1;TP332

【參考文獻(xiàn)】

相關(guān)博士學(xué)位論文 前1條

1 高豐;基于SOC的實(shí)時(shí)操作系統(tǒng)的研究[D];浙江大學(xué);2002年

,

本文編號(hào):1933060

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1933060.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶f75e1***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com