天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

ESCA高性能處理器控制內(nèi)核的研究與實(shí)現(xiàn)

發(fā)布時(shí)間:2018-01-09 20:20

  本文關(guān)鍵詞:ESCA高性能處理器控制內(nèi)核的研究與實(shí)現(xiàn) 出處:《華中科技大學(xué)》2012年碩士論文 論文類型:學(xué)位論文


  更多相關(guān)文章: 高性能計(jì)算 混合計(jì)算 多核協(xié)處理器 ESCA 控制內(nèi)核 顯式存儲(chǔ)訪問機(jī)制 軟硬件協(xié)同驗(yàn)證


【摘要】:混合計(jì)算架構(gòu)采用異構(gòu)處理器,,充分挖掘不同架構(gòu)處理器的體系結(jié)構(gòu)優(yōu)勢(shì),分別對(duì)控制密集和計(jì)算密集型任務(wù)進(jìn)行優(yōu)化處理,協(xié)同實(shí)現(xiàn)對(duì)應(yīng)用的加速,已成為高性能計(jì)算體系結(jié)構(gòu)的重要發(fā)展趨勢(shì)之一。本項(xiàng)目組基于混合計(jì)算思想,面向工程科學(xué)計(jì)算和多媒體領(lǐng)域應(yīng)用設(shè)計(jì)了一款高性能多核處理器-ESCA(Engineering and ScientificComputing Accelerator)。ESCA處理器以協(xié)處理器的形式對(duì)應(yīng)用中計(jì)算密集型任務(wù)進(jìn)行加速,采用SIMD/Vector/Sub-word等技術(shù)實(shí)現(xiàn)高性能。 ESCA處理器由控制內(nèi)核和計(jì)算陣列兩部分組成,本課題主要圍繞控制內(nèi)核的關(guān)鍵技術(shù)研究及其實(shí)現(xiàn)展開。 本文首先從ESCA系統(tǒng)的角度介紹相關(guān)模型,然后闡述了ESCA處理器的指令集、硬件框架和存儲(chǔ)組織等體系結(jié)構(gòu)關(guān)鍵知識(shí)。在此基礎(chǔ)之上,確定控制內(nèi)核的具體功能職責(zé)并定義了微體系結(jié)構(gòu)?刂苾(nèi)核指令集采用分層編碼,擴(kuò)展控制指令以支持特殊控制流。針對(duì)大規(guī)模規(guī)整數(shù)據(jù)傳輸進(jìn)行優(yōu)化,提出了顯式存儲(chǔ)訪問機(jī)制。硬件實(shí)現(xiàn)以流水線為主線,力求性能與開銷的折衷。采用軟硬件協(xié)同驗(yàn)證方法對(duì)控制內(nèi)核的復(fù)雜控制流進(jìn)行驗(yàn)證,設(shè)計(jì)了混合驗(yàn)證平臺(tái),自動(dòng)化的驗(yàn)證流程極大地縮短了驗(yàn)證周期。 最終的ESCA處理器設(shè)計(jì)進(jìn)行了硅原型實(shí)現(xiàn),工作頻率為250MHz,總面積為17676582.00μm~2,其中控制內(nèi)核面積為3107821.56μm~2,硬件開銷比例為17.58%。以DGEMM為評(píng)測(cè)程序,對(duì)系統(tǒng)實(shí)現(xiàn)的顯式存儲(chǔ)訪問機(jī)制進(jìn)行了性能評(píng)測(cè),存儲(chǔ)訪問延遲隱藏能夠達(dá)到運(yùn)行總時(shí)間的56%,并獲得1.5倍的加速比,表明該機(jī)制可有效彌補(bǔ)計(jì)算與存儲(chǔ)訪問間的速度差異,提高系統(tǒng)計(jì)算效率。
[Abstract]:Hybrid computing architecture uses heterogeneous processors, fully mining the architecture advantages of different architecture processors, respectively to control intensive and computation-intensive tasks to optimize processing, collaborative implementation of the accelerated application. It has become one of the most important development trends of high performance computing architecture. This project team is based on hybrid computing. A high performance multi-core processor (ESCA) is designed for engineering science computing and multimedia applications. Engineering and ScientificComputing Accelerator. ESCA processors accelerate computation-intensive tasks in applications in the form of coprocessors. Using SIMD/Vector/Sub-word and other technologies to achieve high performance. The ESCA processor consists of two parts: the control kernel and the computing array. This paper focuses on the research and implementation of the key technology of the control kernel. This paper first introduces the relevant models from the point of view of ESCA system, and then describes the key knowledge of instruction set, hardware framework and storage organization of ESCA processor. The specific functional responsibilities of the control kernel and the definition of the microarchitecture are defined. The control kernel instruction set adopts hierarchical coding and extends the control instructions to support special control flow. The control kernel instruction set is optimized for large-scale structured data transmission. An explicit storage access mechanism is proposed. The hardware implementation takes pipeline as the main line and strives for a compromise between performance and overhead. The hardware / software co-verification method is used to verify the complex control flow of the control kernel. A hybrid verification platform is designed, and the automated verification process greatly shortens the verification cycle. The final ESCA processor is designed and implemented with a silicon prototype with a working frequency of 250MHz and a total area of 17676582.00 渭 mm2. The control kernel area is 3107821.56 渭 m ~ 2, and the hardware overhead ratio is 17.58. DGEMM is used as the evaluation program. The performance of the explicit storage access mechanism implemented by the system is evaluated. The storage access delay hiding can reach 56 times of the total running time and obtain a speedup of 1.5 times. It is shown that this mechanism can effectively compensate for the speed difference between computing and storage access and improve the system computing efficiency.
【學(xué)位授予單位】:華中科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP332

【參考文獻(xiàn)】

相關(guān)期刊論文 前6條

1 溫璞;楊學(xué)軍;;V-PIM中低功耗分體多端口向量寄存器文件設(shè)計(jì)[J];計(jì)算機(jī)工程與應(yīng)用;2006年04期

2 黃立波;岳虹;陸洪毅;戴葵;;一種高性能子字并行乘法器的設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)工程與應(yīng)用;2007年20期

3 馬勝;黃立波;王志英;劉聰;戴葵;;子字并行加法器的研究與實(shí)現(xiàn)[J];計(jì)算機(jī)工程與應(yīng)用;2009年36期

4 饒金理;吳丹;陳攀;董冕;鄧承諾;戴葵;鄒雪城;;基于ESCA系統(tǒng)的層次化顯式訪存機(jī)制研究[J];計(jì)算機(jī)工程;2011年22期

5 楊學(xué)軍;廖湘科;盧凱;胡慶豐;宋君強(qiáng);蘇金樹;;The TianHe-1A Supercomputer: Its Hardware and Software[J];Journal of Computer Science & Technology;2011年03期

6 董冕;吳丹;饒金理;黃威;戴葵;鄒雪城;;高性能子字并行運(yùn)算單元的設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)工程;2012年16期



本文編號(hào):1402505

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1402505.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶9a5e1***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com