天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

高性能硬件加速器的實(shí)現(xiàn)

發(fā)布時(shí)間:2018-10-18 20:32
【摘要】:在現(xiàn)代復(fù)雜數(shù)字信號(hào)處理中,隨著算法復(fù)雜度和待處理數(shù)據(jù)量的日益劇增,通用處理器已經(jīng)難以滿足某些特定應(yīng)用中數(shù)據(jù)處理的高速實(shí)時(shí)性要求。異構(gòu)多核系統(tǒng)可以將不同的計(jì)算任務(wù)分配到不同的處理器核進(jìn)行并行處理,加速任務(wù)執(zhí)行,提供了更加高效、靈活的處理機(jī)制,滿足多種應(yīng)用的需求。硬件加速器可提高面向特定應(yīng)用中科學(xué)計(jì)算的運(yùn)算速度。因此,集成有硬件加速器形式的異構(gòu)多核系統(tǒng)架構(gòu)應(yīng)運(yùn)而生。一些多核處理器通過(guò)集成專用的加速器核對(duì)某些特定應(yīng)用進(jìn)行加速運(yùn)算,但是其靈活性不高。隨著可重構(gòu)技術(shù)的出現(xiàn),將可重構(gòu)技術(shù)應(yīng)用于硬件加速器中,能夠彌補(bǔ)通用運(yùn)算和軟件計(jì)算在性能和靈活性上的不足,為復(fù)雜高速信號(hào)的處理提出了更好的解決方案。本文根據(jù)發(fā)展趨勢(shì),對(duì)可重構(gòu)計(jì)算技術(shù)、硬件加速器,以及如何在異構(gòu)多核系統(tǒng)中集成硬件加速器進(jìn)行了研究。本文主要從以下三個(gè)方面進(jìn)行了研究:(1)根據(jù)應(yīng)用目標(biāo)需求特征,分析面向高密度計(jì)算的應(yīng)用特點(diǎn),并對(duì)矩陣算法特征進(jìn)行研究分析,分析出并行程度高、可有效提高系統(tǒng)性能的矩陣計(jì)算方法,并針對(duì)應(yīng)用目標(biāo)和應(yīng)用平臺(tái)對(duì)矩陣運(yùn)算類型的算法進(jìn)行了優(yōu)化,得到了原位替換的混合粒度并行矩陣求逆實(shí)現(xiàn)方法。(2)在優(yōu)化算法和結(jié)構(gòu)的基礎(chǔ)上,提出了優(yōu)化后的矩陣求逆算法的硬件架構(gòu),設(shè)計(jì)了面向異構(gòu)多核系統(tǒng)的可重構(gòu)高性能硬件加速器。該硬件加速器主要面向高密度計(jì)算領(lǐng)域中矩陣類運(yùn)算。特別是矩陣求逆運(yùn)算,能夠高效地完成128階以內(nèi)2n階單精度任意浮點(diǎn)實(shí)數(shù)矩陣的求逆運(yùn)算。(3)基于Xilinx V6 FPGA對(duì)設(shè)計(jì)的硬件加速器進(jìn)行了實(shí)驗(yàn)驗(yàn)證和性能分析,并介紹了該硬件加速器在異構(gòu)多核系統(tǒng)中的集成,驗(yàn)證了所設(shè)計(jì)的硬件加速器具有較高的性能。
[Abstract]:In modern complex digital signal processing, with the increasing complexity of the algorithm and the amount of data to be processed, the general-purpose processor has been unable to meet the requirements of high-speed real-time data processing in some specific applications. Heterogeneous multi-core systems can assign different computing tasks to different processor cores for parallel processing, accelerate task execution, provide a more efficient and flexible processing mechanism, and meet the needs of various applications. Hardware accelerators can improve the speed of scientific computation for specific applications. Therefore, the architecture of heterogeneous multicore system with hardware accelerator has emerged as the times require. Some multicore processors perform accelerated operations by integrating specialized accelerators for certain applications, but their flexibility is not high. With the advent of reconfigurable technology, the application of reconfigurable technology to hardware accelerators can make up for the shortcomings of performance and flexibility in general computing and software computing, and provide a better solution for complex high-speed signal processing. According to the development trend, this paper studies reconfigurable computing technology, hardware accelerator and how to integrate hardware accelerator in heterogeneous multi-core system. In this paper, the following three aspects are studied: (1) according to the requirements of the application target, the application characteristics of high-density computing are analyzed, and the characteristics of matrix algorithm are analyzed, and the high degree of parallelism is analyzed. The matrix calculation method which can improve the system performance effectively, and the algorithm of matrix operation type is optimized according to the application target and application platform. A hybrid granularity parallel matrix inversion method with in-situ substitution is obtained. (2) based on the optimization algorithm and structure, the hardware architecture of the optimized matrix inversion algorithm is proposed. A reconfigurable high performance hardware accelerator for heterogeneous multi-core systems is designed. The hardware accelerator is mainly used for matrix operations in the field of high density computing. Especially, the inverse operation of matrix can efficiently perform the inversion of real matrix with single precision of 2n within order 128. (3) based on Xilinx V6 FPGA, the experimental verification and performance analysis of the designed hardware accelerator are carried out. The integration of the hardware accelerator in heterogeneous multicore system is introduced, and the high performance of the designed accelerator is verified.
【學(xué)位授予單位】:合肥工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP332

【參考文獻(xiàn)】

相關(guān)期刊論文 前7條

1 劉仲;田希;陳磊;;支持原位計(jì)算的高效三角矩陣乘法向量化方法[J];國(guó)防科技大學(xué)學(xué)報(bào);2014年06期

2 張啟英;劉亞剛;張淑艷;朱娟;;基于FPGA的硬件加速器設(shè)計(jì)的研究與應(yīng)用[J];計(jì)算機(jī)光盤(pán)軟件與應(yīng)用;2013年17期

3 許芳;席毅;陳虹;靳偉偉;;基于FPGA/Nios-Ⅱ的矩陣運(yùn)算硬件加速器設(shè)計(jì)[J];電子測(cè)量與儀器學(xué)報(bào);2011年04期

4 周杰;陳嘯洋;趙建勛;竇勇;;大矩陣QR分解的FPGA設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)工程與科學(xué);2010年10期

5 蘇濤,莊德靖,吳順君;一種SAR成像快速算法及其并行實(shí)現(xiàn)[J];西安電子科技大學(xué)學(xué)報(bào);2005年01期

6 譚道盛,溫啟愚;矩陣的任意分塊求逆及其應(yīng)用[J];四川大學(xué)學(xué)報(bào)(自然科學(xué)版);1999年01期

7 徐蘭;;復(fù)數(shù)矩陣的快速Givens變換[J];華東師范大學(xué)學(xué)報(bào)(自然科學(xué)版);1988年03期

相關(guān)博士學(xué)位論文 前5條

1 李東生;基于高密度計(jì)算的多核芯片設(shè)計(jì)關(guān)鍵技術(shù)研究[D];合肥工業(yè)大學(xué);2012年

2 王超;異構(gòu)多核可重構(gòu)片上系統(tǒng)關(guān)鍵技術(shù)研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2011年

3 鄔貴明;FPGA矩陣計(jì)算并行算法與結(jié)構(gòu)[D];國(guó)防科學(xué)技術(shù)大學(xué);2011年

4 谷曉忱;并行蒙特卡羅計(jì)算硬件加速器的關(guān)鍵技術(shù)研究[D];國(guó)防科學(xué)技術(shù)大學(xué);2010年

5 宋宇鯤;動(dòng)態(tài)可重構(gòu)協(xié)處理器研究[D];合肥工業(yè)大學(xué);2006年

相關(guān)碩士學(xué)位論文 前6條

1 郭磊;矩陣運(yùn)算的硬件加速技術(shù)研究[D];國(guó)防科學(xué)技術(shù)大學(xué);2010年

2 邵儀;基于FPGA的矩陣運(yùn)算固化實(shí)現(xiàn)技術(shù)研究[D];解放軍信息工程大學(xué);2010年

3 李本齋;PowerPC下H.264運(yùn)動(dòng)估計(jì)硬件加速器研究[D];合肥工業(yè)大學(xué);2010年

4 陳迎春;DReNoC:基于片上網(wǎng)絡(luò)的動(dòng)態(tài)可重構(gòu)計(jì)算系統(tǒng)研究與實(shí)現(xiàn)[D];合肥工業(yè)大學(xué);2010年

5 何瑩瑩;基于二維網(wǎng)格NoC的矩陣求逆加速實(shí)現(xiàn)[D];合肥工業(yè)大學(xué);2010年

6 林皓;基于FPGA的矩陣運(yùn)算實(shí)現(xiàn)[D];南京理工大學(xué);2007年

,

本文編號(hào):2280291

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2280291.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶aa247***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com