天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

YHFT-Matrix DSP低功耗向量運(yùn)算單元設(shè)計(jì)與歸約網(wǎng)絡(luò)研究

發(fā)布時(shí)間:2018-02-07 12:12

  本文關(guān)鍵詞: 數(shù)字信號(hào)處理器 向量運(yùn)算技術(shù) 低功耗 算術(shù)邏輯單元 除法器 歸約網(wǎng)絡(luò) 邏輯驗(yàn)證 出處:《國(guó)防科學(xué)技術(shù)大學(xué)》2012年碩士論文 論文類型:學(xué)位論文


【摘要】:數(shù)字信號(hào)處理器(DSP)是一種特別適合于數(shù)字信號(hào)處理運(yùn)算的嵌入式微處理器。隨著其在通信、多媒體處理等高端領(lǐng)域的廣泛應(yīng)用,對(duì)DSP性能的要求也越來(lái)越高,因此研究和設(shè)計(jì)高性能DSP就具有較大的科研和應(yīng)用價(jià)值。本文依托于面向軟件無(wú)線電的“YHFT-Matrix DSP”的開(kāi)發(fā)與研制,旨在研究和設(shè)計(jì)符合YHFT-Matrix DSP高標(biāo)準(zhǔn)要求的向量運(yùn)算單元和歸約網(wǎng)絡(luò)。 本文研究了DSP的結(jié)構(gòu)特點(diǎn)和向量運(yùn)算技術(shù)的實(shí)現(xiàn),并介紹了國(guó)際上將相關(guān)向量運(yùn)算實(shí)現(xiàn)技術(shù)應(yīng)用于面向3G和4G無(wú)線通信的DSP。概述了YHFT-Matrix DSP的體系結(jié)構(gòu),以及向量運(yùn)算單元和向量數(shù)據(jù)交互網(wǎng)絡(luò)的特點(diǎn),指出向量運(yùn)算單元的設(shè)計(jì)需結(jié)合低功耗技術(shù),向量數(shù)據(jù)交互網(wǎng)絡(luò)要滿足靈活性和便于使用的要求,并根據(jù)開(kāi)發(fā)者的反饋信息總結(jié)了現(xiàn)有運(yùn)算單元值得提升和改進(jìn)的功能點(diǎn)。 將低功耗設(shè)計(jì)方法和RTL級(jí)的低功耗設(shè)計(jì)技術(shù)應(yīng)用于向量運(yùn)算單元的設(shè)計(jì)。用門(mén)控時(shí)鐘技術(shù)實(shí)現(xiàn)了可變寬度的向量處理單元VPU。分析了定點(diǎn)SIMD IALU的應(yīng)用需求以及相關(guān)指令,以進(jìn)位選擇SIMD加法器為核心,結(jié)合操作數(shù)隔離低功耗技術(shù),設(shè)計(jì)并實(shí)現(xiàn)了低功耗定點(diǎn)SIMD IALU;诜蛛x基數(shù)的基_4除法算法,結(jié)合狀態(tài)賦值低功耗技術(shù),設(shè)計(jì)了定點(diǎn)除法器,支持有符號(hào)和無(wú)符號(hào)除法運(yùn)算,數(shù)據(jù)通路為8/16/32位SISD/SIMD模式,可工作于固定執(zhí)行周期模式和可變執(zhí)行周期模式,兩種模式分別適用于向量處理單元VPU和標(biāo)量處理單元SPU。 以矩陣乘法算法為例,比較了歸約的軟件實(shí)現(xiàn)方式和硬件實(shí)現(xiàn)方式,,結(jié)果表明在增加面積開(kāi)銷的條件下硬件實(shí)現(xiàn)方式對(duì)算法具有明顯的加速作用。在定點(diǎn)歸約網(wǎng)絡(luò)的設(shè)計(jì)中,引入歸約樹(shù)模型實(shí)現(xiàn)了定點(diǎn)歸約網(wǎng)絡(luò)的完整平均分組,以隱式自增指定目標(biāo)VPE的方式實(shí)現(xiàn)了定點(diǎn)歸約網(wǎng)絡(luò)的循環(huán)編程。研究了浮點(diǎn)歸約的實(shí)現(xiàn)方式,指出由于浮點(diǎn)運(yùn)算單元巨大的硬件面積開(kāi)銷,浮點(diǎn)歸約網(wǎng)絡(luò)應(yīng)采用軟硬件相結(jié)合的實(shí)現(xiàn)方式;赮HFT-Matrix DSP中定點(diǎn)歸約網(wǎng)絡(luò)的分組模式,給出了一種支持浮點(diǎn)混合運(yùn)算歸約網(wǎng)絡(luò)的實(shí)現(xiàn)方案:用SPU配置浮點(diǎn)歸約運(yùn)算類型,通過(guò)專用的混洗網(wǎng)絡(luò)搬移操作數(shù),并調(diào)用向量運(yùn)算單元中的浮點(diǎn)運(yùn)算部件實(shí)現(xiàn)計(jì)算,從而完成浮點(diǎn)歸約操作。 介紹了YHFT-Matrix DSP的邏輯功能驗(yàn)證流程,編寫(xiě)基于Verilog語(yǔ)言和Perl腳本語(yǔ)言的運(yùn)算部件模塊級(jí)測(cè)試平臺(tái)。用DC綜合工具對(duì)實(shí)現(xiàn)的三個(gè)運(yùn)算部件在TSMC65nm工藝下進(jìn)行了邏輯綜合,給出綜合結(jié)果和性能比較,結(jié)果表明三個(gè)運(yùn)算部件均能達(dá)到700MHz工作頻率的設(shè)計(jì)要求。介紹了4核YHFT-QMBase芯片的仿真測(cè)試和單核的性能評(píng)測(cè)。
[Abstract]:Digital signal processor (DSP) is a kind of embedded microprocessor which is especially suitable for digital signal processing. With the wide application of DSP in communication, multimedia processing and other high-end fields, the performance of DSP is becoming more and more demanding. Therefore, the research and design of high performance DSP has great scientific research and application value. This paper is based on the development and development of "YHFT-Matrix DSP" for software radio, aiming to study and design vector operation units and reduction networks that meet the high standard requirements of YHFT-Matrix DSP. This paper studies the structure characteristics of DSP and the realization of vector operation technology, and introduces the application of correlation vector operation technology to 3G and 4G wireless communication in the world. The architecture of YHFT-Matrix DSP is summarized. As well as the characteristics of vector operation unit and vector data interactive network, it is pointed out that the design of vector operation unit should be combined with low power technology, and vector data interaction network should meet the requirements of flexibility and convenience. According to the feedback information of the developer, the function points of the existing computing units are summarized. The low power design method and the low power design technique of RTL level are applied to the design of vector operation unit. The variable width vector processing unit (VPU) is realized by gating clock technology. The application requirements and related instructions of fixed-point SIMD IALU are analyzed. Taking the carry-select SIMD adder as the core and combining the Operand isolation low power technology, a low power fixed-point SIMD IALU algorithm is designed and implemented. The base stack 4 division algorithm based on the separated cardinality and the state assignment low-power technology are used to design the fixed-point divider. The data path is 8 / 16 / 32 bit SISD/SIMD mode, which can work in fixed execution cycle mode and variable execution cycle mode. The two modes are suitable for vector processing unit (VPU) and scalar processing unit (SPU), respectively. Taking the matrix multiplication algorithm as an example, this paper compares the software implementation and hardware implementation of the reduction algorithm. The results show that the hardware implementation can accelerate the algorithm obviously under the condition of increasing the area overhead. In the design of fixed-point reduction network, The reduction tree model is introduced to realize the complete average packet of fixed point reduction network, and the cyclic programming of fixed point reduction network is realized by implicit self-increasing target VPE. The implementation of floating point reduction is studied. It is pointed out that the floating-point reduction network should be implemented by the combination of hardware and software because of the huge hardware area overhead of the floating-point operation unit, based on the grouping mode of fixed-point reduction network in YHFT-Matrix DSP. In this paper, a scheme of supporting floating point hybrid operation reduction network is presented. The type of floating point reduction operation is configured with SPU, the operands are moved by special washing network, and the floating-point operation unit in the vector operation unit is called to realize the calculation. Thus the floating point reduction operation is completed. This paper introduces the logic function verification flow of YHFT-Matrix DSP, compiles the modular test platform based on Verilog language and Perl script language, and uses DC synthesis tool to realize the logic synthesis of the three operation components under TSMC65nm technology. The synthetic results and performance comparison are given. The results show that the three operation components can meet the design requirements of 700MHz operating frequency. The simulation test of four core YHFT-QMBase chips and the performance evaluation of single core are introduced.
【學(xué)位授予單位】:國(guó)防科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP332;TN402

【參考文獻(xiàn)】

相關(guān)期刊論文 前4條

1 萬(wàn)江華;劉勝;周鋒;王耀華;陳書(shū)明;;具有高效混洗模式存儲(chǔ)器的可編程混洗單元[J];國(guó)防科技大學(xué)學(xué)報(bào);2011年06期

2 杜靜;敖富江;楊學(xué)軍;;矩陣向量乘在流處理器上的實(shí)現(xiàn)[J];計(jì)算機(jī)工程與科學(xué);2007年11期

3 黃秀蓀;葉青;仇玉林;;高速除法器設(shè)計(jì)及ASIC實(shí)現(xiàn)[J];微電子學(xué)與計(jì)算機(jī);2008年02期

4 宣淑巍;李曉江;馬成炎;;一種基于循環(huán)減法原理除法器的加速方法[J];微電子學(xué)與計(jì)算機(jī);2009年12期



本文編號(hào):1494349

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1494349.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶dbf08***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
91偷拍与自偷拍精品| 激情图日韩精品中文字幕| 亚洲欧美日韩熟女第一页| 欧美成人高清在线播放| 亚洲一区二区三区四区性色av| 黄片在线免费观看全集| 日本淫片一区二区三区| 日本一本在线免费福利| 在线免费看国产精品黄片| 99国产成人免费一区二区| 国产精品一区日韩欧美| 日韩美成人免费在线视频| 精品女同在线一区二区| 在线日韩欧美国产自拍| 日韩三极片在线免费播放| 欧美丝袜诱惑一区二区| 国产传媒一区二区三区| 久草国产精品一区二区| 国产精品日本女优在线观看| 国产美女精品午夜福利视频| 婷婷激情四射在线观看视频| 国产精品自拍杆香蕉视频| 一本久道久久综合中文字幕| 中文字幕乱子论一区二区三区| 男女午夜视频在线观看免费| 亚洲最新中文字幕一区| 欧美激情中文字幕综合八区| 午夜精品成年人免费视频| 欧美黑人在线精品极品| 日本少妇三级三级三级| 亚洲另类欧美综合日韩精品| 偷拍洗澡一区二区三区| 成人精品一区二区三区在线| 人妻亚洲一区二区三区| 国产91麻豆精品成人区| 欧美国产日本高清在线| 久久精品国产第一区二区三区| 欧美一区二区三区播放| 日本人妻精品有码字幕| 日本丰满大奶熟女一区二区| 色婷婷久久五月中文字幕|