天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 計算機論文 >

M-DSP定點運算單元及混洗單元的設(shè)計驗證與優(yōu)化

發(fā)布時間:2018-10-09 21:48
【摘要】:隨著航空航天、通信、醫(yī)療等領(lǐng)域中數(shù)據(jù)處理量的增大以及實時信息處理能力的需求提升,使得高性能DSP(Digital Signal Processing)成為國內(nèi)外的研究熱點。M-DSP是一款自主研發(fā)的32位高性能DSP,采用11發(fā)射的超長指令字(VLIW)體系結(jié)構(gòu),擁有強大的并行計算能力,在40nm工藝下主頻達(dá)1GHz。本文基于M-DSP的研發(fā)平臺,完成IALU單元和混洗單元的設(shè)計、優(yōu)化、驗證工作,主要內(nèi)容如下:一、根據(jù)M-DSP的設(shè)計需求,設(shè)計了IALU單元的指令集和微體系結(jié)構(gòu),并實現(xiàn)了兩種各具優(yōu)點的IALU單元設(shè)計方案。一種是以Kogge-Stone樹為核心的加法器分立式IALU結(jié)構(gòu),具有較好的時序,而且便于采用門控精細(xì)控制功耗,但面積較大;另一種是二級超前進(jìn)位加法器IALU結(jié)構(gòu),IALU大多數(shù)指令通過復(fù)用加法器實現(xiàn),IALU單元面積較小,但其結(jié)構(gòu)復(fù)雜,時序相對較差。本文根據(jù)M-DSP的設(shè)計需要最終采用第一種實現(xiàn)方案。二、目前,傳統(tǒng)的混洗指令需要使用Load指令提前加載混洗模式,這種方式占用過多的系統(tǒng)寄存器資源并且執(zhí)行周期較長。為克服上述問題,本文設(shè)計了一款配置和執(zhí)行相分離,并且擁有特定的混洗模式地址寄存器和混洗模式存儲體的高效混洗單元。三、針對本文所設(shè)計的IALU單元和混洗單元的特點,設(shè)計了完整詳細(xì)的驗證方案。主要采用模擬驗證的方法,分別從模塊級到系統(tǒng)級對IALU單元和混洗單元進(jìn)行驗證。模塊級驗證包括功能點、ATEC和隨機數(shù)驗證;系統(tǒng)級包括全局信號和指令組合驗證等,并對驗證情況做覆蓋率統(tǒng)計,分析消除驗證盲點。另外,采用形式化驗證的方法,驗證綜合后網(wǎng)表和RTL級代碼的一致性。四、分別對IALU單元和混洗單元設(shè)計采用樹狀選擇結(jié)構(gòu)、邏輯優(yōu)化和流水線技術(shù)等方法進(jìn)行時序優(yōu)化,并采用門控時鐘、邏輯重組、操作數(shù)隔離和狀態(tài)碼優(yōu)化等方法進(jìn)行RTL級功耗優(yōu)化。最后在40nm CMOS工藝下,使用Design Complier綜合工具對IALU單元和Shuffle單元進(jìn)行綜合,其中IALU關(guān)鍵路徑延時為400ps,總面積為7004.2372um2;Shuffle單元關(guān)鍵路徑延時為430ps,總面積為151811.721um2,結(jié)果表明其性能、面積達(dá)到M-DSP的設(shè)計要求。
[Abstract]:With the increase of data processing capacity in aerospace, communication, medical and other fields, as well as the need for real-time information processing capacity, High performance DSP (Digital Signal Processing) has become a research hotspot at home and abroad. M-DSP is a self-developed 32-bit high performance DSP, (VLIW) architecture with 11-transmitted super-long instruction word. It has powerful parallel computing capability and the main frequency reaches 1 GHz in the 40nm process. Based on the research and development platform of M-DSP, this paper completes the design, optimization and verification of IALU unit and washing unit. The main contents are as follows: 1. According to the design requirements of M-DSP, the instruction set and microarchitecture of IALU unit are designed. Two IALU element design schemes with each advantage are implemented. One is a discrete IALU structure with Kogge-Stone tree as the core, which has better timing, and is convenient to use gated fine control power consumption, but the area is large. The other is that the IALU structure of the two-stage carry-ahead adder is small in area, but its structure is complex and the timing is relatively poor. According to the design needs of M-DSP, the first implementation scheme is adopted in this paper. Second at present the traditional shuffling instruction needs to load the washing mode in advance with Load instruction which takes up too much system register resource and has a long execution period. In order to overcome the above problems, this paper designs an efficient shuffling unit with a separate configuration and execution phase, and has a specific address register of the shuffling mode and the memory of the shuffling mode. Thirdly, according to the characteristics of the IALU unit and the washing unit designed in this paper, a complete and detailed verification scheme is designed. The IALU unit and the washing unit are verified from module level to system level by the method of simulation verification. Module level verification includes function point ATEC and random number verification, and system level includes global signal and instruction combination verification. In addition, the method of formal verification is used to verify the consistency between the net table and the RTL level code. Fourthly, the IALU unit and the washing unit are designed using tree selection structure, logic optimization and pipeline technology, respectively, and the timing is optimized by gating clock, logic recombination, etc. Operand isolation and state code optimization are used to optimize power consumption at RTL level. Finally, the Design Complier synthesis tool is used to synthesize the IALU unit and the Shuffle unit in 40nm CMOS process. The critical path delay of IALU is 400ps. the total area is 7004.2372um2Shuffle, the critical path delay is 430psand the total area is 151811.721um2. The result shows its performance. The area meets the design requirements of M-DSP.
【學(xué)位授予單位】:國防科學(xué)技術(shù)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2015
【分類號】:TP332

【相似文獻(xiàn)】

相關(guān)期刊論文 前10條

1 楊俊波;蘇顯渝;;自由空間非對稱與對稱混洗網(wǎng)絡(luò)的拓?fù)涞葍r[J];光電子.激光;2007年01期

2 馮向萍;張?zhí)t;;混洗算法在考場編排中的應(yīng)用[J];福建電腦;2008年06期

3 康輝,章江英,戰(zhàn)元齡;用棱鏡實現(xiàn)高效率的完全混洗互連網(wǎng)絡(luò)[J];光學(xué)學(xué)報;1995年03期

4 萬江華;劉勝;周鋒;王耀華;陳書明;;具有高效混洗模式存儲器的可編程混洗單元[J];國防科技大學(xué)學(xué)報;2011年06期

5 李源,曹明翠,羅風(fēng)光,陳清明;反-逆混洗光電混合循環(huán)排序網(wǎng)[J];光學(xué)學(xué)報;1999年05期

6 曹樹國;;基于考場編排的改進(jìn)分治混洗算法研究[J];計算機應(yīng)用與軟件;2014年06期

7 楊俊波;劉菊;楊建坤;李修建;蘇顯渝;徐平;;非對稱型多級混洗網(wǎng)絡(luò)拓?fù)浣Y(jié)構(gòu)與路由研究[J];光電子.激光;2010年05期

8 P-Y.Chen;D.H.Lawrie;D.A.Padna;P-C.Yew;張德芳;萬湘林;張濱;;混洗互連網(wǎng)絡(luò)[J];計算機工程與科學(xué);1983年03期

9 馮向萍;張?zhí)t;李萍;;高考考場編排算法研究[J];新疆農(nóng)業(yè)大學(xué)學(xué)報;2008年03期

10 ;[J];;年期

相關(guān)重要報紙文章 前1條

1 蘇東華 陳章浩;衣襪混洗易致病[N];醫(yī)藥經(jīng)濟(jì)報;2006年

相關(guān)碩士學(xué)位論文 前2條

1 汪峰;M-DSP定點運算單元及混洗單元的設(shè)計驗證與優(yōu)化[D];國防科學(xué)技術(shù)大學(xué);2015年

2 彭浩;X-DSP 64位SIMD位處理部件及混洗單元的設(shè)計與實現(xiàn)[D];國防科學(xué)技術(shù)大學(xué);2013年



本文編號:2260906

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2260906.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶a0ae3***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com