基于X-DSP乘法部件的設(shè)計(jì)、驗(yàn)證與優(yōu)化
發(fā)布時(shí)間:2018-12-27 06:43
【摘要】:X-DSP是一款自主正向研發(fā)的、支持浮點(diǎn)和定點(diǎn)操作的32位高性能數(shù)字信號(hào)處理器,采用超長(zhǎng)指令字(VLIW)體系結(jié)構(gòu)和單指令流多數(shù)據(jù)流(SIMD)技術(shù)。乘法部件是CPU內(nèi)核四大功能運(yùn)算部件之一。本文根據(jù)X-DSP的設(shè)計(jì)要求,研制開發(fā)了一款高性能、支持定點(diǎn)和浮點(diǎn)乘法的SIMD乘法部件,滿足了DSP對(duì)并行運(yùn)算、高精度以及實(shí)時(shí)數(shù)據(jù)處理能力的需求。本文的主要研究?jī)?nèi)容有以下幾點(diǎn): 1、乘法部件的設(shè)計(jì)。首先對(duì)乘法部件的指令進(jìn)行分析,然后根據(jù)分析結(jié)果對(duì)定點(diǎn)乘法和浮點(diǎn)乘法進(jìn)行結(jié)構(gòu)設(shè)計(jì),之后采用多數(shù)據(jù)流乘法矩陣算法、Wallace樹型結(jié)構(gòu)以及超前進(jìn)位加法器實(shí)現(xiàn)了SIMD乘法部件的邏輯設(shè)計(jì)。 2、乘法部件的時(shí)序優(yōu)化。首先,對(duì)乘法部件進(jìn)行邏輯綜合,得出關(guān)鍵路徑。然后對(duì)處在關(guān)鍵路徑上的功能模塊進(jìn)行優(yōu)化設(shè)計(jì)。最后從邏輯結(jié)構(gòu)與算法級(jí)和代碼級(jí)對(duì)整個(gè)乘法部件進(jìn)行時(shí)序優(yōu)化。優(yōu)化后,在45nm CMOS工藝下,且在面積、功耗等性能滿足設(shè)計(jì)要求的前提下,,關(guān)鍵路徑延時(shí)減少190ps,時(shí)序性能提高22.4%,寄存器的個(gè)數(shù)減少了18.3%。 3、乘法部件的功能驗(yàn)證。本文采取模擬驗(yàn)證和FPGA仿真驗(yàn)證方法對(duì)乘法部件進(jìn)行功能驗(yàn)證。模擬驗(yàn)證的關(guān)鍵是測(cè)試向量的開發(fā),驗(yàn)證過程中采取功能覆蓋的方法從模塊級(jí)和系統(tǒng)級(jí)對(duì)乘法部件進(jìn)行了測(cè)試向量的開發(fā)。模塊級(jí)驗(yàn)證主要根據(jù)每個(gè)模塊實(shí)現(xiàn)的功能開發(fā)測(cè)試向量。系統(tǒng)級(jí)驗(yàn)證主要分為流水線驗(yàn)證和運(yùn)算功能驗(yàn)證。最后,對(duì)乘法部件進(jìn)行了FPGA仿真驗(yàn)證。 在45nm CMOS工藝下,布局布線結(jié)果表明:乘法部件在worst條件下主頻達(dá)到1GHz,動(dòng)態(tài)功耗為12.6686mW,靜態(tài)功耗為4.5032mW,面積為202718.88um2,完全達(dá)到X-DSP的設(shè)計(jì)目標(biāo)。
[Abstract]:X-DSP is an autonomous forward developed 32-bit high performance digital signal processor which supports floating-point and fixed-point operation. It adopts super-long instruction word (VLIW) architecture and single-instruction stream multi-stream (SIMD) technology. Multiplicative component is one of the four functions of CPU kernel. According to the design requirements of X-DSP, a high performance SIMD multiplier supporting fixed-point and floating-point multiplication is developed in this paper, which meets the requirements of DSP for parallel operation, high precision and real-time data processing. The main contents of this paper are as follows: 1. Design of multiplication components. Firstly, the instructions of multiplication components are analyzed, then the structure of fixed-point multiplication and floating-point multiplication are designed according to the analysis results, and then the multi-data stream multiplication matrix algorithm is adopted. The Wallace tree structure and the ahead carry adder realize the logical design of the SIMD multiplication unit. 2. Timing optimization of multiplicative components. First of all, the multiplication components are logically synthesized and the critical path is obtained. Then the function module on the critical path is optimized. Finally, the logic structure, algorithm level and code level are used to optimize the time sequence of the whole multiplication unit. After optimization, the critical path delay is reduced by 190psand the timing performance is improved by 22.4s, and the number of registers is reduced by 18.3in 45nm CMOS process, and the performance of critical path is reduced by 190ps. 3. Functional verification of multiplication components. In this paper, the method of simulation verification and FPGA simulation verification is used to verify the function of multiplication components. The key of simulation verification is the development of test vectors. In the process of verification, the test vectors are developed at the module level and the system level by the method of functional coverage. Module level verification is mainly based on the function of each module to develop the test vector. System level verification is mainly divided into pipeline verification and operational function verification. Finally, the multiplication components are verified by FPGA simulation. In 45nm CMOS process, the layout and wiring results show that the main frequency of the multiplier is 1 GHz under worst, the dynamic power consumption is 12.6686mW, the static power consumption is 4.5032mW, the area is 202718.88um2, and the design goal of X-DSP is fully achieved.
【學(xué)位授予單位】:國(guó)防科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP332
本文編號(hào):2392651
[Abstract]:X-DSP is an autonomous forward developed 32-bit high performance digital signal processor which supports floating-point and fixed-point operation. It adopts super-long instruction word (VLIW) architecture and single-instruction stream multi-stream (SIMD) technology. Multiplicative component is one of the four functions of CPU kernel. According to the design requirements of X-DSP, a high performance SIMD multiplier supporting fixed-point and floating-point multiplication is developed in this paper, which meets the requirements of DSP for parallel operation, high precision and real-time data processing. The main contents of this paper are as follows: 1. Design of multiplication components. Firstly, the instructions of multiplication components are analyzed, then the structure of fixed-point multiplication and floating-point multiplication are designed according to the analysis results, and then the multi-data stream multiplication matrix algorithm is adopted. The Wallace tree structure and the ahead carry adder realize the logical design of the SIMD multiplication unit. 2. Timing optimization of multiplicative components. First of all, the multiplication components are logically synthesized and the critical path is obtained. Then the function module on the critical path is optimized. Finally, the logic structure, algorithm level and code level are used to optimize the time sequence of the whole multiplication unit. After optimization, the critical path delay is reduced by 190psand the timing performance is improved by 22.4s, and the number of registers is reduced by 18.3in 45nm CMOS process, and the performance of critical path is reduced by 190ps. 3. Functional verification of multiplication components. In this paper, the method of simulation verification and FPGA simulation verification is used to verify the function of multiplication components. The key of simulation verification is the development of test vectors. In the process of verification, the test vectors are developed at the module level and the system level by the method of functional coverage. Module level verification is mainly based on the function of each module to develop the test vector. System level verification is mainly divided into pipeline verification and operational function verification. Finally, the multiplication components are verified by FPGA simulation. In 45nm CMOS process, the layout and wiring results show that the main frequency of the multiplier is 1 GHz under worst, the dynamic power consumption is 12.6686mW, the static power consumption is 4.5032mW, the area is 202718.88um2, and the design goal of X-DSP is fully achieved.
【學(xué)位授予單位】:國(guó)防科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP332
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 郝志剛;曾獻(xiàn)君;;一種并行的Sticky位計(jì)算方法[J];計(jì)算機(jī)工程與科學(xué);2006年04期
本文編號(hào):2392651
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2392651.html
最近更新
教材專著