高能效混合浮點(diǎn)FFT硬件加速器架構(gòu)與VLSI實(shí)現(xiàn)研究
本文關(guān)鍵詞:高能效混合浮點(diǎn)FFT硬件加速器架構(gòu)與VLSI實(shí)現(xiàn)研究 出處:《復(fù)旦大學(xué)》2014年碩士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 快速傅里葉變換 高能效電路 低成本電路 混合浮點(diǎn)
【摘要】:快速傅里葉變換(FFT)是數(shù)字信號(hào)處理中最常用的算法之一。它始終是數(shù)字信號(hào)處理領(lǐng)域的研究熱點(diǎn)。如今,FFT是很多新興應(yīng)用中的關(guān)鍵處理模塊,如基于正交頻分復(fù)用(OFDM)的手持移動(dòng)通信系統(tǒng)和生物醫(yī)療電子信號(hào)處理平臺(tái)。這些應(yīng)用有一個(gè)顯著的共同點(diǎn),那就是它們要求整個(gè)系統(tǒng)的功耗極低,以延長(zhǎng)產(chǎn)品的使用周期。同時(shí),它們也要求系統(tǒng)具備良好的適應(yīng)性,在面對(duì)不同信號(hào)輸入時(shí),都能給出理想的處理結(jié)果。因此,FFT硬件加速器必需在保證一定量化信噪比(SQNR)輸出的前提下做到高能效、低成本和高靈活性的實(shí)現(xiàn)。針對(duì)上述要求,本文從算法和電路層面優(yōu)化設(shè)計(jì)實(shí)現(xiàn)FFT硬件加速器。在算法方面,本文總結(jié)了FFT硬件實(shí)現(xiàn)中常用的數(shù)據(jù)表示格式,包括定點(diǎn)格式、浮點(diǎn)格式和基于定點(diǎn)縮放的方法。在這些格式的基礎(chǔ)上,本文提出了動(dòng)態(tài)偏置調(diào)節(jié)的混合浮點(diǎn)方法。該方法采用浮點(diǎn)格式的指數(shù)域和定點(diǎn)格式的小數(shù)域,并使復(fù)數(shù)的實(shí)部和虛部共享一個(gè)指數(shù)域。這樣可以在保證數(shù)據(jù)精度的前提下,減少硬件實(shí)現(xiàn)的成本和功耗。此外,動(dòng)態(tài)偏置調(diào)節(jié)的方法可以根據(jù)輸入信號(hào)的不同在運(yùn)算過(guò)程中動(dòng)態(tài)調(diào)整數(shù)據(jù)表示范圍,從而提高整體SQNR。這種機(jī)制保證了FFT硬件加速器的靈活性和高精度輸出。因此,采用動(dòng)態(tài)偏置調(diào)節(jié)的混合浮點(diǎn)方法的FFT硬件加速器能夠以較小數(shù)據(jù)位寬獲得較高SQNR,從而達(dá)到降低功耗和成本的目標(biāo)。在電路層面,本文實(shí)現(xiàn)的FFT硬件加速器采用單存儲(chǔ)器架構(gòu)以降低硬件的開(kāi)銷。在數(shù)據(jù)通路的實(shí)現(xiàn)中,本文采用多種方法來(lái)降低功耗和提高SQNR。第一,本文分析并減少蝶形運(yùn)算中所需的浮點(diǎn)歸一化操作,由原來(lái)的15個(gè)操作降低到4個(gè)操作。第二,本文分析并縮短蝶形運(yùn)算中所需的數(shù)據(jù)處理位寬,在小數(shù)位寬為9時(shí),可以使中間處理位寬節(jié)省多達(dá)6比特。第三,本文采用Trounding的數(shù)據(jù)舍去策略,盡可能地降低量化誤差而不增加過(guò)多的硬件開(kāi)銷。此外,本文最后著眼于基于低電壓存儲(chǔ)器的FFT硬件加速器設(shè)計(jì)。首先概述存儲(chǔ)器故障的種類和產(chǎn)生原因。然后描述了一定電壓下存儲(chǔ)器故障率的分析仿真方法。之后,給出具體故障率與電壓和電路頻率之間的關(guān)系。并根據(jù)這個(gè)對(duì)應(yīng)關(guān)系分析出一定存儲(chǔ)器電壓下FFT硬件加速器的SQNR以及該情況下的功耗收益。本文提出的FFT硬件加速器能夠計(jì)算64-8192點(diǎn)的變換。當(dāng)數(shù)據(jù)位寬為3+2*9比特,存儲(chǔ)器電壓為0.7V,使用SMIC 65nm工藝時(shí),FFT硬件加速器工作在400MHz,面積為0.482mm2,功耗為35.3mW。64點(diǎn)和8192點(diǎn)對(duì)應(yīng)的SQNR分別為41.6 dB和35.8 dB。
[Abstract]:Fast Fourier transform (FFT) is one of the most commonly used algorithms in digital signal processing. It has always been a research hotspot in the field of digital signal processing. Nowadays, FFT is a key processing module in many new applications. For example, the handheld mobile communication system and biomedical electronic signal processing platform based on OFDM (orthogonal Frequency Division Multiplexing). These applications have a remarkable common point, that is, they require very low power consumption of the whole system. At the same time, they also require the system to have good adaptability, in the face of different signal input, can give the ideal processing results. FFT hardware accelerator must achieve high energy efficiency, low cost and high flexibility on the premise of certain quantization signal-to-noise ratio (SNR) output. This paper optimizes the design and implementation of FFT hardware accelerator from the algorithm and circuit level. In the aspect of algorithm, this paper summarizes the data representation format commonly used in FFT hardware implementation, including fixed-point format. On the basis of these formats, a mixed floating point method for dynamic bias adjustment is proposed. This method uses the exponential domain of floating point format and the decimal domain of fixed point format. The real and virtual parts of the complex can share an exponential domain, which can reduce the cost and power consumption of the hardware implementation under the premise of ensuring the data accuracy. The method of dynamic bias adjustment can dynamically adjust the range of data representation according to the difference of input signal in the operation process. This mechanism ensures the flexibility and high precision output of FFT hardware accelerator. The hybrid floating-point accelerator with dynamic bias adjustment can obtain higher SQNRs with smaller data bit width, thus achieving the goal of reducing power consumption and cost at the circuit level. The hardware accelerator implemented in this paper uses single memory architecture to reduce hardware overhead. In the implementation of data path, this paper uses a variety of methods to reduce power consumption and improve SQNR. first. This paper analyzes and reduces the floating point normalization operation in butterfly operation, from 15 operations to 4 operations. Secondly, this paper analyzes and shortens the bit width of data processing required in butterfly operation. When the decimal width is 9:00, the intermediate processing bit width can be saved by up to 6 bits. Thirdly, this paper adopts the data reduction strategy of Trounding. Minimize quantization errors without adding too much hardware overhead. In the end, this paper focuses on the design of FFT hardware accelerator based on low voltage memory. Firstly, the types and causes of memory failure are summarized. Then, the analysis and simulation method of memory failure rate under certain voltage is described. After. The relationship between failure rate, voltage and circuit frequency is given. According to this relation, the SQNR of FFT hardware accelerator under certain memory voltage and the power gain in this case are analyzed. The FFT hardware accelerator can calculate the 64-8192 point transformation. When the data bit width is 3. Two or nine bits. The memory voltage is 0.7 V, and the SMIC hardware accelerator is working at 400MHz with an area of 0.482mm2 using SMIC 65nm process. The SQNR corresponding to the power consumption of 35.3mW.64 and 8192 is 41.6 dB and 35.8 dB, respectively.
【學(xué)位授予單位】:復(fù)旦大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TN911.72;TN47
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 顧葉華;曾曉洋;韓軍;張章;;低硬件成本的高性能Hash算法加速器VLSI設(shè)計(jì)[J];小型微型計(jì)算機(jī)系統(tǒng);2007年05期
2 張啟英;劉亞剛;張淑艷;朱娟;;基于FPGA的硬件加速器設(shè)計(jì)的研究與應(yīng)用[J];計(jì)算機(jī)光盤(pán)軟件與應(yīng)用;2013年17期
3 周凡;時(shí)龍興;楊軍;;基于統(tǒng)計(jì)分析的SoC定點(diǎn)硬件加速器字長(zhǎng)設(shè)計(jì)[J];固體電子學(xué)研究與進(jìn)展;2007年02期
4 裴明敬;張歆奕;;基于Nios Ⅱ的IMDCT算法硬件加速器設(shè)計(jì)[J];五邑大學(xué)學(xué)報(bào)(自然科學(xué)版);2014年01期
5 Noam Shendar;;Java軟件解決方案是怎樣勝過(guò)硬件加速器的?[J];集成電路應(yīng)用;2005年06期
6 楊煉;彭濤;;EPS系統(tǒng)中業(yè)務(wù)流模板硬件加速器設(shè)計(jì)[J];電視技術(shù);2012年23期
7 ;愛(ài)國(guó)者推出新一代MX31移動(dòng)多媒體技術(shù)平臺(tái)[J];世界電子元器件;2007年01期
8 朱燕翔;周凡;;MP3解碼的IMDCT硬件加速器方案[J];單片機(jī)與嵌入式系統(tǒng)應(yīng)用;2006年11期
9 周凡;時(shí)龍興;楊軍;張宇;高谷剛;;浮定點(diǎn)轉(zhuǎn)換與SoC定點(diǎn)加速器字長(zhǎng)協(xié)同設(shè)計(jì)研究[J];應(yīng)用科學(xué)學(xué)報(bào);2007年02期
10 孫宇豪;張盛兵;;基于FPGA的ROHC硬件解壓器設(shè)計(jì)[J];微電子學(xué)與計(jì)算機(jī);2013年08期
相關(guān)重要報(bào)紙文章 前1條
1 本報(bào)記者 趙艷秋;基站芯片:積極部署3G/LTE 頻推高性能方案[N];中國(guó)電子報(bào);2009年
相關(guān)碩士學(xué)位論文 前5條
1 席毅;基于硬件加速器的DMC控制器實(shí)現(xiàn)研究[D];吉林大學(xué);2011年
2 王振;LTE終端加解密硬件加速器的研究與設(shè)計(jì)[D];西安科技大學(xué);2012年
3 薄一帆;高能效混合浮點(diǎn)FFT硬件加速器架構(gòu)與VLSI實(shí)現(xiàn)研究[D];復(fù)旦大學(xué);2014年
4 庹偉;醫(yī)用CT三維重建系統(tǒng)中PCIE數(shù)據(jù)傳輸接口的FPGA設(shè)計(jì)與實(shí)現(xiàn)[D];哈爾濱工業(yè)大學(xué);2008年
5 趙開(kāi)蘭;靈活可配的大數(shù)運(yùn)算架構(gòu)設(shè)計(jì)[D];浙江大學(xué);2014年
,本文編號(hào):1439788
本文鏈接:http://sikaile.net/kejilunwen/wltx/1439788.html