基于CUDA的FFT并行計(jì)算研究
[Abstract]:Discrete Fourier transform (DFT) is an important mathematical transformation commonly used in digital signal processing system. The feasibility, complexity and efficiency of the algorithm are the important factors that affect the calculation results. In recent years, GPU is developing at a speed that greatly exceeds Moore's Law. The single precision floating-point processing capability and external memory bandwidth of mainstream GPUs are obviously superior to those of CPU in the same period. General computing based on graphics hardware GPU is becoming a research hotspot in parallel field. In particular, NVIDIA introduced the CUDA unified computing equipment architecture in 2007, which has been greatly improved in programming, optimization and so on, greatly enhanced the GPU's general computing capability .CUDA does not need to use graphics API, using C language to develop. Make it easier for developers to transition from CPU programming mode to GPU programming mode. With the development of GPU's programmable ability, parallel processing ability and application scope, GPU has developed into a highly parallel, multithreaded, multi-core processor. Using the parallel processing capability of GPU, heterogeneous parallel computing systems characterized by CPU-GPU hybrid acceleration will become the mainstream of high-performance computing in the future. In this paper, the hardware architecture and programming model of CUDA are analyzed. Based on the analysis of the current situation of GPU general computing, the method of CUDA programming is put forward. Then the basic principle of Fast Fourier transform (FFT) is discussed in depth, and the implementation process and related properties of 2-FFT algorithm in time domain are introduced in detail. According to the characteristics of high parallel division and control of FFT algorithm, combined with CUDA programming model and implementation mechanism, The parallel algorithm of fast Fourier transform is designed with C-like language of CUDA. The improved algorithm adopts the CPU GPU heterogeneous model, and introduces GPU into the calculation, which allows GPU to undertake the addition and multiplication of the complex number and the large scale computation in the program. The traditional serial algorithm for the fast Fourier transform of N-point sequence needs three-layer cycle, and the time complexity is O (Nlog2N). The improved algorithm uses thread-level organization structure to realize parallel operation of independent N / 2 butterfly operations in the same level, and the original three-layer loop can be completed by two-layer loop. The time complexity becomes O (N), which can accelerate and optimize the fast Fourier transform (FFT). Finally, the paper builds up the CUDA experimental running environment, realizes the traditional fast Fourier algorithm running on CPU, and the improved algorithm running on GPU. At the same time, the program code of FFTW function library and CUFFT function library are also called. Through the analysis of experimental data, the superiority of fast Fourier algorithm using CUDA architecture is proved, and the advantage of GPU in dealing with a large amount of data is verified.
【學(xué)位授予單位】:湖南大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類(lèi)號(hào)】:TP338.6
【參考文獻(xiàn)】
相關(guān)期刊論文 前8條
1 李偉;孫進(jìn)平;王俊;李少洪;;一種基于FPGA的超高速32k點(diǎn)FFT處理器[J];北京航空航天大學(xué)學(xué)報(bào);2007年12期
2 韓博;周秉鋒;;GPGPU性能模型及應(yīng)用實(shí)例分析[J];計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào);2009年09期
3 趙麗麗;張盛兵;張萌;姚濤;;基于CUDA的高速FFT計(jì)算[J];計(jì)算機(jī)應(yīng)用研究;2011年04期
4 王芳;張學(xué)鋒;程增會(huì);;快速傅立葉變換中的一種倒位序生成法[J];計(jì)算機(jī)應(yīng)用與軟件;2011年02期
5 林一松;楊學(xué)軍;唐滔;王桂彬;徐新海;;一種基于并行度分析模型的GPU功耗優(yōu)化技術(shù)[J];計(jì)算機(jī)學(xué)報(bào);2011年04期
6 朱林;王志凌;黃天戍;;基于DSP并行系統(tǒng)的FFT算法實(shí)現(xiàn)[J];武漢理工大學(xué)學(xué)報(bào);2009年20期
7 董惠;衛(wèi)銘斐;江麗;曾俊;;基于FPGA的FFT處理器的設(shè)計(jì)與仿真[J];微電子學(xué)與計(jì)算機(jī);2008年11期
8 王潤(rùn)澤;王穎;楊棟毅;;大規(guī)模FFT并行計(jì)算中二維SRAM的設(shè)計(jì)[J];中國(guó)科學(xué)院研究生院學(xué)報(bào);2008年01期
本文編號(hào):2137456
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2137456.html