天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 電子信息論文 >

基于FPGA的矩陣奇異值分解加速方案的設(shè)計(jì)與實(shí)現(xiàn)

發(fā)布時(shí)間:2018-05-25 17:26

  本文選題:奇異值分解 + 現(xiàn)場(chǎng)可編程邏輯門陣列; 參考:《北京交通大學(xué)》2017年碩士論文


【摘要】:奇異值分解(singular value decomposition)是數(shù)值計(jì)算學(xué)科中的一個(gè)重要組成,并且在諸如無(wú)線通信領(lǐng)域的大規(guī)模MIMO、圖像處理領(lǐng)域的特征提取及主成分分析、機(jī)器學(xué)習(xí)領(lǐng)域的數(shù)據(jù)壓縮、詞義索引和大數(shù)據(jù)領(lǐng)域的數(shù)據(jù)相關(guān)性分析中都發(fā)揮著至關(guān)重要的作用。奇異值分解算法是計(jì)算復(fù)雜度相對(duì)較高的矩陣分解算法,而且隨著數(shù)據(jù)處理規(guī)模的不斷增加,無(wú)論在通信方向的大規(guī)模MIMO中,還是對(duì)于矩陣維度及數(shù)據(jù)量都更加龐大的圖像及數(shù)據(jù)挖掘等研究與應(yīng)用場(chǎng)景中,對(duì)于奇異值分解的運(yùn)算速度都有越來(lái)越高的需求,因此對(duì)矩陣奇異值分解的加速方案實(shí)現(xiàn)具有很高的研究與應(yīng)用價(jià)值。本文重點(diǎn)研究了基于單邊Jacobi方法的矩陣奇異值分解,該算法具有相對(duì)精度高、分解速度快的特點(diǎn),是一種非常適合并行化和大規(guī)模矩陣計(jì)算的一種旋轉(zhuǎn)運(yùn)算方法。對(duì)于Jacobi算法而言,旋轉(zhuǎn)變換和列對(duì)排序?qū)Ψ纸獾乃俣扔袥Q定性作用,本文對(duì)不同的矩陣列對(duì)索引方式進(jìn)行了研究,并將兩種序列生成方式,循環(huán)序列和指環(huán)序列應(yīng)用到硬件設(shè)計(jì)當(dāng)中。其中指環(huán)序列的列對(duì)排序方式,不僅利于并行化實(shí)現(xiàn),而且可以得到有序排列奇異值矩陣,并對(duì)算法的收斂速度也有積極的促進(jìn)作用。針對(duì)實(shí)時(shí)性、低延遲需求,本文提出了基于片上存儲(chǔ)的循環(huán)序列單邊Jacobi變換算法硬件架構(gòu),其性能相比于相同算法的MATLAB方案和GPU方案有很明顯加速效果,保持了相當(dāng)?shù)臄?shù)值精度。在此基礎(chǔ)上,設(shè)計(jì)實(shí)現(xiàn)了一種基于片上存儲(chǔ)以及指環(huán)序列方式的并行化硬件加速方案,相比于循環(huán)序列方式,實(shí)測(cè)加速比達(dá)到2.95倍。其次,針對(duì)大規(guī)模、高吞吐率的圖像處理以及數(shù)據(jù)挖掘等應(yīng)用場(chǎng)景,為解決片內(nèi)存儲(chǔ)容量與硬件設(shè)計(jì)復(fù)雜的問題,提出了基于片外存儲(chǔ)器和指環(huán)序列的單邊Jacobi算法的并行架構(gòu)設(shè)計(jì),并且基于性能與資源的關(guān)系,提出了其在并行化硬件設(shè)計(jì)上性能與資源的平衡策略。
[Abstract]:Singular value decomposition (singular value decomposition) is an important component of numerical computing, and it plays an important role in large scale MIMO in the field of wireless communications, feature extraction and principal component analysis in the field of image processing, data compression in machine learning, word meaning index and data correlation analysis in large data fields. The singular value decomposition algorithm is a matrix decomposition algorithm with relatively high computational complexity, and as the scale of data processing is increasing, the singular values are in the large-scale MIMO of the communication direction, or in the research and application scenarios, such as the matrix dimension and the data mining, which are more large in the matrix dimension and the data amount. The computing speed of decomposition is higher and higher, so the acceleration scheme of matrix singular value decomposition has high research and application value. This paper focuses on the singular value decomposition of matrix based on single side Jacobi method. This algorithm has the characteristics of high relative precision and fast decomposition speed, which is very suitable for parallelization and large scale. A rotation operation method of scale matrix calculation. For Jacobi algorithm, the rotation transformation and column pair sorting have a decisive effect on the speed of decomposition. In this paper, the index mode of different matrix columns is studied, and two kinds of sequence generation, cyclic sequence and ring sequence are applied to the hardware design. The sequence method is not only conducive to parallel implementation, but also can get an orderly array of singular value matrices, and it also has a positive effect on the convergence speed of the algorithm. In view of real time and low delay demand, this paper proposes a hardware architecture of the single side Jacobi transform algorithm based on the memory on chip. Its performance is compared to the same algorithm. The MATLAB scheme and the GPU scheme have obvious acceleration effect and maintain a considerable numerical accuracy. On this basis, a parallel hardware acceleration scheme based on the on-chip storage and the ring sequence is designed and implemented. Compared with the cyclic sequence, the measured acceleration ratio is 2.95 times. Secondly, for large-scale, high throughput images. In order to solve the problem of complex memory storage capacity and hardware design, the parallel architecture design of single side Jacobi algorithm based on external memory and ring sequence is proposed. Based on the relationship between performance and resources, the balance strategy of performance and resources in the design of parallel hard pieces is proposed.
【學(xué)位授予單位】:北京交通大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP301.6;TN791

【參考文獻(xiàn)】

相關(guān)博士學(xué)位論文 前1條

1 盧風(fēng)順;面向CPU/GPU異構(gòu)體系結(jié)構(gòu)的并行計(jì)算關(guān)鍵技術(shù)研究[D];國(guó)防科學(xué)技術(shù)大學(xué);2012年

相關(guān)碩士學(xué)位論文 前1條

1 徐芳;FPGA代價(jià)資源辨識(shí)[D];西安電子科技大學(xué);2014年



本文編號(hào):1934053

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/dianzigongchenglunwen/1934053.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶2035e***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com