SAR目標(biāo)識(shí)別方法的GPU并行實(shí)現(xiàn)與優(yōu)化

發(fā)布時(shí)間：2018-04-01 18:32

本文選題：GPU　切入點(diǎn)：Jacobi　出處：《電子科技大學(xué)》2017年碩士論文

【摘要】：SAR目標(biāo)識(shí)別方法已經(jīng)成為近年來(lái)的研究熱點(diǎn),其研究成果被廣泛應(yīng)用于軍事和民用領(lǐng)域。隨著高分辨SAR成像技術(shù)的發(fā)展,SAR圖像的分辨率和數(shù)據(jù)量均迅速增加,基于CPU串行計(jì)算的目標(biāo)識(shí)別算法已經(jīng)不能達(dá)到高分辨SAR目標(biāo)識(shí)別軟件實(shí)時(shí)處理數(shù)據(jù)的要求,且計(jì)算代價(jià)過(guò)高。而近些年出現(xiàn)的GPU(Graphic Process Unit)通用計(jì)算可以提供強(qiáng)大的計(jì)算能力和存儲(chǔ)帶寬,此外其具有開(kāi)發(fā)成本低、周期短等優(yōu)點(diǎn)。因此,基于GPU的并行目標(biāo)識(shí)別算法的研究,對(duì)實(shí)時(shí)處理數(shù)據(jù)的目標(biāo)識(shí)別軟件系統(tǒng)的研究和建立具有重要推動(dòng)作用。本文首先討論了GPU的體系結(jié)構(gòu)以及CUDA編程模型,并將目標(biāo)識(shí)別算法分為特征提取部分和分類器部分,然后詳細(xì)描述了如何將各部分的具體計(jì)算任務(wù)進(jìn)行并行分解,以及如何通過(guò)CUDA并行編程實(shí)現(xiàn)各個(gè)計(jì)算任務(wù),最終對(duì)CUDA程序進(jìn)行一系列優(yōu)化處理,爭(zhēng)取實(shí)現(xiàn)算法的加速最大化。具體的工作安排如下:(1)分析了CUDA的編程模型、存儲(chǔ)模型以及編程語(yǔ)言,然后研究主成分分析、非負(fù)矩陣分解和線性判別分析這三種比較成熟的特征提取技術(shù)和支持向量機(jī)這種分類方法的基礎(chǔ)原理和實(shí)現(xiàn)方法,為后文目標(biāo)識(shí)別算法并行分析提供理論依據(jù)和技術(shù)基礎(chǔ)。(2)研究特征提取方法和分類器的計(jì)算任務(wù),將計(jì)算過(guò)程拆分并做并行改進(jìn)。分別對(duì)三種特征提取方法中的矩陣乘法、Jacobi迭代法求矩陣特征值、歸約法、類間和類內(nèi)散度矩陣構(gòu)造等計(jì)算任務(wù)進(jìn)行并行分析和GPU并行改進(jìn)。然后分析SMO算法的計(jì)算過(guò)程和并行性,實(shí)現(xiàn)SVM在CUDA上的并行移植。最終,以MSTAR公開(kāi)數(shù)據(jù)庫(kù)為基礎(chǔ),通過(guò)實(shí)驗(yàn)得到目標(biāo)識(shí)別算法在CPU端和GPU端的運(yùn)行時(shí)間,并作對(duì)比分析,以證明GPU并行計(jì)算對(duì)目標(biāo)識(shí)別算法的加速效果。(3)結(jié)合CUDA程序的通用評(píng)估方式和優(yōu)化策略,深入分析了目標(biāo)識(shí)別算法中影響CUDA程序運(yùn)行速度的原因,實(shí)現(xiàn)了從通信、訪存和指令流三個(gè)方面對(duì)算法進(jìn)行優(yōu)化處理。并通過(guò)實(shí)驗(yàn)表明基于GPU并行實(shí)現(xiàn)的目標(biāo)識(shí)別算法經(jīng)過(guò)優(yōu)化獲得了25-30倍的性能提升。
[Abstract]:The method of SAR target recognition has become a hot topic in recent years, and its research results have been widely used in military and civilian fields. With the development of high-resolution SAR imaging technology, the resolution and data volume of SAR images are increasing rapidly. The target recognition algorithm based on CPU serial computation can not meet the requirement of real-time data processing of high-resolution SAR target recognition software. In recent years, GPU(Graphic Process Unit can provide powerful computing power and storage bandwidth, besides, it has the advantages of low development cost, short period and so on. Therefore, parallel target recognition algorithm based on GPU is studied. This paper first discusses the architecture of GPU and the CUDA programming model, and divides the target recognition algorithm into feature extraction part and classifier part. Then it describes in detail how to decompose each part of the specific computing tasks in parallel, and how to realize each computing task by CUDA parallel programming, and finally carries on a series of optimization processing to the CUDA program. This paper analyzes the programming model, storage model and programming language of CUDA, and then studies principal component analysis. Non-negative matrix decomposition and linear discriminant analysis (LDA) are the three mature feature extraction techniques and the basic principles and implementation methods of support vector machine (SVM) classification. It provides theoretical and technical basis for parallel analysis of target recognition algorithm. The computation process is split and improved in parallel. The matrix eigenvalues are obtained by the matrix multiplication Jacobi iteration method, and the matrix eigenvalues are obtained by the reduction method, the matrix multiplication method and the Jacobi iteration method are used to calculate the eigenvalues of the matrix respectively. The parallel analysis and GPU parallel improvement are carried out by constructing inter-class and intra-class divergence matrix, and then the computation process and parallelism of SMO algorithm are analyzed to realize the parallel transplantation of SVM on CUDA. Finally, based on MSTAR open database, the parallel migration of SVM is realized. Through experiments, the running time of target recognition algorithm on CPU and GPU is obtained, and a comparative analysis is made to prove that the acceleration effect of GPU parallel computation to target recognition algorithm...) combined with the general evaluation method and optimization strategy of CUDA program. In this paper, the reasons that affect the speed of CUDA program in target recognition algorithm are analyzed, and the communication is realized. The algorithm is optimized from memory access and instruction stream, and the experiment results show that the target recognition algorithm based on GPU can achieve 25-30 times better performance after optimization.
【學(xué)位授予單位】：電子科技大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TN958

【參考文獻(xiàn)】

相關(guān)期刊論文前5條

1 米淳;李翔;許星;付為民;;基于CUDA的GPU技術(shù)快速處理海量數(shù)據(jù)應(yīng)用探析[J];河南科技;2013年17期

2 MAA Jerome P.-Y;;Solving generalized lattice Boltzmann model for 3-D cavity flows using CUDA-GPU[J];Science China(Physics,Mechanics & Astronomy);2012年10期

3 崔雪冰;張延紅;王康平;;基于GPU的通用計(jì)算模型[J];河南科技大學(xué)學(xué)報(bào)(自然科學(xué)版);2009年03期

4 袁禮海;宋建社;薛文通;趙偉舟;;SAR圖像自動(dòng)目標(biāo)識(shí)別系統(tǒng)研究與設(shè)計(jì)[J];計(jì)算機(jī)應(yīng)用研究;2006年11期

5 吳恩華,柳有權(quán);基于圖形處理器(GPU)的通用計(jì)算[J];計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào);2004年05期

相關(guān)會(huì)議論文前1條

1 曹麗娟;王小明;;快速訓(xùn)練支持向量機(jī)的并行結(jié)構(gòu)[A];2006年全國(guó)開(kāi)放式分布與并行計(jì)算學(xué)術(shù)會(huì)議論文集（二）[C];2006年

相關(guān)博士學(xué)位論文前3條

1 馬安國(guó);高效能GPGPU體系結(jié)構(gòu)關(guān)鍵技術(shù)研究[D];國(guó)防科學(xué)技術(shù)大學(xué);2011年

2 劉貴;精毛紡織品虛擬加工中的預(yù)報(bào)與反演模型研究[D];東華大學(xué);2010年

3 胡利平;合成孔徑雷達(dá)圖像目標(biāo)識(shí)別技術(shù)研究[D];西安電子科技大學(xué);2009年

相關(guān)碩士學(xué)位論文前5條

1 田寧;GPU加速的矩陣計(jì)算的研究[D];黑龍江大學(xué);2015年

2 王濤;求矩陣特征值的GPU并行算法的研究[D];黑龍江大學(xué);2012年

3 王世春;基于CUDA的車(chē)牌字符識(shí)別[D];復(fù)旦大學(xué);2012年

4 莫良永;基于GPU的并行人臉識(shí)別算法研究[D];大連理工大學(xué);2008年

5 齊興敏;基于PCA的人臉識(shí)別技術(shù)的研究[D];武漢理工大學(xué);2007年

，

本文編號(hào)：1696752

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/xinxigongchenglunwen/1696752.html

上一篇：UPS逆變器輸出諧波抑制與控制軟件驗(yàn)證方法探討
下一篇：D2D通信系統(tǒng)中中繼輔助的資源分配算法研究

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

SAR目標(biāo)識(shí)別方法的GPU并行實(shí)現(xiàn)與優(yōu)化