Bellman-Ford算法性能可移植的GPU并行優(yōu)化
發(fā)布時(shí)間:2018-05-02 10:30
本文選題:計(jì)算機(jī)軟件 + Bellman-Ford算法 ; 參考:《吉林大學(xué)學(xué)報(bào)(工學(xué)版)》2015年05期
【摘要】:提出了一種面向GPU的性能可移植的并行歸約求極值優(yōu)化算法和全局訪存優(yōu)化算法,對(duì)Bellman-Ford算法進(jìn)行并行化改造,以解決不同類型GPU設(shè)備上都存在的并行粒度不足和全局內(nèi)存訪問(wèn)不連續(xù)等問(wèn)題。實(shí)驗(yàn)結(jié)果表明:本文的優(yōu)化算法在NVIDIA和AMD的多款GPU設(shè)備上都取得了很好的效果,經(jīng)本文算法優(yōu)化后的程序性能較原始GPU并行版本提升3~6倍。
[Abstract]:In this paper, a parallel reduction extremum optimization algorithm and a global memory access optimization algorithm for GPU are proposed, which can transform the Bellman-Ford algorithm into parallelization. In order to solve the problems of parallel granularity deficiency and global memory access discontinuity on different types of GPU devices. The experimental results show that the proposed optimization algorithm has achieved good results on both NVIDIA and AMD GPU devices, and the performance of the program optimized by the proposed algorithm is 3 times higher than that of the original GPU parallel version.
【作者單位】: 吉林大學(xué)計(jì)算機(jī)科學(xué)與技術(shù)學(xué)院;中信證券有限公司;中國(guó)科學(xué)院計(jì)算技術(shù)研究所;
【基金】:吉林省重大科技攻關(guān)項(xiàng)目(20130206052GX) “863”國(guó)家高技術(shù)研究發(fā)展計(jì)劃項(xiàng)目(2012AA010902) “973”國(guó)家重點(diǎn)基礎(chǔ)研究計(jì)劃項(xiàng)目(2011CB302500)
【分類號(hào)】:TP332;TP301.6
【相似文獻(xiàn)】
相關(guān)期刊論文 前3條
1 徐俊波;王慧強(qiáng);馮光升;呂宏武;田蘇梅;;Bellman動(dòng)態(tài)規(guī)劃的服務(wù)恢復(fù)方法[J];哈爾濱工程大學(xué)學(xué)報(bào);2011年06期
2 劉立平;陳s,
本文編號(hào):1833516
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1833516.html
最近更新
教材專著