天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 力學(xué)論文 >

光滑粒子流體動(dòng)力學(xué)方法的高效異構(gòu)加速

發(fā)布時(shí)間:2018-02-25 17:33

  本文關(guān)鍵詞: CPU-GPU耦合計(jì)算 熱點(diǎn)加速 全GPU加速 對(duì)等協(xié)同 粒子模擬 光滑粒子流體動(dòng)力學(xué) petaPar 出處:《計(jì)算機(jī)學(xué)報(bào)》2017年09期  論文類型:期刊論文


【摘要】:目前,光滑粒子流體動(dòng)力學(xué)方法的GPU加速幾乎都是基于簡(jiǎn)化的Euler控制方程,完整的Navier-Stokes方程的GPU實(shí)現(xiàn)非常少,且對(duì)其困難、優(yōu)化策略、加速效果的描述較為模糊.另一方面,CPU-GPU協(xié)同方式深刻影響著異構(gòu)平臺(tái)的整體效率,GPU加速模型還有待進(jìn)一步探討.文中的目的是將自主開發(fā)的、基于Navier-Stokes方程的SPH應(yīng)用程序petaPar在異構(gòu)平臺(tái)上進(jìn)行高效加速.文中首先從數(shù)學(xué)公式的角度分析了Euler方程和NavierStokes方程的計(jì)算特征,并總結(jié)了Navier-Stokes方程在GPU加速中面臨的困難.由于Euler方程只含有簡(jiǎn)單的標(biāo)量和向量計(jì)算,是典型的適合GPU的計(jì)算密集輕量級(jí)kernel;而完整形式的Navier-Stokes方程涉及復(fù)雜的材料本構(gòu)和大量張量計(jì)算,需要面對(duì)GPU上大kernel帶來的系列問題,如訪存壓力、cache不足、低占用率、寄存器溢出等.文中通過減少粒子屬性、提取操作到粒子更新、利用粒子的重用度、最大化GPU占用率等策略對(duì)Navier-Stokes方程的粒子交互kernel進(jìn)行優(yōu)化,具體實(shí)現(xiàn)見5.1節(jié).同時(shí),文中調(diào)研了三種GPU加速模型:熱點(diǎn)加速、全GPU加速以及對(duì)等協(xié)同,分析了其開發(fā)投入、應(yīng)用范圍、理論加速比等,并深入探討了對(duì)等協(xié)同模型的通信優(yōu)化策略.由于通信粒子的不連續(xù)分布,GPU端通信粒子的抽取、插入、刪除等操作本質(zhì)上是對(duì)不連續(xù)內(nèi)存的并行操作,會(huì)嚴(yán)重影響CPU-GPU的同步效果,而相關(guān)文獻(xiàn)對(duì)此問題沒有闡述.我們通過改進(jìn)粒子索引規(guī)則解決此問題:粒子排序時(shí)不僅考慮網(wǎng)格編號(hào),還要考慮網(wǎng)格類型,具體實(shí)現(xiàn)見5.2.3節(jié).基于Euler方程和Navier-Stokes方程實(shí)現(xiàn)并分析了三種GPU加速模型.測(cè)試結(jié)果顯示,三種模型下,Euler方程分別獲得了8倍、33倍、36倍的加速,Navier-Stokes方程分別獲得了6倍、15倍、20倍的加速.全GPU加速均突破了熱點(diǎn)加速的加速比理論上限,對(duì)等協(xié)同比之全GPU加速又可以獲得進(jìn)一步提高.特別是對(duì)于Navier-Stokes方程,采用文中的kernel優(yōu)化策略及對(duì)等協(xié)同模型,最終在異構(gòu)平臺(tái)上實(shí)現(xiàn)了20倍的整體加速.針對(duì)Navier-Stokes方程的對(duì)等協(xié)同版本這一應(yīng)用范圍最廣、加速效果最好的實(shí)現(xiàn),在Titan超級(jí)計(jì)算機(jī)的6個(gè)和1024個(gè)異構(gòu)計(jì)算節(jié)點(diǎn)上進(jìn)行了強(qiáng)、弱可擴(kuò)展性測(cè)試,分別獲得了67.1%和75.2%的并行效率.
[Abstract]:At present, the GPU acceleration of smooth particle hydrodynamics method is almost based on the simplified Euler governing equation. The GPU implementation of the complete Navier-Stokes equation is very few, and it is difficult to optimize the strategy. On the other hand, the CPU-GPU collaborative mode has a profound impact on the overall efficiency of heterogeneous platforms. PetaPar, a SPH application program based on Navier-Stokes equation, accelerates efficiently on heterogeneous platforms. In this paper, the computational characteristics of Euler equation and NavierStokes equation are analyzed from the point of view of mathematical formula. The difficulties of Navier-Stokes equation in GPU acceleration are summarized. Because Euler equation contains only simple scalar and vector computation, The complete form of Navier-Stokes equation involves complex constitutive structure of materials and a large number of Zhang Liang calculations. It is necessary to face a series of problems caused by large kernel on GPU, such as insufficient memory cache and low occupancy rate. Register overflow etc. In this paper, particle interaction kernel of Navier-Stokes equation is optimized by reducing particle properties, extracting operation to particle update, using particle reuse degree and maximizing GPU occupancy. This paper investigates three kinds of GPU acceleration models: hot spot acceleration, full GPU acceleration and peer-to-peer collaboration, analyzes their development input, application scope, theoretical acceleration ratio, etc. The communication optimization strategy of peer-to-peer cooperative model is discussed in detail. Because the discontinuous distribution of communication particles in GPU terminal communication particles extraction, insertion, deletion and other operations are essentially parallel operations on discontinuous memory. We solve this problem by improving particle index rules: particle sorting not only considers grid numbers, but also mesh types. Three kinds of GPU acceleration models are realized and analyzed based on Euler equation and Navier-Stokes equation. The test results show that, In the three models, the acceleration of the Navier-Stokes equation is 6 times 15 times and 20 times higher than that of the Navier Stokes equation, respectively, and the acceleration rate of all GPU accelerations is above the theoretical upper limit of the acceleration ratio of hot spots. The full GPU acceleration of the peer-to-peer collaboration ratio can be further improved, especially for the Navier-Stokes equation, the kernel optimization strategy and the peer-to-peer collaboration model are used in this paper. Finally, 20 times the whole acceleration is realized on the heterogeneous platform. Aiming at the most widely used peer-to-peer collaborative version of Navier-Stokes equation, the best acceleration effect is achieved on 6 and 1024 heterogeneous computing nodes of Titan supercomputer. The parallel efficiency of 67.1% and 75.2% is obtained by weak scalability test.
【作者單位】: 中國(guó)科學(xué)院計(jì)算技術(shù)研究所;中國(guó)科學(xué)院軟件研究所;中國(guó)工程物理研究院高性能數(shù)值模擬軟件中心;
【基金】:國(guó)家自然科學(xué)基金(11472274,11072241,11111140020,91130026) 美國(guó)橡樹嶺國(guó)家實(shí)驗(yàn)室/美國(guó)國(guó)家計(jì)算科學(xué)中心“主任基金”(MAT028,CSC153)資助~~
【分類號(hào)】:O35

【相似文獻(xiàn)】

中國(guó)期刊全文數(shù)據(jù)庫 前7條

1 鄭興;段文洋;;潰壩模擬的光滑粒子流體動(dòng)力學(xué)方法及其粘性特性(英文)[J];Journal of Marine Science and Application;2010年01期

2 陳劉定;姚磊江;李自山;鄭潔;童小燕;徐緋;;光滑質(zhì)點(diǎn)流體動(dòng)力學(xué)方法中數(shù)值斷裂的防止[J];機(jī)械強(qiáng)度;2010年01期

3 李付鵬;汪繼文;;基于光滑粒子方法的水流數(shù)值模擬[J];計(jì)算機(jī)技術(shù)與發(fā)展;2010年07期

4 陳劉定;童小燕;陳昊;鄭翔;程起有;姚磊江;;光滑質(zhì)點(diǎn)流體動(dòng)力學(xué)方法中斷裂準(zhǔn)則的引入[J];機(jī)械強(qiáng)度;2010年04期

5 韓亞偉;強(qiáng)洪夫;趙玖玲;高巍然;;光滑粒子流體動(dòng)力學(xué)方法固壁處理的一種新型排斥力模型[J];物理學(xué)報(bào);2013年04期

6 閆民;馮科珂;;圓柱繞流運(yùn)動(dòng)的GHM模擬[J];科技導(dǎo)報(bào);2013年20期

7 吳建松;鮑凱;張輝;楊銳;;基于SPH方法的階梯流數(shù)值模型[J];清華大學(xué)學(xué)報(bào)(自然科學(xué)版);2011年06期

中國(guó)重要會(huì)議論文全文數(shù)據(jù)庫 前5條

1 陳建設(shè);徐緋;黃其青;;光滑質(zhì)點(diǎn)流體動(dòng)力學(xué)方法的穩(wěn)定性分析[A];慶祝中國(guó)力學(xué)學(xué)會(huì)成立50周年暨中國(guó)力學(xué)學(xué)會(huì)學(xué)術(shù)大會(huì)’2007論文摘要集(下)[C];2007年

2 熊紅兵;朱劍;;光滑粒子流體動(dòng)力學(xué)方法中流體不可壓縮性的研究及其應(yīng)用[A];中國(guó)力學(xué)學(xué)會(huì)學(xué)術(shù)大會(huì)'2009論文摘要集[C];2009年

3 蔣亦民;劉佑;;流體動(dòng)力學(xué)方法與本構(gòu)模型[A];中國(guó)力學(xué)學(xué)會(huì)學(xué)術(shù)大會(huì)'2009論文摘要集[C];2009年

4 張學(xué)瑩;潘中建;;潰壩波與結(jié)構(gòu)物作用過程的SPH并行實(shí)現(xiàn)[A];中國(guó)力學(xué)學(xué)會(huì)學(xué)術(shù)大會(huì)'2009論文摘要集[C];2009年

5 閆民;尹建業(yè);孫寶平;方俊;;GHM顆粒流體動(dòng)力學(xué)方法[A];第八屆全國(guó)動(dòng)力學(xué)與控制學(xué)術(shù)會(huì)議論文集[C];2008年

中國(guó)博士學(xué)位論文全文數(shù)據(jù)庫 前1條

1 李付鵬;光滑粒子流體動(dòng)力學(xué)方法及其在淺水波方程中的應(yīng)用[D];安徽大學(xué);2014年

中國(guó)碩士學(xué)位論文全文數(shù)據(jù)庫 前2條

1 張強(qiáng)發(fā);光滑質(zhì)點(diǎn)流體動(dòng)力學(xué)方法在結(jié)構(gòu)分析中的應(yīng)用[D];南京航空航天大學(xué);2007年

2 沈雁鳴;超高速碰撞的三維光滑粒子流體動(dòng)力學(xué)方法模擬[D];中國(guó)空氣動(dòng)力研究與發(fā)展中心;2008年

,

本文編號(hào):1534506

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/lxlw/1534506.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶32c32***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com