天體物理成團(tuán)研究中的非規(guī)則訪存優(yōu)化
發(fā)布時(shí)間:2018-01-01 05:33
本文關(guān)鍵詞:天體物理成團(tuán)研究中的非規(guī)則訪存優(yōu)化 出處:《計(jì)算機(jī)科學(xué)與探索》2017年01期 論文類(lèi)型:期刊論文
更多相關(guān)文章: 天體物理成團(tuán) 非規(guī)則訪存優(yōu)化 數(shù)據(jù)預(yù)排序 并行計(jì)算
【摘要】:HGGF(halo-based galaxy group finder)算法實(shí)現(xiàn)了基于暗物質(zhì)暈的星系找群,在研究宇宙大尺度結(jié)構(gòu)及宇宙的演化等領(lǐng)域中占有至關(guān)重要的地位。但由于數(shù)據(jù)規(guī)模的增長(zhǎng),急需對(duì)HGGF算法進(jìn)行優(yōu)化,以縮短運(yùn)行時(shí)間。經(jīng)分析,算法的熱點(diǎn)部分耗時(shí)受到非規(guī)則訪存的嚴(yán)重影響,因此針對(duì)算法的結(jié)構(gòu)和非規(guī)則訪存模型,提出了數(shù)據(jù)預(yù)排序方法,并分析了該方法如何影響訪存過(guò)程。在此基礎(chǔ)上,利用數(shù)據(jù)對(duì)齊、循環(huán)分解進(jìn)一步優(yōu)化訪存效率,利用負(fù)載均衡和互斥變量私有化的方法提高了Open MP的并行效率,最終將HGGF應(yīng)用使用12線程加速11.6倍,同時(shí)取得了更好的可擴(kuò)展性。主要有三點(diǎn)貢獻(xiàn):(1)分析了HGGF算法的非規(guī)則訪存問(wèn)題;(2)提出并分析了數(shù)據(jù)預(yù)排序方法;(3)使用數(shù)據(jù)對(duì)齊、循環(huán)分解、負(fù)載均衡、互斥變量私有化方法提高了HGGF應(yīng)用的并行性能。
[Abstract]:The HGGF(halo-based galaxy group finder algorithm implements the cluster search of galaxies based on dark matter halo. It plays an important role in the study of the large-scale structure of the universe and the evolution of the universe. However, due to the growth of the data scale, it is urgent to optimize the HGGF algorithm in order to shorten the running time. The hot spot of the algorithm is seriously affected by the irregular memory access, so a data pre-sorting method is proposed for the structure of the algorithm and the irregular memory access model. Based on the analysis of how the method affects the memory access process, the efficiency of accessing memory is further optimized by using data alignment and cyclic decomposition. The parallel efficiency of Open MP is improved by using load balancing and mutex privatization, and the HGGF application is finally accelerated by 11.6 times using 12 threads. At the same time, better scalability is achieved. There are three contributions: 1) the problem of irregular memory access in HGGF algorithm is analyzed. 2) the data pre-sorting method is proposed and analyzed. The parallel performance of HGGF applications is improved by using data alignment, cyclic decomposition, load balancing and mutex privatization.
【作者單位】: 上海交通大學(xué)高性能計(jì)算中心;NVIDIA
【基金】:國(guó)家高技術(shù)研究發(fā)展計(jì)劃(863計(jì)劃) 日本學(xué)術(shù)振興會(huì)項(xiàng)目~~
【分類(lèi)號(hào)】:P14
【正文快照】: 1引言 隨著天文觀測(cè)能力的不斷提高,天文數(shù)據(jù)急劇增加,其中星系的觀測(cè)總量已經(jīng)達(dá)到109量級(jí)。面對(duì)來(lái)自大型數(shù)字巡天計(jì)劃的海量數(shù)據(jù),如何從數(shù)據(jù)中迅速準(zhǔn)確地提取所需要的內(nèi)容,直接影響著天文學(xué)的發(fā)展和研究進(jìn)程。其中,如何將可觀測(cè)的星系和理論中不可見(jiàn)的暗物質(zhì)暈聯(lián)系起來(lái),是天,
本文編號(hào):1363161
本文鏈接:http://sikaile.net/kejilunwen/tianwen/1363161.html
最近更新
教材專(zhuān)著