基于鍵漲落模型數(shù)值模擬的并行優(yōu)化
本文選題:并行計(jì)算 切入點(diǎn):高分子表面吸附 出處:《山東大學(xué)》2013年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著大規(guī)模計(jì)算的需求不斷增長(zhǎng),并行計(jì)算技術(shù)得到了不斷發(fā)展,TOP500每年都要公布峰值速度前500強(qiáng)的世界高性能計(jì)算機(jī)排名。現(xiàn)在主流高性能計(jì)算機(jī)的體系結(jié)構(gòu)發(fā)展趨勢(shì),是使用基于共享存儲(chǔ)的刀片多核處理器搭建機(jī)群系統(tǒng),山東大學(xué)高性能機(jī)群便是采用這種架構(gòu)。在并行計(jì)算機(jī)發(fā)展日新月異的同時(shí),并行計(jì)算的實(shí)際應(yīng)用發(fā)展速度大大落后于硬件的發(fā)展速度,應(yīng)用程序?qū)崪y(cè)性能遠(yuǎn)低于計(jì)算峰值。因此,充分利用起并行計(jì)算機(jī)的計(jì)算特點(diǎn),優(yōu)化提升應(yīng)用的并行性能,比如對(duì)高分子表面吸附的并行應(yīng)用進(jìn)行優(yōu)化,使其在機(jī)群上實(shí)現(xiàn)并行計(jì)算,并且提升并行性能縮短計(jì)算周期,成為了并行計(jì)算中的一項(xiàng)研究課題。 鍵漲落模型是高分子表面吸附數(shù)值模擬的經(jīng)典運(yùn)動(dòng)模型,由于其計(jì)算量巨大使得單機(jī)PC進(jìn)行計(jì)算模擬的時(shí)間不可接受,而MPI實(shí)現(xiàn)的并行蒙特卡洛抽樣方法可以通過(guò)擴(kuò)展PC(或計(jì)算核)的個(gè)數(shù)來(lái)計(jì)算不同的樣本,最后將數(shù)據(jù)歸約計(jì)算,當(dāng)使用480核計(jì)算960樣本時(shí),將需要9年完成的串行計(jì)算加速至5天完成。這種粒度劃分最小為每一個(gè)樣本為一個(gè)計(jì)算任務(wù),但當(dāng)模擬的高分子鏈分子量比較大時(shí),一個(gè)獨(dú)立樣本的計(jì)算時(shí)間也是相當(dāng)長(zhǎng)的。因此在MPI并行基礎(chǔ)之上,可以通過(guò)區(qū)域分解進(jìn)一步劃分并行粒度,區(qū)域分解之后的循環(huán)迭代易于使用OpenMP提供的編譯制導(dǎo)并行化。相比于MPI使用進(jìn)程通信,OpenMP基于多線程技術(shù),能更好的發(fā)揮刀片結(jié)點(diǎn)共享存儲(chǔ)的優(yōu)勢(shì)。使用OpenMP直接并行化的應(yīng)用程序,可以初步完成在刀片結(jié)點(diǎn)上的并行計(jì)算,但高效率的發(fā)揮并行性能,需要進(jìn)一步測(cè)試、分析、調(diào)優(yōu),然后得到最合理的硬件資源使用方案。 本文基于高分子表面吸附在高性能機(jī)群上的MPI并行編程框架,主要工作分為兩部分:首先,研究OpenMP編程技術(shù),實(shí)現(xiàn)應(yīng)用熱點(diǎn)模塊的并行化;其次,研究OpenMP的優(yōu)化技術(shù),針對(duì)高分子表面吸附應(yīng)用設(shè)計(jì)并行優(yōu)化方案。本文對(duì)高分子表面吸附應(yīng)用的OpenMP程序調(diào)優(yōu)工作,均在四路八核刀片上完成,測(cè)試結(jié)果表明優(yōu)化方案能有效地提高實(shí)際并行性能。本文采用的軟件工程優(yōu)化化方法,可以為將來(lái)機(jī)群應(yīng)用在單個(gè)結(jié)點(diǎn)上的OpenMP調(diào)優(yōu)提供方法、經(jīng)驗(yàn)和借鑒。具體而言,本文的主要工作如下: 1.給出MPI并行后的高分子表面吸附在高性能機(jī)群上的性能測(cè)試和分析,驗(yàn)證滿足Gustafson定律; 2.在計(jì)算長(zhǎng)鏈并行性能提升達(dá)到瓶頸的條件下,本文基于長(zhǎng)鏈分段的方法,使用多線程模擬各段內(nèi)的鍵漲落運(yùn)動(dòng),以此來(lái)替代長(zhǎng)鏈運(yùn)動(dòng)的方法。使用OpenMP并行編程實(shí)現(xiàn)并行接口MC_Bond_Fluc,完成在四路八核刀片上的性能測(cè)試; 3.設(shè)計(jì)了高分子表面吸附基于鍵漲落模型數(shù)值模擬的OpenMP優(yōu)化方案,基于軟件優(yōu)化方法論,使用均衡負(fù)載、減少并行開銷、合理使用內(nèi)存、提高Cache命中等手段增量式優(yōu)化,測(cè)試得到了性能最佳的OpenMP優(yōu)化方式,給出高分子表面吸附的OpenMP并行最優(yōu)方案。
[Abstract]:With the growing demand of massive computing, parallel computing technology has been developing continuously, TOP500 announced every year ranked high performance computer peak speed of the top 500. Now the mainstream architecture of high performance computer trend is the use of shared storage blade build multi-core cluster based on high performance cluster is Shandong University the architecture of the parallel computer. The development change rapidly at the same time, the development speed of parallel application development speed calculation greatly behind the hardware application, the measured performance is much lower than the calculated peak. Therefore, making full use of computing characteristics of parallel computer, optimized to enhance the performance of parallel applications, such as the parallel application of polymer surface adsorption the optimization of parallel computing in cluster, and improve the performance of parallel shorten computing period, and become A research topic in line calculation.
Bond fluctuation model is a classical model of numerical simulation of adsorption of the polymer surface, because of its large amount of computation makes the single PC was used to simulate the time is not acceptable, and MPI parallel Monte Carlo sampling method can be achieved by extending the PC (or the core) the number of calculation of different samples, and finally the data reduction calculation, when the use of 480 nuclear calculation of 960 samples, and will take 9 years to complete the serial computing speeds up to 5 days to complete. The minimum granularity for each sample as a computing task, but the molecular polymer chain when simulation is large, an independent sample of computing time is quite long. Therefore in MPI parallel basis, can be further divided by parallel granularity domain decomposition, iterative domain decomposition after parallelization using easy to compile guidance provided by OpenMP. Compared with MPI using the process of communication, OpenMP Based on multi threading technology, can better play the advantages of shared memory blade node. Using the OpenMP parallel application programs, can achieve simple parallel computing in blade nodes, but efficient use of parallel performance, the need for further testing, analysis, optimization, and then get the most rational use of hardware resources.
In this paper, the polymer surface adsorption in high performance cluster on the MPI parallel programming framework based on the main work is divided into two parts: first, study OpenMP programming technology and Realization of parallel applications module; secondly, optimization technology of OpenMP, the adsorption of polymer surface with parallel design optimization scheme. The OpenMP program optimization on adsorption the application of polymer surface, was completed in four road eight nuclear blade, test results show that the optimization scheme can effectively improve the parallel performance. This paper uses the software engineering optimization method can provide a method for the OpenMP optimization application on cluster's node, and learn from experience. Specifically, the main work of this paper the following:
1. the performance test and analysis of the polymer surface adsorbed on the high performance cluster after MPI parallel are given to verify the Gustafson law.
2. in the calculation of long chain parallel performance to reach the bottleneck conditions, this method of long chain segment based, multi thread is used to simulate bond fluctuation motion within each segment, in order to replace the method of long chain movement. The use of OpenMP to achieve the parallel interface of MC_Bond_Fluc parallel programming, complete performance testing in the four nuclear Road eight on the blade;
3. the design of the polymer surface adsorption of OpenMP optimization scheme for numerical simulation of bond fluctuation model based on the methodology of software optimization based on the use of load balancing, reduce overhead, reasonable use of memory, improve the Cache hit means of incremental optimization, OpenMP optimization test and got the best performance, given the polymer surface adsorption of OpenMP parallel optimal solution.
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP338.6
【共引文獻(xiàn)】
相關(guān)期刊論文 前10條
1 關(guān)亞林;曾艷奇;逯貴禎;;基于并行計(jì)算環(huán)境的混波室三維仿真[J];中國(guó)傳媒大學(xué)學(xué)報(bào)(自然科學(xué)版);2008年03期
2 程克非;羅江華;李紅波;;一種新的基于HPM并行計(jì)算性能數(shù)據(jù)采集方法[J];重慶郵電大學(xué)學(xué)報(bào)(自然科學(xué)版);2011年01期
3 王結(jié)臣;王豹;胡瑋;張輝;;并行空間分析算法研究進(jìn)展及評(píng)述[J];地理與地理信息科學(xué);2011年06期
4 阮定益;;并行式matlab平臺(tái)搭建[J];電腦知識(shí)與技術(shù);2008年08期
5 胡海峰;;樹狀成本估算模型的并行處理[J];電腦知識(shí)與技術(shù);2009年28期
6 古奮飛;王良俠;;淺析Linux集群技術(shù)[J];電腦知識(shí)與技術(shù);2010年06期
7 古奮飛;王良俠;張莉;;基于Linux集群的高性能低成本的校園網(wǎng)解決方案[J];電腦知識(shí)與技術(shù);2012年02期
8 李焱;胡祥云;金鋼燮;吳桂桔;廖國(guó)忠;王程;;基于MPI的一維大地電磁并行計(jì)算研究[J];地球物理學(xué)進(jìn)展;2010年05期
9 李焱;胡祥云;吳桂桔;葉益信;廖國(guó)忠;;基于MPI的二維大地電磁正演的并行計(jì)算[J];地震地質(zhì);2010年03期
10 劉曉群;鄒欣;范虹;;基于并行云計(jì)算模式的建筑結(jié)構(gòu)設(shè)計(jì)[J];電子技術(shù)應(yīng)用;2011年10期
相關(guān)會(huì)議論文 前7條
1 肖永浩;黃清南;;基于分塊數(shù)據(jù)結(jié)構(gòu)的沖擊問(wèn)題并行計(jì)算[A];中國(guó)計(jì)算力學(xué)大會(huì)'2010(CCCM2010)暨第八屆南方計(jì)算力學(xué)學(xué)術(shù)會(huì)議(SCCM8)論文集[C];2010年
2 李根;李連崇;唐春安;唐世斌;王振;;巖石流固/熱固作用下?lián)p傷演化并行分析系統(tǒng)及應(yīng)用[A];中國(guó)計(jì)算力學(xué)大會(huì)'2010(CCCM2010)暨第八屆南方計(jì)算力學(xué)學(xué)術(shù)會(huì)議(SCCM8)論文集[C];2010年
3 張亞林;吳錦龍;李于鋒;趙曉平;;PANDA軟件框架的應(yīng)用模板研究[A];中國(guó)計(jì)算力學(xué)大會(huì)'2010(CCCM2010)暨第八屆南方計(jì)算力學(xué)學(xué)術(shù)會(huì)議(SCCM8)論文集[C];2010年
4 董延華;張曄;白文秀;;影響基于PC集群系統(tǒng)的因素研究[A];第六屆全國(guó)信息獲取與處理學(xué)術(shù)會(huì)議論文集(2)[C];2008年
5 武艷強(qiáng);江在森;楊國(guó)華;;最小二乘配置方法在提取GPS時(shí)間序列信息中的應(yīng)用[A];GPS技術(shù)應(yīng)用研究論文專輯[C];2007年
6 張志敏;梁逸曾;王家俊;;并行交互檢驗(yàn)方法及其在近紅外光譜中的應(yīng)用[A];中國(guó)化學(xué)會(huì)第26屆學(xué)術(shù)年會(huì)化學(xué)信息學(xué)與化學(xué)計(jì)量學(xué)分會(huì)場(chǎng)論文集[C];2008年
7 程煜峰;徐幼平;普業(yè);;并行計(jì)算在數(shù)值預(yù)報(bào)模式中的應(yīng)用[A];第28屆中國(guó)氣象學(xué)會(huì)年會(huì)——S17第三屆研究生年會(huì)[C];2011年
相關(guān)碩士學(xué)位論文 前10條
1 朱圣鑫;并行GPBiCG(m,,l)算法與預(yù)處理技術(shù)[D];中國(guó)工程物理研究院;2010年
2 蘭任;基于并行混合粒子群算法的蛋白質(zhì)結(jié)構(gòu)預(yù)測(cè)[D];大連理工大學(xué);2010年
3 高和東;GPU并行計(jì)算在LSSVM建模中的研究與應(yīng)用[D];大連理工大學(xué);2010年
4 張晶;ABEEMσπ/MM模型中能量求解的并行化[D];遼寧師范大學(xué);2010年
5 田野;環(huán)境衛(wèi)星光學(xué)影像自動(dòng)配準(zhǔn)算法研究與并行實(shí)現(xiàn)[D];遼寧工程技術(shù)大學(xué);2009年
6 羅r
本文編號(hào):1607195
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1607195.html