基于顯卡加速的氧化鋁團簇的結構預測
發(fā)布時間:2018-06-12 04:44
本文選題:顯卡計算 + 勢能函數(shù); 參考:《安徽大學》2015年碩士論文
【摘要】:顯卡全稱為顯示接口卡(GPU),自電子計算機被發(fā)明以來,一直是計算機的基本配置之一。在計算機發(fā)展的初期階段,顯卡的性能較弱,僅在主機中承擔顯示圖形的作用;隨著近半個世紀的發(fā)展,顯卡的性能突飛猛進,并憑借其特殊的架構,使得當代顯卡的浮點、并行計算能力數(shù)十倍乃至于上百倍于中央處理器(CPU)。在傳統(tǒng)的科學計算中,由于單個CPU的計算能力較弱,導致計算速度緩慢,而多個CPU并行計算雖然能提高計算速度,大型計算機集群又過于耗費資源。顯卡正好可以補足傳統(tǒng)計算的短板,因此,顯卡在未來的科學計算中將處于十分重要的地位。本文中,我們采用顯卡加速的策略,計算氧化鋁團簇的能量和梯度,統(tǒng)計了顯卡相比傳統(tǒng)中央處理器的加速比,并將顯卡加速運用到氧化鋁團簇結構的全局優(yōu)化當中。主要內(nèi)容如下:1、基于顯卡加速的能量和梯度的計算對于氧化鋁團簇,我們把它當作一個剛性模型,它的勢能函數(shù)共有四項,將勢能函數(shù)項放入顯卡中進行計算,最終得到了很高的加速比。得出結論如下:設計了三種不同的加速策略,一維操作、塊操作和二維操作,以及兩種精度,單精度和雙精度。在單精度條件下,一維操作、塊操作和二維操作峰值加速比分別為220、240和77;雙精度條件下,一維操作、塊操作和二維操作峰值加速比分別為103、107和35。對于小尺寸團簇,二維操作的加速比具有絕對優(yōu)勢,一維操作和塊操作的加速比都非常;對于中等尺寸團簇,二維操作無法計算,塊操作加速比開始顯著增加,遠遠高于一維操作,并更早的達到了峰值加速比;對于大尺寸團簇,一維操作與塊操作加速比達到飽和,塊操作峰值加速比略微大于一維操作。2、顯卡加速在氧化鋁團簇電子結構優(yōu)化中的應用采用顯卡加速結合遺傳算法預測了(Al2O3)n(n=1-15)團簇的結構并分析了它們的結構特點。由于團簇尺寸較小以及對精度的高要求,我們采用的是二維的雙精度操作,峰值加速比為35倍左右。得出結論如下:在n=1-3時,全局最優(yōu)結構分別為籠狀,籠狀和茶壺形狀;當n=4和5時,都是高度對稱的籠狀結構;當n=6時,結構傾向于無序;當n=7-9時,也是高度對稱的大型籠狀結構;當n=10時,結構又傾向于無序。先前文獻報道過的(Al2O3)n (n=1-10)的結構在我們的方法中均得到了重現(xiàn)。此外我們還預測出n=11-15的結構,這些團簇結構先前從未報道過。當n=11時,最優(yōu)結構和n=7結構較為相似;當n=12時,結構是無序的,但在特定角度某些原子可以重疊;當n=13和14時,最優(yōu)結構對稱性分別為Cs和D2;當n=15時,結構是無序的。
[Abstract]:Graphics card is called display interface card (GPU). Since the invention of electronic computer, it has been one of the basic configuration of computer. In the early stage of computer development, the performance of graphics card is relatively weak, it only plays the role of displaying graphics in the host computer. With the development of nearly half a century, the performance of graphics card develops by leaps and bounds, and by virtue of its special structure, it makes the floating point of contemporary graphics card. Parallel computing power is tens or even hundreds of times higher than the CPU. In the traditional scientific computation, the computing speed is slow because of the weak computing power of a single CPU, while the parallel computing of multiple CPUs can improve the computing speed, and the large computer cluster consumes too much resources. Graphics card can complement the traditional computing board, so graphics card will play an important role in the future scientific calculation. In this paper, we calculate the energy and gradient of the alumina cluster by using the strategy of graphics card acceleration, and calculate the speedup ratio of the display card compared with the traditional central processor, and apply the graphics card acceleration to the global optimization of the alumina cluster structure. The main contents are as follows: 1. For the alumina cluster, we regard it as a rigid model based on the calculation of the accelerating energy and gradient of the graphics card. Its potential energy function has four terms, and the potential energy function is put into the graphics card to calculate. A high speedup was finally achieved. The conclusions are as follows: three different acceleration strategies, one dimensional operation, one block operation and two dimensional operation, as well as two kinds of precision, single precision and double precision, are designed. The peak speedup ratios of one-dimensional operation, block operation and two-dimensional operation are 220240 and 77, respectively, and the peak speedup ratios of one-dimensional operation, block operation and two-dimensional operation are 103107 and 35 respectively under the condition of single precision. For small clusters, the speedup ratio of two-dimensional operation is absolutely superior, the speedup ratio of one-dimensional operation and block operation is very small, for medium-size cluster, two-dimensional operation can not be calculated, and block operation speedup begins to increase significantly. It is much higher than one-dimensional operation and reaches peak speedup earlier. For large clusters, the ratio of one-dimensional operation to block operation reaches saturation. The peak speedup ratio of block operation is slightly higher than that of one-dimensional operation .2.The application of graphics card acceleration in the optimization of electronic structure of alumina cluster is discussed. The structure of Al _ 2O _ 3N _ (nn) _ (1-15) cluster is predicted and their structural characteristics are analyzed by means of video card acceleration combined with genetic algorithm (GA). Due to the small size of clusters and the high requirement of precision, we use two dimensional double precision operation, and the peak speedup ratio is about 35 times. The conclusions are as follows: the global optimum structure is cage, cage and teapot shape when nn 1-3, highly symmetrical cage structure when nu 4 and 5, disorderly when n = 6, and n 7-9 when n = 7-9, n = 3, the global optimum structure is cage shape, cage shape and teapot shape, respectively; It is also a large cage structure with high symmetry; when n = 10:00, the structure tends to be disordered. The structure of Al _ 2O _ 3N ~ (1-10), reported in previous literature, has been reproduced in our method. In addition, we have predicted the structure of nc-15, which has never been reported before. When n = 11:00, the optimal structure is more similar to that of nni7; when n = 12:00, the structure is disordered, but some atoms can overlap at a particular angle; when n = 13 and 14:00, the symmetry of the optimal structure is Cs and D _ 2, respectively; when n = 14:00, the structure is disordered.
【學位授予單位】:安徽大學
【學位級別】:碩士
【學位授予年份】:2015
【分類號】:TQ133.1
【相似文獻】
相關碩士學位論文 前1條
1 張琦堯;基于顯卡加速的氧化鋁團簇的結構預測[D];安徽大學;2015年
,本文編號:2008420
本文鏈接:http://sikaile.net/kejilunwen/huaxuehuagong/2008420.html
最近更新
教材專著