當(dāng)前位置：主頁(yè) > 科技論文 > 計(jì)算機(jī)論文 >

GPGPU結(jié)構(gòu)研究與性能分析

發(fā)布時(shí)間：2018-06-24 14:17

本文選題：GPGPU + Fermi��；參考：《吉林大學(xué)》2017年碩士論文

【摘要】：在過(guò)去的十幾年里GPU處理性能的增長(zhǎng)十分迅猛。GPU在結(jié)構(gòu)上與CPU有很大的不同,在GPU中有更多的晶體管用于計(jì)算,而CPU中更多的晶體管用于邏輯控制。因此在不同的設(shè)計(jì)目的之下,他們的作用也變得不同。更近一步,GPU迅速?gòu)膱D像處理領(lǐng)域發(fā)展到通用計(jì)算領(lǐng)域,由此開(kāi)啟了一個(gè)新的領(lǐng)域叫做GPGPU(General-Purpose Computing on the Graphic Processing Unit)。GPGPU是為處理并行任務(wù)而設(shè)計(jì)的,所以對(duì)并行計(jì)算模型的研究是很有意義的。雖然PRAM模型、BSP模型和log P模型等經(jīng)典的并行計(jì)算模型已經(jīng)提出很多年,但是通過(guò)對(duì)這些模型的研究可以更加深刻的理解GPGPU結(jié)構(gòu)。從GPGPU這個(gè)概念被提出開(kāi)始,很多的研究集中在利用其強(qiáng)大的計(jì)算能力,對(duì)于處理某一問(wèn)題的效率進(jìn)行大幅度提升。這一現(xiàn)象主要原因在于芯片的詳細(xì)結(jié)構(gòu)、流水線以及存儲(chǔ)設(shè)計(jì)都涉及到商業(yè)機(jī)密,很難獲得這些資料用于研究。英偉達(dá)和AMD是兩家主要生產(chǎn)GPGPU的廠家,相比較之下英偉達(dá)的官方文檔更加詳細(xì),其CUDA套件也更加完備,因此本文以英偉達(dá)的芯片作為研究重點(diǎn)。本文選擇了開(kāi)源的GPGPU-Sim模擬器,對(duì)英偉達(dá)的GPU進(jìn)行模擬。本文對(duì)一些并行計(jì)算模型,比如PRAM模型、BSP模型和log P模型等進(jìn)行了對(duì)比研究,比較了其參數(shù)的異同以及核心思想,并且對(duì)當(dāng)前GPU的研究現(xiàn)狀做了簡(jiǎn)單綜述。隨后,本文給出了一個(gè)全新的NKGPGPU,對(duì)硬件結(jié)構(gòu)、任務(wù)的邏輯結(jié)構(gòu)、代碼結(jié)構(gòu)以及其中的映射關(guān)系做出了詳細(xì)構(gòu)架。整體上,NKGPGPU包括五個(gè)子模型,分別是硬件結(jié)構(gòu)子模型、任務(wù)結(jié)構(gòu)子模型、任務(wù)組織子模型、任務(wù)執(zhí)行子模型以及任務(wù)調(diào)度子模型。硬件結(jié)構(gòu)子模型主要給出了NKGPGPU芯片中的主要組成部件。任務(wù)組織子模型主要給出了適用于NKGPGPU的代碼結(jié)構(gòu)以及代碼和任務(wù)之間的映射,除此之外還給出了任務(wù)之間的啟動(dòng)關(guān)系模型。任務(wù)執(zhí)行子模型這一部分給出了代碼和硬件之間的映射。任務(wù)調(diào)度子模型給出了任務(wù)拓?fù)浣Y(jié)構(gòu)和硬件結(jié)構(gòu)的映射。同時(shí)本文給出了一個(gè)性能分析模型,使它符合本文提出的NKGPGPU。對(duì)于影響GPGPU性能的主要三個(gè)方面:GPGPU流水線、共享存儲(chǔ)和全局存儲(chǔ),本文在不同線程數(shù)目的情況下進(jìn)行了詳細(xì)的實(shí)驗(yàn)。對(duì)GPGPU的流水線的實(shí)驗(yàn)主要是研究對(duì)于不同類型的指令的運(yùn)行周期的差異,通過(guò)這個(gè)差異來(lái)判斷指令與流水線之間的關(guān)系。研究共享內(nèi)存和全局內(nèi)存的方法類似,都是通過(guò)連續(xù)的訪存指令測(cè)試完成周期。本文提出的NKGPGPU豐富了GPGPU的理論模型,為GPGPU硬件工程師和軟件編程人員提供了改進(jìn)的依據(jù),對(duì)于GPGPU-Sim的實(shí)驗(yàn)方法和思路可以作為進(jìn)一步研究GPGPU的基礎(chǔ)。
[Abstract]:In the past decade, the processing performance of GPU has grown rapidly. The structure of GPU is very different from that of CPU. There are more transistors in GPU for computation and more transistors in CPU for logic control. Therefore, under different design purposes, their role also becomes different. With the rapid development of GPU from the field of image processing to the field of general computing, GPU (General-Purpose Computing on the graphic processing Unit) .GPGPU is designed to deal with parallel tasks, so the research of parallel computing model is very meaningful. Although the classical parallel computing models such as pram model and log P model have been proposed for many years, the structure of GPGPU can be better understood through the study of these models. Since the concept of GPGPU was put forward, many researches have focused on using its powerful computing power to greatly improve the efficiency of dealing with a certain problem. This phenomenon is mainly due to the detailed structure of the chip, pipeline and storage design are involved in trade secrets, it is difficult to obtain such information for research. Nvidia and AMD are two main manufacturers of GPGPU. Compared with Nvidia, the official documents of Nvidia are more detailed and its CUDA kit is more complete. Therefore, this paper focuses on Nvidia's chip. In this paper, the open source GPU-Sim simulator is chosen to simulate Nvidia's GPU. In this paper, some parallel computing models, such as pram model, BSP model and log P model, are compared, the differences and similarities of their parameters and their core ideas are compared, and the current research situation of GPUs is briefly summarized. Then, this paper presents a new NKGP GPU, which provides a detailed framework for hardware structure, task logic structure, code structure and mapping relationship. As a whole, NKGPU consists of five sub-models, namely, the hardware structure sub-model, the task organization sub-model, the task execution sub-model and the task scheduling sub-model. The hardware architecture sub-model mainly gives the main components of NKGPGPU chip. The task organization sub-model mainly gives the code structure and mapping between code and task which is suitable for NKGPU. In addition, the startup relationship model between tasks is also given. This part of the task execution submodel shows the mapping between code and hardware. The task scheduling submodel gives the mapping between the task topology and the hardware structure. At the same time, a performance analysis model is given to make it accord with the NKGP GPUproposed in this paper. For the three main aspects affecting GPGPU performance: GPGPU pipelining, shared storage and global storage, this paper makes a detailed experiment with different number of threads. The experiment of pipeline of GPGPU is mainly to study the difference of running cycle for different types of instruction, and judge the relationship between instruction and pipeline by this difference. The methods of studying shared memory and global memory are similar, they are completed by continuous memory access instruction testing. The NKGPGPU presented in this paper enriches the theoretical model of GPGPU and provides an improved basis for GPGPU hardware engineers and software programmers. The experimental methods and ideas for GPGPU-Sim can be used as the basis for further research on GPGPU.
【學(xué)位授予單位】：吉林大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TP391.41;TP332

【相似文獻(xiàn)】

相關(guān)期刊論文前7條

1 尹芳;;基于ANSYS Workbench的結(jié)構(gòu)應(yīng)力分析的子模型法[J];武漢輕工大學(xué)學(xué)報(bào);2014年02期

2 謝曉丹;;深入理解CSS盒子模型[J];福建電腦;2011年07期

3 甘杜芬;吳飛燕;;CSS盒子模型定位方式的研究與應(yīng)用[J];計(jì)算機(jī)光盤(pán)軟件與應(yīng)用;2013年06期

4 王能超;李小妹;;格子模型的序列搜索優(yōu)化算法[J];小型微型計(jì)算機(jī)系統(tǒng);2005年10期

5 康亞明;楊明成;;基于子模型的孔邊應(yīng)力集中的有限元分析[J];湖南工程學(xué)院學(xué)報(bào)(自然科學(xué)版);2005年04期

6 王安華;涂序彥;;氣田生產(chǎn)調(diào)度多重廣義算子模型[J];微計(jì)算機(jī)信息;2006年34期

7 彭云;易龍;南英;;復(fù)合材料盒段結(jié)構(gòu)屈曲穩(wěn)定性分析及優(yōu)化技術(shù)[J];航空計(jì)算技術(shù);2006年05期

相關(guān)會(huì)議論文前1條

1 楊慶山;李啟;;亞格子模型在鈍體繞流大渦模擬中的比較[A];第十四屆全國(guó)結(jié)構(gòu)風(fēng)工程學(xué)術(shù)會(huì)議論文集（下冊(cè)）[C];2009年

相關(guān)博士學(xué)位論文前1條

1 李應(yīng)林;基于旋流強(qiáng)度的亞格子模型及其在不可壓流動(dòng)大渦模擬中的應(yīng)用[D];中國(guó)科學(xué)技術(shù)大學(xué);2015年

相關(guān)碩士學(xué)位論文前3條

1 郭康瑞;基于子模型的蜂窩梁孔間腹板受剪屈曲承載力計(jì)算方法[D];山東大學(xué);2017年

2 邢千里;GPGPU結(jié)構(gòu)研究與性能分析[D];吉林大學(xué);2017年

3 周宇;基于子模型的鐵路車輛結(jié)構(gòu)強(qiáng)度精細(xì)計(jì)算[D];大連交通大學(xué);2008年

，

本文編號(hào)：2061805

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2061805.html

上一篇：片上多核處理器的區(qū)域共享的雙粒度目錄
下一篇：HDFS中文件存儲(chǔ)優(yōu)化的相關(guān)技術(shù)研究

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

GPGPU結(jié)構(gòu)研究與性能分析