天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 計(jì)算機(jī)論文 >

基于FPGA的并行加速實(shí)驗(yàn)平臺原型設(shè)計(jì)與實(shí)現(xiàn)

發(fā)布時(shí)間:2018-03-16 01:20

  本文選題:PCI 切入點(diǎn):Express 出處:《山東大學(xué)》2013年碩士論文 論文類型:學(xué)位論文


【摘要】:近年來,隨著物聯(lián)網(wǎng)等新概念的提出和計(jì)算機(jī)技術(shù)的進(jìn)步,嵌入式系統(tǒng)正以前所未有的速度發(fā)展,各種新型的嵌入式設(shè)備不斷涌現(xiàn);而且這些新出現(xiàn)的設(shè)備對智能化和實(shí)時(shí)性的要求越來越高,因此需要的運(yùn)算量也越來越大。但是,傳統(tǒng)的嵌入式處理器由于受性能、頻率等方面的限制,單個的處理器已經(jīng)在很大程度上沒法滿足需求。如果采用多個嵌入式處理器來提高處理速度,其功耗必將會大大增加,對能量有限的嵌入式設(shè)備而言,這也是不合適的。在這種情況下,現(xiàn)場可編程邏輯門陣列(Field Programmable Gate Array, FPGA)加嵌入式處理器的異構(gòu)體系架構(gòu)成為了解決上述問題的一個理想方案之一。 目前基于FPGA的并行加速模型可謂多種多樣,針對具體的算法采用FPGA作為協(xié)處理器進(jìn)行并行加速研究也是學(xué)術(shù)界的熱點(diǎn)之一。但是通常,將算法采用FPGA進(jìn)行并行加速后,多采用仿真和分析得到加速效果,缺少實(shí)際的板級測試,這主要是因?yàn)樗惴y試中需要與主控制器之間進(jìn)行大量而且快速的數(shù)據(jù)交換,但是目前尚缺少這樣的數(shù)據(jù)交換平臺,因此急需這樣一個可以進(jìn)行高速數(shù)據(jù)交換的并行加速實(shí)驗(yàn)平臺,用于加速效果的板級測試。 本文設(shè)計(jì)了一個并行加速實(shí)驗(yàn)平臺原型。為達(dá)到數(shù)據(jù)交換速度要求,該平臺采用PCI Express總線與主控制器進(jìn)行數(shù)據(jù)交換,為加速數(shù)據(jù)傳輸,采用了DMA傳輸?shù)姆绞。文中給出了實(shí)驗(yàn)平臺的總體設(shè)計(jì)及實(shí)現(xiàn)步驟和方法。采用自上而下的模塊化設(shè)計(jì)模式,將平臺分為了PCI Express端點(diǎn)控制器模塊、PCI Express事物層報(bào)文處理及DMA控制模塊、存儲控制器模塊、并行加速實(shí)驗(yàn)?zāi)K和并行加速模塊與存儲器控制器之間的接口模塊。作為整個平臺的核心模塊,PCI Express事務(wù)層報(bào)文處理及DMA控制器模塊邏輯復(fù)雜,子模塊眾多,本文中重點(diǎn)介紹了該模塊的詳細(xì)設(shè)計(jì)和實(shí)現(xiàn)過程,將其劃分為發(fā)送部件、接收部件、DMA控制器、讀請求封裝器、發(fā)送數(shù)據(jù)仲裁及準(zhǔn)備模塊、接收數(shù)據(jù)分發(fā)模塊、DMA與存儲器控制器接口模塊和DMA與并行加速模塊接口等子模塊分別實(shí)現(xiàn)。同時(shí)也給出了其他模塊的設(shè)計(jì)實(shí)現(xiàn)過程。然后以排序算法為例,介紹了并行排序加速器的實(shí)現(xiàn),以此為基礎(chǔ),設(shè)計(jì)實(shí)現(xiàn)了并行加速模塊,從而完成了整個實(shí)驗(yàn)平臺的設(shè)計(jì)實(shí)現(xiàn)。本文最后對上述設(shè)計(jì)實(shí)現(xiàn)的平臺進(jìn)行了測試,給出了平臺的實(shí)際資源占用、最大交換速度及實(shí)際加速效果等數(shù)據(jù)。通過實(shí)驗(yàn)證明,該平臺滿足并行加速實(shí)驗(yàn)的要求,可以進(jìn)行算法并行加速的板級測試和實(shí)驗(yàn)。
[Abstract]:In recent years, with the introduction of new concepts such as the Internet of things and the progress of computer technology, embedded systems are developing at an unprecedented speed, and a variety of new embedded devices are emerging. Moreover, these new devices require more and more intelligentization and real-time performance, so they need more and more computation. However, the traditional embedded processors are limited by performance, frequency and so on. A single processor has largely failed to meet the requirements. If multiple embedded processors are used to increase processing speed, the power consumption will be greatly increased for embedded devices with limited energy. In this case, the heterogeneous architecture of Field Programmable Gate Array (FPGA) with embedded processors has become one of the ideal solutions to the above problems. At present, there are various parallel acceleration models based on FPGA, and it is also one of the hot topics in academic circles to use FPGA as a coprocessor for specific algorithms. But usually, FPGA is used to accelerate the algorithm in parallel. Simulation and analysis are often used to get accelerated results and lack of actual board level testing, which is mainly due to the need for a large amount of and fast data exchange between the algorithm test and the main controller, but there is still a lack of such a data exchange platform. Therefore, such a parallel acceleration experiment platform for high speed data exchange is urgently needed, which can be used to test the acceleration effect at board level. In this paper, a prototype of parallel acceleration experiment platform is designed. In order to meet the requirement of data exchange speed, the platform uses PCI Express bus to exchange data with the main controller. The DMA transmission mode is adopted. The overall design, implementation steps and methods of the experimental platform are given, and the top-down modular design mode is adopted. The platform is divided into PCI Express endpoint controller module, PCI Express transaction layer message processing module and DMA control module, and storage controller module. The parallel acceleration experiment module and the interface module between the parallel acceleration module and the memory controller. As the core module of the whole platform, the logic of the transaction layer message processing and the DMA controller module of the DMA controller are complex, and the sub-modules are numerous. This paper focuses on the detailed design and implementation of the module, which is divided into sending parts, receiving components of DMA controller, reading request wrapper, sending data arbitration and preparation module. The interface module of receiving data distribution module and memory controller and the interface module of DMA and parallel acceleration module are implemented respectively. At the same time, the design and implementation of other modules are also given. Then, the sorting algorithm is taken as an example. This paper introduces the implementation of parallel sorting accelerator, designs and implements the parallel acceleration module based on it, and completes the design and implementation of the whole experimental platform. Finally, the above design and implementation platform is tested in this paper. The actual resource occupation, maximum exchange speed and actual acceleration effect of the platform are given. It is proved by experiments that the platform can meet the requirements of parallel acceleration experiments and can be tested and experimented on board level with parallel algorithm acceleration.
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP368.1;TP338.6

【參考文獻(xiàn)】

相關(guān)期刊論文 前2條

1 崔強(qiáng)強(qiáng);金同標(biāo);朱勇;;基于IMPULSE C的GF(P)域橢圓加密算法的硬件加速[J];計(jì)算機(jī)應(yīng)用;2011年09期

2 錢松,周欽,俞軍;AES算法的一種高效FPGA實(shí)現(xiàn)方法[J];微電子學(xué)與計(jì)算機(jī);2005年07期

相關(guān)碩士學(xué)位論文 前3條

1 羅r,

本文編號:1617699


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1617699.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶77d0e***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com