可重構(gòu)編譯中循環(huán)流水優(yōu)化技術(shù)研究
本文選題:可重構(gòu)計算 切入點:可重構(gòu)編譯 出處:《哈爾濱工程大學(xué)》2016年博士論文 論文類型:學(xué)位論文
【摘要】:隨著半導(dǎo)體技術(shù)的發(fā)展,基于時間-空間多維計算方式的可重構(gòu)計算體系結(jié)構(gòu),突破了馮.諾依曼結(jié)構(gòu)的局限性,兼具專用集成電路芯片ASIC高效性與通用處理器靈活性的可重構(gòu)計算在高性能計算、數(shù)字信號處理、網(wǎng)絡(luò)信息安全等重要領(lǐng)域中被廣泛應(yīng)用,在商業(yè)上和技術(shù)上存在的潛在價值逐漸被人們重視,成為另一種主流計算方式。對于通用計算領(lǐng)域來說,基于GPP+FPGA異構(gòu)架構(gòu)的可重構(gòu)計算架構(gòu)在能耗、存儲、性能等多方面均優(yōu)于傳統(tǒng)架構(gòu)的通用處理器,這使得可重構(gòu)計算成為未來新型計算的一個重要研究方向。由于面向通用計算領(lǐng)域的可重構(gòu)計算相關(guān)研究均處于起步階段,雖然已經(jīng)取得了很多研究成果,但仍存在很多問題亟需深入研究。影響可重構(gòu)計算系統(tǒng)實際推廣效果的一個重要因素是相關(guān)軟件生態(tài)系統(tǒng)不成熟,同時不受半導(dǎo)體制造工藝和相關(guān)硬件技術(shù)的限制,使得面向可重構(gòu)計算系統(tǒng)的可重構(gòu)編譯器相關(guān)技術(shù)成為目前世界范圍內(nèi)的研究重點與熱點。通過對可重構(gòu)計算系統(tǒng)實現(xiàn)通用計算領(lǐng)域中應(yīng)用程序硬件加速的過程進(jìn)行分析,改善可重構(gòu)編譯器實現(xiàn)應(yīng)用程序中循環(huán)結(jié)構(gòu)到可重構(gòu)計算系統(tǒng)平臺并行流水硬件加速單元的自動映射技術(shù)成為當(dāng)前該領(lǐng)域關(guān)注的課題。在前人工作的基礎(chǔ)上,本文主要針對循環(huán)程序中的運算單元、控制單元、存儲單元三個主要功能模塊的自動映射及優(yōu)化技術(shù)展開深入研究,具體研究內(nèi)容如下:(1)在現(xiàn)有可重構(gòu)編譯器實現(xiàn)循環(huán)程序到流水執(zhí)行的運算單元自動映射過程中,往往采用流水線直接劃分方法,沒有考慮基本運算指令在FPGA上執(zhí)行時真實的硬件延時特性,導(dǎo)致流水線劃分結(jié)果不優(yōu)。針對這種情況,本文設(shè)計了一種基于硬件延時特性的流水線自動劃分算法。結(jié)合循環(huán)程序在FPGA上運行時基本運算指令的硬件延時特性,建立基本指令硬件延時特征庫,并以基本運算指令延時為權(quán)值,進(jìn)行流水線合并和優(yōu)化,實現(xiàn)流水線的自動劃分。實驗結(jié)果表明,該算法能夠有效降低流水線劃分段數(shù),從而減少了因流水線劃分所導(dǎo)致的硬件資源開銷,同時降低了運算單元單次迭代執(zhí)行時的時鐘周期個數(shù)。(2)在現(xiàn)有可重構(gòu)編譯器中,循環(huán)程序流水執(zhí)行時迭代間啟動間距均采用制導(dǎo)語句指令方式控制,但是該方式只能生成固定的迭代間啟動間距信息,不能充分提高循環(huán)程序流水執(zhí)行性能,同時限制了可重構(gòu)編譯器的自動化水平。針對該問題,本文設(shè)計了一種循環(huán)流水迭代間啟動間距自動分析及優(yōu)化方法。通過建立循環(huán)流水迭代間啟動間距信息模型,采用循環(huán)流水迭代間非固定啟動間距控制策略,完成循環(huán)流水迭代間啟動間距的自動分析,同時采用流水線調(diào)度技術(shù)對迭代間啟動間距進(jìn)行優(yōu)化。實驗結(jié)果表明,本文所設(shè)計的循環(huán)流水迭代間非固定啟動間距控制策略,能夠有效減少循環(huán)程序流水執(zhí)行時迭代間等待延時時間,同時采用自動分析算法能夠有效提高可重構(gòu)編譯器的自動化水平。(3)在可重構(gòu)計算系統(tǒng)中目前已經(jīng)存在很多并行存儲結(jié)構(gòu)的研究成果,為了提高數(shù)據(jù)訪問的并行性和重用性,往往采用空間換時間的策略,但是,在資源開銷與性能方面均有提高的空間。針對這種情況,本文設(shè)計了一種參數(shù)化并行存儲結(jié)構(gòu)自動映射方法。針對類仿射型數(shù)組下標(biāo)應(yīng)用,設(shè)計一種參數(shù)化并行存儲體系結(jié)構(gòu),通過自動生成算法構(gòu)建循環(huán)程序的訪存數(shù)據(jù)依賴圖,并進(jìn)行并行存儲結(jié)構(gòu)模板的參數(shù)計算,在可重構(gòu)編譯器中實現(xiàn)并行存儲體系結(jié)構(gòu)的自動映射生成。實驗結(jié)果表明,該存儲體系結(jié)構(gòu)能夠充分挖掘循環(huán)中的數(shù)據(jù)并行性和重用性,與現(xiàn)有方案相比,能夠在占用較少硬件資源的情況下,提升循環(huán)程序流水執(zhí)行的性能。最后,本文結(jié)合上述研究內(nèi)容,分別將基于硬件延時特性的流水線自動劃分算法、循環(huán)流水迭代間啟動間距自動分析及優(yōu)化方法、參數(shù)化并行存儲結(jié)構(gòu)自動映射方法等技術(shù)應(yīng)用在循環(huán)程序運算單元、控制單元、存儲單元的自動生成過程中,構(gòu)建一種面向可重構(gòu)編譯器的循環(huán)流水自動映射框架。實驗結(jié)果表明,本文方法在提高可重構(gòu)編譯器自動化水平的同時,能夠有效提高循環(huán)程序在可重構(gòu)計算系統(tǒng)中流水執(zhí)行的性能,具有一定的可行性。
[Abstract]:With the development of semiconductor technology, time - space multidimensional calculation based on the reconstruction of the way to calculate system structure, break the limitation of the structure of von Neumann, reconstruction of both ASIC ASIC efficiency and flexibility of general purpose processor can be calculated in high performance computing, digital signal processing, has been widely used in the important field of network and information security etc. in the potential value in business and technology has been gradually valued, become a mainstream computing. For general-purpose computing, computing architecture in energy consumption, storage reconstruction of GPP+FPGA heterogeneous architecture based on general purpose processor, performance and other aspects are better than that of traditional architecture, which makes reconfigurable computing has become a a new important research direction in the future of computing. As for general-purpose reconstruction field calculation related studies are in the initial stage, although it has been taken Got a lot of achievements, but there are still many problems need to be further study. Reconfigurable computing is one of the important factors to promote the effect of the actual system is related to the software ecosystem is not mature, and not by the semiconductor manufacturing process and related hardware technology, makes for reconfigurable computing system reconfigurable Compiler Techniques become the world the emphases of research. Through the process analysis of reconfigurable computing system to achieve universal computing hardware accelerated applications, improve the reconfigurable compiler application cycle structure to automatically mapping technology of reconfigurable computing system platform parallel hardware acceleration unit has become the field of attention. Based on previous work on the operation unit, cycling program control unit, three main power storage unit An in-depth study on the automatic mapping and optimization module, the specific contents are as follows: (1) in the existing operation unit automatic mapping process reconstruction compiler implementation program to the implementation of the water cycle, often by direct division method of pipeline, without considering the basic operation instruction execution characteristics of real hardware delay in FPGA, leading to pipeline the division result is not optimal. In view of this situation, this paper designed a kind of automatic partitioning algorithm based on pipelined hardware delay characteristics. Combined with the hardware delay cycle program is run on FPGA basic operation instruction, establish the basic instruction hardware delay feature library, and to basic arithmetic instructions for delay weights of the pipelined merger and optimization automatic division of the realization of the pipeline. The experimental results show that this algorithm can effectively reduce the pipeline partition number, so as to reduce the pipeline partition The hardware resources which, while reducing the operation unit of a single iteration execution when the number of clock cycles. (2) in the existing reconfigurable compiler, water cycle program execution start between space using iterative guidance statement instruction mode control, but this method can only generate fixed spacing between iterations starting information, not to fully enhance the water cycle program execution performance, while limiting the automation level of the reconfigurable compiler. Aiming at this problem, this paper designs a recirculating iteration and optimization method of automatic analysis between the start distance. Distance through information model started to establish a recirculating iteration, by circulating water between the non fixed iteration initiation interval control strategy automatically analysis of circulating water and the spacing between iterations starting, the iteration between initiation interval was optimized by the pipeline scheduling technique. Experimental results show that the, The design of the water cycle iterative non fixed pitch control strategy can effectively reduce the cycle delay time for pipelined execution between iterations, and the automatic analysis algorithm can effectively improve the automation level of the reconfigurable compiler. (3) in the reconfigurable computing system has many parallel storage structure of the research results. In order to improve the reusability of parallelism and data access, often using the strategy space for time, but improves the resource overhead and performance space. In view of this situation, this paper designs a parametric parallel mapping method. According to the structure of the automatic storage class of affine array subscript applications, a parametric parallel design storage architecture, construction cycle program through the automatic generation algorithm of memory data dependence graph, and the parameter calculation of parallel storage structure template, In the parallel implementation of automatic map generation storage architecture reconstruction compiler. The experimental results show that the storage architecture can fully exploit the parallelism and reuse cycle data, compared with the existing schemes can occupy less hardware resources, and improve the performance up cycle pipelining. Finally, combining with the the above research contents, respectively, automatic partitioning algorithm hardware delay characteristics of pipeline based on recirculating iteration and optimization method of automatic start between the analysis of space, application of parametric parallel storage structure automatic mapping method of technology control unit in cycle operation unit, automatic generation of storage unit in the construction of a reconfigurable compiler circulating water automatic mapping framework. The experimental results show that this method can improve the level of automation in the reconstruction of the compiler at the same time, can effectively It is feasible to improve the performance of the circulation program in the reconfigurable computing system.
【學(xué)位授予單位】:哈爾濱工程大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2016
【分類號】:TP314
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 鄭怡文;白云暉;;基于數(shù)學(xué)歸納法抽取循環(huán)程序研究[J];電腦編程技巧與維護(hù);2009年14期
2 劉黎明;;談?wù)勓h(huán)程序設(shè)計[J];電腦知識與技術(shù);2010年08期
3 趙曉燕;;串行線性賦值循環(huán)程序的終止性判定[J];電腦知識與技術(shù);2012年15期
4 李玲娜;田繼東;劉德斌;;有界閉區(qū)間并上的非線性循環(huán)程序的終止性驗證[J];計算機應(yīng)用與軟件;2012年10期
5 李軼;;線性循環(huán)程序的終止性判定[J];系統(tǒng)科學(xué)與數(shù)學(xué);2013年05期
6 廖苑蓉;陳光喜;;一類非線性循環(huán)程序的終止性[J];微電子學(xué)與計算機;2013年11期
7 唐保興;多重序集在間歇斷言方法驗證循環(huán)程序中的應(yīng)用[J];計算機應(yīng)用與軟件;1985年02期
8 鄧筱紅;淺談循環(huán)程序的教學(xué)[J];九江師專學(xué)報;1998年06期
9 黃皓怡;循環(huán)程序的優(yōu)化與運行時間的測定[J];實驗教學(xué)與儀器;1996年03期
10 羅毅輝,李仁發(fā),熊曙初;可重構(gòu)計算系統(tǒng)的研究與應(yīng)用[J];計算機測量與控制;2005年08期
相關(guān)博士學(xué)位論文 前7條
1 郭振華;可重構(gòu)編譯中循環(huán)流水優(yōu)化技術(shù)研究[D];哈爾濱工程大學(xué);2016年
2 田紅麗;可重構(gòu)計算系統(tǒng)芯片中的動態(tài)數(shù)據(jù)調(diào)度模型及部件的研究[D];河北工業(yè)大學(xué);2011年
3 劉勇;嵌入式可重構(gòu)計算系統(tǒng)及其任務(wù)調(diào)度機制的研究[D];中國科學(xué)院研究生院(上海微系統(tǒng)與信息技術(shù)研究所);2006年
4 王穎;嵌入式可重構(gòu)計算系統(tǒng)的設(shè)計技術(shù)研究[D];復(fù)旦大學(xué);2009年
5 羅賽;可重構(gòu)計算系統(tǒng)體系結(jié)構(gòu)研究與實現(xiàn)[D];中國科學(xué)技術(shù)大學(xué);2006年
6 沈英哲;可重構(gòu)計算系統(tǒng)中軟硬件代碼劃分技術(shù)研究[D];中國科學(xué)技術(shù)大學(xué);2007年
7 尹勇生;可重構(gòu)多流水計算系統(tǒng)研究[D];合肥工業(yè)大學(xué);2006年
相關(guān)碩士學(xué)位論文 前6條
1 朱廣;基于最終線性秩函數(shù)的線性循環(huán)程序終止性分析[D];重慶郵電大學(xué);2016年
2 劉翰博;面向無人機航跡預(yù)測的機載可重構(gòu)計算系統(tǒng)研制[D];哈爾濱工業(yè)大學(xué);2016年
3 邵響;多核混合可重構(gòu)計算系統(tǒng)的設(shè)計與優(yōu)化[D];合肥工業(yè)大學(xué);2015年
4 謝靂;粗粒度可重構(gòu)計算系統(tǒng)中算法映射的研究與設(shè)計[D];上海交通大學(xué);2011年
5 沈陽;臥式循環(huán)流反應(yīng)器混合與分散特性研究[D];華東理工大學(xué);2013年
6 唐世卓;面向通用可重構(gòu)計算系統(tǒng)的任務(wù)編譯關(guān)鍵技術(shù)研究[D];上海交通大學(xué);2013年
,本文編號:1628759
本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/1628759.html