當(dāng)前位置：主頁 > 法律論文 > 知識(shí)產(chǎn)權(quán)法論文 >

YHFT-Matrix編譯器全局指令調(diào)度相關(guān)技術(shù)的研究與實(shí)現(xiàn)

發(fā)布時(shí)間：2018-03-19 10:57

本文選題：編譯器　切入點(diǎn)：全局指令調(diào)度　出處：《國防科學(xué)技術(shù)大學(xué)》2013年碩士論文　論文類型：學(xué)位論文

【摘要】：Matrix DSP處理器是一款由國防科學(xué)技術(shù)大學(xué)計(jì)算機(jī)學(xué)院微電子所研發(fā)的有自主知識(shí)產(chǎn)權(quán)的高性能DSP處理。該處理器有較強(qiáng)的數(shù)據(jù)計(jì)算能力，因此可以應(yīng)用于軟基站無線通信、水聲計(jì)算等領(lǐng)域。為了能夠推廣這款處理器，一套正確的、性能優(yōu)越的編譯器系統(tǒng)是必須的。為了使所開發(fā)的Matrix編譯器性能更優(yōu)，就必須要做好Matrix編譯器的優(yōu)化措施，特別是針對(duì)于Matrix體系結(jié)構(gòu)的優(yōu)化措施會(huì)更有效。本文根據(jù)Matrix體系結(jié)構(gòu)的特點(diǎn)，提出了提出了幾種適合Matrix編譯器的優(yōu)化措施，有的已經(jīng)在Matrix編譯器中實(shí)現(xiàn)并根據(jù)Matrix體系結(jié)構(gòu)做了相應(yīng)的改進(jìn)，在很大程度上提高了Matrix編譯器的優(yōu)化性能。本文主要介紹和實(shí)現(xiàn)的優(yōu)化措施如下： (1)基于選擇調(diào)度的全局指令調(diào)度。Matrix處理器是一款能夠同時(shí)發(fā)射10條指令的VLIW DSP，所以指令級(jí)的并行可以充分挖掘Matrix處理器的性能。全局指令調(diào)度能夠使編譯器更好的實(shí)現(xiàn)指令級(jí)的并行。在基于GCC選擇調(diào)度的基礎(chǔ)上，Matrix編譯器中實(shí)現(xiàn)了正確的選擇調(diào)度算法，并且根據(jù)自身體系結(jié)構(gòu)改進(jìn)后的算法效果更加明顯。 (2)if轉(zhuǎn)換。if轉(zhuǎn)換能夠把控制流圖轉(zhuǎn)換為數(shù)據(jù)流圖，進(jìn)而可以服務(wù)于后續(xù)的優(yōu)化，特別是對(duì)于指令調(diào)度有關(guān)的優(yōu)化。Matrix處理器可以支持全謂詞執(zhí)行的，所以為Matrix編譯器開發(fā)if轉(zhuǎn)換可以更好的利用Matrix體系結(jié)構(gòu)的特點(diǎn)挖掘處理器的性能。在基于GCC if轉(zhuǎn)換實(shí)現(xiàn)的基礎(chǔ)上，Matrix編譯器中實(shí)現(xiàn)了同GCC一樣的幾種if轉(zhuǎn)換情況，，并且根據(jù)特定的應(yīng)用程序添加了一些新的能夠if轉(zhuǎn)換的情況。通過添加if轉(zhuǎn)換之后，Matrix編譯器的性能得到了進(jìn)一步提升，特別是在添加了一些新的能夠if轉(zhuǎn)換的情況之后，一些特定應(yīng)用程序的執(zhí)行效率有很大的提高。 (3)分支延遲調(diào)度。Matrix指令集中所有的分支指令、跳轉(zhuǎn)指令、函數(shù)調(diào)用指令都有四個(gè)延遲槽。如果在程序中不對(duì)這些延遲槽進(jìn)行填充，就會(huì)造成流水線的空轉(zhuǎn)，浪費(fèi)了硬件資源。在基于GCC分支延遲調(diào)度實(shí)現(xiàn)的基礎(chǔ)上，Matrix編譯器正確實(shí)現(xiàn)了分支延遲調(diào)度功能，并且根據(jù)Matrix體系結(jié)構(gòu)改進(jìn)后的分支延遲調(diào)度算法，調(diào)度效果更好，延遲槽填充更加充分。
[Abstract]:Matrix DSP processor is a kind of high performance DSP processing developed by Microelectronics Institute of National University of Science and Technology. The processor has strong data computing ability, so it can be used in soft base station wireless communication. In order to popularize this processor, a set of correct and superior performance compiler system is necessary. In order to make the developed Matrix compiler perform better, we must do a good job of optimizing the Matrix compiler. In particular, the optimization measures for Matrix architecture will be more effective. According to the characteristics of Matrix architecture, this paper puts forward several optimization measures suitable for Matrix compilers, some of which have been implemented in Matrix compilers and improved accordingly according to Matrix architecture. The optimization performance of Matrix compiler is improved to a great extent. This article mainly introduces and implements the following optimization measures:. Global instruction scheduling based on selective scheduling. Matrix processor is a VLIW DSP that can transmit 10 instructions simultaneously, so the parallelism of instruction level can fully exploit the performance of Matrix processor. Global instruction scheduling can make the compiler better. On the basis of GCC selection scheduling, the correct selection scheduling algorithm is implemented in the matrix compiler. And the improved algorithm is more effective according to its own architecture. The control flow diagram can be converted into a data flow diagram, which can serve subsequent optimizations, especially for instruction scheduling related optimizations. Matrix processors can support full predicate execution. Therefore, the development of if transformation for Matrix compiler can make better use of the characteristics of Matrix architecture to mine the performance of the processor. On the basis of the implementation of GCC if transformation, several kinds of if conversions are implemented in the Matrix compiler just like GCC. The performance of the Matrix compiler has been further improved by adding if transformations, especially after adding new ones that can be converted if. The execution efficiency of some specific applications has been greatly improved. All branch instructions, jump instructions, and function call instructions in the Matrix instruction set have four delay slots. Based on the implementation of GCC branch delay scheduling, the GCC compiler correctly implements the branch delay scheduling function, and according to the improved branch delay scheduling algorithm of Matrix architecture, the scheduling effect is better. The delay slot is more fully filled.
【學(xué)位授予單位】：國防科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2013
【分類號(hào)】：TP314;TP332

【參考文獻(xiàn)】

相關(guān)期刊論文前2條

1 吳承勇,連瑞琦,張兆慶,喬如良;協(xié)作式全局指令調(diào)度與寄存器分配[J];計(jì)算機(jī)學(xué)報(bào);2000年05期

2 田祖?zhèn)?趙克佳,汪小飛;GCC基于IA-64謂詞執(zhí)行的IF轉(zhuǎn)換技術(shù)研究[J];微電子學(xué)與計(jì)算機(jī);2005年06期

本文編號(hào)：1633968

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/falvlunwen/zhishichanquanfa/1633968.html

上一篇：基于專利分析的蓄熱式熔鋁爐系統(tǒng)發(fā)展趨勢(shì)研究
下一篇：論我國馳名商標(biāo)制度的價(jià)值異化與回歸

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

YHFT-Matrix編譯器全局指令調(diào)度相關(guān)技術(shù)的研究與實(shí)現(xiàn)