天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 計(jì)算機(jī)論文 >

X-DSP 64位SIMD位處理部件及混洗單元的設(shè)計(jì)與實(shí)現(xiàn)

發(fā)布時(shí)間:2018-04-04 12:59

  本文選題:位處理 切入點(diǎn):混洗 出處:《國防科學(xué)技術(shù)大學(xué)》2013年碩士論文


【摘要】:數(shù)字信號(hào)處理器(Digital Signal Processor,DSP)是一門涉及多學(xué)科而又廣泛應(yīng)用于眾多領(lǐng)域的新興學(xué)科。步入21世紀(jì)以后,社會(huì)進(jìn)入數(shù)字時(shí)代,而DSP正是這場(chǎng)數(shù)字化革命的核心。 X-DSP是一款自主設(shè)計(jì)的高性能64位SIMD DSP,它采用VLIW技術(shù),一拍可以發(fā)射11條指令,設(shè)計(jì)主頻為1.25GHz。本文根據(jù)X-DSP的性能需求,在深入研究了目前主流DSP處理器體系結(jié)構(gòu)與指令集系統(tǒng)的基礎(chǔ)上,完成了64位位處理(Bit-Processing,BP)部件和混洗(Shuffle)單元的設(shè)計(jì)與實(shí)現(xiàn),具體內(nèi)容如下: ⒈設(shè)計(jì)實(shí)現(xiàn)了X-DSP64位SIMD位BP部件。它作為X-DSP內(nèi)核運(yùn)算單元的功能部件之一,主要執(zhí)行移位指令、位處理指令和打包解包指令。通過采用SIMD結(jié)構(gòu),可以一拍內(nèi)完成兩個(gè)32位數(shù)據(jù)操作,對(duì)程序的數(shù)據(jù)級(jí)并行提供充分的支持。 ⒉64位Shuffle單元作為一種向量數(shù)據(jù)交互網(wǎng)絡(luò),主要用于實(shí)現(xiàn)向量運(yùn)算單元中各個(gè)VPE之間的數(shù)據(jù)交換。本文通過深入研究目前幾種主流芯片的混洗指令設(shè)計(jì)特點(diǎn),設(shè)計(jì)了自己的64位混洗指令及混洗電路結(jié)構(gòu)。它采用獨(dú)立的SRAM來存放混洗模式,這樣使得應(yīng)用程序在執(zhí)行過程中可以與寄存器文件或訪存帶寬等系統(tǒng)的關(guān)鍵資源分離,提高了其執(zhí)行效率。 ⒊本文在設(shè)計(jì)中對(duì)BP及shuffle進(jìn)行了三個(gè)層次的模擬驗(yàn)證:模塊級(jí)、部件級(jí)、SPE/VPE級(jí),其中在模塊級(jí)還結(jié)合了SVA形式化驗(yàn)證,保證了設(shè)計(jì)功能的正確性;在部件級(jí),我們通過加載單個(gè)部件的測(cè)試激勵(lì),獲得了相應(yīng)模塊的覆蓋率。同時(shí),我們還對(duì)混洗單元進(jìn)行了性能測(cè)評(píng),結(jié)果顯示:在相同的混洗粒度下,X-DSP混洗模式存儲(chǔ)器的混洗模式表示效率分別為0.88和0.75,在對(duì)比的幾種混洗單元中為最高。 最后,我們采用Synopsys公司的Design Compiler工具分別對(duì)BP部件及shuffle單元進(jìn)行綜合,結(jié)果顯示:位處理部件的總面積為48513.7819um2,關(guān)鍵路徑延時(shí)為0.42ns,,功耗為28.1785mw;混洗單元的總面積為662016.8um2,關(guān)鍵路徑延時(shí)為0.44ns,功耗為179.6060mw,均能滿足X-DSP預(yù)期1.25GHz的性能要求。
[Abstract]:Digital Signal processor (DSP) is a new subject which involves many disciplines and is widely used in many fields.After entering the 21 st century, the society enters the digital age, and DSP is the core of the digital revolution.X-DSP is a self-designed high performance 64-bit SIMD DSP. It uses VLIW technology, can send 11 instructions in one shot, and the main frequency is 1.25 GHz.According to the performance requirements of X-DSP, the design and implementation of 64-bit processing Bit-Processing-BPs and shuffle units are completed on the basis of in-depth research on the current mainstream DSP processor architecture and instruction set system. The main contents are as follows:1. The X-DSP64 bit SIMD BP part is designed and implemented.As one of the functional components of the X-DSP kernel unit, it mainly executes shift instruction, bit processing instruction and package unpack instruction.By adopting SIMD structure, two 32 bit data operations can be completed in one beat, which can provide sufficient support for the data level parallelism of the program.As a vector data interaction network, the 264 bit Shuffle unit is mainly used to realize the data exchange between the VPE in the vector operation unit.In this paper, the design characteristics of several kinds of mainstream chips' washing instructions are studied, and their own 64 bit washing instructions and their circuit structure are designed.It uses independent SRAM to store the shuffling mode, which enables the application to separate from the key resources of the system such as register file or memory access bandwidth in the execution process, and improves its execution efficiency.3. In this paper, BP and shuffle are simulated at three levels: module level, component level and SPE / VPE level, in which SVA formal verification is combined at module level to ensure the correctness of design function.We get the corresponding module coverage by loading the test excitation of a single component.At the same time, we also evaluate the performance of the washing unit. The results show that the efficiency of the mixed-mode memory of X-DSP is 0.88 and 0.75 respectively under the same washing granularity, which is the highest in the comparison of several washing units.Finally, we use the Design Compiler tool of Synopsys Company to synthesize BP parts and shuffle units respectively.The results show that the total area of the bit processing unit is 48513.7819um2, the critical path delay is 0.42ns, the power consumption is 28.1785mw. the total area of the mixed-washing unit is 662016.8um2, the critical path delay is 0.44ns, and the power consumption is 179.6060mw. it can meet the performance requirements of X-DSP.
【學(xué)位授予單位】:國防科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP332

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 萬江華;劉勝;周鋒;王耀華;陳書明;;具有高效混洗模式存儲(chǔ)器的可編程混洗單元[J];國防科技大學(xué)學(xué)報(bào);2011年06期



本文編號(hào):1710050

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1710050.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶17989***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
99精品国产一区二区青青 | 日本特黄特色大片免费观看| 成年人视频日本大香蕉久久| 青青操日老女人的穴穴| 国产精品一区二区成人在线| 欧美日韩无卡一区二区| 欧美日韩国产一级91| 麻豆看片麻豆免费视频| 日本一二三区不卡免费| 国产成人精品视频一二区| 亚洲中文字幕三区四区| 日本欧美一区二区三区就| 99久久精品国产日本| 国产原创激情一区二区三区| 激情五月天免费在线观看| 黑人粗大一区二区三区| 国产日韩熟女中文字幕| 午夜福利黄片免费观看| 国产主播精品福利午夜二区| 一二区中文字幕在线观看| 日韩人妻中文字幕精品| 午夜福利网午夜福利网| 亚洲精品国产精品日韩| 亚洲性日韩精品一区二区| a久久天堂国产毛片精品| 青草草在线视频免费视频| 国产成人人人97超碰熟女| 亚洲国产精品久久精品成人| 午夜亚洲少妇福利诱惑| 亚洲中文字幕视频在线观看| 五月激情婷婷丁香六月网| 亚洲男人天堂网在线视频| 日韩在线中文字幕不卡| 一级欧美一级欧美在线播| 国产精品免费视频专区| 翘臀少妇成人一区二区| 欧美整片精品日韩综合| 欧美国产极品一区二区| 精品熟女少妇av免费久久野外| 国产精品国产亚洲看不卡 | 蜜桃传媒视频麻豆第一区|