天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

65nm工藝YHFT-DX二級(jí)Cache的物理設(shè)計(jì)

發(fā)布時(shí)間:2018-03-17 23:07

  本文選題:二級(jí)Cache 切入點(diǎn):物理設(shè)計(jì) 出處:《國(guó)防科學(xué)技術(shù)大學(xué)》2012年碩士論文 論文類(lèi)型:學(xué)位論文


【摘要】:YHFT-DX是在65nm工藝下設(shè)計(jì)的一款高性能DSP(Digital Signal Processor)芯片,,要求在最差工藝條件下達(dá)到800MHz的設(shè)計(jì)目標(biāo)。作為芯片存儲(chǔ)通路的中心樞紐,二級(jí)Cache的設(shè)計(jì)至關(guān)重要。本文研究了YHFT-DX初樣芯片和正樣芯片中二級(jí)Cache的物理設(shè)計(jì)優(yōu)化技術(shù)。主要內(nèi)容分為如下幾點(diǎn): 1)二級(jí)Cache采用分體結(jié)構(gòu),將1MB的存儲(chǔ)體分成16個(gè)Bank體,每個(gè)Bank體由4個(gè)SRAM_CELL基本模塊構(gòu)成。研究了SRAM_CELL模塊的電路設(shè)計(jì),優(yōu)化了其布局結(jié)構(gòu)、布線(xiàn)方法以及譯碼電路,最后合理規(guī)劃了該模塊的版圖布局。 2)研究了初樣芯片中宏模塊Bank體的物理設(shè)計(jì),使用了原地優(yōu)化等多種方法進(jìn)行時(shí)序收斂,而且采用了有用偏差技術(shù)優(yōu)化了長(zhǎng)互連路徑的時(shí)序。研究了全芯片中Bank體的布局結(jié)構(gòu),分析了不同布局結(jié)構(gòu)的關(guān)鍵路徑和性能上的優(yōu)缺點(diǎn),最終確定了初樣芯片中Bank體的倒U型布局結(jié)構(gòu)。 3)對(duì)LRU(Least Recent Use)模塊進(jìn)行全定制設(shè)計(jì),使用了帶復(fù)位端的13T存儲(chǔ)單元,并在面積、性能、噪聲容限和功耗等各個(gè)方面與其他存儲(chǔ)單元進(jìn)行了比較分析。設(shè)計(jì)了讀寫(xiě)操作電路,其中讀操作電路全部采用組合邏輯實(shí)現(xiàn)。模擬了延時(shí)和功耗結(jié)果,全定制設(shè)計(jì)將關(guān)鍵路徑延時(shí)減少218ps,時(shí)序性能提高了29%,消除了芯片中與LRU相關(guān)的時(shí)序違反。 4)使用層次化設(shè)計(jì)方法進(jìn)行正樣芯片二級(jí)Cache的物理設(shè)計(jì)。重新調(diào)整了Bank體的布局結(jié)構(gòu),根據(jù)與全芯片中其他模塊的互連關(guān)系優(yōu)化了I/O端口的位置,改善了跨模塊路徑80的時(shí)序。設(shè)計(jì)了電源網(wǎng)絡(luò),在保證供電充足的同時(shí)將電壓降控制在3%以?xún)?nèi)。時(shí)鐘樹(shù)綜合采用了平衡樹(shù)的二叉樹(shù)拓?fù)浣Y(jié)構(gòu),并且使用了多種方法來(lái)優(yōu)化時(shí)鐘偏差和噪聲。分析了串?dāng)_對(duì)信號(hào)延遲的影響,對(duì)串?dāng)_預(yù)防和修復(fù)的方法進(jìn)行了研究,有效提高了設(shè)計(jì)的抗噪聲能力。 5)在正樣芯片中重組了存儲(chǔ)體的層次結(jié)構(gòu),將兩個(gè)Bank體合并為單個(gè)Bank2模塊,規(guī)劃了遠(yuǎn)端存儲(chǔ)單元的長(zhǎng)互連,并采用了雙倍線(xiàn)寬雙倍間距的布線(xiàn)規(guī)則,改善關(guān)鍵路徑延時(shí)30,有效解決了二級(jí)Cache中因長(zhǎng)互連線(xiàn)引起的時(shí)序違反。 最終,與初樣芯片相比,正樣芯片的二級(jí)Cache延時(shí)減少90,性能提高6.7%,在65nm最差工藝條件下達(dá)到了800MHz的設(shè)計(jì)目標(biāo)。
[Abstract]:YHFT-DX is a high performance DSP(Digital Signal processor chip designed in 65nm process, which requires the design target of 800MHz in the worst process conditions. The design of two-stage Cache is very important. In this paper, the physical design optimization technology of two-stage Cache in YHFT-DX chips and regular chips is studied. The main contents are as follows:. 1) the two-stage Cache adopts a split-body structure and divides the 1MB storage into 16 Bank bodies. Each Bank is composed of four SRAM_CELL basic modules. The circuit design of the SRAM_CELL module is studied, and its layout, routing method and decoding circuit are optimized. Finally, the layout of the module is reasonably planned. 2) the physical design of macro module Bank in the original sample chip is studied, and several methods such as in situ optimization are used for timing convergence, and the useful deviation technique is used to optimize the timing of long interconnect path. The layout structure of Bank volume in the whole chip is studied. The key paths and performance advantages and disadvantages of different layout structures are analyzed. Finally, the inverted U-shaped layout of Bank in the initial chip is determined. 3) the LRU(Least Recent use module is fully customized, the 13T memory cell with reset end is used, and compared with other memory cells in area, performance, noise tolerance and power consumption, the read-write operation circuit is designed. All the read operation circuits are implemented by combinatorial logic. The results of delay and power consumption are simulated. The fully customized design reduces critical path delay by 218ps. the timing performance is improved by 29 and the timing violation associated with LRU in the chip is eliminated. 4) physical design of normal chip two-level Cache is carried out by using hierarchical design method. The layout structure of Bank is adjusted, and the position of I / O port is optimized according to the interconnection relationship with other modules in the whole chip. The time sequence of cross-module path 80 is improved. The power supply network is designed, and the voltage drop is controlled within 3% while the power supply is sufficient. The clock tree synthesizes the binary tree topology of the balance tree. Several methods are used to optimize clock bias and noise. The effects of crosstalk on signal delay are analyzed, and the methods of crosstalk prevention and repair are studied, which can effectively improve the anti-noise ability of the design. 5) the hierarchical structure of the memory is reorganized in the normal chip, the two Bank bodies are merged into a single Bank2 module, the long interconnection of the remote memory cells is planned, and the routing rules of double linewidth and double spacing are adopted. By improving critical path delay 30, the timing violation caused by long interconnect in two stage Cache is effectively solved. Finally, compared with the original chip, the two-stage Cache delay of the standard chip is reduced by 90, the performance is improved by 6.7, and the design target of 800MHz is achieved under the worst process condition of 65nm.
【學(xué)位授予單位】:國(guó)防科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類(lèi)號(hào)】:TP333

【參考文獻(xiàn)】

相關(guān)期刊論文 前3條

1 馮科才;SRAM在激烈競(jìng)爭(zhēng)中開(kāi)拓新市場(chǎng)[J];電子產(chǎn)品世界;1996年10期

2 葉菁華,陳一輝,郭淦,洪志良;一種512Kbit同步高速SRAM的設(shè)計(jì)[J];固體電子學(xué)研究與進(jìn)展;2004年03期

3 趙繼業(yè);楊旭;;納米級(jí)工藝對(duì)物理設(shè)計(jì)的影響[J];中國(guó)集成電路;2008年08期



本文編號(hào):1626891

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1626891.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶(hù)96684***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com