40納米工藝乘法部件的物理設(shè)計與優(yōu)化
發(fā)布時間:2018-06-29 20:33
本文選題:乘法部件 + 布圖優(yōu)化; 參考:《國防科學技術(shù)大學》2015年碩士論文
【摘要】:乘法運算部件速度的快慢直接影響到整個CPU內(nèi)核數(shù)據(jù)通路的性能,高性能、低功耗乘法運算部件的物理設(shè)計與實現(xiàn)是當前的難點問題之一。綜合考慮芯片的設(shè)計成本以及整體的性能,需要在有限的面積下完成乘法運算部件物理設(shè)計,這會導致時鐘網(wǎng)絡(luò)偏差較大,整體密度偏高,進而影響設(shè)計的時序和功耗。針對以上問題,本文以X-DSP的CPU內(nèi)核乘法運算部件性能優(yōu)化為背景,從提高時序、降低功耗、時序分析和等價性驗證四個方面入手,對物理設(shè)計流程進行了詳細研究,并對其中采用的主要方法和技術(shù)進行了闡述。本文的主要研究工作包括以下幾個方面:1)布圖規(guī)劃是物理設(shè)計中的重要環(huán)節(jié),其合理性對設(shè)計的性能有很大影響。采用層次化的物理設(shè)計方法,對頂層的CPU數(shù)據(jù)通路進行布圖規(guī)劃,根據(jù)模塊之間的連接關(guān)系,調(diào)整出兩種布圖規(guī)劃并進行對比,結(jié)果表明改進后的布圖規(guī)劃在時序上可優(yōu)化9%。根據(jù)CPU數(shù)據(jù)通路的布圖規(guī)劃來確定乘法部件的布圖規(guī)劃,詳細分析乘法部件的層次結(jié)構(gòu),迭代多次,改進后的布圖規(guī)劃在時序上可優(yōu)化5%。2)在時鐘樹方面,減少時鐘延時和時鐘偏差是時鐘網(wǎng)絡(luò)的首要任務。乘法部件最初的時鐘偏差是47.2ps,時鐘延時是304.7ps;通過控制時鐘驅(qū)動單元、最大扇出、最大級數(shù)來優(yōu)化時鐘網(wǎng)絡(luò),使時鐘偏差降低到35.5ps,時鐘延時降低到260.6~296.1ps。在上面的基礎(chǔ)上通過控制時鐘布線來優(yōu)化時鐘網(wǎng)絡(luò),使時鐘偏差降低到27.9ps,時鐘延時降低到232.3~260.2ps。3)芯片功耗已經(jīng)成為與芯片速度、芯片面積同樣重要的性能指標,通過自動化插入門控時鐘使動態(tài)功耗降低了62.7%,通過調(diào)整約束、降低密度、減小單元倍數(shù)、多閾值單元替換等辦法使靜態(tài)功耗降低了9%。4)在靜態(tài)時序分析方面,由于單模式單端角進行時序分析的不足,從端角的組成、分析模式以及分析流程三個角度討論了多模式多端角的時序分析。基于乘法部件進行多模式多端角時序分析,并運用ice工具進行優(yōu)化時序,使其達到滿足。5)基于乘法部件進行形式化驗證方法的研究,使用Formality工具進行形式化驗證,解決了在驗證過程中所遇到的門控時鐘、掃描鏈等問題,最終驗證成功。
[Abstract]:The speed of multiplication operation unit has a direct impact on the performance of the whole CPU kernel data path. The physical design and implementation of high performance and low power multiplication operation unit is one of the difficult problems at present. Considering the design cost and the overall performance of the chip, it is necessary to complete the physical design of the multiplication operation unit in a limited area, which will lead to a large deviation of the clock network and high overall density, which will affect the timing and power consumption of the design. Aiming at the above problems, this paper studies the physical design flow in detail from four aspects: improving timing, reducing power consumption, timing analysis and equivalence verification, based on the performance optimization of X-DSP CPU kernel multiplication operation unit. The main methods and techniques are described. The main research work of this paper includes the following aspects: 1) layout planning is an important part of physical design, and its rationality has a great influence on the design performance. Using the hierarchical physical design method, the layout planning of the CPU data path at the top level is carried out. According to the connection relationship between the modules, the two layout plans are adjusted and compared. The results show that the improved layout planning can optimize 9 parts in time sequence. According to the layout planning of CPU data path, the layout planning of multiplication components is determined, and the hierarchical structure of multiplicative components is analyzed in detail, iterated many times, the improved layout planning can optimize 5.2% in time sequence) in the aspect of clock tree, Reducing clock delay and clock deviation is the most important task of clock network. The initial clock deviation of the multiplication unit is 47.2 pss and the clock delay is 304.7 ps.The clock network is optimized by controlling the clock drive unit, the maximum fan out and the maximum series to reduce the clock deviation to 35.5psand the clock delay to 260.6 / 296.1ps. On the basis of the above, the clock network is optimized by controlling the clock wiring, so that the clock deviation is reduced to 27.9psand the clock delay is reduced to 232.3~260.2ps.3) chip power consumption has become the same performance index as chip speed and chip area. By automatically inserting the gating clock, the dynamic power consumption is reduced by 62.7%, and the static power consumption is reduced by 9.4% by adjusting the constraints, reducing the density, reducing the unit multiple and replacing the multi-threshold unit. Because of the shortage of single mode and single end angle time series analysis, this paper discusses the time series analysis of multi mode and multi end angle from three angles: the composition of end angle, the analysis mode and the analysis flow. Multi-mode multi-angle timing analysis based on multiplicative components, and optimized timing with ice tool to meet the requirements of 5. 5) the formal verification method based on multiplicative components is studied, and formality tools are used for formal verification. The problems of gating clock and scan chain are solved in the process of verification, and the verification is successful.
【學位授予單位】:國防科學技術(shù)大學
【學位級別】:碩士
【學位授予年份】:2015
【分類號】:TP332
【參考文獻】
相關(guān)碩士學位論文 前1條
1 張仕紅;多端角下時鐘偏差一致性的分析與優(yōu)化[D];國防科學技術(shù)大學;2014年
,本文編號:2083292
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2083292.html
最近更新
教材專著