天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 計(jì)算機(jī)論文 >

高性能微處理器中浮點(diǎn)融合乘加部件的設(shè)計(jì)與實(shí)現(xiàn)

發(fā)布時(shí)間:2018-09-08 09:28
【摘要】:浮點(diǎn)融合乘加(FMA)部件作為高性能微處理器的核心運(yùn)算部件之一,對(duì)整個(gè)微處理器的浮點(diǎn)性能具有很大影響。浮點(diǎn)融合乘加運(yùn)算算法復(fù)雜,邏輯執(zhí)行時(shí)間長,規(guī)模大;且驗(yàn)證難,設(shè)計(jì)周期長。因此,對(duì)高性能浮點(diǎn)融合乘加部件的研究具有廣泛的應(yīng)用價(jià)值和重要的現(xiàn)實(shí)意義。 本文對(duì)高性能浮點(diǎn)融合乘加部件的設(shè)計(jì)和優(yōu)化技術(shù)進(jìn)行了研究,課題的研究內(nèi)容作為國家重大項(xiàng)目“高性能X處理器”的一部分,研究成果直接應(yīng)用于工程實(shí)踐;趩螖(shù)據(jù)通路FMA算法,無異常中斷和軟件協(xié)處理(SWA)機(jī)制,以高頻率、小面積、兼容IEEE754標(biāo)準(zhǔn)為目標(biāo),本文設(shè)計(jì)了支持非規(guī)格化數(shù),符號(hào)零,無窮大和NaNs數(shù)輸入與輸出的FMA部件。主要研究工作及成果包括以下幾點(diǎn): 1.對(duì)高性能浮點(diǎn)融合乘加部件及其關(guān)鍵技術(shù)進(jìn)行了廣泛的研究,在此基礎(chǔ)上設(shè)計(jì)并實(shí)現(xiàn)了高性能X處理器的浮點(diǎn)融合乘加部件。 2.提出了一種乘法陣列的進(jìn)位修正結(jié)構(gòu);設(shè)計(jì)了基于EAC結(jié)構(gòu)的主加法器,減少了FMA的邏輯級(jí)數(shù),提高了執(zhí)行速度。 3.采用最大規(guī)格化移位量控制和靈活的一位規(guī)格化修正技術(shù)設(shè)計(jì)了支持非規(guī)格化數(shù)的簡捷LZA結(jié)構(gòu);將精確無窮大操作和NaNs數(shù)據(jù)通路并入對(duì)齊的加數(shù)數(shù)據(jù)通路,非規(guī)格化操作數(shù)處理融入到正常的規(guī)格化數(shù)據(jù)流中,以最大限度地共享尾數(shù)處理數(shù)據(jù)通路。 4.用Verilog硬件描述語言完成了對(duì)整個(gè)設(shè)計(jì)的RTL級(jí)流水化建模實(shí)現(xiàn)。整個(gè)設(shè)計(jì)通過了包括IEEE754標(biāo)準(zhǔn)測(cè)試向量、特殊操作數(shù)、邊角數(shù)據(jù)和大量的隨機(jī)向量等各種測(cè)試集的測(cè)試,,保證了設(shè)計(jì)的正確性。 最后,對(duì)本文設(shè)計(jì)的浮點(diǎn)融合乘加部件進(jìn)行了綜合和優(yōu)化調(diào)試,采用40nm體硅CMOS工藝,在最壞工藝條件下,其頻率能達(dá)到2.5GHz,面積56735.9um2,滿足X處理器的設(shè)計(jì)要求。
[Abstract]:As one of the core computing components of high-performance microprocessors, floating-point fusion multiplication plus (FMA) has great influence on the floating-point performance of the whole microprocessor. The floating-point fusion multiplication and addition algorithm is complex, the logical execution time is long, the scale is large, and the verification is difficult and the design period is long. Therefore, the research of high performance floating-point fusion multiplicative components has wide application value and important practical significance. In this paper, the design and optimization of high performance floating-point fusion multiplicative components are studied. As a part of the national important project "High performance X processor", the research results are directly applied in engineering practice. Based on the single data path FMA algorithm, no abnormal interrupt and software coprocessing (SWA) mechanism, and aiming at high frequency, small area and compatible with IEEE754 standard, this paper designs FMA parts that support non-normalized number, symbol zero, infinity and NaNs number input and output. The main research work and results include the following: 1. The high performance floating-point fusion multiplier and its key technology are studied extensively. Based on this, the floating-point fusion multiplicative and additive component of high performance X processor is designed and implemented. 2. In this paper, a carry correction structure of multiplication array is proposed, and a main adder based on EAC structure is designed, which reduces the logical series of FMA and improves the execution speed. A simple LZA structure supporting non-normalized number is designed by using the maximum normalized shift control and flexible one-bit correction technique, and the precise infinity operation and the NaNs data path are incorporated into the aligned additive data path. Non-normalized Operand processing is integrated into the normal normalized data stream to maximize the sharing of Mantissa processing data path. 4. The RTL level pipelining modeling of the whole design is implemented with Verilog hardware description language. The whole design has passed the tests including IEEE754 standard test vector, special Operand, edge angle data and a large number of random vectors, which ensures the correctness of the design. Finally, the floating-point fusion multiplicative component designed in this paper is synthesized and optimized. The 40nm bulk silicon CMOS process is adopted. Under the worst technological conditions, the frequency can reach 2.5 GHz and the area is 56735.9 um2, which meets the design requirements of X processor.
【學(xué)位授予單位】:國防科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP332

【參考文獻(xiàn)】

相關(guān)博士學(xué)位論文 前1條

1 孫巖;納米集成電路軟錯(cuò)誤分析與緩解技術(shù)研究[D];國防科學(xué)技術(shù)大學(xué);2010年



本文編號(hào):2230126

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2230126.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶b87b7***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com