天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 計(jì)算機(jī)論文 >

基于Goldschmidt算法的高性能雙精度浮點(diǎn)除法器設(shè)計(jì)

發(fā)布時(shí)間:2018-04-28 18:42

  本文選題:浮點(diǎn)除法器 + Goldschmidt算法; 參考:《計(jì)算機(jī)應(yīng)用》2015年07期


【摘要】:針對雙精度浮點(diǎn)除法通常運(yùn)算過程復(fù)雜、延時(shí)較大這一問題,提出一種基于Goldschmidt算法設(shè)計(jì)支持IEEE-754標(biāo)準(zhǔn)的高性能雙精度浮點(diǎn)除法器方法。首先,分析Goldschmidt算法運(yùn)算除法的過程以及迭代運(yùn)算產(chǎn)生的誤差;然后,提出了控制誤差的方法;其次,采用了較節(jié)約面積的雙查找表法確定迭代初值,迭代單元采用并行乘法器結(jié)構(gòu)以提高迭代速度;最后,合理劃分流水站,控制迭代過程使浮點(diǎn)除法可以流水執(zhí)行,從而進(jìn)一步提高除法器運(yùn)算速率。實(shí)驗(yàn)結(jié)果表明,在40 nm工藝下,雙精度浮點(diǎn)除法器采用14位迭代初值流水結(jié)構(gòu),其綜合cell面積為84 902.261 8μm2,運(yùn)行頻率可達(dá)2.2 GHz;相比采用8位迭代初值流水結(jié)構(gòu)運(yùn)算速度提高了32.73%,面積增加了5.05%;計(jì)算一條雙精度浮點(diǎn)除法的延遲為12個(gè)時(shí)鐘周期,流水執(zhí)行時(shí),單條除法平均延遲為3個(gè)時(shí)鐘周期,與其他處理器中基于SRT算法實(shí)現(xiàn)的雙精度浮點(diǎn)除法器相比,數(shù)據(jù)吞吐率提高了3~7倍;與其他處理器中基于Goldschmidt算法實(shí)現(xiàn)的雙精度浮點(diǎn)除法器相比,數(shù)據(jù)吞吐率提高了2~3倍。
[Abstract]:In order to solve the problem that the operation process of double precision floating-point division is complex and the delay is long, a high performance double-precision floating-point divider based on Goldschmidt algorithm is proposed to support IEEE-754 standard. Firstly, the process of division of Goldschmidt algorithm and the error caused by iterative operation are analyzed. Then, the method of controlling error is proposed. Secondly, the method of double look-up table is used to determine the initial value of iteration. The iteration unit adopts the parallel multiplier structure to improve the iteration speed. Finally, the flow station is divided reasonably, and the floating-point division can be performed by pipeline through controlling the iterative process, which further improves the operation speed of the divider. The experimental results show that the dual-precision floating-point divider uses a 14-bit iterative initial flow structure in the 40nm process. The integrated cell area is 84 902.261 8 渭 m 2, and the operation frequency can reach 2 902.261 GHz. Compared with using 8 bit iterative initial value pipeline structure, the operation speed is increased 32.73 and the area is increased by 5. 05. The delay of calculating a double precision floating point division is 12 clock cycles. The average delay of single division is three clock cycles. Compared with the dual-precision floating-point divider based on SRT algorithm in other processors, the data throughput is increased by 3 ~ 7 times. Compared with the dual-precision floating-point divider based on Goldschmidt algorithm in other processors, the data throughput is increased by 2 times.
【作者單位】: 國防科學(xué)技術(shù)大學(xué)計(jì)算機(jī)學(xué)院;
【基金】:湖南省重點(diǎn)學(xué)科建設(shè)項(xiàng)目(434515000008) 航空科學(xué)基金資助項(xiàng)目(2013zc88003) 國家自然科學(xué)基金資助項(xiàng)目(61402499)
【分類號】:TP332.22

【參考文獻(xiàn)】

相關(guān)期刊論文 前2條

1 吳鐵彬;劉衡竹;楊惠;張劍鋒;侯申;;一種快速SIMD浮點(diǎn)乘加器的設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)工程與科學(xué);2012年01期

2 李立s,

本文編號:1816424


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1816424.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶847b3***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請E-mail郵箱bigeng88@qq.com