天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 計算機論文 >

處理器條件分支指令處理關(guān)鍵技術(shù)研究

發(fā)布時間:2018-01-10 10:10

  本文關(guān)鍵詞:處理器條件分支指令處理關(guān)鍵技術(shù)研究 出處:《浙江大學(xué)》2013年博士論文 論文類型:學(xué)位論文


  更多相關(guān)文章: 條件分支 分支預(yù)測 預(yù)測錯誤高峰期 動態(tài)自適應(yīng)過濾機制 多級緩沖 自適應(yīng)分支預(yù)測粒度 循環(huán)加速處理 PC越級傳輸


【摘要】:隨著各種應(yīng)用對處理器性能的需求不斷提高,超標量、超深流水線以及投機執(zhí)行等技術(shù)被應(yīng)用到處理器設(shè)計中以提高指令并行度,而條件分支指令由于具備條件執(zhí)行及程序流控制的雙重特性,對并行度造成負面影響,因此高效的條件分支指令處理是保證上述技術(shù)發(fā)揮潛能的前提。本文重點研究若干面向性能優(yōu)化的條件分支處理關(guān)鍵技術(shù),主要研究內(nèi)容和創(chuàng)新點包括: 1、基于預(yù)測極性動態(tài)變換的分支預(yù)測方法研究。通過研究分支預(yù)測錯誤的時間局部性,提出一種基于預(yù)測極性動態(tài)變換的分支預(yù)測方法,動態(tài)監(jiān)測未經(jīng)極性變換的原始分支預(yù)測錯誤率,篩選出預(yù)測錯誤率高于閾值的預(yù)測錯誤高峰期,將高峰期內(nèi)的預(yù)測極性進行變換,使變換后的最終分支預(yù)測錯誤率維持在較低水平,以提高整體分支預(yù)測精度。該方法可解決傳統(tǒng)基于分支別名的預(yù)測方法無法解決的分支抖動等問題。 2、基于多層次過濾的分支預(yù)測方法研究。通過研究分支預(yù)測錯誤的空間局部性,提出一種基于多層次過濾的分支預(yù)測方法,將預(yù)測空間分為多個層次,動態(tài)監(jiān)測各層分支預(yù)測錯誤率,進而將各層中集中分布的少數(shù)錯誤傾向性分支過濾到下一層中通進行針對性處理,降低各層預(yù)測錯誤率,從而提高整體預(yù)測精度。該方法可解決傳統(tǒng)多路預(yù)測方法中各通路均需處理全部條件分支從而造成資源利用率不高的問題。 3、基于多級緩沖以及基于預(yù)測粒度自適應(yīng)的并行分支預(yù)測方法研究。先提出一種基于多級緩沖的并行分支預(yù)測方法,在分支空閑周期內(nèi)訪問預(yù)測器,提前預(yù)取后續(xù)分支預(yù)測信息并對其進行緩存,當同時出現(xiàn)多條條件分支時,從緩存的信息中選取對應(yīng)預(yù)測信息分配給各條分支,該方法可在小于等于8的取指帶寬下實現(xiàn)高精度并行分支預(yù)測。隨后進一步提出一種基于預(yù)測粒度自適應(yīng)的并行分支預(yù)測方法,根據(jù)取指帶寬和分支行為,自適應(yīng)地將若干條件分支封裝成指令包,以指令包作為預(yù)測粒度,并以指令包為單位維護歷史信息,該方法可在任意取指帶寬下實現(xiàn)高精度并行分支預(yù)測。 4、基于解碼緩沖器復(fù)用及PC越級傳輸?shù)难h(huán)加速方法研究。針對循環(huán)體特性,提出一種基于解碼緩沖器復(fù)用及PC越級傳輸?shù)难h(huán)加速方法,通過PC越級傳輸,使設(shè)計多表項解碼緩沖器成為可能,進而復(fù)用該緩沖器,在循環(huán)過程中從緩沖器內(nèi)向執(zhí)行單元發(fā)送循環(huán)體指令,加速循環(huán)執(zhí)行。并通過自循環(huán)寬發(fā)射技術(shù),解決循環(huán)體指令分布、循環(huán)銜接、cache位寬限制等影響循環(huán)處理性能的問題。 本文提出的關(guān)鍵技術(shù)對提高條件分支指令處理性能具有積極的理論研究意義與實際應(yīng)用價值。
[Abstract]:With the increasing demand for processor performance in various applications, superscalar, ultra-deep pipeline and speculative execution techniques have been applied to processor design to improve instruction parallelism. Because of the dual characteristics of conditional execution and program flow control, conditional branch instruction has a negative effect on the degree of parallelism. Therefore, efficient conditional branching instruction processing is the premise to ensure the full potential of the above technology. This paper focuses on several key techniques of conditional branch processing oriented to performance optimization. The main research contents and innovations are as follows: 1. The branch prediction method based on dynamic transformation of predictive polarity is studied. A branch prediction method based on dynamic transformation of predictive polarity is proposed by studying the temporal localization of branch prediction errors. Dynamic monitoring of the original branch prediction error rate without polarity transformation, screening out the prediction error rate higher than the threshold value of the prediction error peak, the peak value of the prediction polarity change. The error rate of the final branch prediction after transformation is kept at a low level in order to improve the prediction accuracy of the whole branch. This method can solve the problem of branch jitter which can not be solved by the traditional prediction method based on branch alias. By studying the spatial localization of branch prediction errors, a branch prediction method based on multi-level filtering is proposed, which divides the prediction space into multiple levels. Dynamic monitoring of each layer branch prediction error rate, and then the concentrated distribution of a few of the layers of error bias branch filter to the next layer pass targeted processing, reduce the prediction error rate in each layer. This method can solve the problem that every channel has to deal with all conditional branches in the traditional multipath prediction method, which results in low utilization of resources. 3. Research on parallel branch prediction method based on multilevel buffer and adaptive prediction granularity. Firstly, a parallel branch prediction method based on multilevel buffer is proposed to access the predictor in the idle period of branch. The prediction information of subsequent branches is prefetched and cached in advance. When multiple conditional branches occur at the same time, the corresponding prediction information is selected from the cached information and assigned to each branch. This method can achieve high precision parallel branch prediction under the reference bandwidth less than 8. Then a parallel branch prediction method based on predictive granularity adaptive algorithm is proposed according to the reference bandwidth and branching behavior. Some conditional branches are encapsulated into instruction packets adaptively. The instruction packets are taken as the prediction granularity and the historical information is maintained in the units of instruction packets. This method can realize high precision parallel branch prediction under arbitrary reference bandwidth. 4. Based on decoding buffer multiplexing and PC leapfrog transmission, a cyclic acceleration method based on decode buffer multiplexing and PC leapfrog transmission is proposed. Through the PC leapfrog transmission, it is possible to design a multi-table item decoding buffer, and then multiplexing the buffer to send circular body instructions from the buffer to the execution unit during the cycle. In order to accelerate the cycle execution and solve the problems such as the distribution of loop volume instruction and the limit of cyclic link cache bit width which affect the performance of loop processing by using the technique of self-cyclic wide transmission. The key technologies proposed in this paper have positive theoretical significance and practical application value in improving the performance of conditional branching instruction processing.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2013
【分類號】:TP332

【參考文獻】

相關(guān)期刊論文 前2條

1 喻明艷;張祥建;楊兵;;基于跳躍訪問控制的低功耗分支目標緩沖器設(shè)計[J];計算機輔助設(shè)計與圖形學(xué)學(xué)報;2010年04期

2 孟建熠;嚴曉浪;葛海通;徐鴻明;;基于指令回收的低功耗循環(huán)分支折合技術(shù)[J];浙江大學(xué)學(xué)報(工學(xué)版);2010年04期

相關(guān)博士學(xué)位論文 前1條

1 孟建熠;超標量嵌入式處理器關(guān)鍵技術(shù)設(shè)計研究[D];浙江大學(xué);2009年

,

本文編號:1404848

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1404848.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶4c1a9***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com