處理器條件分支指令處理關(guān)鍵技術(shù)研究
發(fā)布時(shí)間:2018-01-10 10:10
本文關(guān)鍵詞:處理器條件分支指令處理關(guān)鍵技術(shù)研究 出處:《浙江大學(xué)》2013年博士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 條件分支 分支預(yù)測(cè) 預(yù)測(cè)錯(cuò)誤高峰期 動(dòng)態(tài)自適應(yīng)過(guò)濾機(jī)制 多級(jí)緩沖 自適應(yīng)分支預(yù)測(cè)粒度 循環(huán)加速處理 PC越級(jí)傳輸
【摘要】:隨著各種應(yīng)用對(duì)處理器性能的需求不斷提高,超標(biāo)量、超深流水線以及投機(jī)執(zhí)行等技術(shù)被應(yīng)用到處理器設(shè)計(jì)中以提高指令并行度,而條件分支指令由于具備條件執(zhí)行及程序流控制的雙重特性,對(duì)并行度造成負(fù)面影響,因此高效的條件分支指令處理是保證上述技術(shù)發(fā)揮潛能的前提。本文重點(diǎn)研究若干面向性能優(yōu)化的條件分支處理關(guān)鍵技術(shù),主要研究?jī)?nèi)容和創(chuàng)新點(diǎn)包括: 1、基于預(yù)測(cè)極性動(dòng)態(tài)變換的分支預(yù)測(cè)方法研究。通過(guò)研究分支預(yù)測(cè)錯(cuò)誤的時(shí)間局部性,提出一種基于預(yù)測(cè)極性動(dòng)態(tài)變換的分支預(yù)測(cè)方法,動(dòng)態(tài)監(jiān)測(cè)未經(jīng)極性變換的原始分支預(yù)測(cè)錯(cuò)誤率,篩選出預(yù)測(cè)錯(cuò)誤率高于閾值的預(yù)測(cè)錯(cuò)誤高峰期,將高峰期內(nèi)的預(yù)測(cè)極性進(jìn)行變換,使變換后的最終分支預(yù)測(cè)錯(cuò)誤率維持在較低水平,以提高整體分支預(yù)測(cè)精度。該方法可解決傳統(tǒng)基于分支別名的預(yù)測(cè)方法無(wú)法解決的分支抖動(dòng)等問(wèn)題。 2、基于多層次過(guò)濾的分支預(yù)測(cè)方法研究。通過(guò)研究分支預(yù)測(cè)錯(cuò)誤的空間局部性,提出一種基于多層次過(guò)濾的分支預(yù)測(cè)方法,將預(yù)測(cè)空間分為多個(gè)層次,動(dòng)態(tài)監(jiān)測(cè)各層分支預(yù)測(cè)錯(cuò)誤率,進(jìn)而將各層中集中分布的少數(shù)錯(cuò)誤傾向性分支過(guò)濾到下一層中通進(jìn)行針對(duì)性處理,降低各層預(yù)測(cè)錯(cuò)誤率,從而提高整體預(yù)測(cè)精度。該方法可解決傳統(tǒng)多路預(yù)測(cè)方法中各通路均需處理全部條件分支從而造成資源利用率不高的問(wèn)題。 3、基于多級(jí)緩沖以及基于預(yù)測(cè)粒度自適應(yīng)的并行分支預(yù)測(cè)方法研究。先提出一種基于多級(jí)緩沖的并行分支預(yù)測(cè)方法,在分支空閑周期內(nèi)訪問(wèn)預(yù)測(cè)器,提前預(yù)取后續(xù)分支預(yù)測(cè)信息并對(duì)其進(jìn)行緩存,當(dāng)同時(shí)出現(xiàn)多條條件分支時(shí),從緩存的信息中選取對(duì)應(yīng)預(yù)測(cè)信息分配給各條分支,該方法可在小于等于8的取指帶寬下實(shí)現(xiàn)高精度并行分支預(yù)測(cè)。隨后進(jìn)一步提出一種基于預(yù)測(cè)粒度自適應(yīng)的并行分支預(yù)測(cè)方法,根據(jù)取指帶寬和分支行為,自適應(yīng)地將若干條件分支封裝成指令包,以指令包作為預(yù)測(cè)粒度,并以指令包為單位維護(hù)歷史信息,該方法可在任意取指帶寬下實(shí)現(xiàn)高精度并行分支預(yù)測(cè)。 4、基于解碼緩沖器復(fù)用及PC越級(jí)傳輸?shù)难h(huán)加速方法研究。針對(duì)循環(huán)體特性,提出一種基于解碼緩沖器復(fù)用及PC越級(jí)傳輸?shù)难h(huán)加速方法,通過(guò)PC越級(jí)傳輸,使設(shè)計(jì)多表項(xiàng)解碼緩沖器成為可能,進(jìn)而復(fù)用該緩沖器,在循環(huán)過(guò)程中從緩沖器內(nèi)向執(zhí)行單元發(fā)送循環(huán)體指令,加速循環(huán)執(zhí)行。并通過(guò)自循環(huán)寬發(fā)射技術(shù),解決循環(huán)體指令分布、循環(huán)銜接、cache位寬限制等影響循環(huán)處理性能的問(wèn)題。 本文提出的關(guān)鍵技術(shù)對(duì)提高條件分支指令處理性能具有積極的理論研究意義與實(shí)際應(yīng)用價(jià)值。
[Abstract]:With the increasing demand for processor performance in various applications, superscalar, ultra-deep pipeline and speculative execution techniques have been applied to processor design to improve instruction parallelism. Because of the dual characteristics of conditional execution and program flow control, conditional branch instruction has a negative effect on the degree of parallelism. Therefore, efficient conditional branching instruction processing is the premise to ensure the full potential of the above technology. This paper focuses on several key techniques of conditional branch processing oriented to performance optimization. The main research contents and innovations are as follows: 1. The branch prediction method based on dynamic transformation of predictive polarity is studied. A branch prediction method based on dynamic transformation of predictive polarity is proposed by studying the temporal localization of branch prediction errors. Dynamic monitoring of the original branch prediction error rate without polarity transformation, screening out the prediction error rate higher than the threshold value of the prediction error peak, the peak value of the prediction polarity change. The error rate of the final branch prediction after transformation is kept at a low level in order to improve the prediction accuracy of the whole branch. This method can solve the problem of branch jitter which can not be solved by the traditional prediction method based on branch alias. By studying the spatial localization of branch prediction errors, a branch prediction method based on multi-level filtering is proposed, which divides the prediction space into multiple levels. Dynamic monitoring of each layer branch prediction error rate, and then the concentrated distribution of a few of the layers of error bias branch filter to the next layer pass targeted processing, reduce the prediction error rate in each layer. This method can solve the problem that every channel has to deal with all conditional branches in the traditional multipath prediction method, which results in low utilization of resources. 3. Research on parallel branch prediction method based on multilevel buffer and adaptive prediction granularity. Firstly, a parallel branch prediction method based on multilevel buffer is proposed to access the predictor in the idle period of branch. The prediction information of subsequent branches is prefetched and cached in advance. When multiple conditional branches occur at the same time, the corresponding prediction information is selected from the cached information and assigned to each branch. This method can achieve high precision parallel branch prediction under the reference bandwidth less than 8. Then a parallel branch prediction method based on predictive granularity adaptive algorithm is proposed according to the reference bandwidth and branching behavior. Some conditional branches are encapsulated into instruction packets adaptively. The instruction packets are taken as the prediction granularity and the historical information is maintained in the units of instruction packets. This method can realize high precision parallel branch prediction under arbitrary reference bandwidth. 4. Based on decoding buffer multiplexing and PC leapfrog transmission, a cyclic acceleration method based on decode buffer multiplexing and PC leapfrog transmission is proposed. Through the PC leapfrog transmission, it is possible to design a multi-table item decoding buffer, and then multiplexing the buffer to send circular body instructions from the buffer to the execution unit during the cycle. In order to accelerate the cycle execution and solve the problems such as the distribution of loop volume instruction and the limit of cyclic link cache bit width which affect the performance of loop processing by using the technique of self-cyclic wide transmission. The key technologies proposed in this paper have positive theoretical significance and practical application value in improving the performance of conditional branching instruction processing.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP332
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 喻明艷;張祥建;楊兵;;基于跳躍訪問(wèn)控制的低功耗分支目標(biāo)緩沖器設(shè)計(jì)[J];計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào);2010年04期
2 孟建熠;嚴(yán)曉浪;葛海通;徐鴻明;;基于指令回收的低功耗循環(huán)分支折合技術(shù)[J];浙江大學(xué)學(xué)報(bào)(工學(xué)版);2010年04期
相關(guān)博士學(xué)位論文 前1條
1 孟建熠;超標(biāo)量嵌入式處理器關(guān)鍵技術(shù)設(shè)計(jì)研究[D];浙江大學(xué);2009年
,本文編號(hào):1404848
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1404848.html
最近更新
教材專著