片上多核處理器二級Cache結(jié)構(gòu)及資源管理技術(shù)研究
發(fā)布時間:2018-08-11 19:42
【摘要】:處理器與內(nèi)存之間訪問速度差距日益增大,有效組織和利用片上Cache資源以減少片外存儲訪問對于提升處理器性能至關(guān)重要。隨著多核處理器的普及和半導(dǎo)體工藝的進(jìn)步,芯片將集成更多的核,給二級Cache結(jié)構(gòu)設(shè)計帶來更大的壓力和挑戰(zhàn)。當(dāng)前主流多核處理器采用基于LRU替換策略的共享或者私有二級Cache結(jié)構(gòu)設(shè)計。然而,單一的共享或者私有Cache結(jié)構(gòu)設(shè)計不能有效權(quán)衡容量與訪問延遲。共享Cache結(jié)構(gòu)能夠有效利用資源,但是全局線延遲導(dǎo)致較慢的訪問速度;私有Cache結(jié)構(gòu)通過數(shù)據(jù)復(fù)制獲得較快訪問速度,但是容量限制導(dǎo)致較多的訪問失效。此外,受組相聯(lián)度、應(yīng)用等因素的影響,LRU替換策略與理論最優(yōu)替換策略之間的性能差距日趨增大。針對上述問題,本文深入研究了多核處理器中二級Cache資源的組織與管理機(jī)制,提出一種基于全局替換策略的可變相聯(lián)度混合Cache結(jié)構(gòu)模型,研究基于存儲訪問需求變化的動態(tài)容量劃分與組均衡管理機(jī)制,并提供低功耗與可擴(kuò)展優(yōu)化。論文的創(chuàng)新點如下: 1.提出面向CMP的可變相聯(lián)度混合Cache結(jié)構(gòu)CMP-VH。CMP-VH將二級Cache劃分成一種優(yōu)化的私有/共享結(jié)構(gòu),Tag私有,數(shù)據(jù)部分私有部分共享。CMP-VH基于數(shù)據(jù)塊的重用信息進(jìn)行全局替換,并支持核間容量劃分以適應(yīng)不同應(yīng)用存儲訪問需求的變化。使用Simics模擬器搭建8核片上多處理器平臺,對SPLASH并行程序負(fù)載的模擬實驗結(jié)果表明,在相同總?cè)萘壳疤嵯,CMP-VH結(jié)構(gòu)下的平均二級Cache失效率與傳統(tǒng)共享Cache結(jié)構(gòu)接近,比傳統(tǒng)私有Cache結(jié)構(gòu)降低約23.37%。 2.提出基于數(shù)據(jù)項動態(tài)分配的容量劃分技術(shù)VH-PAD。VH-PAD根據(jù)各個核的容量需求進(jìn)行資源分配,包含初始化、重劃分和回退三個階段。初始化階段賦予每個核相同數(shù)目資源;重劃分階段基于當(dāng)前劃分容量的飽和程度評估容量需求以指導(dǎo)容量劃分;回退階段基于當(dāng)前占用容量判斷是否撤銷重劃分階段操作。VH-PAD通過控制共享數(shù)據(jù)項資源的動態(tài)分配實施核間容量調(diào)整。在Simics搭建的模擬平臺上使用PARSEC基準(zhǔn)程序進(jìn)行實驗,,發(fā)現(xiàn)在相同總?cè)萘壳疤嵯,VH-PAD機(jī)制下的平均二級Cache失效率比傳統(tǒng)私有Cache結(jié)構(gòu)降低約41.33%。 3.提出基于概率控制的容量劃分技術(shù)VH-PS。VH-PS根據(jù)各個核的資源利用率進(jìn)行資源分配,使用概率控制各個核對共享資源的競爭能力,從而實現(xiàn)核間容量劃分。VH-PS提供一種性能監(jiān)控機(jī)制評估各個核在增加一定容量后可以獲得的失效率增益,并以此為基礎(chǔ)賦予各個核不同等級的使用共享資源的概率。通過提升失效率增益大的核的概率等級,降低失效率增益小的核的概率等級,達(dá)到降低總失效率目的。VH-PS中的概率控制可以采用偽隨機(jī)數(shù)或者PSR比例實現(xiàn)。在Simics搭建的模擬平臺上使用PARSEC基準(zhǔn)程序進(jìn)行實驗,發(fā)現(xiàn)在相同總?cè)萘壳疤嵯拢c傳統(tǒng)私有Cache結(jié)構(gòu)相比,采用偽隨機(jī)數(shù)實現(xiàn)的VH-PS下的平均二級Cache失效率降低約46.78%;采用PSR比例實現(xiàn)的VH-PS下的平均二級Cache失效率降低約43.05%。 4.提出基于Tag組飽和度的組均衡管理技術(shù)。由于CMP-VH中私有Tag陣列限制最大組相聯(lián)度與最大可用容量,本文提出核內(nèi)、核間兩種Tag組均衡管理機(jī)制。將CMP-VH中的替換分成Tag項主導(dǎo)替換與Data項主導(dǎo)替換兩類,并使用Tag項主導(dǎo)替換數(shù)目評估每個組的飽和程度,允許飽和度高的組使用核內(nèi)或者核間相應(yīng)飽和度低的組中資源。在Simics搭建的模擬平臺上使用PARSEC基準(zhǔn)程序進(jìn)行實驗,發(fā)現(xiàn)在相同總?cè)萘壳疤嵯,與基準(zhǔn)CMP-VH結(jié)構(gòu)相比,核內(nèi)組均衡機(jī)制的平均二級Cache失效率降低約11.04%,核間組均衡機(jī)制的平均二級Cache失效率降低約18.94%。 5.提出異構(gòu)可變相聯(lián)度Cache結(jié)構(gòu)HV-Way Cache及異構(gòu)可變相聯(lián)度混合Cache結(jié)構(gòu)模型CMP-VHR。HV-Way Cache使用異構(gòu)Tag陣列優(yōu)化V-WayCache結(jié)構(gòu),以降低面積、功耗等開銷。同時,為了適應(yīng)未來眾核處理器對低功耗與可擴(kuò)展性的要求,使用異構(gòu)Tag陣列和可重構(gòu)數(shù)據(jù)陣列搭建異構(gòu)可變相聯(lián)度混合Cache結(jié)構(gòu)模型,支持根據(jù)應(yīng)用需求進(jìn)行功耗優(yōu)化。實驗結(jié)果表明,HV-Way Cache結(jié)構(gòu)能以較少的性能損失獲得面積、功耗等開銷的大幅降低。
[Abstract]:With the development of multi-core processors and semiconductor technology, more cores will be integrated into the chip, which will bring more pressure and choices to the design of secondary cache architecture. Current mainstream multicore processors use shared or private secondary cache architectures based on LRU replacement strategies. However, a single shared or private cache architecture cannot effectively balance capacity and access latency. In addition, the performance gap between LRU replacement strategy and theoretical optimal replacement strategy is increasing due to the influence of group association and application. In view of the above problems, this paper deeply studies the secondary cache resources in multicore processors. In this paper, we propose a hybrid Cache architecture model with variable associativity based on global replacement strategy. We study the dynamic capacity partitioning and group balancing management mechanism based on storage access requirement changes, and provide low power consumption and scalable optimization.
1. CMP-VH, a CMP-oriented variable associativity hybrid cache architecture, is proposed to divide the secondary cache into an optimized private/shared structure, which is Tag private and data partially private. Simics simulator is used to build a multi-processor platform on 8 cores. The simulation results of SPLASH parallel program load show that the average secondary cache failure rate of CMP-VH structure is close to the traditional shared cache structure under the same total capacity, which is about 23.37% lower than the traditional private cache structure.
2. The capacity partitioning technology VH-PAD.VH-PAD based on dynamic data item allocation is proposed to allocate resources according to the capacity requirements of each core, including initialization, re-partitioning and regression stages. VH-PAD implements inter-core capacity adjustment by controlling the dynamic allocation of shared data item resources. Experiments are carried out on Simics simulation platform using PARSEC benchmark program. It is found that the average under VH-PAD mechanism is the same total capacity. The failure rate of the two level Cache is about 41.33%. lower than that of the traditional private Cache structure.
3. A capacity partitioning technique based on probabilistic control, VH-PS.VH-PS, is proposed to allocate resources according to resource utilization of each core, and to control the competitiveness of shared resources by probabilistic control. VH-PS provides a performance monitoring mechanism to evaluate the failure rate of each core after increasing a certain capacity. The probability control of VH-PS can be realized by pseudo-random number or PSR ratio. The probability control of VH-PS can be implemented by Simics. The PARSEC benchmark program is used in the simulation platform. It is found that under the same total capacity, the average failure rate of secondary Cache in VH-PS with pseudo-random number is reduced by 46.78% compared with that in traditional private Cache structure, and the average failure rate of secondary Cache in VH-PS with PSR ratio is reduced by 43.05%.
4. A group balancing management technique based on group saturation is proposed. Since private Tag arrays in CMP-VH limit maximum group association and maximum available capacity, this paper proposes two Tag group balancing management mechanisms within and between cores. The saturation degree of each group is allowed for the group with high saturation to use the resources in the group with low core or inter-core saturation. The PARSEC benchmark program is used in the simulation platform built by Simics. It is found that under the same total capacity, the average secondary Cache failure rate of the intra-core group equalization mechanism is about 1.0 lower than that of the benchmark CMP-VH structure. 1.04%, the average two level Cache failure rate of the inter core group equalization mechanism is reduced by about 18.94%..
5. Heterogeneous variable-degree-of-association Cache structure HV-Way Cache and heterogeneous variable-degree-of-association hybrid Cache structure model CMP-VHR.HV-way Cache are proposed to optimize V-Way Cache structure by using heterogeneous Tag arrays in order to reduce the overhead of area and power consumption. The HV-Way Cache architecture is reconstructed to construct a heterogeneous variable-degree-of-association hybrid Cache architecture model to support power optimization according to application requirements. The experimental results show that the HV-Way Cache architecture can achieve a large reduction in area and power consumption with less performance loss.
【學(xué)位授予單位】:國防科學(xué)技術(shù)大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2012
【分類號】:TP332
本文編號:2178051
[Abstract]:With the development of multi-core processors and semiconductor technology, more cores will be integrated into the chip, which will bring more pressure and choices to the design of secondary cache architecture. Current mainstream multicore processors use shared or private secondary cache architectures based on LRU replacement strategies. However, a single shared or private cache architecture cannot effectively balance capacity and access latency. In addition, the performance gap between LRU replacement strategy and theoretical optimal replacement strategy is increasing due to the influence of group association and application. In view of the above problems, this paper deeply studies the secondary cache resources in multicore processors. In this paper, we propose a hybrid Cache architecture model with variable associativity based on global replacement strategy. We study the dynamic capacity partitioning and group balancing management mechanism based on storage access requirement changes, and provide low power consumption and scalable optimization.
1. CMP-VH, a CMP-oriented variable associativity hybrid cache architecture, is proposed to divide the secondary cache into an optimized private/shared structure, which is Tag private and data partially private. Simics simulator is used to build a multi-processor platform on 8 cores. The simulation results of SPLASH parallel program load show that the average secondary cache failure rate of CMP-VH structure is close to the traditional shared cache structure under the same total capacity, which is about 23.37% lower than the traditional private cache structure.
2. The capacity partitioning technology VH-PAD.VH-PAD based on dynamic data item allocation is proposed to allocate resources according to the capacity requirements of each core, including initialization, re-partitioning and regression stages. VH-PAD implements inter-core capacity adjustment by controlling the dynamic allocation of shared data item resources. Experiments are carried out on Simics simulation platform using PARSEC benchmark program. It is found that the average under VH-PAD mechanism is the same total capacity. The failure rate of the two level Cache is about 41.33%. lower than that of the traditional private Cache structure.
3. A capacity partitioning technique based on probabilistic control, VH-PS.VH-PS, is proposed to allocate resources according to resource utilization of each core, and to control the competitiveness of shared resources by probabilistic control. VH-PS provides a performance monitoring mechanism to evaluate the failure rate of each core after increasing a certain capacity. The probability control of VH-PS can be realized by pseudo-random number or PSR ratio. The probability control of VH-PS can be implemented by Simics. The PARSEC benchmark program is used in the simulation platform. It is found that under the same total capacity, the average failure rate of secondary Cache in VH-PS with pseudo-random number is reduced by 46.78% compared with that in traditional private Cache structure, and the average failure rate of secondary Cache in VH-PS with PSR ratio is reduced by 43.05%.
4. A group balancing management technique based on group saturation is proposed. Since private Tag arrays in CMP-VH limit maximum group association and maximum available capacity, this paper proposes two Tag group balancing management mechanisms within and between cores. The saturation degree of each group is allowed for the group with high saturation to use the resources in the group with low core or inter-core saturation. The PARSEC benchmark program is used in the simulation platform built by Simics. It is found that under the same total capacity, the average secondary Cache failure rate of the intra-core group equalization mechanism is about 1.0 lower than that of the benchmark CMP-VH structure. 1.04%, the average two level Cache failure rate of the inter core group equalization mechanism is reduced by about 18.94%..
5. Heterogeneous variable-degree-of-association Cache structure HV-Way Cache and heterogeneous variable-degree-of-association hybrid Cache structure model CMP-VHR.HV-way Cache are proposed to optimize V-Way Cache structure by using heterogeneous Tag arrays in order to reduce the overhead of area and power consumption. The HV-Way Cache architecture is reconstructed to construct a heterogeneous variable-degree-of-association hybrid Cache architecture model to support power optimization according to application requirements. The experimental results show that the HV-Way Cache architecture can achieve a large reduction in area and power consumption with less performance loss.
【學(xué)位授予單位】:國防科學(xué)技術(shù)大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2012
【分類號】:TP332
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 高翔;章隆兵;胡偉武;;一種基于容量復(fù)用的異構(gòu)CMP Cache[J];計算機(jī)研究與發(fā)展;2008年05期
2 所光;楊學(xué)軍;;面向多線程多道程序的加權(quán)共享Cache劃分[J];計算機(jī)學(xué)報;2008年11期
3 黃國睿;張平;魏廣博;;多核處理器的關(guān)鍵技術(shù)及其發(fā)展趨勢[J];計算機(jī)工程與設(shè)計;2009年10期
本文編號:2178051
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2178051.html
最近更新
教材專著