面向大數(shù)據(jù)處理的多核處理器Cache一致性協(xié)議
發(fā)布時間:2018-04-09 11:17
本文選題:大數(shù)據(jù) 切入點:MERSI協(xié)議 出處:《國防科學技術大學》2014年碩士論文
【摘要】:高新技術飛速發(fā)展,產生的數(shù)據(jù)量正以人們無法預計的方式不斷的增加,因此面向大數(shù)據(jù)處理的微處理器需要快速處理大量不同類型的數(shù)據(jù)。大數(shù)據(jù)的價值密度低,所以處理器內核間需要交互的數(shù)據(jù)量變少,如果繼續(xù)采用面向科學計算的多核處理器Cache一致性協(xié)議,將加大整個系統(tǒng)的壓力,降低處理器頻率。本文根據(jù)大數(shù)據(jù)的特征,設計了面向大數(shù)據(jù)處理的多核處理器Cache一致性協(xié)議——MERSI協(xié)議。本文主要工作如下:(1)針對多個共享副本同時應答所帶來的開銷,在MERSI協(xié)議中,當系統(tǒng)有多個共享副本時,只有唯一一個副本為共享應答狀態(tài),其余副本狀態(tài)為“S”狀態(tài)。當遠程處理器對該數(shù)據(jù)進行操作時,由共享應答狀態(tài)副本進行應答操作,“S”狀態(tài)副本不應答。這樣可以避免多個共享副本同時對遠程操作進行應答帶來的系統(tǒng)開銷。(2)針對Cache負載不均衡問題,在MERSI協(xié)議中,共享應答狀態(tài)副本在進行應答操作之后變?yōu)槠溆嘞鄳臓顟B(tài),同時請求者副本的狀態(tài)變?yōu)楣蚕響馉顟B(tài)。下一次操作由變?yōu)楣蚕響馉顟B(tài)的副本進行應答操作,這樣就避免了一個Cache過于忙碌而其余Cache產生饑餓。(3)針對Cache乒乓效應的開銷,在MERSI協(xié)議中采用寫作廢和寫更新的混合策略。系統(tǒng)中只有兩個共享副本時采用寫更新策略,系統(tǒng)中有兩個以上共享副本時采用寫作廢策略。采用混合寫策略可以大大的減少乒乓效應帶來的系統(tǒng)開銷。(4)在性能測試部分,選取了SPLASH-2并行程序測試集中LU、Ocean、Radix、FFT和Water五個測試程序來完成多核處理器Cache一致性協(xié)議的性能測試。首先對目錄結構的組織方式對系統(tǒng)性能的影響和Cache塊大小對系統(tǒng)性能的影響進行了測試。然后對處理器內核數(shù)以及網絡拓撲結構對系統(tǒng)的影響進行了測試與分析。最后將MERSI和MESI協(xié)議進行性能對比,MERSI比MESI協(xié)議性能平均提升了3.58%;在L1Cache的失效率方面,MERSI比MESI協(xié)議平均降低了3.18%。基本達到了MERSI協(xié)議的設計目標。
[Abstract]:With the rapid development of high and new technology, the amount of data generated is increasing in an unpredictable way, so the microprocessor for big data needs to deal with a large number of different types of data quickly.Big data's value density is low, so the amount of data needed to interact between processor cores becomes smaller. If we continue to adopt the multi-core processor Cache consistency protocol for scientific computing, it will increase the pressure on the whole system and reduce the processor frequency.According to big data's characteristics, a multi-core processor Cache conformance protocol, Mersi protocol, is designed.The main work of this paper is as follows: (1) for the overhead of multiple shared replicas simultaneously, in MERSI protocol, only one replica is a shared reply state and the other replicas are "S" state when the system has multiple shared replicas.When the remote processor operates on the data, the shared reply state copy responds, and the "S" state copy does not.This can avoid the overhead of multiple shared replicas responding to remote operations simultaneously. 2) aiming at the problem of Cache load imbalance, in the MERSI protocol, the shared reply state replicas become the remaining corresponding states after the response operation.At the same time, the state of the requestor copy becomes a shared reply state.The next operation changes from a copy of the shared reply state to a reply operation, which avoids a Cache being too busy while the rest of the Cache is hungry.) in view of the cost of the ping-pong effect of Cache, a hybrid strategy of writing scrap and writing update is adopted in the MERSI protocol.When there are only two shared replicas in the system, the write update strategy is adopted, and the writing scrap strategy is used when there are more than two shared replicas in the system.In the part of performance testing, five SPLASH-2 parallel program test programs, LU / Oceanum Radix FFT and Water, are selected to test the performance of multi-core processor Cache conformance protocol.Firstly, the influence of directory structure on system performance and the influence of Cache block size on system performance are tested.Then, the influence of processor kernel number and network topology on the system is tested and analyzed.Finally, the performance of MERSI and MESI is compared with that of MESI. The average performance of MERSI is 3.58% higher than that of MESI, and the average reduction of MERSI is 3.18% lower than that of MESI in the aspect of L1Cache failure rate.The design goal of MERSI protocol is basically achieved.
【學位授予單位】:國防科學技術大學
【學位級別】:碩士
【學位授予年份】:2014
【分類號】:TP332
【參考文獻】
相關期刊論文 前1條
1 袁愛東,董建萍;基于目錄的一致性協(xié)議淺析[J];計算機工程;2004年S1期
,本文編號:1726190
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1726190.html
最近更新
教材專著