CMP上結合bank一致性技術的NUCA任意步長數(shù)據(jù)提升技術
[Abstract]:At present, the computer has become an indispensable tool for people's life and work. In use, people's demands on the computer are getting higher and higher, and it is hoped that the computer can have higher processing speed, more storage capacity, more convenient and friendly use method and so on. In order to improve the speed of the processor, the manufacturer keeps increasing the processor's main frequency, but it comes with more power consumption and becomes the bottleneck of the processor's speed. In this case, on-chip multi-core processor CMP (Chip Multi-Processor) is born, which integrates multiple processor cores on a processor chip to improve computing power. CMP has become the mainstream of the market, and the research of the CMP processing chip is also necessary. At the same time, the manufacturing process of the integrated circuit is rapidly developed, and the capacity of the on-chip cache is more and more large, but with the increase of the cache volume, the line delay of the high-capacity on-chip cache also increases with the increase of the cache volume, and the increasing line delay has a great effect on the processing speed of the CPU. In response, Kim C et al. proposed a non-consistent cache (NUCA), which allows different banks of cache to have different access delays, thus having a smaller average access delay than the previous consistency cache (UCA) late. In dynamic non-consistent cache (DNUCA), cache supports the migration of cache line (i.e., data block), that is, the hit data can be moved to the bank closer to the access processor, thereby reducing the follow-up of the CPU when the same data is accessed again by the CPU Ask for a delay. The movement of this kind of data in cache is called data promotion or Block migration. The data upgrade requires the target bank to be found to store the data to be upgraded, but some of the current data-lifting techniques do not take into account the actual state of the target bank, and the fixed lift steps used are likely to replace the more useful data in the target bank during the data upgrade cache, or replace to a bank farther from the CPU, cause cache pollution problems, so that data enhancement cannot be reached Good effect. On the basis of the structure of the CMP, we need to consider an important issue, namely, the improvement of the lifting technology, that is, The problem of sharing data. Multiple cores on a single chip share a cache of an L2 or L3 level and will have access to a share at the same time Data is the case. But the data-raising technology is to raise the data accessed by the current CPU to the bank that is closer to its own, to reach the same data next time faster access to. Then, when multiple CPUs access the same shared data, the shared data is "Lula" into the middle of the NUCA, thereby limiting the data promotion The benefits of technology. So, in the improvement of the upgrade technology, the bank consistency technology is combined to allow shared data to have multiple copies in the NUCA, each of which belongs to a different CPU, and is maintained in the NUCA by the bank consistency technology The consistency of the data of the different copies, thus solving the problems caused by the competition of the data, and improving the CPU. The speed of the access to the shared data. The consistency of the maintenance data needs to record the different states of the data, and the data promotion strategy proposed in this paper just uses the different states of the cache line to select the target bank to be migrated, so that the consistency of bank is proposed. In this paper, a brief introduction to the research background and the related technologies is given, and several basic simulation tools for the research of the system structure are introduced, and the paper is introduced in detail. Simics, a simulation tool, is introduced and the existing fixed step size data lifting technology and its problems are introduced in this paper. After the combined bank consistency, the combined bank-one on the CMP is described in detail. And finally, using the whole system simulation, the NAS Parallel Benchmark (NPB) benchmark test program is used to carry out the technology. The technology can effectively reduce the access delay of the access shared cache by the processor. Compared with the design made by Kim C and the like, the average of the IPC is increased by 8.19%, and the result is reduced.
【學位授予單位】:吉林大學
【學位級別】:碩士
【學位授予年份】:2012
【分類號】:TP332
【參考文獻】
相關期刊論文 前10條
1 劉磊;;對片上多核系統(tǒng)的系統(tǒng)結構的研究[J];電腦知識與技術;2008年29期
2 喻之斌;金海;;多核處理器體系結構軟件仿真技術:研究綜述[J];計算機科學;2007年10期
3 何軍;王飆;;多核處理器的結構設計研究[J];計算機工程;2007年16期
4 黃安文;高軍;張民選;;多核處理器片上存儲系統(tǒng)研究[J];計算機工程;2010年04期
5 吳俊杰;潘曉輝;楊學軍;;面向非一致Cache的智能多跳提升技術[J];計算機學報;2009年10期
6 王軍;高速緩沖存儲器Cache簡介[J];計算機與通信;1997年10期
7 吳俊杰;潘曉輝;;面向多核NUCA共享數(shù)據(jù)競爭問題的Bank一致性技術[J];計算機工程與科學;2009年11期
8 吳俊杰;楊學軍;;非一致Cache體系結構技術綜述[J];計算機工程與科學;2011年02期
9 高翔;張福新;湯彥;章隆兵;胡偉武;唐志敏;;基于龍芯CPU的多核全系統(tǒng)模擬器SimOS-Goodson[J];軟件學報;2007年04期
10 黃琨;馬可;曾洪博;張戈;章隆兵;;一種分片式多核處理器的用戶級模擬器[J];軟件學報;2008年04期
相關重要報紙文章 前2條
1 江南計算技術研究所 王飆 陳皖蘇;[N];計算機世界;2006年
2 阿戈;[N];中國計算機報;2007年
相關碩士學位論文 前5條
1 曹皓;多核處理器體系結構下Linux調(diào)度機制的研究[D];內(nèi)蒙古大學;2011年
2 劉佳;多核結構下片內(nèi)存儲系統(tǒng)的模型模擬技術研究[D];國防科學技術大學;2010年
3 史莉雯;雙核處理器多級Cache的研究[D];西北工業(yè)大學;2007年
4 信磊;對稱多核處理器中Cache一致性的研究與實現(xiàn)[D];合肥工業(yè)大學;2007年
5 蔣海濤;CMP體系結構的L2 Cache替換算法研究[D];重慶大學;2008年
,本文編號:2438420
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2438420.html