天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

大規(guī)模并行片上系統(tǒng)的分布式并行模擬關(guān)鍵技術(shù)研究

發(fā)布時(shí)間:2018-01-05 03:15

  本文關(guān)鍵詞:大規(guī)模并行片上系統(tǒng)的分布式并行模擬關(guān)鍵技術(shù)研究 出處:《國(guó)防科學(xué)技術(shù)大學(xué)》2012年碩士論文 論文類(lèi)型:學(xué)位論文


  更多相關(guān)文章: 大規(guī)模并行片上系統(tǒng) 分布式并行模擬器 指令語(yǔ)義映射 原子指令 軟件分布式共享存儲(chǔ)


【摘要】:主頻陷入增長(zhǎng)停滯后,以多核和眾核體系結(jié)構(gòu)為代表的大規(guī)模并行片上系統(tǒng)成為微處理器的研究和實(shí)現(xiàn)的熱點(diǎn)。根據(jù)摩爾定律,增加的芯片資源轉(zhuǎn)化為單芯片中處理器核數(shù)目的增長(zhǎng),千核數(shù)量級(jí)處理器已不再遙遠(yuǎn)。隨著目標(biāo)處理器核數(shù)目的增加,傳統(tǒng)串行模擬器的性能將會(huì)急劇惡化。而大規(guī)模并行片上系統(tǒng)的設(shè)計(jì)空間卻擴(kuò)大了數(shù)倍,這必然導(dǎo)致體系結(jié)構(gòu)設(shè)計(jì)空間探索的效率急劇降低,模擬器這一重要研究手段面臨巨大挑戰(zhàn)。 利用Host平臺(tái)的并行計(jì)算能力可以開(kāi)發(fā)目標(biāo)機(jī)模型中天然存在的粗粒度并行性,本文提出面向大規(guī)模并行片上系統(tǒng)的分布式并行模擬(DPS)加速框架,試圖通過(guò)改變傳統(tǒng)串行模擬機(jī)制來(lái)提高模擬器的性能,重點(diǎn)研究了分布式并行模擬中,原子指令模擬執(zhí)行效率問(wèn)題,和共享狀態(tài)模擬效率問(wèn)題。 針對(duì)加鎖方法模擬原子指令執(zhí)行效率不高的問(wèn)題,本文提出了一種基于指令語(yǔ)義映射(ISM)的原子指令并行模擬執(zhí)行技術(shù)。該技術(shù)將目標(biāo)機(jī)中的原子指令和宿主機(jī)的原子指令做一對(duì)一的語(yǔ)義映射,以宿主機(jī)原子指令的執(zhí)行代替目標(biāo)機(jī)原子指令的模擬。這種方法實(shí)現(xiàn)簡(jiǎn)單,原子指令并行模擬的正確性易于保證,相比于加鎖方法具有更好的性能,最高可達(dá)到30%的性能提升。 針對(duì)分布式并行模擬中各宿主機(jī)存儲(chǔ)不共享、無(wú)法模擬共享存儲(chǔ)目標(biāo)機(jī)的問(wèn)題,本文提出了一種基于軟件分布式共享存儲(chǔ)(SDSM)的共享狀態(tài)高效模擬技術(shù)。實(shí)現(xiàn)了Host級(jí)和Simulator級(jí)抽象級(jí)別的兩種軟件分布式共享存儲(chǔ)模型,實(shí)驗(yàn)結(jié)果顯示,,文中提出的技術(shù)可以在分布式宿主機(jī)上正確高效模擬共享存儲(chǔ)目標(biāo)機(jī)。 基于上述的模擬框架和分布式并行模擬技術(shù),論文在FTsim模擬器的基礎(chǔ)上實(shí)現(xiàn)了分布并行模擬器DPFTsim。實(shí)驗(yàn)結(jié)果顯示,DPS框架、ISM和SDSM技術(shù)能夠有效地對(duì)大規(guī)模并行片上系統(tǒng)模擬進(jìn)行分布式并行加速。在啟動(dòng)10個(gè)模擬線程時(shí),DPFTsim的性能達(dá)到了串行模擬器FTsim的4.5倍。
[Abstract]:After the main frequency has stagnated, the large-scale parallel on-chip system, represented by multi-core and multi-core architecture, has become the hotspot of microprocessor research and implementation. According to Moore's law. The increased chip resources are transformed into the increase of the number of processor cores in a single chip, thousands of core processors are no longer remote. With the increase of the number of target processor cores. The performance of the traditional serial simulator will deteriorate dramatically, but the design space of the large-scale parallel on-chip system will be expanded several times, which will inevitably lead to a sharp decline in the efficiency of the exploration of architecture design space. The simulator, an important research tool, faces a great challenge. The coarse-grained parallelism in the target machine model can be developed by using the parallel computing ability of Host platform. In this paper, a distributed parallel simulation (DPS) acceleration framework for large-scale parallel on-chip systems is proposed, which attempts to improve the performance of the simulator by changing the traditional serial simulation mechanism. The performance efficiency of atomic instruction simulation and the efficiency of shared state simulation in distributed parallel simulation are studied. To solve the problem that the locking method is not efficient in simulating the execution of atomic instructions. In this paper, a parallel simulation and execution technique of atomic instruction based on instruction semantic mapping (ISM) is proposed, in which the atomic instruction in the target machine and the atomic instruction in the host are mapped one-to-one. This method is simple to realize and the correctness of the parallel simulation of atomic instructions is easy to ensure. Compared with the locking method, this method has better performance. Up to 30% performance improvements. In order to solve the problem that the storage of each host is not shared in distributed parallel simulation, it is impossible to simulate the shared storage target machine. In this paper, a software distributed shared storage (SDSM) is proposed. Two software distributed shared storage models at Host level and Simulator level are implemented. The experimental results show that the proposed technology can simulate the shared storage target machine correctly and efficiently on the distributed host computer. Based on the above simulation framework and distributed parallel simulation technology, the distributed parallel simulator DPFTsim. the experimental results show that the distributed parallel simulator DPFTsim. the experimental results show that the DPS framework. ISM and SDSM techniques can effectively speed up large-scale parallel on-chip system simulation, when starting 10 simulation threads. The performance of DPFTsim is 4.5 times higher than that of serial simulator FTsim.
【學(xué)位授予單位】:國(guó)防科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類(lèi)號(hào)】:TP332

【參考文獻(xiàn)】

相關(guān)期刊論文 前4條

1 龐九鳳;佟冬;李皓;何浪;程旭;;面向基于x86處理器和AMBA的系統(tǒng)芯片的全系統(tǒng)模擬器PKUsim-86[J];電子學(xué)報(bào);2011年02期

2 趙天磊;唐遇星;徐煒遐;付桂濤;齊樹(shù)波;賈小敏;張民選;;程序執(zhí)行的精確重現(xiàn)技術(shù)及其在體系結(jié)構(gòu)模擬中的應(yīng)用[J];計(jì)算機(jī)學(xué)報(bào);2011年11期

3 許建衛(wèi);陳明宇;鄭規(guī);曹政;呂慧偉;孫凝暉;;SimK:A Large-Scale Parallel Simulation Engine[J];Journal of Computer Science & Technology;2009年06期

4 高翔;張福新;湯彥;章隆兵;胡偉武;唐志敏;;基于龍芯CPU的多核全系統(tǒng)模擬器SimOS-Goodson[J];軟件學(xué)報(bào);2007年04期



本文編號(hào):1381321

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1381321.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶dd480***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
五月婷婷综合缴情六月| 内用黄老外示儒术出处| 日韩精品免费一区三区| 中文字幕日韩欧美理伦片| 欧美尤物在线视频91| 国产高清一区二区白浆| 亚洲精品中文字幕一二三| 91精品欧美综合在ⅹ| 欧美一级黄片免费视频| 欧美精品久久一二三区| 亚洲一区二区福利在线| 人人妻在人人看人人澡| 婷婷色香五月综合激激情| 黄男女激情一区二区三区| 又色又爽又黄的三级视频| 国内外激情免费在线视频| 区一区二区三中文字幕| 色鬼综合久久鬼色88| 欧美日韩国产综合特黄| 老司机亚洲精品一区二区| 激情五月天免费在线观看| 久久精品色妇熟妇丰满人妻91| 欧美精品在线播放一区二区| 亚洲中文字幕高清视频在线观看| 欧美人妻少妇精品久久性色| 亚洲女同一区二区另类| 美女露小粉嫩91精品久久久| 日本免费熟女一区二区三区 | 日韩在线免费看中文字幕| 内射精品欧美一区二区三区久久久| 亚洲精品一区二区三区免| 男人的天堂的视频东京热| 日本精品中文字幕在线视频| 久久99热成人网不卡| 麻豆视传媒短视频免费观看| 亚洲天堂精品在线视频| 天堂av一区一区一区| 午夜精品一区免费视频| 国产精品一区二区丝袜| 有坂深雪中文字幕亚洲中文| 国产免费人成视频尤物|