解釋型指令集全系統(tǒng)仿真器的設(shè)計(jì)與實(shí)現(xiàn)
發(fā)布時間:2019-03-24 19:44
【摘要】:隨著嵌入式系統(tǒng)應(yīng)用的日益廣泛,嵌入式應(yīng)用系統(tǒng)所包含的功能也越來越多,且嵌入式應(yīng)用系統(tǒng)的更新?lián)Q代的周期越來越短。這導(dǎo)致了嵌入式應(yīng)用系統(tǒng)巨大的設(shè)計(jì)與開發(fā)壓力,要求進(jìn)行軟、硬件的協(xié)同開發(fā),這促使指令集仿真器得以快速的發(fā)展,指令集仿真器也廣泛應(yīng)用在微處理器新體系結(jié)構(gòu)的設(shè)計(jì)與驗(yàn)證領(lǐng)域。因此,研究如何提供一種快速的指令集全系統(tǒng)仿真器具有重要的理論與實(shí)際意義。 針對于解釋型指令集仿真技術(shù)具有很好的靈活性與精確性的優(yōu)點(diǎn),及其存在仿真速度較慢的不足,設(shè)計(jì)與實(shí)現(xiàn)了一種基于共享塊級cache技術(shù)的解釋型指令集仿真器IISimulator。該仿真器充分利用應(yīng)用程序執(zhí)行時的時間局部性原理與空間局部性原理,對解釋型指令集仿真技術(shù)中譯碼階段的指令譯碼結(jié)果,,以塊為單位進(jìn)行緩存,當(dāng)再一次執(zhí)行到該指令塊時,直接調(diào)用該指令塊的譯碼結(jié)果執(zhí)行仿真,從而有效地跳過解釋型指令集仿真技術(shù)中耗時的譯碼階段;同時使用共享內(nèi)存池的方法管理指令的譯碼結(jié)果使用的內(nèi)存,有效地減少因使用塊級cache技術(shù)所帶來的內(nèi)存管理開銷。 在IISimulator仿真器的測試階段,通過選擇了一些具有代表性的目標(biāo)機(jī)應(yīng)用程序?qū)Ψ抡嫫鞯男阅苓M(jìn)行測試。通過運(yùn)行這些測試實(shí)驗(yàn)用例,統(tǒng)計(jì)仿真器在無cache、指令級cache和塊級cache三種情況下仿真執(zhí)行速度,并進(jìn)行對比分析,結(jié)果表明塊級cache技術(shù)能夠很好的提高解釋型指令集仿真器的仿真速度;同時,也對在使用和不使用共享內(nèi)存池時仿真器的仿真執(zhí)行速度進(jìn)行了對比,實(shí)驗(yàn)結(jié)果表明共享內(nèi)存池能夠有效地減少因cache所帶來的內(nèi)存管理開銷;最后將IISimulator與其它一些全系統(tǒng)仿真器skyeye和SimpleScalar進(jìn)行對比,其平均速度要快。這說明新對解釋型指令集仿真器的改進(jìn)大大提高了仿真器的執(zhí)行效率。
[Abstract]:With the increasing application of embedded system, more and more functions are included in embedded application system, and the cycle of updating embedded application system is shorter and shorter. This leads to the huge pressure of design and development of embedded application system, which requires the collaborative development of software and hardware, which promotes the rapid development of instruction set simulator. Instruction set emulator is also widely used in the design and verification of microprocessor's new architecture. Therefore, it is of great theoretical and practical significance to study how to provide a fast instruction set full system simulator. In view of the advantages of good flexibility and accuracy of interpretive instruction set simulation technology, and the disadvantage of slow simulation speed, an interpreted instruction set simulator IISimulator. based on shared block level cache technology is designed and implemented. The simulator makes full use of the time locality principle and the space locality principle when the application is executed, and buffers the decoding results in the decoding stage of the interpreted instruction set simulation technology in the block unit, and makes full use of the time locality principle and the space locality principle during the execution of the application. When the instruction block is executed again, the decoding result of the instruction block is directly invoked to perform the simulation, thus effectively skipping the time-consuming decoding stage in the interpreted instruction set simulation technology. At the same time, the memory used by decoding results of instructions is managed by the method of shared memory pool, which effectively reduces the memory management overhead caused by the use of block-level cache technology. In the testing phase of the IISimulator simulator, some representative target applications are selected to test the performance of the simulator. By running these test lab use cases, the statistical emulator simulates the execution speed without cache, command-level cache and block-level cache, and makes a comparative analysis. The results show that block-level cache technology can improve the simulation speed of interpretive instruction set simulator. At the same time, the simulation execution speed of the simulator is compared when the shared memory pool is used and not used. The experimental results show that the shared memory pool can effectively reduce the memory management overhead caused by cache. Finally, IISimulator is compared with other whole-system simulators skyeye and SimpleScalar, and the average speed is faster. This shows that the new improvements to the interpreted instruction set emulator greatly improve the execution efficiency of the simulator.
【學(xué)位授予單位】:華中科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2012
【分類號】:TP368.1;TP337
本文編號:2446621
[Abstract]:With the increasing application of embedded system, more and more functions are included in embedded application system, and the cycle of updating embedded application system is shorter and shorter. This leads to the huge pressure of design and development of embedded application system, which requires the collaborative development of software and hardware, which promotes the rapid development of instruction set simulator. Instruction set emulator is also widely used in the design and verification of microprocessor's new architecture. Therefore, it is of great theoretical and practical significance to study how to provide a fast instruction set full system simulator. In view of the advantages of good flexibility and accuracy of interpretive instruction set simulation technology, and the disadvantage of slow simulation speed, an interpreted instruction set simulator IISimulator. based on shared block level cache technology is designed and implemented. The simulator makes full use of the time locality principle and the space locality principle when the application is executed, and buffers the decoding results in the decoding stage of the interpreted instruction set simulation technology in the block unit, and makes full use of the time locality principle and the space locality principle during the execution of the application. When the instruction block is executed again, the decoding result of the instruction block is directly invoked to perform the simulation, thus effectively skipping the time-consuming decoding stage in the interpreted instruction set simulation technology. At the same time, the memory used by decoding results of instructions is managed by the method of shared memory pool, which effectively reduces the memory management overhead caused by the use of block-level cache technology. In the testing phase of the IISimulator simulator, some representative target applications are selected to test the performance of the simulator. By running these test lab use cases, the statistical emulator simulates the execution speed without cache, command-level cache and block-level cache, and makes a comparative analysis. The results show that block-level cache technology can improve the simulation speed of interpretive instruction set simulator. At the same time, the simulation execution speed of the simulator is compared when the shared memory pool is used and not used. The experimental results show that the shared memory pool can effectively reduce the memory management overhead caused by cache. Finally, IISimulator is compared with other whole-system simulators skyeye and SimpleScalar, and the average speed is faster. This shows that the new improvements to the interpreted instruction set emulator greatly improve the execution efficiency of the simulator.
【學(xué)位授予單位】:華中科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2012
【分類號】:TP368.1;TP337
【參考文獻(xiàn)】
相關(guān)期刊論文 前6條
1 王利明,宋振宇,李明,陳渝;一個開放源碼的嵌入式仿真環(huán)境——SkyEye[J];單片機(jī)與嵌入式系統(tǒng)應(yīng)用;2003年09期
2 錢斌,付宇卓;一種基于虛指令集技術(shù)構(gòu)建快速的可重用的指令集仿真器的方法[J];計(jì)算機(jī)工程與應(yīng)用;2005年12期
3 師小麗;張發(fā)存;;LS RISC微處理器仿真研究[J];計(jì)算機(jī)應(yīng)用;2008年10期
4 陶峰峰,付宇卓;DSP指令集仿真器的設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)仿真;2005年09期
5 王旭;計(jì)算機(jī)指令集仿真器的時間仿真技術(shù)研究[J];計(jì)算機(jī)應(yīng)用與軟件;2005年08期
6 何海濤;;周期精確的流水線仿真模型[J];微計(jì)算機(jī)信息;2009年16期
相關(guān)碩士學(xué)位論文 前1條
1 金方其;可重配置的時鐘精確嵌入式處理器仿真平臺的研究[D];浙江大學(xué);2006年
本文編號:2446621
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2446621.html
最近更新
教材專著