面向FT1000微處理器的STREAM并行計算與優(yōu)化

發(fā)布時間：2018-09-12 13:48

【摘要】：STREAM是微處理器上內(nèi)存性能的基準測試程序,在多核多線程FT1000微處理器上發(fā)揮高性能是具有挑戰(zhàn)性的研究工作。基于多級Cache結(jié)構(gòu),優(yōu)化STREAM四個程序的指令流水線,根據(jù)寄存器數(shù),設(shè)計了多級循環(huán)展開方法,根據(jù)指令延遲和Cache行的大小確定數(shù)據(jù)預(yù)取的數(shù)目,使用匯編語言編寫了優(yōu)化子程序�；贠penMP并行環(huán)境,設(shè)計了STREAM并行程序,優(yōu)化了局部化數(shù)據(jù)分配方式。數(shù)據(jù)測試結(jié)果表明,優(yōu)化后的STREAM的性能比原始串行程序性能提高了19.2%~64.2%。優(yōu)化后,并行程序的最高訪存性能達到8.5GB/s,對比優(yōu)化前的最高訪存性能最大提高了22.7%。
[Abstract]:STREAM is a benchmark program for memory performance testing on microprocessors. It is a challenging task to perform high performance in multi-core multithreaded FT1000 microprocessors. Based on the multilevel Cache structure, the instruction pipeline of the four STREAM programs is optimized. According to the number of registers, a multistage loop expansion method is designed, and the number of data prefetching is determined according to the instruction delay and the size of the Cache row. The optimized subprogram is written in assembly language. Based on the OpenMP parallel environment, the STREAM parallel program is designed, and the localized data allocation method is optimized. The test results show that the performance of the optimized STREAM is better than that of the original serial program. After optimization, the maximum memory access performance of parallel programs reaches 8.5 GB / s, compared with that before optimization, the maximum memory access performance is improved by 22.7GB / s.
【作者單位】：國防科學技術(shù)大學并行與分布處理重點實驗室;
【基金】：國家863計劃資助項目(2012AA01A301) 國家自然科學基金資助項目(60970033,91430218)
【分類號】：TP332

【相似文獻】

相關(guān)期刊論文前10條

1 沈佩瑤;Jack;;享受·感動——本田時韻Stream音響改裝[J];音響改裝技術(shù);2010年05期

2 ;[J];;年期

3 ;[J];;年期

4 ;[J];;年期

5 ;[J];;年期

6 ;[J];;年期

7 ;[J];;年期

8 ;[J];;年期

9 ;[J];;年期

10 ;[J];;年期

相關(guān)重要報紙文章前1條

1 劉秀明;柯達召開Stream概念型印刷機媒體見面會[N];中國包裝報;2008年

，

本文編號：2239191

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2239191.html

上一篇：支持Hadoop配置的異構(gòu)虛擬機平臺的研究
下一篇：基于云計算的醫(yī)藥冷鏈物流體系構(gòu)建

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

面向FT1000微處理器的STREAM并行計算與優(yōu)化