PRF:a process-RAM-feedback performance model to reveal bottl
發(fā)布時間:2021-07-03 13:41
Performance models provide insightful perspectives to predict performance and to propose optimization guidance. Although there has been much researches, pinpointing bottlenecks of various memory access patterns and reaching high accurate prediction of both regular and irregular programs on various hardware configurations are still not trivial. This work proposes a novel model called process-RAM-feedback(PRF) to quantify the overhead of computation and data transmission time on general-purpose mu...
【文章來源】:High Technology Letters. 2020,26(03)EI
【文章頁數(shù)】:14 頁
【文章目錄】:
0 Introduction
1 Related work
2 The PRF performance model
2.1 Process phase
2.2 RAM phase
2.3 Feedback optimization phase
3 Experimental testbed
4 Validation
4.1 Convolution
4.1.1 Convolution operation
4.1.2 Performance prediction for naive code
4.1.3 Optimization method and modeling analysis
4.1.4 Optimization guidance
4.2 SpMV
4.2.1 Test matrices
4.2.2 Performance prediction and bottlenecks
4.2.3 Feedback performance optimization
4.3 Sn-sweep
4.3.1 Sn-sweep operation
4.3.2 Performance prediction and bottleneck analysis
4.3.3 Optimization method and feedback
5 Conclusion
【參考文獻】:
期刊論文
[1]Automatic tuning of sparse matrix-vector multiplication on multicore clusters[J]. LI ShiGang,HU ChangJun,ZHANG JunChao,ZHANG YunQuan. Science China(Information Sciences). 2015(09)
本文編號:3262692
【文章來源】:High Technology Letters. 2020,26(03)EI
【文章頁數(shù)】:14 頁
【文章目錄】:
0 Introduction
1 Related work
2 The PRF performance model
2.1 Process phase
2.2 RAM phase
2.3 Feedback optimization phase
3 Experimental testbed
4 Validation
4.1 Convolution
4.1.1 Convolution operation
4.1.2 Performance prediction for naive code
4.1.3 Optimization method and modeling analysis
4.1.4 Optimization guidance
4.2 SpMV
4.2.1 Test matrices
4.2.2 Performance prediction and bottlenecks
4.2.3 Feedback performance optimization
4.3 Sn-sweep
4.3.1 Sn-sweep operation
4.3.2 Performance prediction and bottleneck analysis
4.3.3 Optimization method and feedback
5 Conclusion
【參考文獻】:
期刊論文
[1]Automatic tuning of sparse matrix-vector multiplication on multicore clusters[J]. LI ShiGang,HU ChangJun,ZHANG JunChao,ZHANG YunQuan. Science China(Information Sciences). 2015(09)
本文編號:3262692
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/3262692.html
最近更新
教材專著