水准点(测量)
瓶颈
计算机科学
并行计算
内存带宽
带宽(计算)
上下界
仿形(计算机编程)
嵌入式系统
计算机网络
操作系统
数学
数学分析
大地测量学
地理
作者
Iksoo Eo,Woojong Han,Yoomi Park
标识
DOI:10.1109/iceic54506.2022.9748279
摘要
The computational performance of HPC highly depends on the balance between peak performance of processing elements and memory bandwidth. While the external memory is often the constraining factor in HPC, a relatively simple roofline model can provide insight on the bound and bottleneck of HPC performance. It may not provide the accurate performance numbers on a specific workload, however it will offer practical insights to both programmers and HW architects on the optimization points. We run the representative benchmark STREAM, HPCG and HPL on ARM and X86 nodetserver), We compare the peak performance and memory bandwidth published by the vendor with profile data gathered with STREAM, HPCG and HPL to prove validity of the simple roofline model. The HPCG and HPL benchmark result shows that HPCG is memory bound while the HPL benchmark is compute bound. The roofline model also shows the balance point of each architecture between memory bandwidth and peak computational performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI