卷积(计算机科学)
计算机科学
培训(气象学)
算法
并行计算
执行时间
计算科学
人工智能
计算机图形学(图像)
人工神经网络
物理
气象学
作者
Peter Bryzgalov,Toshiyuki Maeda,Yutaro Shigeto
标识
DOI:10.1145/3452412.3462750
摘要
Convolutional neural networks (CNN) drive successful machine learning applications in a growing number of areas. However, training a CNN may take a massive amount of time and expensive high-end GPU resources. CNN training time may change significantly depending on training parameters and GPU type. Therefore, an accurate estimation of CNN training time can help in selecting training parameters and GPU type, which minimise training time and cost. We focus on one training parameter, which has a particularly significant effect on the training time-the mini-batch size. Predicting CNN training time on a wide range of mini-batch sizes is challenging because a small variation in a mini-batch size can change the selection of convolution algorithms and cause abrupt changes in training time, which is also affected by non-GPU operations. This paper shows our approach to predicting CNN training time over a wide range of mini-batch sizes by utilising a proxy application to benchmark convolutional and dense layers and considering non-GPU time. In contrast to prior works, which build one prediction model for all possible CNN configurations, we build simple models that would each make highly accurate predictions for one particular CNN. We evaluate our approach using several CNN samples and GPU types and demonstrate that it can yield highly accurate predictions on unseen mini-batch sizes with a mean percentage error averaged over all experiments equal to 1.38% (the minimum is 0.21% and the maximum is 5.01%).
科研通智能强力驱动
Strongly Powered by AbleSci AI