最大值和最小值
计算机科学
趋同(经济学)
人工神经网络
鞍点
收敛速度
多项式的
人工智能
过程(计算)
边界(拓扑)
算法
数学优化
数学
经济增长
计算机网络
操作系统
频道(广播)
数学分析
经济
几何学
作者
Purnendu Mishra,Kishor Sarawadekar
标识
DOI:10.1109/tencon.2019.8929465
摘要
Learning rate (LR) is one of the most important hyper-parameters in any deep neural network (DNN) optimization process. It controls the speed of network convergence to the point of global minima by navigation through non-convex loss surface. The performance of a DNN is affected by presence of local minima, saddle points, etc. in the loss surface. Decaying the learning rate by a factor at fixed number of epochs or exponentially is the conventional way of varying the LR. Recently, two new approaches for setting learning rate have been introduced namely cyclical learning rate and stochastic gradient descent with warm restarts. In both of these approaches, the learning rate value is varied in a cyclic pattern between two boundary values. This paper introduces another warm restart technique which is inspired by these two approaches and it uses “poly” LR policy. The proposed technique is called as polynomial learning rate with warm restart and it requires only a single warm restart. The proposed LR policy helps in faster convergence of the DNN and it has slightly higher classification accuracy. The performance of the proposed LR policy is demonstrated on CIFAR-10, CIFAR-100 and tiny ImageNet dataset with CNN, ResNets and Wide Residual Networks (WRN) architectures.
科研通智能强力驱动
Strongly Powered by AbleSci AI