共轭梯度法
算法
随机梯度下降算法
黑森矩阵
计算机科学
收敛速度
行搜索
趋同(经济学)
梯度下降
数学优化
下降方向
数学
人工智能
应用数学
人工神经网络
钥匙(锁)
经济增长
经济
半径
计算机安全
标识
DOI:10.1016/j.eswa.2022.117719
摘要
Due to their faster convergence rate than gradient descent algorithms and less computational cost than second order algorithms, conjugate gradient (CG) algorithms have been widely used in machine learning. This paper considers conjugate gradient in the mini-batch setting. Concretely, we propose a stable adaptive stochastic conjugate gradient (SCG) algorithm via incorporating both the stochastic recursive gradient algorithm (SARAH) and second order information into the CG-type algorithm. Unlike most of existing CG algorithms that spend a lot of time in determining the step size by using line search and may fail in stochastic optimization, the proposed algorithms use a local quadratic model to estimate the step size sequence, but do not require computing the Hessian information, which make the proposed algorithms attain a low computational cost as first-order algorithms. We establish the linear convergence rate of a class of SCG algorithms, when the loss function is the strongly convex. Moreover, we show that the complexity of the proposed algorithm matches modern stochastic optimization algorithms. As a by-product, we develop a practical variant of the proposed algorithm by setting a stopping criterion for the number of inner loop iterations. Various numerical experiments on machine learning problems demonstrate the efficiency of the proposed algorithms.
科研通智能强力驱动
Strongly Powered by AbleSci AI