计算机科学
人工神经网络
人工智能
深度学习
循环神经网络
激活函数
卷积神经网络
前馈神经网络
模式识别(心理学)
反向传播
梯度下降
机器学习
趋同(经济学)
前馈
作者
Huiping Zhuang,Yi Wang,Qinglai Liu,Zhiping Lin
出处
期刊:IEEE Transactions on Neural Networks
[Institute of Electrical and Electronics Engineers]
日期:2021-04-09
卷期号:: 1-8
被引量:1
标识
DOI:10.1109/tnnls.2021.3069883
摘要
Training neural networks with backpropagation (BP) requires a sequential passing of activations and gradients. This has been recognized as the lockings (i.e., the forward, backward, and update lockings) among modules (each module contains a stack of layers) inherited from the BP. In this brief, we propose a fully decoupled training scheme using delayed gradients (FDG) to break all these lockings. The FDG splits a neural network into multiple modules and trains them independently and asynchronously using different workers (e.g., GPUs). We also introduce a gradient shrinking process to reduce the stale gradient effect caused by the delayed gradients. Our theoretical proofs show that the FDG can converge to critical points under certain conditions. Experiments are conducted by training deep convolutional neural networks to perform classification tasks on several benchmark data sets. These experiments show comparable or better results of our approach compared with the state-of-the-art methods in terms of generalization and acceleration. We also show that the FDG is able to train various networks, including extremely deep ones (e.g., ResNet-1202), in a decoupled fashion.
科研通智能强力驱动
Strongly Powered by AbleSci AI