计算机科学
卷积神经网络
人工智能
图像(数学)
模式识别(心理学)
深度学习
机器学习
作者
Yu Yang,Yi Zhang,Zeyu Cheng,Zhe Song,Chengkai Tang
标识
DOI:10.1016/j.engappai.2023.107079
摘要
A broad range of prior research has demonstrated that attention mechanisms offer great potential in advancing the performance of deep convolutional neural networks (CNNs). However, most existing approaches either ignore modeling attention in both channel and spatial dimensions or introduce higher model complexity and heavier computational burden. To alleviate this dilemma, in this paper, we propose a lightweight and efficient multidimensional collaborative attention, MCA, a novel method for simultaneously inferring attention in channel, height, and width dimensions with almost free additional overhead by using a three-branch architecture. For the essential components of MCA, we not only develop an adaptive combination mechanism for merging dual cross-dimension feature responses in squeeze transformation, enhancing the informativeness and discriminability of feature descriptors but also design a gating mechanism in excitation transformation that adaptively determines the coverage of interaction to capture local feature interactions, overcoming the paradox of performance and computational overhead trade-off. Our MCA is simple yet general and can be easily plugged into various classic CNNs as a plug-and-play module and trained along with the vanilla networks in an end-to-end manner. Extensive experimental results for image recognition on CIFAR and ImageNet-1K datasets demonstrate the superiority of our method over other state-of-the-art (SOTA) counterparts. In addition, we also provide insight into the practical benefits of MCA by visually inspecting the GradCAM++ visualization results. The code is available at https://github.com/ndsclark/MCANet.
科研通智能强力驱动
Strongly Powered by AbleSci AI