AGLNet: Towards real-time semantic segmentation of self-driving images via attention-guided lightweight network

计算机科学分割人工智能编码器棱锥（几何）卷积神经网络推论特征（语言学）卷积（计算机科学）帧速率计算机视觉边缘设备编码（集合论）模式识别（心理学）人工神经网络云计算语言学哲学物理集合（抽象数据类型）光学程序设计语言操作系统

作者

Quan Zhou,Yu Wang,Yawen Fan,Xiaofu Wu,Suofei Zhang,Bin Kang,Longin Jan Latecki

出处

期刊：Applied Soft Computing [Elsevier BV]
日期：2020-09-02 卷期号：96: 106682-106682 被引量：92

链接

sciencedirect.comdoi.org

标识

DOI：10.1016/j.asoc.2020.106682

摘要

The extensive computational burden limits the usage of convolutional neural networks (CNNs) in edge devices for image semantic segmentation, which plays a significant role in many real-world applications, such as augmented reality, robotics, and self-driving. To address this problem, this paper presents an attention-guided lightweight network, namely AGLNet, which employs an encoder–decoder architecture for real-time semantic segmentation. Specifically, the encoder adopts a novel residual module to abstract feature representations, where two new operations, channel split and shuffle, are utilized to greatly reduce computation cost while maintaining higher segmentation accuracy. On the other hand, instead of using complicated dilated convolution and artificially designed architecture, two types of attention mechanism are subsequently employed in the decoder to upsample features to match input resolution. Specifically, a factorized attention pyramid module (FAPM) is used to explore hierarchical spatial attention from high-level output, still remaining fewer model parameters. To delineate object shapes and boundaries, a global attention upsample module (GAUM) is adopted as global guidance for high-level features. The comprehensive experiments demonstrate that our approach achieves state-of-the-art results in terms of speed and accuracy on three self-driving datasets: CityScapes, CamVid, and Mapillary Vistas. AGLNet achieves 71.3%, 69.4%, and 30.7% mean IoU on these datasets with only 1.12M model parameters. Our method also achieves 52 FPS, 90 FPS, and 53 FPS inference speed, respectively, using a single GTX 1080Ti GPU. Our code is open-source and available at https://github.com/xiaoyufenfei/Efficient-Segmentation-Networks.

求助该文献

最长约 10秒，即可获得该文献文件

AGLNet: Towards real-time semantic segmentation of self-driving images via attention-guided lightweight network

今日热心研友