计算机科学
分割
背景(考古学)
人工智能
特征(语言学)
比例(比率)
试验装置
增采样
集合(抽象数据类型)
模式识别(心理学)
语义学(计算机科学)
图像分割
特征提取
计算机视觉
图像(数学)
程序设计语言
物理
量子力学
古生物学
哲学
生物
语言学
作者
Jin Liu,Fangyu Zhang,Ziyin Zhou,Jiajun Wang
标识
DOI:10.1016/j.neucom.2022.11.084
摘要
As a key technology for scene understanding, real-time semantic segmentation is an important topic in the field of computer vision in recent years. However, some lightweight networks designed for real-time semantic segmentation only have limited receptive field, so they cannot effectively perceive multi-scale objects in images. In addition, although the fast downsampling feature extraction network reduces the amount of computation, it has the problem of loss of detailed information, resulting in poor prediction accuracy. In this paper, we propose an efficient lightweight semantic segmentation network called BFMNet to address these issues. First, we use a lightweight bilateral structure to encode the semantic and detailed information from images respectively and introduce feature interactions during the encoding process. Furthermore, we design a novel Multi-Scale Context Aggregation Module (MSCAM) to help the network perceive the information of multi-scale objects, which is crucial for semantic segmentation. Finally, we introduce a new fusion module (AEFM) that uses attention perception to facilitate bilateral feature fusion. Our network achieves competitive results on three popular semantic segmentation benchmarks: Cityscapes, CamVid and COCO-Stuff. Specifically, on a single 2080Ti GPU, our network yields 77.7% mIoU at 63.7 FPS on Cityscapes test set with the input resolution of 768×1536. Considering the speed-accuracy trade-off, we also report the results with 1024×2048 input resolution: 78.9% mIoU at 31.4 FPS. On the Camvid test set, our network achieves 75.6% mIoU at 95.8 FPS, while on the COCO-Stuff validation set, our network achieves 31.2% mIoU.
科研通智能强力驱动
Strongly Powered by AbleSci AI