计算机科学
棱锥(几何)
联营
分割
人工智能
计算机视觉
模式识别(心理学)
光学
物理
作者
Xingguo Song,Xiaojie Fang,Xiangyin Meng,Fang Xu,Maoting Lv,Yue Zhuo
标识
DOI:10.1016/j.engappai.2024.107988
摘要
Real-time semantic segmentation methods demand high levels of both speed and accuracy. However, recent methods in semantic segmentation generally use the models pre-trained on ImageNet as their backbone to increase the field-of-view quickly. Due to the low resolution and plenty of image categories, the existing methods often suffer from insufficient receptive field and channel redundancy. To tackle these problems, we propose a novel block structure named Atrous block (A block) in this paper. By repeating this block structure, we have developed our dedicated backbone tailored for the semantic segmentation task, eliminating the need for additional modules. According to the different output channels of the backbone, we obtain our own architecture called Atrous Network (ANet), Furthermore, we propose a lightweight decoder to recover the spatial details lost during encoding. To demonstrate the effectiveness of our methods, we have validated our methods on CityScapes and CamVid datasets, using mixed precision during the training process. Compared with the state-of-the-art methods, our models achieve competitive results. Among them, our small size model achieves 75.2 % mean intersection over union (mIOU) on CityScapes dataset and 78.3 % mIOU on CamVid dataset, both without ImageNet pretraining.
科研通智能强力驱动
Strongly Powered by AbleSci AI