计算机科学
语义鸿沟
增采样
编码器
人工智能
棱锥(几何)
特征提取
语义特征
分割
稳健性(进化)
特征(语言学)
模式识别(心理学)
计算机视觉
图像(数学)
图像检索
光学
物理
哲学
操作系统
基因
生物化学
化学
语言学
作者
Mucong Ye,Jingpeng Ouyang,Ge Chen,Jing Zhang,Xiaogang Yu
标识
DOI:10.1109/icpr48806.2021.9413224
摘要
Multi-scale feature fusion has been an effective way for improving the performance of semantic segmentation. However, current methods generally fail to consider the semantic gaps between the shallow (low-level) and deep (high-level) features and thus the fusion methods may not be optimal. In this paper, to address the issues of the semantic gap between the feature from different layers, we propose a unified framework based on the U-shape encoder-decoder architecture, named Enhanced Feature Pyramid Network (EFPN). Specifically, the semantic enhancement module (SEM), edge extraction module (EEM), and context aggregation model (CAM) are incorporated into the decoder network to improve the robustness of the multilevel features aggregation. In addition, a global fusion model (GFM), which in the encoder branch is proposed to capture more semantic information in the deep layers and effectively transmit the high-level semantic features to each layer. Extensive experiments are conducted and the results show that the proposed framework achieves the state-of-the-art results on three public datasets, namely PASCAL VOC 2012, Cityscapes, and PASCAL Context. Furthermore, we also demonstrate that the proposed method is effective for other visual tasks that require frequent fusing features and upsampling.
科研通智能强力驱动
Strongly Powered by AbleSci AI