RGB颜色模型
人工智能
分割
计算机科学
稳健性(进化)
计算机视觉
频道(广播)
融合
特征(语言学)
模式识别(心理学)
电信
语言学
生物化学
基因
哲学
化学
作者
Hao Zhou,Lu Qi,Hai Huang,Yang Xu,Zhaoliang Wan,Xianglong Wen
标识
DOI:10.1016/j.patcog.2021.108468
摘要
Incorporating the depth (D) information to RGB images has proven the effectiveness and robustness in semantic segmentation. However, the fusion between them is not trivial due to their inherent physical meaning discrepancy, in which RGB represents RGB information but D depth information. In this paper, we propose a co-attention network (CANet) to build sound interaction between RGB and depth features. The key part in the CANet is the co-attention fusion part. It includes three modules. Specifically, the position and channel co-attention fusion modules adaptively fuse RGB and depth features in spatial and channel dimensions. An additional fusion co-attention module further integrates the outputs of the position and channel co-attention fusion modules to obtain a more representative feature which is used for the final semantic segmentation. Extensive experiments witness the effectiveness of the CANet in fusing RGB and depth features, achieving state-of-the-art performance on two challenging RGB-D semantic segmentation datasets, i.e., NYUDv2 and SUN-RGBD.
科研通智能强力驱动
Strongly Powered by AbleSci AI