RGB颜色模型
计算机科学
人工智能
分割
块(置换群论)
编码器
融合机制
模式识别(心理学)
特征提取
特征(语言学)
融合
计算机视觉
数学
语言学
哲学
几何学
脂质双层融合
操作系统
作者
Wujie Zhou,Sijia Lv,Jingsheng Lei,Ting Luo,Lu Yu
出处
期刊:IEEE transactions on emerging topics in computational intelligence
[Institute of Electrical and Electronics Engineers]
日期:2022-04-11
卷期号:7 (2): 598-603
被引量:9
标识
DOI:10.1109/tetci.2022.3160720
摘要
RGB-D indoor multiclass scene understandingis a pixelwise task that interprets RGB-D images using depth information to improve the RGB features for higher performance. We propose a novel asymmetric encoder structure for RGB-D indoor scene understanding that uses a reverse fusion network (RFNet) with an attention mechanism and a simplified feature extraction block. Specifically, the pre-trained ResNet34 and VGG16 networks (two asymmetric input streams) are used as the backbone for the information extraction paths as well as additive fusion and attention modules that further enhance network performance. The strong feature extraction ability of classical networks and the advantages of two-way reverse fusion enable this novel semantic segmentation network to narrow the gap between low- and high-level features, such that the features are better merged for segmentation. We achieved segmentation performances (MIoU) of 53.5% and 50.7% on the SUN RGB-D and NYUDv2 datasets, respectively, thereby outperforming other state-of-the-art approaches.
科研通智能强力驱动
Strongly Powered by AbleSci AI