人工智能
计算机科学
融合
计算机视觉
水准点(测量)
特征(语言学)
模态(人机交互)
融合机制
RGB颜色模型
编码器
模式识别(心理学)
过程(计算)
利用
保险丝(电气)
脂质双层融合
操作系统
电气工程
工程类
哲学
语言学
地理
计算机安全
大地测量学
作者
Tianyou Chen,Jin Xiao,Xiaoguang Hu,Guofeng Zhang,Shaojie Wang
标识
DOI:10.1016/j.neucom.2022.12.004
摘要
Existing state-of-the-art RGB-D saliency detection models mainly utilize the depth information as complementary cues to enhance the RGB information. However, depth maps can be easily influenced by environment and hence are full of noises. Thus, indiscriminately integrating multi-modality (i.e., RGB and depth) features may induce noise-degraded saliency maps. In this paper, we propose a novel Adaptive Fusion Network (AFNet) to solve this problem. Specifically, we design a triplet encoder network consisting of three subnetworks to process RGB, depth, and fused features, respectively. The three subnetworks are interlinked and form a grid net to facilitate mutual refinement of these multi-modality features. Moreover, we propose a Multi-modality Feature Interaction (MFI) module to exploit complementary cues between depth and RGB modalities and adaptively fuse the multi-modality features. Finally, we design the Cascaded Feature Interweaved Decoder (CFID) to exploit complementary information between multi-level features and refine them iteratively to achieve accurate saliency detection. Experimental results on six commonly used benchmark datasets verify that the proposed AFNet outperforms 20 state-of-the-art counterparts in terms of six widely adopted evaluation metrics. Source code will be publicly available athttps://github.com/clelouch/AFNet upon paper acceptance.
科研通智能强力驱动
Strongly Powered by AbleSci AI