情态动词
人工智能
计算机视觉
计算机科学
反向
目标检测
突出
RGB颜色模型
对象(语法)
模式识别(心理学)
数学
几何学
材料科学
高分子化学
作者
Hegui Zhu,Jia Ni,Xi Yang,Libo Zhang
标识
DOI:10.1016/j.patcog.2024.110693
摘要
Currently, the majority of RGB-Depth salient object detection (SOD) methods utilize the encoder–decoder architecture. However, they often fail to utilize the encoding and decoding features fully. This paper rethinks the differences and correlations between them and proposes the Cross-Modal Inverse Guidance Network (CMIGNet) for SOD. Specifically, a Multi-level Feature Guidance Enhancement (MFGE) module is integrated into every layer of the foundational network. It employs a high-level decoding feature to guide the low-level RGB and depth encoding features, facilitating the rapid identification of salient regions and noise removal. The dual-stream encoding features guided by the MFGE module are combined using the proposed Dual-Stream Interactive Fusion (DSIF) module. It could simultaneously reduce dependence on two modal features during the fusion process. Thus, the impact on the results can be reduced in complex scenes when one modality is absent or confusing. Finally, the edge information is supplemented using the proposed Edge Refinement Awareness (ERA) module to generate the final salient map. Comparisons on seven widely used and one latest challenging RGB-D datasets show that the performance of the proposed CMIGNet is highly competitive with the state-of-the-art RGB-Depth SOD models. Additionally, the model is lighter and faster.
科研通智能强力驱动
Strongly Powered by AbleSci AI