计算机视觉
突出
RGB颜色模型
人工智能
比例(比率)
计算机科学
融合
对象(语法)
地图学
地理
语言学
哲学
作者
M Zhong,Jing Sun,Peng Ren,Fasheng Wang,Fuming Sun
标识
DOI:10.1016/j.knosys.2024.112126
摘要
In recent years, excellent RGB-D salient object detection performance has been achieved. However, existing detection methods generally require a large number of model parameters in pursuit of high accuracy. To alleviate this problem, we propose a Multi-scale Awareness and Global fusion Network for RGB-D salient object detection, named MAGNet. MAGNet has 16.1M Params and 9.9G FLOPs. Specifically, we notice that convolutional neural networks (CNNs) can strongly perceive local spatial structures, whereas attention mechanisms can perform global correlation analysis of input information. Therefore, we exploit the advantages of both methods to design two kinds of cross-modal feature fusion modules. To reduce the computational complexity of the model, we design a multi-scale awareness fusion module (MAFM) to fully leverage the rich textural information and edge information in low-level feature maps. For the high-level feature maps, we incorporate an attention mechanism and a CNN to design a global fusion module (GFM), which enables the model to better capture the semantic information of different modalities by learning the correspondence between RGB and depth images. Then, we employ the proposed multi-level convolution module (MCM) to generate the predicted map through a step-by-step decoding process, which can gradually recover finer detection results. Finally, extensive experimental results on six datasets show that the proposed MAGNet not only achieves advanced detection performance but also drastically reduces the number of model parameters. Source code is available at https://github.com/mingyu6346/MAGNet.
科研通智能强力驱动
Strongly Powered by AbleSci AI