CATNet: A Cascaded and Aggregated Transformer Network for RGB-D Salient Object Detection

计算机科学 人工智能 RGB颜色模型 模式识别(心理学) 解码方法 计算机视觉 预处理器 特征提取 增采样 算法 图像(数学)
作者
Fuming Sun,Peng Ren,Bowen Yin,Fasheng Wang,Haojie Li
出处
期刊:IEEE Transactions on Multimedia [Institute of Electrical and Electronics Engineers]
卷期号:26: 2249-2262 被引量:26
标识
DOI:10.1109/tmm.2023.3294003
摘要

Salient object detection (SOD) is an important preprocessing operation for various computer vision tasks. Most of existing RGB-D SOD models employ additive or connected strategies to directly aggregate and decode multi-scale features to predict salient maps. However, due to the large differences between the features of different scales, these aggregation strategies adopted may lead to information loss or redundancy, and few methods explicitly consider how to establish connections between features at different scales in the decoding process, which consequently deteriorates the detection performance of the models. To this end, we propose a cascaded and aggregated Transformer Network (CATNet) which consists of three key modules, i.e., attention feature enhancement module (AFEM), cross-modal fusion module (CMFM) and cascaded correction decoder (CCD). Specifically, the AFEM is designed on the basis of atrous spatial pyramid pooling to obtain multi-scale semantic information and global context information in high-level features through dilated convolution and multi-head self-attention mechanism, enhancing high-level features. The role of the CMFM is to enhance and thereafter fuse the RGB features and depth features, alleviating the problem of poor-quality depth maps. The CCD is composed of two subdecoders in a cascading fashion. It is designed to suppress noise in low-level features and mitigate the differences between features at different scales. Moreover, the CCD uses a feedback mechanism to correct and repair the output of the subdecoder by exploiting supervised features, so that the problem of information loss caused by the upsampling operation during the multi-scale features aggregation process can be mitigated. Extensive experimental results demonstrate that the proposed CATNet achieves superior performance over 14 state-of-the-art RGB-D methods on 7 challenging benchmarks. The codes are released at https://github.com/ROC-Star/CATNet/ .
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
3W完成签到,获得积分10
1秒前
1秒前
可爱的函函应助风起人散采纳,获得10
1秒前
科研通AI5应助香蕉初瑶采纳,获得10
2秒前
2秒前
斯文败类应助snow采纳,获得10
2秒前
uu完成签到,获得积分20
3秒前
复杂含灵完成签到,获得积分10
5秒前
kevin发布了新的文献求助20
5秒前
Lawrence发布了新的文献求助10
6秒前
zxldylan完成签到,获得积分10
6秒前
曦子曦子发布了新的文献求助10
6秒前
我是老大应助linyalala采纳,获得10
6秒前
科研通AI5应助善良的一凤采纳,获得30
6秒前
8秒前
8秒前
9秒前
9秒前
10秒前
沉静的小熊猫完成签到,获得积分10
10秒前
ty7889完成签到,获得积分10
10秒前
10秒前
yatuitui发布了新的文献求助30
11秒前
11秒前
淡然的寻冬完成签到 ,获得积分10
11秒前
12秒前
脑洞疼应助结实大白采纳,获得10
12秒前
12秒前
12秒前
大个应助uu采纳,获得10
13秒前
暗号发布了新的文献求助30
13秒前
vain完成签到,获得积分10
13秒前
liyiren完成签到,获得积分10
13秒前
linyalala完成签到,获得积分10
13秒前
13秒前
sb发布了新的文献求助10
13秒前
wwt完成签到,获得积分10
14秒前
暮云完成签到,获得积分20
15秒前
Re发布了新的文献求助10
15秒前
15秒前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Mechanistic Modeling of Gas-Liquid Two-Phase Flow in Pipes 2500
Structural Load Modelling and Combination for Performance and Safety Evaluation 1000
Conference Record, IAS Annual Meeting 1977 710
電気学会論文誌D(産業応用部門誌), 141 巻, 11 号 510
Virulence Mechanisms of Plant-Pathogenic Bacteria 500
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3564154
求助须知:如何正确求助?哪些是违规求助? 3137367
关于积分的说明 9422052
捐赠科研通 2837751
什么是DOI,文献DOI怎么找? 1560082
邀请新用户注册赠送积分活动 729261
科研通“疑难数据库(出版商)”最低求助积分说明 717280