计算机科学
人工智能
卷积神经网络
变压器
编码器
模式识别(心理学)
分割
特征学习
目标检测
深度学习
计算机视觉
工程类
电压
电气工程
操作系统
作者
Kyeong-Beom Park,Jae Yeol Lee
出处
期刊:IEEE Access
[Institute of Electrical and Electronics Engineers]
日期:2022-01-01
卷期号:10: 122347-122360
被引量:5
标识
DOI:10.1109/access.2022.3223424
摘要
Camouflaged object detection (COD) seeks to find concealed objects hidden in natural surroundings. COD is challenging since it has to distinguish intrinsic similarities between foreground objects and background surroundings, unlike salient object detection. Convolutional neural network (CNN)-based approaches are proposed to overcome this challenge. However, they have inherent limitations in modeling and extracting global contexts. Although Transformer-based approaches are proposed to tackle this problem, which can maintain the semantic features of input images, they have limitations in learning localized spatial features in the limited receptive field. Therefore, one of the main challenges is to conduct accurate and robust COD while maintaining global contexts without sacrificing low-level contexts. This study proposes a novel concealed object detection and segmentation method using Transformer and CNN-based advanced U-Net (TCU-Net). TCU-Net can extract globalized semantic features using the Swin Transformer-based encoder and localized spatial features using the attentive inception decoder. In particular, multi-dilated residual (MDR) blocks connecting the encoder and decoder generate refined multi-level features to improve discriminability. Finally, the attentive inception decoder generates the final camouflaged object mask by maintaining the localized spatial information. Instead of simple up-sampling of the feature map, the attentive inception decoder conducts cascaded deconvolution through inception and attention modules. A weighted hybrid loss function is used for optimizing the model, consisting of the binary cross entropy (BCE) and intersection over union (IoU) losses. We comprehensively compared the proposed TCU-Net with previous studies by analyzing different metrics based on four public datasets, such as CAMO, CHAMELEON, COD10K, and NC4K. An ablation study was also conducted to evaluate network architectures and loss functions to verify advantages of the proposed approach. Experimental analysis on public datasets proves that the proposed TCU-Net outperforms previous approaches.
科研通智能强力驱动
Strongly Powered by AbleSci AI