计算机科学
分割
卷积神经网络
变压器
人工智能
编码器
云计算
基本事实
图像分割
深度学习
模式识别(心理学)
物理
量子力学
电压
操作系统
作者
Shuang Liu,Jiafeng Zhang,Zhong Zhang,Xiaozhong Cao,T.S. Durrani
出处
期刊:IEEE Transactions on Geoscience and Remote Sensing
[Institute of Electrical and Electronics Engineers]
日期:2023-01-01
卷期号:: 1-1
标识
DOI:10.1109/tgrs.2023.3265384
摘要
Recently, convolutional neural network (CNN) dominates the ground-based cloud image segmentation task, but disregards the learning of long-range dependencies due to the limited size of filters. Although Transformer-based methods could overcome this limitation, they only learn long-range dependencies at a single scale, hence failing to capture multi-scale information of cloud image. The multi-scale information is beneficial to ground-based cloud image segmentation, because the features from small scales tend to extract detailed information while features from large scales have the ability to learn global information. In this paper, we propose a novel deep network named Integration Transformer (InTransformer), which builds long-range dependencies from different scales. To this end, we propose the Hybrid Multi-head Transformer Block (HMTB) to learn multi-scale long-range dependencies, and hybridize CNN and HMTB as the encoder at different scales. The proposed InTransformer hybridizes CNN and Transformer as the encoder to extract multi-scale representations, which learns both local information and long-range dependencies with different scales. Meanwhile, in order to fuse the patch tokens with different scales, we propose Mutual Cross-Attention Module (MCAM) for the decoder of InTransformer which could adequately interact multi-scale patch tokens in a bidirectional way. We have conducted a series of experiments on large ground-based cloud detection database TLCDD and SWIMSEG. The experimental results show that the performance of our method outperforms other methods, proving the effectiveness of the proposed InTransformer.
科研通智能强力驱动
Strongly Powered by AbleSci AI