计算机科学
编码器
分割
变压器
人工智能
电压
物理
量子力学
操作系统
作者
Xinyi Zeng,Pinxian Zeng,Tao Cheng,Peng Wang,Binyu Yan,Yan Wang
标识
DOI:10.1007/978-3-031-43901-8_48
摘要
3D Spatially Aligned Multi-modal MRI Brain Tumor Segmentation (SAMM-BTS) is a crucial task for clinical diagnosis. While Transformer-based models have shown outstanding success in this field due to their ability to model global features using the self-attention mechanism, they still face two challenges. First, due to the high computational complexity and deficiencies in modeling local features, the traditional self-attention mechanism is ill-suited for SAMM-BTS tasks that require modeling both global and local volumetric features within an acceptable computation overhead. Second, existing models only stack spatially aligned multi-modal data on the channel dimension, without any processing for such multi-channel data in the model's internal design. To address these challenges, we propose a Transformer-based model for the SAMM-BTS task, namely DBTrans, with dual-branch architectures for both the encoder and decoder. Specifically, the encoder implements two parallel feature extraction branches, including a local branch based on Shifted Window Self-attention and a global branch based on Shuffle Window Cross-attention to capture both local and global information with linear computational complexity. Besides, we add an extra global branch based on Shifted Window Cross-attention to the decoder, introducing the key and value matrices from the corresponding encoder block, allowing the segmented target to access a more complete context during up-sampling. Furthermore, the above dual-branch designs in the encoder and decoder are both integrated with improved channel attention mechanisms to fully explore the contribution of features at different channels. Experimental results demonstrate the superiority of our DBTrans model in both qualitative and quantitative measures. Codes will be released at https://github.com/Aru321/DBTrans .
科研通智能强力驱动
Strongly Powered by AbleSci AI