编码器
人工智能
图像(数学)
计算机科学
图像分割
分割
语义鸿沟
编码(内存)
模式识别(心理学)
操作系统
图像检索
作者
Haonan Wang,Peng Cao,Jinzhu Yang,Osmar R. Zai͏̈ane
出处
期刊:Neural Networks
[Elsevier]
日期:2024-07-17
卷期号:178: 106546-106546
被引量:4
标识
DOI:10.1016/j.neunet.2024.106546
摘要
Current state-of-the-art medical image segmentation techniques predominantly employ the encoder-decoder architecture. Despite its widespread use, this U-shaped framework exhibits limitations in effectively capturing multi-scale features through simple skip connections. In this study, we made a thorough analysis to investigate the potential weaknesses of connections across various segmentation tasks, and suggest two key aspects of potential semantic gaps crucial to be considered: the semantic gap among multi-scale features in different encoding stages and the semantic gap between the encoder and the decoder. To bridge these semantic gaps, we introduce a novel segmentation framework, which incorporates a Dual Attention Transformer module for capturing channel-wise and spatial-wise relationships, and a Decoder-guided Recalibration Attention module for fusing DAT tokens and decoder features. These modules establish a principle of learnable connection that resolves the semantic gaps, leading to a high-performance segmentation model for medical images. Furthermore, it provides a new paradigm for effectively incorporating the attention mechanism into the traditional convolution-based architecture. Comprehensive experimental results demonstrate that our model achieves consistent, significant gains and outperforms state-of-the-art methods with relatively fewer parameters. This study contributes to the advancement of medical image segmentation by offering a more effective and efficient framework for addressing the limitations of current encoder-decoder architectures. Code: https://github.com/McGregorWwww/UDTransNet.
科研通智能强力驱动
Strongly Powered by AbleSci AI