对偶(语法数字)
比例(比率)
分割
人工智能
计算机视觉
计算机科学
图像(数学)
地图学
地理
艺术
文学类
作者
Xiang Li,Chong Fu,Qun Wang,Wenchao Zhang,Chiu‐Wing Sham,Junxin Chen
标识
DOI:10.1016/j.knosys.2024.112050
摘要
Convolutional Neural Networks (CNNs), particularly UNet, have become prevalent in medical image segmentation tasks. However, CNNs inherently struggle to capture global dependencies owing to their intrinsic localities. Although Transformers have shown superior performance in modeling global dependencies, they encounter the challenges of high model complexity and dependencies on large-scale pre-trained models. Furthermore, the current attention mechanisms of Transformers only consider single-scale feature interactions, making it difficult to analyze feature correlations at different scales in the same attention layer. In this paper, we propose DMSA-UNet, which strengthens the global analysis capability and maximally preserves the local inductive bias capability while maintaining low model complexity. Specifically, we reformulate vanilla self-attention as efficient Dual Multi-Scale Attention (DMSA) that captures multi-scale-enhanced global information along both spatial and channel dimensions with linear complexity and pixel granularity. We also introduce a context-gated linear unit in DMSA for each feature to obtain adaptive attention based on neighboring contexts. To preserve the convolutional properties, DMSAs are inserted directly between the UNet's convolutional blocks rather than replacing them. Because DMSA has multi-scale adaptive aggregation capability, the deepest convolutional block of UNet is removed to mitigate the noise interference caused by fixed convolutional kernels with large receptive fields. We further leverage efficient convolution to reduce computational redundancy. DMSA-UNet is highly competitive in terms of model complexity, with 33% fewer parameters and 15% fewer FLOPs (at 2242 resolution) than UNet. Extensive experimental results on four different medical datasets demonstrate that DMSA-UNet outperforms other state-of-the-art approaches without any pre-trained models.
科研通智能强力驱动
Strongly Powered by AbleSci AI