计算机科学
人工智能
分割
编码器
卷积神经网络
图像分割
深度学习
模式识别(心理学)
空间分析
计算机视觉
特征学习
变压器
工程类
地理
电压
电气工程
操作系统
遥感
作者
Yinghua Fu,Junfeng Liu,Jun Shi
标识
DOI:10.1016/j.compbiomed.2024.107938
摘要
Deep learning architectures based on convolutional neural network (CNN) and Transformer have achieved great success in medical image segmentation. Models based on the encoder–decoder framework like U-Net have been successfully employed in many realistic scenarios. However, due to the low contrast between object and background, various shapes and scales of objects, and complex background in medical images, it is difficult to locate targets and obtain better segmentation performance by extracting effective information from images. In this paper, an encoder–decoder architecture based on spatial and channel attention modules built by Transformer is proposed for medical image segmentation. Concretely, spatial and channel attention modules based on Transformer are utilized to extract spatial and channel global complementary information at different layers in U-shape network, which is beneficial to learn the detail features in different scales. To fuse better spatial and channel information from Transformer features, a spatial and channel feature fusion block is designed for the decoder. The proposed network inherits the advantages of both CNN and Transformer with the local feature representation and long-range dependency for medical images. Qualitative and quantitative experiments demonstrate that the proposed method outperforms against eight state-of-the-art segmentation methods on five publicly medical image datasets including different modalities, such as 80.23% and 93.56% Dice value, 67.13% and 88.94% Intersection over Union (IoU) value on the Multi-organ Nucleus Segmentation (MoNuSeg) and Combined Healthy Abdominal Organ Segmentation with Computed Tomography scans (CHAOS-CT) datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI