计算机科学
人工智能
图像分割
分割
失败
卷积神经网络
变压器
尺度空间分割
模式识别(心理学)
推论
基于分割的对象分类
计算复杂性理论
计算机视觉
算法
电压
工程类
并行计算
电气工程
作者
Along He,Kai Wang,Tao Li,Chengkun Du,Shuang Xia,Huazhu Fu
出处
期刊:IEEE Transactions on Medical Imaging
[Institute of Electrical and Electronics Engineers]
日期:2023-04-05
卷期号:42 (9): 2763-2775
被引量:80
标识
DOI:10.1109/tmi.2023.3264513
摘要
Accurate medical image segmentation is of great significance for computer aided diagnosis. Although methods based on convolutional neural networks (CNNs) have achieved good results, it is weak to model the long-range dependencies, which is very important for segmentation task to build global context dependencies. The Transformers can establish long-range dependencies among pixels by self-attention, providing a supplement to the local convolution. In addition, multi-scale feature fusion and feature selection are crucial for medical image segmentation tasks, which is ignored by Transformers. However, it is challenging to directly apply self-attention to CNNs due to the quadratic computational complexity for high-resolution feature maps. Therefore, to integrate the merits of CNNs, multi-scale channel attention and Transformers, we propose an efficient hierarchical hybrid vision Transformer (H2Former) for medical image segmentation. With these merits, the model can be data-efficient for limited medical data regime. The experimental results show that our approach exceeds previous Transformer, CNNs and hybrid methods on three 2D and two 3D medical image segmentation tasks. Moreover, it keeps computational efficiency in model parameters, FLOPs and inference time. For example, H2Former outperforms TransUNet by 2.29% in IoU score on KVASIR-SEG dataset with 30.77% parameters and 59.23% FLOPs.
科研通智能强力驱动
Strongly Powered by AbleSci AI