计算机科学
分割
模块化设计
变压器
人工智能
利用
卷积神经网络
编码器
建筑
图像分割
模式识别(心理学)
计算机视觉
物理
操作系统
艺术
视觉艺术
量子力学
计算机安全
电压
作者
Wentao Liu,Tong Tian,Weijin Xu,Huihua Yang,Xipeng Pan,Songlin Yan,Lemeng Wang
标识
DOI:10.1007/978-3-031-16443-9_23
摘要
The success of Transformer in computer vision has attracted increasing attention in the medical imaging community. Especially for medical image segmentation, many excellent hybrid architectures based on convolutional neural networks (CNNs) and Transformer have been presented and achieve impressive performance. However, most of these methods, which embed modular Transformer into CNNs, struggle to reach their full potential. In this paper, we propose a novel hybrid architecture for medical image segmentation called PHTrans, which parallelly hybridizes Transformer and CNN in main building blocks to produce hierarchical representations from global and local features and adaptively aggregate them, aiming to fully exploit their strengths to obtain better segmentation performance. Specifically, PHTrans follows the U-shaped encoder-decoder design and introduces the parallel hybird module in deep stages, where convolution blocks and the modified 3D Swin Transformer learn local features and global dependencies separately, then a sequence-to-volume operation unifies the dimensions of the outputs to achieve feature aggregation. Extensive experimental results on both Multi-Atlas Labeling Beyond the Cranial Vault and Automated Cardiac Diagnosis Challeng datasets corroborate its effectiveness, consistently outperforming state-of-the-art methods. The code is available at: https://github.com/lseventeen/PHTrans .
科研通智能强力驱动
Strongly Powered by AbleSci AI