TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers

人工智能建筑分割变压器计算机科学图像分割计算机视觉工程类电气工程艺术视觉艺术电压

作者

Jieneng Chen,Jieru Mei,Xianhang Li,Yongyi Lu,Qihang Yu,Qingyue Wei,Xiangde Luo,Yutong Xie,Ehsan Adeli,Yan Wang,Matthew P. Lungren,Shaoting Zhang,Lei Xing,Le Lü,Alan Yuille,Yuyin Zhou

出处

期刊：Medical Image Analysis [Elsevier]
日期：2024-07-22 卷期号：97: 103280-103280 被引量：11

链接

nih.govdoi.org

标识

DOI：10.1016/j.media.2024.103280

摘要

Medical image segmentation is crucial for healthcare, yet convolution-based methods like U-Net face limitations in modeling long-range dependencies. To address this, Transformers designed for sequence-to-sequence predictions have been integrated into medical image segmentation. However, a comprehensive understanding of Transformers' self-attention in U-Net components is lacking. TransUNet, first introduced in 2021, is widely recognized as one of the first models to integrate Transformer into medical image analysis. In this study, we present the versatile framework of TransUNet that encapsulates Transformers' self-attention into two key modules: (1) a Transformer encoder tokenizing image patches from a convolution neural network (CNN) feature map, facilitating global context extraction, and (2) a Transformer decoder refining candidate regions through cross-attention between proposals and U-Net features. These modules can be flexibly inserted into the U-Net backbone, resulting in three configurations: Encoder-only, Decoder-only, and Encoder+Decoder. TransUNet provides a library encompassing both 2D and 3D implementations, enabling users to easily tailor the chosen architecture. Our findings highlight the encoder's efficacy in modeling interactions among multiple abdominal organs and the decoder's strength in handling small targets like tumors. It excels in diverse medical applications, such as multi-organ segmentation, pancreatic tumor segmentation, and hepatic vessel segmentation. Notably, our TransUNet achieves a significant average Dice improvement of 1.06% and 4.30% for multi-organ segmentation and pancreatic tumor segmentation, respectively, when compared to the highly competitive nn-UNet, and surpasses the top-1 solution in the BrasTS2021 challenge. 2D/3D Code and models are available at https://github.com/Beckschen/TransUNet and https://github.com/Beckschen/TransUNet-3D, respectively.

求助该文献

最长约 10秒，即可获得该文献文件

TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers

今日热心研友