Hybrid Masked Image Modeling for 3D Medical Image Segmentation

计算机科学人工智能分割模式识别（心理学）编码器图像分割像素计算机视觉机器学习操作系统

作者

Zhaohu Xing,Lei Zhu,Lequan Yu,Zhiheng Xing,Liang Wan

出处

期刊：IEEE Journal of Biomedical and Health Informatics [Institute of Electrical and Electronics Engineers]
日期：2024-01-30 卷期号：28 (4): 2115-2125 被引量：24

链接

arxiv.org arxiv.org nih.govdoi.org

标识

DOI：10.1109/jbhi.2024.3360239

摘要

Masked image modeling (MIM) with transformer backbones has recently been exploited as a powerful self-supervised pre-training technique. The existing MIM methods adopt the strategy to mask random patches of the image and reconstruct the missing pixels, which only considers semantic information at a lower level, and causes a long pre-training time. This paper presents HybridMIM, a novel hybrid self-supervised learning method based on masked image modeling for 3D medical image segmentation. Specifically, we design a two-level masking hierarchy to specify which and how patches in sub-volumes are masked, effectively providing the constraints of higher level semantic information. Then we learn the semantic information of medical images at three levels, including: 1) partial region prediction to reconstruct key contents of the 3D image, which largely reduces the pre-training time burden (pixel-level); 2) patch-masking perception to learn the spatial relationship between the patches in each sub-volume (region-level); and 3) drop-out-based contrastive learning between samples within a mini-batch, which further improves the generalization ability of the framework (sample-level). The proposed framework is versatile to support both CNN and transformer as encoder backbones, and also enables to pre-train decoders for image segmentation. We conduct comprehensive experiments on five widely-used public medical image segmentation datasets, including BraTS2020, BTCV, MSD Liver, MSD Spleen, and BraTS2023. The experimental results show the clear superiority of HybridMIM against competing supervised methods, masked pre-training approaches, and other self-supervised methods, in terms of quantitative metrics, speed performance and qualitative observations. The codes of HybridMIM are available at https://github.com/ge-xing/HybridMIM .

求助该文献

最长约 10秒，即可获得该文献文件

Hybrid Masked Image Modeling for 3D Medical Image Segmentation

今日热心研友