人工智能
图像分割
计算机视觉
计算机科学
情态动词
图像(数学)
分割
培训(气象学)
尺度空间分割
上下文图像分类
模式识别(心理学)
物理
化学
气象学
高分子化学
作者
Zhiyuan Cai,Li Lin,Huaqing He,Pujin Cheng,Xiaoying Tang
出处
期刊:IEEE Transactions on Medical Imaging
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-1
被引量:1
标识
DOI:10.1109/tmi.2024.3422102
摘要
A large-scale labeled dataset is a key factor for the success of supervised deep learning in most ophthalmic image analysis scenarios. However, limited annotated data is very common in ophthalmic image analysis, since manual annotation is time-consuming and labor-intensive. Self-supervised learning (SSL) methods bring huge opportunities for better utilizing unlabeled data, as they do not require massive annotations. To utilize as many unlabeled ophthalmic images as possible, it is necessary to break the dimension barrier, simultaneously making use of both 2D and 3D images as well as alleviating the issue of catastrophic forgetting. In this paper, we propose a universal self-supervised Transformer framework named Uni4Eye++ to discover the intrinsic image characteristic and capture domain-specific feature embedding in ophthalmic images. Uni4Eye++ can serve as a global feature extractor, which builds its basis on a Masked Image Modeling task with a Vision Transformer architecture. On the basis of our previous work Uni4Eye, we further employ an image entropy guided masking strategy to reconstruct more-informative patches and a dynamic head generator module to alleviate modality confusion. We evaluate the performance of our pre-trained Uni4Eye++ encoder by fine-tuning it on multiple downstream ophthalmic image classification and segmentation tasks. The superiority of Uni4Eye++ is successfully established through comparisons to other state-of-the-art SSL pre-training methods. Our code is available at Github
科研通智能强力驱动
Strongly Powered by AbleSci AI