计算机科学
模式
人工智能
机器视觉
计算机视觉
视觉科学
医学影像学
人机交互
社会科学
社会学
作者
Arshi Parvaiz,Muhammad Anwaar Khalid,Rukhsana Zafar,Huma Ameer,Muhammad Umair Ali,Muhammad Moazam Fraz
标识
DOI:10.1016/j.engappai.2023.106126
摘要
Vision Transformers (ViTs), with the magnificent potential to unravel the information contained within images, have evolved as one of the most contemporary and dominant architectures that are being used in the field of computer vision. These are immensely utilized by plenty of researchers to perform new as well as former experiments. Here, in this article, we investigate the intersection of vision transformers and medical images. We proffered an overview of various ViT based frameworks that are being used by different researchers to decipher the obstacles in medical computer vision. We surveyed the applications of Vision Transformers in different areas of medical computer vision such as image-based disease classification, anatomical structure segmentation, registration, region-based lesion detection, captioning, report generation, and reconstruction using multiple medical imaging modalities that greatly assist in medical diagnosis and hence treatment process. Along with this, we also demystify several imaging modalities used in medical computer vision. Moreover, to get more insight and deeper understanding, the self-attention mechanism of transformers is also explained briefly. Conclusively, the ViT based solutions for each image analytics task are critically analyzed, open challenges are discussed and the pointers to possible solutions for future direction are deliberated. We hope this review article will open future research directions for medical computer vision researchers.
科研通智能强力驱动
Strongly Powered by AbleSci AI