计算机科学
人工智能
医学诊断
计算机视觉
胶囊内镜
上下文图像分类
图像处理
模式识别(心理学)
变压器
内窥镜检查
图像(数学)
医学
工程类
放射科
电压
电气工程
作者
Wei Wang,Xin Yang,Jinhui Tang
出处
期刊:IEEE Transactions on Circuits and Systems for Video Technology
[Institute of Electrical and Electronics Engineers]
日期:2023-05-25
卷期号:33 (9): 4452-4461
被引量:6
标识
DOI:10.1109/tcsvt.2023.3277462
摘要
Automated classification of gastrointestinal endoscope images can help reduce the workload of doctors and improve the accuracy of diagnoses. The rapidly developed vision Transformer, represented by Swin Transformer, has become an impressive technique for medical image classification. However, Swin Transformer cannot capture the long-range dependency well in complex gastrointestinal endoscopy images. As a result, it fails to represent features of some widely-spread targets in digestive tract images, such as normal-z-line and esophagitis, effectively. To solve this problem, we propose a novel vision Transformer model based on hybrid shifted windows for digestive tract image classification, which can obtain both short-range and long-range dependency concurrently. Extensive experiments demonstrate the superiority of our method to the state-of-the-art methods with a classification accuracy of 95.42% on the Kvasir v2 dataset and a classification accuracy of 86.81% on the HyperKvasir dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI