Vision Transformer‐based recognition of diabetic retinopathy grade

Softmax函数 计算机科学 人工智能 卷积神经网络 变压器 模式识别(心理学) 深度学习 预处理器 糖尿病性视网膜病变 计算机视觉 工程类 电压 医学 电气工程 内分泌学 糖尿病
作者
Jianfang Wu,Ruo Hu,Zhenghong Xiao,Jiaxu Chen,Jing-Wei Liu
出处
期刊:Medical Physics [Wiley]
卷期号:48 (12): 7850-7863 被引量:79
标识
DOI:10.1002/mp.15312
摘要

In the domain of natural language processing, Transformers are recognized as state-of-the-art models, which opposing to typical convolutional neural networks (CNNs) do not rely on convolution layers. Instead, Transformers employ multi-head attention mechanisms as the main building block to capture long-range contextual relations between image pixels. Recently, CNNs dominated the deep learning solutions for diabetic retinopathy grade recognition. However, spurred by the advantages of Transformers, we propose a Transformer-based method that is appropriate for recognizing the grade of diabetic retinopathy.The purposes of this work are to demonstrate that (i) the pure attention mechanism is suitable for diabetic retinopathy grade recognition and (ii) Transformers can replace traditional CNNs for diabetic retinopathy grade recognition.This paper proposes a Vision Transformer-based method to recognize the grade of diabetic retinopathy. Fundus images are subdivided into non-overlapping patches, which are then converted into sequences by flattening, and undergo a linear and positional embedding process to preserve positional information. Then, the generated sequence is input into several multi-head attention layers to generate the final representation. The first token sequence is input to a softmax classification layer to produce the recognition output in the classification stage.The dataset for training and testing employs fundus images of different resolutions, subdivided into patches. We challenge our method against current CNNs and extreme learning machines and achieve an appealing performance. Specifically, the suggested deep learning architecture attains an accuracy of 91.4%, specificity = 0.977 (95% confidence interval (CI) (0.951-1)), precision = 0.928 (95% CI (0.852-1)), sensitivity = 0.926 (95% CI (0.863-0.989)), quadratic weighted kappa score = 0.935, and area under curve (AUC) = 0.986.Our comparative experiments against current methods conclude that our model is competitive and highlight that an attention mechanism based on a Vision Transformer model is promising for the diabetic retinopathy grade recognition task.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
干净馒头完成签到,获得积分20
1秒前
zz发布了新的文献求助10
1秒前
量子星尘发布了新的文献求助10
1秒前
1秒前
大个应助淡定的月饼采纳,获得10
2秒前
毕烨华发布了新的文献求助10
2秒前
3秒前
火星上的菲鹰应助1233333采纳,获得20
4秒前
5秒前
平淡的棉花糖完成签到,获得积分10
5秒前
5秒前
5秒前
5秒前
6秒前
6秒前
6秒前
6秒前
lj完成签到,获得积分10
7秒前
jndx2010完成签到,获得积分10
7秒前
量子星尘发布了新的文献求助10
7秒前
chang发布了新的文献求助30
8秒前
9秒前
9秒前
454发布了新的文献求助10
10秒前
10秒前
安静以松完成签到,获得积分20
10秒前
爱笑涵梅完成签到 ,获得积分10
11秒前
12秒前
哭泣时光发布了新的文献求助10
13秒前
13秒前
安静以松发布了新的文献求助20
14秒前
泽灵完成签到,获得积分10
14秒前
samuealndjw发布了新的文献求助10
14秒前
16秒前
温暖凡灵完成签到,获得积分10
17秒前
量子星尘发布了新的文献求助10
18秒前
18秒前
19秒前
19秒前
高分求助中
Production Logging: Theoretical and Interpretive Elements 2700
An experimental and analytical investigation on the fatigue behaviour of fuselage riveted lap joints: The significance of the rivet squeeze force, and a comparison of 2024-T3 and Glare 3 1000
Neuromuscular and Electrodiagnostic Medicine Board Review 1000
Statistical Methods for the Social Sciences, Global Edition, 6th edition 600
こんなに痛いのにどうして「なんでもない」と医者にいわれてしまうのでしょうか 510
ALUMINUM STANDARDS AND DATA 500
Walter Gilbert: Selected Works 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3664226
求助须知:如何正确求助?哪些是违规求助? 3224388
关于积分的说明 9757079
捐赠科研通 2934289
什么是DOI,文献DOI怎么找? 1606806
邀请新用户注册赠送积分活动 758804
科研通“疑难数据库(出版商)”最低求助积分说明 735010