ChatGPT-3.5 and ChatGPT-4 dermatological knowledge level based on the Specialty Certificate Examination in Dermatology

专业 证书 考试(生物学) 医学 皮肤病 临床实习 显著性差异 皮肤病科 家庭医学 计算机科学 内科学 算法 生物 古生物学
作者
Miłosz Lewandowski,Paweł Łukowicz,Dariusz Świetlik,Wioletta Barańska‐Rybak
出处
期刊:Clinical and Experimental Dermatology [Wiley]
被引量:26
标识
DOI:10.1093/ced/llad255
摘要

Abstract Background The global use of artificial intelligence (AI) has the potential to revolutionize the healthcare industry. Despite the fact that AI is becoming more popular, there is still a lack of evidence on its use in dermatology. Objectives To determine the capacity of ChatGPT-3.5 and ChatGPT-4 to support dermatology knowledge and clinical decision-making in medical practice. Methods Three Specialty Certificate Examination in Dermatology tests, in English and Polish, consisting of 120 single-best-answer, multiple-choice questions each, were used to assess the performance of ChatGPT-3.5 and ChatGPT-4. Results ChatGPT-4 exceeded the 60% pass rate in every performed test, with a minimum of 80% and 70% correct answers for the English and Polish versions, respectively. ChatGPT-4 performed significantly better on each exam (P < 0.01), regardless of language, compared with ChatGPT-3.5. Furthermore, ChatGPT-4 answered clinical picture-type questions with an average accuracy of 93.0% and 84.2% for questions in English and Polish, respectively. The difference between the tests in Polish and English were not significant; however, ChatGPT-3.5 and ChatGPT-4 performed better overall in English than in Polish by an average of 8 percentage points for each test. Incorrect ChatGPT answers were highly correlated with a lower difficulty index, denoting questions of higher difficulty in most of the tests (P < 0.05). Conclusions The dermatology knowledge level of ChatGPT was high, and ChatGPT-4 performed significantly better than ChatGPT-3.5. Although the use of ChatGPT will not replace a doctor’s final decision, physicians should support the development of AI in dermatology to raise the standards of medical care.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
金容完成签到,获得积分10
刚刚
细雨听风完成签到,获得积分10
刚刚
含糊的白安完成签到,获得积分10
1秒前
迟大猫应助xzn1123采纳,获得30
2秒前
2秒前
2秒前
科研通AI5应助李李采纳,获得50
3秒前
祖f完成签到,获得积分10
3秒前
阿莫西林胶囊完成签到,获得积分10
4秒前
jason完成签到,获得积分10
4秒前
4秒前
科研通AI5应助吴岳采纳,获得10
5秒前
Sheila发布了新的文献求助10
5秒前
甜美的海瑶完成签到,获得积分10
6秒前
6秒前
6秒前
张牧之完成签到 ,获得积分10
6秒前
yuyukeke完成签到,获得积分10
7秒前
7秒前
沉默的婴完成签到 ,获得积分10
7秒前
8秒前
9秒前
Dita完成签到,获得积分10
9秒前
惠惠发布了新的文献求助10
9秒前
脑洞疼应助lan采纳,获得10
10秒前
11秒前
成就的笑南完成签到 ,获得积分10
12秒前
偷狗的小月亮完成签到,获得积分10
12秒前
爱吃泡芙完成签到,获得积分10
12秒前
ysl完成签到,获得积分10
13秒前
13秒前
爆米花应助pipge采纳,获得30
13秒前
彻底完成签到,获得积分10
14秒前
15秒前
韋晴完成签到,获得积分10
16秒前
16秒前
18秒前
领导范儿应助wenjian采纳,获得10
18秒前
18秒前
奇拉维特完成签到 ,获得积分10
18秒前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Social media impact on athlete mental health: #RealityCheck 1020
Ensartinib (Ensacove) for Non-Small Cell Lung Cancer 1000
Unseen Mendieta: The Unpublished Works of Ana Mendieta 1000
Bacterial collagenases and their clinical applications 800
El viaje de una vida: Memorias de María Lecea 800
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3527928
求助须知:如何正确求助?哪些是违规求助? 3108040
关于积分的说明 9287614
捐赠科研通 2805836
什么是DOI,文献DOI怎么找? 1540070
邀请新用户注册赠送积分活动 716904
科研通“疑难数据库(出版商)”最低求助积分说明 709808