已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

Performance of ChatGPT on a Radiology Board-style Examination: Insights into Current Strengths and Limitations

医学 风格(视觉艺术) 召回 订单(交换) 放射科 医学物理学 认知心理学 心理学 考古 财务 经济 历史
作者
Rajesh Bhayana,Satheesh Krishna,Robert Bleakney
出处
期刊:Radiology [Radiological Society of North America]
卷期号:307 (5) 被引量:232
标识
DOI:10.1148/radiol.230582
摘要

Background ChatGPT is a powerful artificial intelligence large language model with great potential as a tool in medical practice and education, but its performance in radiology remains unclear. Purpose To assess the performance of ChatGPT on radiology board-style examination questions without images and to explore its strengths and limitations. Materials and Methods In this exploratory prospective study performed from February 25 to March 3, 2023, 150 multiple-choice questions designed to match the style, content, and difficulty of the Canadian Royal College and American Board of Radiology examinations were grouped by question type (lower-order [recall, understanding] and higher-order [apply, analyze, synthesize] thinking) and topic (physics, clinical). The higher-order thinking questions were further subclassified by type (description of imaging findings, clinical management, application of concepts, calculation and classification, disease associations). ChatGPT performance was evaluated overall, by question type, and by topic. Confidence of language in responses was assessed. Univariable analysis was performed. Results ChatGPT answered 69% of questions correctly (104 of 150). The model performed better on questions requiring lower-order thinking (84%, 51 of 61) than on those requiring higher-order thinking (60%, 53 of 89) (P = .002). When compared with lower-order questions, the model performed worse on questions involving description of imaging findings (61%, 28 of 46; P = .04), calculation and classification (25%, two of eight; P = .01), and application of concepts (30%, three of 10; P = .01). ChatGPT performed as well on higher-order clinical management questions (89%, 16 of 18) as on lower-order questions (P = .88). It performed worse on physics questions (40%, six of 15) than on clinical questions (73%, 98 of 135) (P = .02). ChatGPT used confident language consistently, even when incorrect (100%, 46 of 46). Conclusion Despite no radiology-specific pretraining, ChatGPT nearly passed a radiology board-style examination without images; it performed well on lower-order thinking questions and clinical management questions but struggled with higher-order thinking questions involving description of imaging findings, calculation and classification, and application of concepts. © RSNA, 2023 See also the editorial by Lourenco et al and the article by Bhayana et al in this issue.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
英俊的铭应助活泼的枫叶采纳,获得10
1秒前
深情安青应助默默函采纳,获得20
4秒前
xl完成签到 ,获得积分10
5秒前
饼子发布了新的文献求助10
8秒前
10秒前
干净溪流发布了新的文献求助10
15秒前
16秒前
21秒前
allshestar完成签到 ,获得积分10
21秒前
妮妮发布了新的文献求助20
24秒前
阿尼亚发布了新的文献求助30
24秒前
26秒前
28秒前
28秒前
Akim应助干净溪流采纳,获得10
28秒前
充电宝应助隐形的雪碧采纳,获得10
29秒前
邹修坤发布了新的文献求助10
31秒前
32秒前
WLL发布了新的文献求助10
35秒前
小米发布了新的文献求助10
37秒前
死去的温柔5完成签到,获得积分10
43秒前
45秒前
mm完成签到 ,获得积分10
47秒前
不配.应助妮妮采纳,获得20
49秒前
张咸鱼发布了新的文献求助30
50秒前
mm关注了科研通微信公众号
50秒前
53秒前
57秒前
北极星完成签到,获得积分10
57秒前
kim发布了新的文献求助10
59秒前
1分钟前
1分钟前
1分钟前
关北落小强完成签到,获得积分20
1分钟前
1分钟前
1分钟前
赘婿应助FOOL采纳,获得10
1分钟前
咕咕鸡完成签到,获得积分10
1分钟前
Doc.Lee发布了新的文献求助10
1分钟前
高分求助中
Kinetics of the Esterification Between 2-[(4-hydroxybutoxy)carbonyl] Benzoic Acid with 1,4-Butanediol: Tetrabutyl Orthotitanate as Catalyst 1000
The Young builders of New china : the visit of the delegation of the WFDY to the Chinese People's Republic 1000
Rechtsphilosophie 1000
Handbook of Qualitative Cross-Cultural Research Methods 600
Chen Hansheng: China’s Last Romantic Revolutionary 500
Mantiden: Faszinierende Lauerjäger Faszinierende Lauerjäger 500
PraxisRatgeber: Mantiden: Faszinierende Lauerjäger 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3139336
求助须知:如何正确求助?哪些是违规求助? 2790244
关于积分的说明 7794607
捐赠科研通 2446679
什么是DOI,文献DOI怎么找? 1301314
科研通“疑难数据库(出版商)”最低求助积分说明 626124
版权声明 601109