医学
误传
等级间信度
手外科
介绍
完备性(序理论)
人工智能
医学教育
外科
家庭医学
心理学
计算机科学
评定量表
计算机安全
数学分析
发展心理学
数学
作者
Zayd M. Al Rawi,Benjamin J. Kirby,Peter Albrecht,Julia A.V. Nuelle,Daniel A. London
出处
期刊:Hand
[SAGE]
日期:2024-03-25
被引量:2
标识
DOI:10.1177/15589447241238372
摘要
Background: Increased utilization of artificial intelligence (AI)-driven search and large language models by the lay and medical community requires us to evaluate the accuracy of AI responses to common hand surgery questions. We hypothesized that the answers to most hand surgery questions posed to an AI large language model would be correct. Methods: Using the topics covered in Green’s Operative Hand Surgery 8 th Edition as a guide, 56 hand surgery questions were compiled and posed to ChatGPT (OpenAI, San Francisco, CA). Two attending hand surgeons then independently reviewed ChatGPT’s answers for response accuracy, completeness, and usefulness. A Cohen’s kappa analysis was performed to assess interrater agreement. Results: An average of 45 of the 56 questions posed to ChatGPT were deemed correct (80%), 39 responses were deemed useful (70%), and 32 responses were deemed complete (57%) by the reviewers. Kappa analysis demonstrated “fair to moderate” agreement between the two raters. Reviewers disagreed on 11 questions regarding correctness, 16 questions regarding usefulness, and 19 questions regarding completeness. Conclusions: Large language models have the potential to both positively and negatively impact patient perceptions and guide referral patterns based on the accuracy, completeness, and usefulness of their responses. While most responses fit these criteria, more precise responses are needed to ensure patient safety and avoid misinformation. Individual hand surgeons and surgical societies must understand these technologies and interface with the companies developing them to provide our patients with the best possible care.
科研通智能强力驱动
Strongly Powered by AbleSci AI