作者
Emily S. Johnson,Eva K. Welch,Jacqueline Y. Kikuchi,H Barbier,Christine M. Vaccaro,Felicia L. Balzano,Katherine L. Dengler
摘要
Use of the publicly available Large Language Model, Chat Generative Pre-trained Transformer (ChatGPT 3.5; OpenAI, 2022), is growing in health care despite varying accuracies. The aim of this study was to assess the accuracy and readability of ChatGPT's responses to questions encompassing surgical informed consent in urogynecology. Five fellowship-trained urogynecology attending physicians and 1 reconstructive female urologist evaluated ChatGPT's responses to questions about 4 surgical procedures: (1) retropubic midurethral sling, (2) total vaginal hysterectomy, (3) uterosacral ligament suspension, and (4) sacrocolpopexy. Questions involved procedure descriptions, risks/benefits/alternatives, and additional resources. Responses were rated using the DISCERN tool, a 4-point accuracy scale, and the Flesch-Kinkaid Grade Level score. The median DISCERN tool overall rating was 3 (interquartile range [IQR], 3-4), indicating a moderate rating ("potentially important but not serious shortcomings"). Retropubic midurethral sling received the highest overall score (median, 4; IQR, 3-4), and uterosacral ligament suspension received the lowest (median, 3; IQR, 3-3). Using the 4-point accuracy scale, 44.0% of responses received a score of 4 ("correct and adequate"), 22.6% received a score of 3 ("correct but insufficient"), 29.8% received a score of 2 ("accurate and misleading information together"), and 3.6% received a score of 1 ("wrong or irrelevant answer"). ChatGPT performance was poor for discussion of benefits and alternatives for all surgical procedures, with some responses being inaccurate. The mean Flesch-Kinkaid Grade Level score for all responses was 17.5 (SD, 2.1), corresponding to a postgraduate reading level. Overall, ChatGPT generated accurate responses to questions about surgical informed consent. However, it produced clearly false portions of responses, highlighting the need for a careful review of responses by qualified health care professionals.