医学
指南
一致性
抗血栓
美国麻醉师学会
人口
化学预防
华法林
重症监护医学
物理疗法
外科
内科学
心房颤动
环境卫生
病理
作者
Akiro H. Duey,Katrina S. Nietsch,Bashar Zaidat,Renee Ren,Laura Mazudie Ndjonko,Nancy Shrestha,Rami Rajjoub,Wasil Ahmed,Timothy Hoang,Michael Saturno,Justin E. Tang,Zachary S. Gallate,Jun Kim,Samuel K. Cho
标识
DOI:10.1016/j.spinee.2023.07.015
摘要
Venous thromboembolism is a negative outcome of elective spine surgery. However, the use of thromboembolic chemoprophylaxis in this patient population is controversial due to the possible increased risk of epidural hematoma. ChatGPT is an artificial intelligence model which may be able to generate recommendations for thromboembolic prophylaxis in spine surgery.To evaluate the accuracy of ChatGPT recommendations for thromboembolic prophylaxis in spine surgery.Comparative analysis.None.Accuracy, over-conclusiveness, supplemental, and incompleteness of ChatGPT responses compared to the North American Spine Society (NASS) clinical guidelines.ChatGPT was prompted with questions from the 2009 NASS clinical guidelines for antithrombotic therapies and evaluated for concordance with the clinical guidelines. ChatGPT-3.5 responses were obtained on March 5, 2023, and ChatGPT-4.0 responses were obtained on April 7, 2023. A ChatGPT response was classified as accurate if it did not contradict the clinical guideline. Three additional categories were created to further evaluate the ChatGPT responses in comparison to the NASS guidelines: over-conclusiveness, supplementary, and incompleteness. ChatGPT was classified as over-conclusive if it made a recommendation where the NASS guideline did not provide one. ChatGPT was classified as supplementary if it included additional relevant information not specified by the NASS guideline. ChatGPT was classified as incomplete if it failed to provide relevant information included in the NASS guideline.Twelve clinical guidelines were evaluated in total. Compared to the NASS clinical guidelines, ChatGPT-3.5 was accurate in 4 (33%) of its responses while ChatGPT-4.0 was accurate in 11 (92%) responses. ChatGPT-3.5 was over-conclusive in 6 (50%) of its responses while ChatGPT-4.0 was over-conclusive in 1 (8%) response. ChatGPT-3.5 provided supplemental information in 8 (67%) of its responses, and ChatGPT-4.0 provided supplemental information in 11 (92%) responses. Four (33%) responses from ChatGPT-3.5 were incomplete, and 4 (33%) responses from ChatGPT-4.0 were incomplete.ChatGPT was able to provide recommendations for thromboembolic prophylaxis with reasonable accuracy. ChatGPT-3.5 tended to cite nonexistent sources and was more likely to give specific recommendations while ChatGPT-4.0 was more conservative in its answers. As ChatGPT is continuously updated, further validation is needed before it can be used as a guideline for clinical practice.
科研通智能强力驱动
Strongly Powered by AbleSci AI