已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

Assessing the performance of AI chatbots in answering patients’ common questions about low back pain

可读性 医学 腰痛 物理疗法 免责声明 阅读(过程) 家庭医学 替代医学 病理 哲学 语言学 政治学 法学
作者
Simone P S Scaff,Felipe José Jandre dos Reis,Giovanni Ferreira,Eduardo da Silva Alves,Bruno T. Saragiotto
出处
期刊:Annals of the Rheumatic Diseases [BMJ]
卷期号:: ard-226202
标识
DOI:10.1136/ard-2024-226202
摘要

Objectives The aim of this study was to assess the accuracy and readability of the answers generated by large language model (LLM)-chatbots to common patient questions about low back pain (LBP). Methods This cross-sectional study analysed responses to 30 LBP-related questions, covering self-management, risk factors and treatment. The questions were developed by experienced clinicians and researchers and were piloted with a group of consumer representatives with lived experience of LBP. The inquiries were inputted in prompt form into ChatGPT 3.5, Bing, Bard (Gemini) and ChatGPT 4.0. Responses were evaluated in relation to their accuracy, readability and presence of disclaimers about health advice. The accuracy was assessed by comparing the recommendations generated with the main guidelines for LBP. The responses were analysed by two independent reviewers and classified as accurate, inaccurate or unclear. Readability was measured with the Flesch Reading Ease Score (FRES). Results Out of 120 responses yielding 1069 recommendations, 55.8% were accurate, 42.1% inaccurate and 1.9% unclear. Treatment and self-management domains showed the highest accuracy while risk factors had the most inaccuracies. Overall, LLM-chatbots provided answers that were ‘reasonably difficult’ to read, with a mean (SD) FRES score of 50.94 (3.06). Disclaimer about health advice was present around 70%–100% of the responses produced. Conclusions The use of LLM-chatbots as tools for patient education and counselling in LBP shows promising but variable results. These chatbots generally provide moderately accurate recommendations. However, the accuracy may vary depending on the topic of each question. The reliability level of the answers was inadequate, potentially affecting the patient’s ability to comprehend the information.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
ajing完成签到,获得积分10
2秒前
3秒前
李昕123完成签到 ,获得积分10
4秒前
嗜血啊阳完成签到,获得积分10
4秒前
紧张的蝴蝶完成签到 ,获得积分10
5秒前
YY完成签到 ,获得积分10
6秒前
6秒前
10秒前
丰盛的煎饼应助贴贴超人采纳,获得10
15秒前
大龙哥886发布了新的文献求助10
17秒前
19秒前
23秒前
24秒前
kakakaku完成签到,获得积分10
26秒前
29秒前
may发布了新的文献求助10
29秒前
科研通AI2S应助科研通管家采纳,获得10
30秒前
科目三应助科研通管家采纳,获得10
30秒前
小马甲应助科研通管家采纳,获得10
30秒前
34秒前
HRZ完成签到 ,获得积分10
36秒前
42秒前
小点点完成签到,获得积分10
43秒前
小点点发布了新的文献求助10
47秒前
李爱国应助淡然的书本采纳,获得10
49秒前
49秒前
51秒前
日川冈坂完成签到 ,获得积分10
52秒前
Cyril完成签到 ,获得积分10
53秒前
本本完成签到 ,获得积分10
53秒前
往返自然发布了新的文献求助10
56秒前
57秒前
58秒前
充电宝应助may采纳,获得10
58秒前
1分钟前
往返自然完成签到,获得积分10
1分钟前
阿米不吃菠菜完成签到 ,获得积分10
1分钟前
1分钟前
馒头完成签到,获得积分10
1分钟前
高分求助中
Evolution 10000
Sustainability in Tides Chemistry 2800
юрские динозавры восточного забайкалья 800
Diagnostic immunohistochemistry : theranostic and genomic applications 6th Edition 500
Chen Hansheng: China’s Last Romantic Revolutionary 500
China's Relations With Japan 1945-83: The Role of Liao Chengzhi 400
Classics in Total Synthesis IV 400
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3150492
求助须知:如何正确求助?哪些是违规求助? 2801834
关于积分的说明 7845817
捐赠科研通 2459180
什么是DOI,文献DOI怎么找? 1309085
科研通“疑难数据库(出版商)”最低求助积分说明 628638
版权声明 601727