亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

Foundation metrics for evaluating effectiveness of healthcare conversations powered by generative AI

个性化 医疗保健 集合(抽象数据类型) 过程(计算) 理解力 计算机科学 人机交互 万维网 经济 程序设计语言 经济增长 操作系统
作者
Mahyar Abbasian,Elahe Khatibi,Iman Azimi,David Oniani,Zahra Shakeri Hossein Abad,Alexander Thieme,Ram D. Sriram,Zhongqi Yang,Yanshan Wang,Bryant Lin,Olivier Gevaert,Li-Jia Li,Ramesh Jain,Amir M. Rahmani
出处
期刊:npj digital medicine [Springer Nature]
卷期号:7 (1) 被引量:2
标识
DOI:10.1038/s41746-024-01074-z
摘要

Abstract Generative Artificial Intelligence is set to revolutionize healthcare delivery by transforming traditional patient care into a more personalized, efficient, and proactive process. Chatbots, serving as interactive conversational models, will probably drive this patient-centered transformation in healthcare. Through the provision of various services, including diagnosis, personalized lifestyle recommendations, dynamic scheduling of follow-ups, and mental health support, the objective is to substantially augment patient health outcomes, all the while mitigating the workload burden on healthcare providers. The life-critical nature of healthcare applications necessitates establishing a unified and comprehensive set of evaluation metrics for conversational models. Existing evaluation metrics proposed for various generic large language models (LLMs) demonstrate a lack of comprehension regarding medical and health concepts and their significance in promoting patients’ well-being. Moreover, these metrics neglect pivotal user-centered aspects, including trust-building, ethics, personalization, empathy, user comprehension, and emotional support. The purpose of this paper is to explore state-of-the-art LLM-based evaluation metrics that are specifically applicable to the assessment of interactive conversational models in healthcare. Subsequently, we present a comprehensive set of evaluation metrics designed to thoroughly assess the performance of healthcare chatbots from an end-user perspective. These metrics encompass an evaluation of language processing abilities, impact on real-world clinical tasks, and effectiveness in user-interactive conversations. Finally, we engage in a discussion concerning the challenges associated with defining and implementing these metrics, with particular emphasis on confounding factors such as the target audience, evaluation methods, and prompt techniques involved in the evaluation process.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
7秒前
成就的白竹完成签到,获得积分10
9秒前
谷子完成签到 ,获得积分10
32秒前
Xiaoxiao应助科研通管家采纳,获得10
33秒前
科研通AI5应助科研通管家采纳,获得10
33秒前
orixero应助科研通管家采纳,获得10
33秒前
39秒前
xjz240221完成签到 ,获得积分10
40秒前
Lucas应助xiaozhao采纳,获得10
48秒前
精灵夜雨发布了新的文献求助10
49秒前
万能图书馆应助gqqq采纳,获得30
50秒前
Noob_saibot完成签到,获得积分10
1分钟前
1分钟前
1分钟前
震动的听枫完成签到,获得积分10
1分钟前
爱静静应助Noob_saibot采纳,获得10
1分钟前
家家完成签到,获得积分10
1分钟前
俞思含发布了新的文献求助10
2分钟前
2分钟前
研友_LBorkn发布了新的文献求助10
2分钟前
Xiaoxiao应助科研通管家采纳,获得10
2分钟前
2分钟前
土豪的灵竹完成签到 ,获得积分10
2分钟前
2分钟前
狮子沟核聚变骡子完成签到 ,获得积分10
3分钟前
研友_LBorkn完成签到,获得积分10
3分钟前
gqqq完成签到,获得积分10
3分钟前
3分钟前
XCHI完成签到 ,获得积分10
3分钟前
gqqq发布了新的文献求助30
3分钟前
明亮紫易完成签到,获得积分10
3分钟前
gqqq发布了新的文献求助10
3分钟前
斯文败类应助稳重的睿渊采纳,获得10
4分钟前
4分钟前
Xiaoxiao应助科研通管家采纳,获得10
4分钟前
无花果应助科研通管家采纳,获得10
4分钟前
Xiaoxiao应助科研通管家采纳,获得10
4分钟前
4分钟前
4分钟前
4分钟前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Mechanistic Modeling of Gas-Liquid Two-Phase Flow in Pipes 2500
Structural Load Modelling and Combination for Performance and Safety Evaluation 800
Conference Record, IAS Annual Meeting 1977 610
Interest Rate Modeling. Volume 3: Products and Risk Management 600
Interest Rate Modeling. Volume 2: Term Structure Models 600
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3555693
求助须知:如何正确求助?哪些是违规求助? 3131341
关于积分的说明 9390797
捐赠科研通 2831055
什么是DOI,文献DOI怎么找? 1556299
邀请新用户注册赠送积分活动 726483
科研通“疑难数据库(出版商)”最低求助积分说明 715803