Supervised machine learning compared to large language models for identifying functional seizures from medical records

置信区间 接收机工作特性 逻辑回归 脑电图 卡帕 神经影像学 癫痫 医学 惊厥 机器学习 心理学 内科学 听力学 人工智能 精神科 计算机科学 数学 几何学
作者
Wesley T. Kerr,Katherine N. McFarlane,Gabriela Figueiredo Pucci,Danielle R. Carns,Alex Israel,Lianne Vighetti,Page B. Pennell,John M. Stern,Zongqi Xia,Yanshan Wang
出处
期刊:Epilepsia [Wiley]
标识
DOI:10.1111/epi.18272
摘要

Abstract Objective The Functional Seizures Likelihood Score (FSLS) is a supervised machine learning–based diagnostic score that was developed to differentiate functional seizures (FS) from epileptic seizures (ES). In contrast to this targeted approach, large language models (LLMs) can identify patterns in data for which they were not specifically trained. To evaluate the relative benefits of each approach, we compared the diagnostic performance of the FSLS to two LLMs: ChatGPT and GPT‐4. Methods In total, 114 anonymized cases were constructed based on patients with documented FS, ES, mixed ES and FS, or physiologic seizure‐like events (PSLEs). Text‐based data were presented in three sequential prompts to the LLMs, showing the history of present illness (HPI), electroencephalography (EEG) results, and neuroimaging results. We compared the accuracy (number of correct predictions/number of cases) and area under the receiver‐operating characteristic (ROC) curves (AUCs) of the LLMs to the FSLS using mixed‐effects logistic regression. Results The accuracy of FSLS was 74% (95% confidence interval [CI] 65%–82%) and the AUC was 85% (95% CI 77%–92%). GPT‐4 was superior to both the FSLS and ChatGPT ( p <.001), with an accuracy of 85% (95% CI 77%–91%) and AUC of 87% (95% CI 79%–95%). Cohen's kappa between the FSLS and GPT‐4 was 40% (fair). The LLMs provided different predictions on different days when the same note was provided for 33% of patients, and the LLM's self‐rated certainty was moderately correlated with this observed variability (Spearman's rho 2 : 30% [fair, ChatGPT] and 63% [substantial, GPT‐4]). Significance Both GPT‐4 and the FSLS identified a substantial subset of patients with FS based on clinical history. The fair agreement in predictions highlights that the LLMs identified patients differently from the structured score. The inconsistency of the LLMs' predictions across days and incomplete insight into their own consistency was concerning. This comparison highlights both benefits and cautions about how machine learning and artificial intelligence could identify patients with FS in clinical practice.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
深情安青应助无情荟采纳,获得10
刚刚
闲人完成签到,获得积分10
1秒前
FM发布了新的文献求助10
5秒前
5秒前
xiahou完成签到 ,获得积分10
6秒前
充电宝应助维尼熊采纳,获得10
7秒前
ehui135发布了新的文献求助30
8秒前
10秒前
11秒前
完美世界应助Sir.夏季风采纳,获得10
11秒前
张瑞雪发布了新的文献求助20
12秒前
marvelou完成签到,获得积分10
13秒前
NexusExplorer应助萨特采纳,获得10
13秒前
13秒前
14秒前
17秒前
山茶花开完成签到,获得积分10
19秒前
维尼熊发布了新的文献求助10
23秒前
所所应助甜蜜发带采纳,获得10
24秒前
26秒前
27秒前
28秒前
28秒前
29秒前
圆圆圆发布了新的文献求助10
31秒前
32秒前
32秒前
32秒前
乌兰巴托没有海完成签到,获得积分10
33秒前
hearz发布了新的文献求助10
33秒前
李健应助张瑞雪采纳,获得10
34秒前
35秒前
doku发布了新的文献求助10
35秒前
kk关闭了kk文献求助
35秒前
FM完成签到,获得积分10
36秒前
科研通AI5应助甜蜜发带采纳,获得10
37秒前
大个应助甜甜小蜜蜂采纳,获得10
39秒前
bkagyin应助Almost采纳,获得10
39秒前
39秒前
俭朴的发带完成签到,获得积分10
41秒前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Mechanistic Modeling of Gas-Liquid Two-Phase Flow in Pipes 2500
Structural Load Modelling and Combination for Performance and Safety Evaluation 1000
Conference Record, IAS Annual Meeting 1977 720
電気学会論文誌D(産業応用部門誌), 141 巻, 11 号 510
Typology of Conditional Constructions 500
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3565500
求助须知:如何正确求助?哪些是违规求助? 3138438
关于积分的说明 9426808
捐赠科研通 2838854
什么是DOI,文献DOI怎么找? 1560581
邀请新用户注册赠送积分活动 729698
科研通“疑难数据库(出版商)”最低求助积分说明 717589