Automatic quantitative stroke severity assessment based on Chinese clinical named entity recognition with domain-adaptive pre-trained large language model

计算机科学 人工智能 冲程(发动机) 机器学习 领域(数学分析) 自然语言处理 数学 机械工程 工程类 数学分析
作者
Zhanzhong Gu,Xiangjian He,Ping Yu,Wenjing Jia,Xiguang Yang,Gang Peng,Penghui Hu,Shiyan Chen,Hongjie Chen,Yiguang Lin
出处
期刊:Artificial Intelligence in Medicine [Elsevier BV]
卷期号:150: 102822-102822 被引量:6
标识
DOI:10.1016/j.artmed.2024.102822
摘要

Stroke is a prevalent disease with a significant global impact. Effective assessment of stroke severity is vital for an accurate diagnosis, appropriate treatment, and optimal clinical outcomes. The National Institutes of Health Stroke Scale (NIHSS) is a widely used scale for quantitatively assessing stroke severity. However, the current manual scoring of NIHSS is labor-intensive, time-consuming, and sometimes unreliable. Applying artificial intelligence (AI) techniques to automate the quantitative assessment of stroke on vast amounts of electronic health records (EHRs) has attracted much interest. This study aims to develop an automatic, quantitative stroke severity assessment framework through automating the entire NIHSS scoring process on Chinese clinical EHRs. Our approach consists of two major parts: Chinese clinical named entity recognition (CNER) with a domain-adaptive pre-trained large language model (LLM) and automated NIHSS scoring. To build a high-performing CNER model, we first construct a stroke-specific, densely annotated dataset "Chinese Stroke Clinical Records" (CSCR) from EHRs provided by our partner hospital, based on a stroke ontology that defines semantically related entities for stroke assessment. We then pre-train a Chinese clinical LLM coined "CliRoberta" through domain-adaptive transfer learning and construct a deep learning-based CNER model that can accurately extract entities directly from Chinese EHRs. Finally, an automated, end-to-end NIHSS scoring pipeline is proposed by mapping the extracted entities to relevant NIHSS items and values, to quantitatively assess the stroke severity. Results obtained on a benchmark dataset CCKS2019 and our newly created CSCR dataset demonstrate the superior performance of our domain-adaptive pre-trained LLM and the CNER model, compared with the existing benchmark LLMs and CNER models. The high F1 score of 0.990 ensures the reliability of our model in accurately extracting the entities for the subsequent automatic NIHSS scoring. Subsequently, our automated, end-to-end NIHSS scoring approach achieved excellent inter-rater agreement (0.823) and intraclass consistency (0.986) with the ground truth and significantly reduced the processing time from minutes to a few seconds. Our proposed automatic and quantitative framework for assessing stroke severity demonstrates exceptional performance and reliability through directly scoring the NIHSS from diagnostic notes in Chinese clinical EHRs. Moreover, this study also contributes a new clinical dataset, a pre-trained clinical LLM, and an effective deep learning-based CNER model. The deployment of these advanced algorithms can improve the accuracy and efficiency of clinical assessment, and help improve the quality, affordability and productivity of healthcare services.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Tina完成签到 ,获得积分10
2秒前
马里奥完成签到,获得积分10
3秒前
muyassar发布了新的文献求助10
3秒前
3秒前
3秒前
4秒前
samvega应助melisa采纳,获得10
4秒前
小白应助可可采纳,获得20
4秒前
dzz完成签到,获得积分10
5秒前
闲庭发布了新的文献求助10
7秒前
x1发布了新的文献求助10
8秒前
Lucas应助Happy采纳,获得10
8秒前
9秒前
小样完成签到,获得积分10
9秒前
赘婿应助youngman2025sci采纳,获得10
10秒前
小夜发布了新的文献求助10
10秒前
12秒前
久而久之发布了新的文献求助10
14秒前
15秒前
默默地读文献应助无尘采纳,获得20
17秒前
lnn完成签到,获得积分20
18秒前
李大俊发布了新的文献求助30
18秒前
19秒前
19秒前
SciGPT应助Lin采纳,获得10
19秒前
斯文败类应助3dyf采纳,获得10
19秒前
科研通AI2S应助99c采纳,获得10
19秒前
愉快向彤完成签到 ,获得积分10
20秒前
20秒前
科研通AI5应助清枫采纳,获得10
22秒前
22秒前
24秒前
lll发布了新的文献求助10
24秒前
樂酉发布了新的文献求助10
24秒前
25秒前
25秒前
甲壳虫应助科研通管家采纳,获得10
25秒前
田様应助乌禅采纳,获得10
25秒前
easy应助科研通管家采纳,获得10
25秒前
香蕉觅云应助科研通管家采纳,获得10
25秒前
高分求助中
Continuum Thermodynamics and Material Modelling 2000
Neuromuscular and Electrodiagnostic Medicine Board Review 1000
Wind energy generation systems - Part 3-2: Design requirements for floating offshore wind turbines 600
こんなに痛いのにどうして「なんでもない」と医者にいわれてしまうのでしょうか 510
Seven new species of the Palaearctic Lauxaniidae and Asteiidae (Diptera) 400
A method for calculating the flow in a centrifugal impeller when entropy gradients are present 240
Conceptualizing 21st-Century Archives (2014) 238
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3693239
求助须知:如何正确求助?哪些是违规求助? 3243882
关于积分的说明 9845459
捐赠科研通 2955769
什么是DOI,文献DOI怎么找? 1620595
邀请新用户注册赠送积分活动 766609
科研通“疑难数据库(出版商)”最低求助积分说明 740427