Automated NLP Extraction of Clinical Rationale for Treatment Discontinuation in Breast Cancer

中止 医学 队列 逻辑回归 接收机工作特性 入射(几何) 内科学 阶段(地层学) 肿瘤科 癌症 回顾性队列研究 乳腺癌 人工智能 机器学习
作者
Matthew S Alkaitis,Monica Agrawal,Gregory J Riely,Pedram Razavi,David Sontag
出处
期刊:JCO clinical cancer informatics [American Society of Clinical Oncology]
卷期号: (5): 550-560
标识
DOI:10.1200/cci.20.00139
摘要

Key oncology end points are not routinely encoded into electronic medical records (EMRs). We assessed whether natural language processing (NLP) can abstract treatment discontinuation rationale from unstructured EMR notes to estimate toxicity incidence and progression-free survival (PFS).We constructed a retrospective cohort of 6,115 patients with early-stage and 701 patients with metastatic breast cancer initiating care at Memorial Sloan Kettering Cancer Center from 2008 to 2019. Each cohort was divided into training (70%), validation (15%), and test (15%) subsets. Human abstractors identified the clinical rationale associated with treatment discontinuation events. Concatenated EMR notes were used to train high-dimensional logistic regression and convolutional neural network models. Kaplan-Meier analyses were used to compare toxicity incidence and PFS estimated by our NLP models to estimates generated by manual labeling and time-to-treatment discontinuation (TTD).Our best high-dimensional logistic regression models identified toxicity events in early-stage patients with an area under the curve of the receiver-operator characteristic of 0.857 ± 0.014 (standard deviation) and progression events in metastatic patients with an area under the curve of 0.752 ± 0.027 (standard deviation). NLP-extracted toxicity incidence and PFS curves were not significantly different from manually extracted curves (P = .95 and P = .67, respectively). By contrast, TTD overestimated toxicity in early-stage patients (P < .001) and underestimated PFS in metastatic patients (P < .001). Additionally, we tested an extrapolation approach in which 20% of the metastatic cohort were labeled manually, and NLP algorithms were used to abstract the remaining 80%. This extrapolated outcomes approach resolved PFS differences between receptor subtypes (P < .001 for hormone receptor+/human epidermal growth factor receptor 2- v human epidermal growth factor receptor 2+ v triple-negative) that could not be resolved with TTD.NLP models are capable of abstracting treatment discontinuation rationale with minimal manual labeling.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
鱼选发布了新的文献求助10
刚刚
自由发布了新的文献求助40
刚刚
3秒前
李小强完成签到,获得积分10
3秒前
斯文败类应助momo采纳,获得10
4秒前
4秒前
脑洞疼应助牛哥还是强啊采纳,获得10
4秒前
4秒前
xzn完成签到,获得积分20
5秒前
冰冰发布了新的文献求助10
5秒前
共享精神应助9999921采纳,获得10
6秒前
buno发布了新的文献求助30
6秒前
6秒前
arizaki7应助科研民工采纳,获得10
7秒前
橘子发布了新的文献求助30
7秒前
7秒前
8秒前
8秒前
8秒前
田様应助科研通管家采纳,获得30
9秒前
Orange应助科研通管家采纳,获得10
9秒前
传奇3应助科研通管家采纳,获得10
9秒前
汉堡包应助科研通管家采纳,获得10
9秒前
大模型应助科研通管家采纳,获得10
9秒前
9秒前
10秒前
10秒前
科研通AI6.1应助hhh采纳,获得10
10秒前
牛哥还是强啊完成签到 ,获得积分10
11秒前
添添完成签到,获得积分10
11秒前
杨杨杨发布了新的文献求助10
11秒前
12秒前
12秒前
Arthur应助沉默含海采纳,获得15
12秒前
kalah发布了新的文献求助10
13秒前
默默晓亦完成签到,获得积分20
14秒前
14秒前
Upsilon完成签到,获得积分10
14秒前
binxman发布了新的文献求助10
15秒前
高分求助中
Modern Epidemiology, Fourth Edition 5000
Kinesiophobia : a new view of chronic pain behavior 5000
Molecular Biology of Cancer: Mechanisms, Targets, and Therapeutics 3000
Digital Twins of Advanced Materials Processing 2000
Propeller Design 2000
Weaponeering, Fourth Edition – Two Volume SET 2000
Handbook of pharmaceutical excipients, Ninth edition 1500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 纳米技术 化学工程 生物化学 物理 计算机科学 内科学 复合材料 催化作用 物理化学 光电子学 电极 冶金 细胞生物学 基因
热门帖子
关注 科研通微信公众号,转发送积分 6011475
求助须知:如何正确求助?哪些是违规求助? 7561281
关于积分的说明 16136985
捐赠科研通 5158233
什么是DOI,文献DOI怎么找? 2762695
邀请新用户注册赠送积分活动 1741467
关于科研通互助平台的介绍 1633653