RadBERT: Adapting Transformer-based Language Models to Radiology

自动汇总 自然语言处理 医学 人工智能 语言模型 变压器 编码(社会科学) 编码器 判决 放射科 计算机科学 统计 量子力学 电压 操作系统 物理 数学
作者
An Yan,Julian McAuley,Xing Lü,Jiang Du,Eric Chang,Amilcare Gentili,Chun‐Nan Hsu
出处
期刊:Radiology [Radiological Society of North America]
卷期号:4 (4) 被引量:68
标识
DOI:10.1148/ryai.210258
摘要

To investigate if tailoring a transformer-based language model to radiology is beneficial for radiology natural language processing (NLP) applications.This retrospective study presents a family of bidirectional encoder representations from transformers (BERT)-based language models adapted for radiology, named RadBERT. Transformers were pretrained with either 2.16 or 4.42 million radiology reports from U.S. Department of Veterans Affairs health care systems nationwide on top of four different initializations (BERT-base, Clinical-BERT, robustly optimized BERT pretraining approach [RoBERTa], and BioMed-RoBERTa) to create six variants of RadBERT. Each variant was fine-tuned for three representative NLP tasks in radiology: (a) abnormal sentence classification: models classified sentences in radiology reports as reporting abnormal or normal findings; (b) report coding: models assigned a diagnostic code to a given radiology report for five coding systems; and (c) report summarization: given the findings section of a radiology report, models selected key sentences that summarized the findings. Model performance was compared by bootstrap resampling with five intensively studied transformer language models as baselines: BERT-base, BioBERT, Clinical-BERT, BlueBERT, and BioMed-RoBERTa.For abnormal sentence classification, all models performed well (accuracies above 97.5 and F1 scores above 95.0). RadBERT variants achieved significantly higher scores than corresponding baselines when given only 10% or less of 12 458 annotated training sentences. For report coding, all variants outperformed baselines significantly for all five coding systems. The variant RadBERT-BioMed-RoBERTa performed the best among all models for report summarization, achieving a Recall-Oriented Understudy for Gisting Evaluation-1 score of 16.18 compared with 15.27 by the corresponding baseline (BioMed-RoBERTa, P < .004).Transformer-based language models tailored to radiology had improved performance of radiology NLP tasks compared with baseline transformer language models.Keywords: Translation, Unsupervised Learning, Transfer Learning, Neural Networks, Informatics Supplemental material is available for this article. © RSNA, 2022See also commentary by Wiggins and Tejani in this issue.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
书霂完成签到,获得积分10
刚刚
优秀含羞草完成签到,获得积分10
1秒前
宓沂完成签到,获得积分10
1秒前
vivre223完成签到,获得积分10
1秒前
量子星尘发布了新的文献求助10
2秒前
受伤凌蝶完成签到,获得积分10
3秒前
wenjiejiang完成签到,获得积分10
4秒前
4秒前
zly完成签到 ,获得积分10
5秒前
5秒前
李某人完成签到,获得积分10
5秒前
6秒前
小鱼完成签到,获得积分10
7秒前
小崽总完成签到,获得积分10
7秒前
挽风完成签到,获得积分10
10秒前
10秒前
dxs发布了新的文献求助10
10秒前
苹果沛柔完成签到,获得积分10
11秒前
111完成签到 ,获得积分10
11秒前
Amon完成签到 ,获得积分10
12秒前
结实寄柔完成签到,获得积分10
13秒前
dh完成签到,获得积分0
13秒前
超帅鸭子发布了新的文献求助10
14秒前
苹果沛柔发布了新的文献求助10
15秒前
17秒前
sure完成签到 ,获得积分10
19秒前
伶俐的不尤完成签到,获得积分10
19秒前
可乐完成签到,获得积分10
20秒前
乐乐乐乐乐乐应助scinature采纳,获得10
22秒前
angrymax完成签到,获得积分10
23秒前
俭朴的天薇完成签到,获得积分10
24秒前
Tanhm完成签到,获得积分10
26秒前
leolin完成签到,获得积分10
26秒前
26秒前
Dr-Luo完成签到 ,获得积分10
26秒前
27秒前
甜蜜的曼冬完成签到 ,获得积分10
28秒前
英姑应助宋老师采纳,获得30
28秒前
桐桐应助完美梨愁采纳,获得10
29秒前
Mm林完成签到,获得积分10
30秒前
高分求助中
【提示信息,请勿应助】关于scihub 10000
Les Mantodea de Guyane: Insecta, Polyneoptera [The Mantids of French Guiana] 3000
徐淮辽南地区新元古代叠层石及生物地层 3000
The Mother of All Tableaux: Order, Equivalence, and Geometry in the Large-scale Structure of Optimality Theory 3000
Global Eyelash Assessment scale (GEA) 1000
Picture Books with Same-sex Parented Families: Unintentional Censorship 550
Research on Disturbance Rejection Control Algorithm for Aerial Operation Robots 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 4038388
求助须知:如何正确求助?哪些是违规求助? 3576106
关于积分的说明 11374447
捐赠科研通 3305798
什么是DOI,文献DOI怎么找? 1819322
邀请新用户注册赠送积分活动 892672
科研通“疑难数据库(出版商)”最低求助积分说明 815029