计算机科学
概念证明
自然语言处理
变压器
人工智能
放射科
情报检索
医学
工程类
操作系统
电气工程
电压
作者
A Olthof,Peter M. A. van Ooijen,L. J. Cornelissen
标识
DOI:10.1177/14604582221131198
摘要
Radiology requests and reports contain valuable information about diagnostic findings and indications, and transformer-based language models are promising for more accurate text classification.In a retrospective study, 2256 radiologist-annotated radiology requests (8 classes) and reports (10 classes) were divided into training and testing datasets (90% and 10%, respectively) and used to train 32 models. Performance metrics were compared by model type (LSTM, Bertje, RobBERT, BERT-clinical, BERT-multilingual, BERT-base), text length, data prevalence, and training strategy. The best models were used to predict the remaining 40,873 cases' categories of the datasets of requests and reports.The RobBERT model performed the best after 4000 training iterations, resulting in AUC values ranging from 0.808 [95% CI (0.757-0.859)] to 0.976 [95% CI (0.956-0.996)] for the requests and 0.746 [95% CI (0.689-0.802)] to 1.0 [95% CI (1.0-1.0)] for the reports. The AUC for the classification of normal reports was 0.95 [95% CI (0.922-0.979)]. The predicted data demonstrated variability of both diagnostic yield for various request classes and request patterns related to COVID-19 hospital admission data.Transformer-based natural language processing is feasible for the multilabel classification of chest imaging request and report items. Diagnostic yield varies with the information in the requests.
科研通智能强力驱动
Strongly Powered by AbleSci AI