医学
概化理论
接收机工作特性
甲状腺结节
分割
Sørensen–骰子系数
科恩卡帕
人工智能
掷骰子
回顾性队列研究
机器学习
放射科
甲状腺
统计
外科
图像分割
计算机科学
内科学
数学
作者
Wenwen Xu,Xiaohong Jia,Zihan Mei,XiaoLin Gu,Yang Lu,Chi-Cheng Fu,Ruifang Zhang,Ying Gu,Xia Chen,Xiaomao Luo,Ning Li,Baoyan Bai,Qiaoying Li,Jiping Yan,Zhai Hong,Ling Guan,Bing Gong,Keyang Zhao,Qu Fang,Chuan He,Weiwei Zhan,Ting Luo,Huiting Zhang,Yijie Dong,JianQiao Zhou
出处
期刊:Radiology
[Radiological Society of North America]
日期:2023-06-01
卷期号:307 (5)
被引量:3
标识
DOI:10.1148/radiol.221157
摘要
Background Artificial intelligence (AI) models have improved US assessment of thyroid nodules; however, the lack of generalizability limits the application of these models. Purpose To develop AI models for segmentation and classification of thyroid nodules in US using diverse data sets from nationwide hospitals and multiple vendors, and to measure the impact of the AI models on diagnostic performance. Materials and Methods This retrospective study included consecutive patients with pathologically confirmed thyroid nodules who underwent US using equipment from 12 vendors at 208 hospitals across China from November 2017 to January 2019. The detection, segmentation, and classification models were developed based on the subset or complete set of images. Model performance was evaluated by precision and recall, Dice coefficient, and area under the receiver operating characteristic curve (AUC) analyses. Three scenarios (diagnosis without AI assistance, with freestyle AI assistance, and with rule-based AI assistance) were compared with three senior and three junior radiologists to optimize incorporation of AI into clinical practice. Results A total of 10 023 patients (median age, 46 years [IQR 37–55 years]; 7669 female) were included. The detection, segmentation, and classification models had an average precision, Dice coefficient, and AUC of 0.98 (95% CI: 0.96, 0.99), 0.86 (95% CI: 0.86, 0.87), and 0.90 (95% CI: 0.88, 0.92), respectively. The segmentation model trained on the nationwide data and classification model trained on the mixed vendor data exhibited the best performance, with a Dice coefficient of 0.91 (95% CI: 0.90, 0.91) and AUC of 0.98 (95% CI: 0.97, 1.00), respectively. The AI model outperformed all senior and junior radiologists (P < .05 for all comparisons), and the diagnostic accuracies of all radiologists were improved (P < .05 for all comparisons) with rule-based AI assistance. Conclusion Thyroid US AI models developed from diverse data sets had high diagnostic performance among the Chinese population. Rule-based AI assistance improved the performance of radiologists in thyroid cancer diagnosis. © RSNA, 2023 Supplemental material is available for this article.