医学
过度拟合
人工智能
机器学习
人工神经网络
临床试验
生物统计学
观察研究
医学物理学
流行病学
计算机科学
病理
作者
Seong Ho Park,Kyunghwa Han
出处
期刊:Radiology
[Radiological Society of North America]
日期:2018-01-08
卷期号:286 (3): 800-809
被引量:673
标识
DOI:10.1148/radiol.2017171920
摘要
The use of artificial intelligence in medicine is currently an issue of great interest, especially with regard to the diagnostic or predictive analysis of medical images. Adoption of an artificial intelligence tool in clinical practice requires careful confirmation of its clinical utility. Herein, the authors explain key methodology points involved in a clinical evaluation of artificial intelligence technology for use in medicine, especially high-dimensional or overparameterized diagnostic or predictive models in which artificial deep neural networks are used, mainly from the standpoints of clinical epidemiology and biostatistics. First, statistical methods for assessing the discrimination and calibration performances of a diagnostic or predictive model are summarized. Next, the effects of disease manifestation spectrum and disease prevalence on the performance results are explained, followed by a discussion of the difference between evaluating the performance with use of internal and external datasets, the importance of using an adequate external dataset obtained from a well-defined clinical cohort to avoid overestimating the clinical performance as a result of overfitting in high-dimensional or overparameterized classification model and spectrum bias, and the essentials for achieving a more robust clinical evaluation. Finally, the authors review the role of clinical trials and observational outcome studies for ultimate clinical verification of diagnostic or predictive artificial intelligence tools through patient outcomes, beyond performance metrics, and how to design such studies. © RSNA, 2018
科研通智能强力驱动
Strongly Powered by AbleSci AI