医学
召回
乳腺摄影术
置信区间
人口
灵敏度(控制系统)
乳腺癌
癌症
内科学
心理学
环境卫生
电子工程
工程类
认知心理学
作者
M. Luke Marinovich,William Lotter,Andrew Waddell,Nehmat Houssami
标识
DOI:10.1177/09691413241262960
摘要
Artificial intelligence (AI) algorithms have been retrospectively evaluated as replacement for one radiologist in screening mammography double-reading; however, methods for resolving discordance between radiologists and AI in the absence of 'real-world' arbitration may underestimate cancer detection rate (CDR) and recall. In 108,970 consecutive screens from a population screening program (BreastScreen WA, Western Australia), 20,120 were radiologist/AI discordant without real-world arbitration. Recall probabilities were randomly assigned for these screens in 1000 simulations. Recall thresholds for screen-detected and interval cancers (sensitivity) and no cancer (false-positive proportion, FPP) were varied to calculate mean CDR and recall rate for the entire cohort. Assuming 100% sensitivity, the maximum CDR was 7.30 per 1000 screens. To achieve >95% probability that the mean CDR exceeded the screening program CDR (6.97 per 1000), interval cancer sensitivities ≥63% (at 100% screen-detected sensitivity) and ≥91% (at 80% screen-detected sensitivity) were required. Mean recall rate was relatively constant across sensitivity assumptions, but varied by FPP. FPP > 6.5% resulted in recall rates that exceeded the program estimate (3.38%). CDR improvements depend on a majority of interval cancers being detected in radiologist/AI discordant screens. Such improvements are likely to increase recall, requiring careful monitoring where AI is deployed for screen-reading.
科研通智能强力驱动
Strongly Powered by AbleSci AI