作者
Noa Dagan,Ori Magen,Michael Leshchinsky,Maya Makov-Assif,Marc Lipsitch,Ben Y. Reis,Shlomit Yaron,Doron Netzer,Ran D. Balicer
摘要
BackgroundCompared with traditional population-wide screening approaches, screening based on machine-learning models enables the targeted identification of high-risk individuals. We describe the development of machine-learning models that address the pressing need for identifying unknown hepatitis C virus (HCV) carriers and measure the real-world yield of this approach deployed in a nationwide setting.MethodsRetrospective data on 18- to 79-year-old members of Israel's largest health care organization tested for HCV from 2013 to 2021 were used to train and test prediction models for identifying active HCV carriers. In August 2021, over 1.5 million members eligible for screening, according to the U.S. Preventive Services Task Force (USPSTF) recommendations, were prospectively evaluated by the top-performing model based on XGBoost, and a staged process of outreach to the highest-risk members began. In November 2022, the yield of the XGBoost-based screening was evaluated and compared with the concurrent testing of USPSTF screening–eligible members.ResultsThe retrospective cohort used for model development included 492,290 individuals, with 0.1% confirmed active HCV carriers. The best-performing model, based on XGBoost, yielded an area under the receiver operating characteristic curve of 0.95. Selecting the top 0.1%, 1%, and 5% of high-risk individuals for screening translated to positive predictive values of 18.2%, 6.2%, and 1.9% and sensitivities of 13.0%, 44.4%, and 67.6%, respectively. During the prospective outreach, a total of 477 members were screened for HCV antibodies, and 38 were eventually found to be active HCV carriers, yielding an extrapolated number needed to screen (NNS) of 10. Among the 53,403 USPSTF screening–eligible members who were tested over the same period, 38 were found to be active HCV carriers, yielding an NNS of 1029.ConclusionsA nationwide implementation of a machine-learning–based HCV screening managed to identify the same number of HCV carriers as the traditional screening approach while achieving over 100-fold-greater efficiency.