作者
Bartha Alexandra Nantongo,Josephine Nabukenya,Peter Nabende,John Kamulegeya
摘要
Abstract Objectives Using machine learning models to predict infants at risk of defaulting routine immunization (RI) and identify significant features for Uganda. Materials and Methods Principal component analysis reduced dimensionality. Datasets were balanced using synthetic minority over-sampling technique. k-Nearest Neighbors, Decision Trees, Random Forests (RFs), Support Vector Machine (SVM), Naïve-Bayes, Logistic Regression (LR), XGBoost, Adoptive-Boosting, and Gradient-Boosting were used on Uganda’s 2016 Demographic and health survey data with social-economic and demographic factors as predictors. Experiments with and without K-fold cross-validation were performed. Models were evaluated for accuracy, recall, precision, and area under a curve (AUC). Results and Discussion Experimental results revealed that the rate of defaulting increases as an infant’s age increases at 5.3% Bacille Calmette-Guérin (BCG), 7.3% pentavalentI, 22.9% pentavalentIII, and 22.1% for measles. Significant predictors for BCG were immunization card, polio0, cluster altitude. Reception of pneumococcal1, BCG, and district for pentavalentI; polio3, pentavalentII for pentavalentIII; polio active and pentavalentIII for measles. RF had the best performance at predicting vaccine defaulting with 96%, 95%, 94%, 84% accuracy for BCG, PentavalentI, pentavalentIII, measles, respectively. Similarly, RF had the same precision, recall, AUC at 1.0. However, XGBoost, SVM, LR displayed the worst discriminatory power among infants who received the vaccine from defaulters with AUC ≤0.57. Conclusion Immunization card, preceding vaccines reception, and district were the most influential predictors. RF was the best classifier among the 9 models to predict defaulting RI. The study recommends regular outreaches, daily vaccination, provision of immunization cards, and accessible water sources to reduce defaulting.