作者
Santiago Papini,Sonya B. Norman,Laura Campbell‐Sills,Xiaoying Sun,Feng He,Ronald C. Kessler,Robert J. Ursano,Sonia Jain,Murray B. Stein
摘要
Importance Military deployment involves significant risk for life-threatening experiences that can lead to posttraumatic stress disorder (PTSD). Accurate predeployment prediction of PTSD risk may facilitate the development of targeted intervention strategies to enhance resilience. Objective To develop and validate a machine learning (ML) model to predict postdeployment PTSD. Design, Setting, and Participants This diagnostic/prognostic study included 4771 soldiers from 3 US Army brigade combat teams who completed assessments between January 9, 2012, and May 1, 2014. Predeployment assessments occurred 1 to 2 months before deployment to Afghanistan, and follow-up assessments occurred approximately 3 and 9 months post deployment. Machine learning models to predict postdeployment PTSD were developed in the first 2 recruited cohorts using as many as 801 predeployment predictors from comprehensive self-report assessments. In the development phase, cross-validated performance metrics and predictor parsimony were considered to select an optimal model. Next, the selected model’s performance was evaluated with area under the receiver operating characteristics curve and expected calibration error in a temporally and geographically distinct cohort. Data analyses were performed from August 1 to November 30, 2022. Main Outcomes and Measures Posttraumatic stress disorder diagnosis was assessed by clinically calibrated self-report measures. Participants were weighted in all analyses to address potential biases related to cohort selection and follow-up nonresponse. Results This study included 4771 participants (mean [SD] age, 26.9 [6.2] years), 4440 (94.7%) of whom were men. In terms of race and ethnicity, 144 participants (2.8%) identified as American Indian or Alaska Native, 242 (4.8%) as Asian, 556 (13.3%) as Black or African American, 885 (18.3%) as Hispanic, 106 (2.1%) as Native Hawaiian or other Pacific Islander, 3474 (72.2%) as White, and 430 (8.9%) as other or unknown race or ethnicity; participants could identify as of more than 1 race or ethnicity. A total of 746 participants (15.4%) met PTSD criteria post deployment. In the development phase, models had comparable performance (log loss range, 0.372-0.375; area under the curve range, 0.75-0.76). A gradient-boosting machine with 58 core predictors was selected over an elastic net with 196 predictors and a stacked ensemble of ML models with 801 predictors. In the independent test cohort, the gradient-boosting machine had an area under the curve of 0.74 (95% CI, 0.71-0.77) and low expected calibration error of 0.032 (95% CI, 0.020-0.046). Approximately one-third of participants with the highest risk accounted for 62.4% (95% CI, 56.5%-67.9%) of the PTSD cases. Core predictors cut across 17 distinct domains: stressful experiences, social network, substance use, childhood or adolescence, unit experiences, health, injuries, irritability or anger, personality, emotional problems, resilience, treatment, anxiety, attention or concentration, family history, mood, and religion. Conclusions and Relevance In this diagnostic/prognostic study of US Army soldiers, an ML model was developed to predict postdeployment PTSD risk with self-reported information collected before deployment. The optimal model showed good performance in a temporally and geographically distinct validation sample. These results indicate that predeployment stratification of PTSD risk is feasible and may facilitate the development of targeted prevention and early intervention strategies.