作者
Zhuo Chen,Hui Ou‐Yang,Botao Sun,Jiashan Ding,Yu Zhang,Xinying Li
摘要
Background: High-grade serous ovarian cancer (HGSOC) remains one of the most challenging gynecological malignancies, with over 70% of ovarian cancer patients ultimately experiencing disease progression. The current prognostic tools for progression-free survival (PFS) in HGSOC patients have limitations. This study aims to develop an explainable machine learning (ML) model for predicting PFS in HGSOC patients. Methods: Nine ML algorithms for PFS prediction were developed using a prospective cohort of 310 HGSOC patients consecutively enrolled from a large Chinese tertiary hospital between January 2017 and December 2020. The optimal model was internally validated using the 1000 bootstrap method. The SHapley Additive exPlanations (SHAP) method was employed to interpret the model in terms of feature importance and feature effects. The final model, constructed with the optimal feature subset, was deployed as an interactive web-based Shiny app. Results: The random survival forest (RSF) model demonstrated superior predictive performance compared to other ML models, the RFS model constructed with an optimal feature subset in the optimal imputed dataset achieved a superior 1000 bootstrap C-index of 0.755 (95% CI: 0.750–0.780) and a Brier score of 0.183 (95% CI: 0.175–0.190). SHAP analysis identified tumor residual, HE4, FIGO stage, T stage, CA125, age, ascites volume, platelet counts, and BMI as the top nine contributing factors. It also revealed potential nonlinear relationships and important thresholds between HE4, CA125, age, ascites volume, platelet counts, the body mass index, and PFS risk. Additionally, interaction effects were found between tumor residual and age, HE4, and CA125. Finally, an interactive web-based Shiny app for the model was developed and accessible at https://rsfmodels.shinyapps.io/ocRSF/. Conclusion: An explainable ML model for PFS prediction in HGSOC patients was developed with superior results. The publicly accessible web tool based on the optimized model facilitates its utility in clinical settings, potentially improving individualized patient management and treatment decision-making in HGSOC.