医学
比例危险模型
布里氏评分
单变量
乳腺癌
多元统计
接收机工作特性
回归分析
生存分析
特征选择
单变量分析
多元分析
癌症
内科学
统计
人工智能
计算机科学
数学
作者
Yafei Wu,Yaheng Zhang,Siyu Duan,Chenming Gu,Chongtao Wei,Ya Fang
标识
DOI:10.1016/j.cmpb.2024.108310
摘要
Studies have found that first primary cancer (FPC) survivors are at high risk of developing second primary breast cancer (SPBC). However, there is a lack of prognostic studies specifically focusing on patients with SPBC. This retrospective study used data from Surveillance, Epidemiology and End Results Program. We selected female FPC survivors diagnosed with SPBC from 12 registries (from January 1998 to December 2018) to construct prognostic models. Meanwhile, SPBC patients selected from another five registries (from January 2010 to December 2018) were used as the validation set to test the model's generalization ability. Four machine learning models and a Cox proportional hazards regression (CoxPH) were constructed to predict the overall survival of SPBC patients. Univariate and multivariate Cox regression analyses were used for feature selection. Model performance was assessed using time-dependent area under the ROC curve (t-AUC) and integrated Brier score (iBrier). A total of 10,321 female FPC survivors with SPBC (mean age [SD]: 66.03 [11.17]) were included for model construction. These patients were randomly split into a training set (mean age [SD]: 65.98 [11.15]) and a test set (mean age [SD]: 66.15 [11.23]) with a ratio of 7:3. In validation set, a total of 3,638 SPBC patients (mean age [SD]: 66.28 [10.68]) were finally enrolled. Sixteen features were selected for model construction through univariate and multivariable Cox regression analyses. Among five models, random survival forest model showed excellent performance with a t-AUC of 0.805 (95%CI: 0.803 - 0.807) and an iBrier of 0.123 (95%CI: 0.122 - 0.124) on testing set, as well as a t-AUC of 0.803 (95%CI: 0.801 - 0.807) and an iBrier of 0.098 (95%CI: 0.096 - 0.103) on validation set. Through feature importance ranking, the top one and other top five key predictive features of the random survival forest model were identified, namely age, stage, regional nodes positive, latency, radiation, and surgery. The random survival forest model outperformed CoxPH and other machine learning models in predicting the overall survival of patients with SPBC, which was helpful for the monitoring of high-risk populations.
科研通智能强力驱动
Strongly Powered by AbleSci AI