结直肠癌
生物标志物
可解释性
生命银行
Lasso(编程语言)
医学
机器学习
肿瘤科
癌症
人工智能
生物信息学
计算生物学
计算机科学
内科学
生物
遗传学
万维网
作者
S Radhakrishnan,Dipanwita Nath,Dominic Russ,Laura Bravo Merodio,Priyani Lad,Folakemi Kola Daisi,Animesh Acharjee
标识
DOI:10.3389/fonc.2024.1505675
摘要
Colorectal cancer is one of the leading causes of cancer-related mortality in the world. Incidence and mortality are predicted to rise globally during the next several decades. When detected early, colorectal cancer is treatable with surgery and medications. This leads to the requirement for prognostic and diagnostic biomarker development. Our study integrates machine learning models and protein network analysis to identify protein biomarkers for colorectal cancer. Our methodology leverages an extensive collection of proteome profiles from both healthy and colorectal cancer individuals. To identify a potential biomarker with high predictive ability, we used three machine learning models. To enhance the interpretability of our models, we quantify each protein’s contribution to the model’s predictions using SHapley Additive exPlanations values. Three classifiers—LASSO, XGBoost, and LightGBM were evaluated for predictive performance along with hyperparameter tuning of each model using grid search, with LASSO achieving the highest AUC of 75% in the UK Biobank dataset and the AUCs for LightGBM and XGBoost are 69.61% and 71.42%, respectively. Using SHapley Additive exPlanations values, TFF3, LCN2, and CEACAM5 were found to be key biomarkers associated with cell adhesion and inflammation. Protein quantitative trait loci analyze studies provided further evidence for the involvement of TFF1, CEACAM5, and SELE in colorectal cancer, with possible connections to the PI3K/Akt and MAPK signaling pathways. By offering insights into colorectal cancer diagnostics and targeted therapeutics, our findings set the stage for further biomarker validation.
科研通智能强力驱动
Strongly Powered by AbleSci AI