可解释性
生物标志物发现
计算机科学
机器学习
分类器(UML)
人工智能
计算生物学
生物信息学
蛋白质组学
基因
生物
生物化学
作者
Qi Zhou,Weicai Ye,Xiaolan Yu,Yun‐Juan Bao
标识
DOI:10.1016/j.cmpb.2024.108077
摘要
The pathway-based strategy has been recently proposed for identifying biomarkers with the advantages of higher biological interpretability and cross-data robustness than the conventional gene-based strategy. However, its utility in clinical applications has been limited due to the high computational complexity and ill-defined performance.The current study presents a machine learning-based computational framework using multi-omics data for identifying a new modal of biomarkers, called pathway-derived core biomarkers, which have the advantages of both gene-based and pathway-based biomarkers.Machine-learning methods and gene-pathway network were integrated to select the pathway-derived core biomarkers. Multiple machine-learning algorithms were used to construct and validate the diagnostic models of the biomarkers based on more than 1400 multi-omics clinical samples of esophageal squamous cell carcinoma (ESCC).The results showed that the classifier models based on the new modal biomarkers achieved superior performance in the training datasets with an average AUC/accuracy of 0.98/0.95 and 0.89/0.81 for mRNAs and miRNA, respectively, higher than the currently known classifier models based on the conventional gene-based strategy and pathway-based strategy. In the testing cohorts, the AUC/accuracy increased by 6.1 %/7.3 % than the models based on the native gene-based biomarkers. The improved performance was further confirmed in independent validation cohorts. Specifically, the sensitivity/specificity increased by ∼3 % and the variance significantly decreased by ∼69 % compared with that of the native gene-based biomarkers. Importantly, the pathway-derived core biomarkers also recovered 45 % more previously reported biomarkers than the gene-based biomarkers and are more functionally relevant to the ESCC etiology (involved in 14 versus 7 pathways related with ESCC or other cancer), highlighting the cross-data robustness of this new modal of biomarkers via enhanced functional relevance.The results demonstrated that the new modal of biomarkers not only have improved predicting performance and robustness, but also exhibit higher functional interpretability thus leading to the potential application in cancer diagnosis.
科研通智能强力驱动
Strongly Powered by AbleSci AI