基因签名
计算机科学
特征选择
单变量
Lasso(编程语言)
逻辑回归
比例危险模型
人工智能
支持向量机
弹性网正则化
机器学习
回归
成对比较
交叉验证
逐步回归
预测建模
数据挖掘
计算生物学
基因
统计
多元统计
生物
医学
内科学
数学
基因表达
遗传学
万维网
作者
Eskezeia Y. Dessie,Jan‐Gowth Chang,Ya‐Sian Chang
标识
DOI:10.1016/j.compbiomed.2022.105493
摘要
Lung adenocarcinoma (LUAD) is one the most prevalent cancer with high mortality and its risk stratification is limited due lack of reliable molecular biomarkers. Although several studies have been conducted to identify gene signature involved in LUAD progression, most currently used methods to select gene features did not fully consider the problem of the existence of strong pairwise gene correlations as it resulted inconsistency in gene election. Therefore, it is crucial to develop new strategy to identify reliable gene signatures that improve risk prediction.In this study, novel feature selection strategy (1) univariate Cox regression model to select survival associated genes (2) integrating rigid Cox regression with Adaptive Lasso model to identify informative survival associated genes (3) stepwise Cox regression model to identify optimal gene signature and (4) prognostic risk predictive model for LUAD (PRPML) was constructed. The PRPML was developed-based on four machine learning (ML) methods including logistic regression (LR), K-nearest neighbors (KNN), support vector machine with the radial kernel (SVMR), and average neural network (Avnet). The PRPML model successfully stratified high-risk and low-risk groups of patients with LUAD in three datasets. The PRPML achieved an area under the curve (AUC) of 0.812 and 0.863 in the validation datasets. Finally, a nine-potential gene signature was found and showed great potential for risk prediction.Our study demonstrates that the developed strategy identified a nine potential gene signature for accurate risk prediction performance and this signature could provide valuable clue into the understanding of the molecular mechanism of LUAD disease.
科研通智能强力驱动
Strongly Powered by AbleSci AI