生命银行
组学
计算机科学
比例危险模型
机器学习
代谢组学
Lasso(编程语言)
疾病
弹性网正则化
生物标志物发现
生物标志物
预测建模
数据挖掘
人工智能
特征选择
生物信息学
医学
生物
蛋白质组学
内科学
生物化学
万维网
基因
作者
Oscar Thomas Aguilar,Cheng Chang,Élise Bismuth,Manuel A. Rivas
标识
DOI:10.1101/2024.04.16.589819
摘要
We train prediction and survival models using multi-omics data for disease risk identification and stratification. Existing work on disease prediction focuses on risk analysis using datasets of individual data types (metabolomic, genomics, demographic), while our study creates an integrated model for disease risk assessment. We compare machine learning models such as Lasso Regression, Multi-Layer Perceptron, XG Boost, and ADA Boost to analyze multi-omics data, incorporating ROC-AUC score comparisons for various diseases and feature combinations. Additionally, we train Cox proportional hazard models for each disease to perform survival analysis. Although the integration of multi-omics data significantly improves risk prediction for 8 diseases, we find that the contribution of metabolomic data is marginal when compared to standard demographic, genetic, and biomarker features. Nonetheless, we see that metabolomics is a useful replacement for the standard biomarker panel when it is not readily available.
科研通智能强力驱动
Strongly Powered by AbleSci AI