Lasso(编程语言)
弹性网正则化
DNA甲基化
均方误差
选择(遗传算法)
特征选择
Scad公司
计算机科学
回归分析
人工智能
统计
回归
线性回归
机器学习
数学
生物
遗传学
医学
基因
内科学
万维网
基因表达
心肌梗塞
作者
Pui Yin Lau,Wing K. Fung
出处
期刊:Legal Medicine
[Elsevier BV]
日期:2020-11-01
卷期号:47: 101744-101744
被引量:9
标识
DOI:10.1016/j.legalmed.2020.101744
摘要
In forensic investigation, retrieving biological information from DNA evidence is a promising field of interest. One of the applications is on the estimation of the age of the donor based on DNA methylation. A large number of studies focused on age prediction using the 450 K Human Methylation Beadchip. Various marker selection methods and prediction models have been considered. However, there is a lack of research evaluating different high-dimensional variable selection methods of CpG sites with various models for age prediction. The aim of this study is to evaluate four variable selection methods (forward selection, LASSO, elastic net and SCAD) combined with a classical statistical model and sophisticated machine learning models based on the mean absolute deviation (MAD) and the root-mean-square error (RMSE). We used publicly available 450 K data set containing 991 whole blood samples (age 19–101 years). We found that the multiple linear regression model with 16 markers selected from the forward selection method performed very well in age prediction (MAD = 3.76 years and RMSE = 5.01 years). On the other hand, the highly advanced ultrahigh dimensional variable selection methods and sophisticated machine learning algorithms appeared unnecessary for age prediction based on DNA methylation.
科研通智能强力驱动
Strongly Powered by AbleSci AI