Genome-wide prediction for complex traits under the presence of dominance effects in simulated populations using GBLUP and machine learning methods

遗传力优势（遗传学）统计随机森林最佳线性无偏预测支持向量机生物机器学习均方误差人工智能数学计算机科学遗传学选择（遗传算法）基因

作者

Anderson Antônio Carvalho Alves,Rebeka Magalhães da Costa,Tiago Bresolin,Gerardo Alves Fernandes Júnior,Rafael Espigolan,André Mauric Frossard Ribeiro,Roberto Carvalheiro,Lúcia Galvão de Albuquerque

出处

期刊：Journal of Animal Science [Oxford University Press]
日期：2020-05-31 卷期号：98 (6) 被引量：19

链接

oup.com nih.gov nih.govdoi.org

标识

DOI：10.1093/jas/skaa179

摘要

Abstract The aim of this study was to compare the predictive performance of the Genomic Best Linear Unbiased Predictor (GBLUP) and machine learning methods (Random Forest, RF; Support Vector Machine, SVM; Artificial Neural Network, ANN) in simulated populations presenting different levels of dominance effects. Simulated genome comprised 50k SNP and 300 QTL, both biallelic and randomly distributed across 29 autosomes. A total of six traits were simulated considering different values for the narrow and broad-sense heritability. In the purely additive scenario with low heritability (h2 = 0.10), the predictive ability obtained using GBLUP was slightly higher than the other methods whereas ANN provided the highest accuracies for scenarios with moderate heritability (h2 = 0.30). The accuracies of dominance deviations predictions varied from 0.180 to 0.350 in GBLUP extended for dominance effects (GBLUP-D), from 0.06 to 0.185 in RF and they were null using the ANN and SVM methods. Although RF has presented higher accuracies for total genetic effect predictions, the mean-squared error values in such a model were worse than those observed for GBLUP-D in scenarios with large additive and dominance variances. When applied to prescreen important regions, the RF approach detected QTL with high additive and/or dominance effects. Among machine learning methods, only the RF was capable to cover implicitly dominance effects without increasing the number of covariates in the model, resulting in higher accuracies for the total genetic and phenotypic values as the dominance ratio increases. Nevertheless, whether the interest is to infer directly on dominance effects, GBLUP-D could be a more suitable method.

求助该文献

Genome-wide prediction for complex traits under the presence of dominance effects in simulated populations using GBLUP and machine learning methods

今日热心研友