遗传力
优势(遗传学)
统计
随机森林
最佳线性无偏预测
支持向量机
生物
机器学习
均方误差
人工智能
数学
计算机科学
遗传学
选择(遗传算法)
基因
作者
Anderson Antônio Carvalho Alves,Rebeka Magalhães da Costa,Tiago Bresolin,Gerardo Alves Fernandes Júnior,Rafael Espigolan,André Mauric Frossard Ribeiro,Roberto Carvalheiro,Lúcia Galvão de Albuquerque
摘要
Abstract The aim of this study was to compare the predictive performance of the Genomic Best Linear Unbiased Predictor (GBLUP) and machine learning methods (Random Forest, RF; Support Vector Machine, SVM; Artificial Neural Network, ANN) in simulated populations presenting different levels of dominance effects. Simulated genome comprised 50k SNP and 300 QTL, both biallelic and randomly distributed across 29 autosomes. A total of six traits were simulated considering different values for the narrow and broad-sense heritability. In the purely additive scenario with low heritability (h2 = 0.10), the predictive ability obtained using GBLUP was slightly higher than the other methods whereas ANN provided the highest accuracies for scenarios with moderate heritability (h2 = 0.30). The accuracies of dominance deviations predictions varied from 0.180 to 0.350 in GBLUP extended for dominance effects (GBLUP-D), from 0.06 to 0.185 in RF and they were null using the ANN and SVM methods. Although RF has presented higher accuracies for total genetic effect predictions, the mean-squared error values in such a model were worse than those observed for GBLUP-D in scenarios with large additive and dominance variances. When applied to prescreen important regions, the RF approach detected QTL with high additive and/or dominance effects. Among machine learning methods, only the RF was capable to cover implicitly dominance effects without increasing the number of covariates in the model, resulting in higher accuracies for the total genetic and phenotypic values as the dominance ratio increases. Nevertheless, whether the interest is to infer directly on dominance effects, GBLUP-D could be a more suitable method.
科研通智能强力驱动
Strongly Powered by AbleSci AI