Using machine learning to realize genetic site screening and genomic prediction of productive traits in pigs

人工智能 基因组选择 支持向量机 随机森林 特征选择 机器学习 梯度升压 计算机科学 弹性网正则化 特质 选择(遗传算法) 最佳线性无偏预测 生物 遗传学 基因 单核苷酸多态性 基因型 程序设计语言
作者
Tao Xiang,Tao Li,Jielin Li,Xin Li,Jia Wang
出处
期刊:The FASEB Journal [Wiley]
卷期号:37 (6) 被引量:8
标识
DOI:10.1096/fj.202300245r
摘要

Genomic prediction, which is based on solving linear mixed-model (LMM) equations, is the most popular method for predicting breeding values or phenotypic performance for economic traits in livestock. With the need to further improve the performance of genomic prediction, nonlinear methods have been considered as an alternative and promising approach. The excellent ability to predict phenotypes in animal husbandry has been demonstrated by machine learning (ML) approaches, which have been rapidly developed. To investigate the feasibility and reliability of implementing genomic prediction using nonlinear models, the performances of genomic predictions for pig productive traits using the linear genomic selection model and nonlinear machine learning models were compared. Then, to reduce the high-dimensional features of genome sequence data, different machine learning algorithms, including the random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost) and convolutional neural network (CNN) algorithms, were used to perform genomic feature selection as well as genomic prediction on reduced feature genome data. All of the analyses were processed on two real pig datasets: the published PIC pig dataset and a dataset comprising data from a national pig nucleus herd in Chifeng, North China. Overall, the accuracies of predicted phenotypic performance for traits T1, T2, T3 and T5 in the PIC dataset and average daily gain (ADG) in the Chifeng dataset were higher using the ML methods than the LMM method, while those for trait T4 in the PIC dataset and total number of piglets born (TNB) in the Chifeng dataset were slightly lower using the ML methods than the LMM method. Among all the different ML algorithms, SVM was the most appropriate for genomic prediction. For the genomic feature selection experiment, the most stable and most accurate results across different algorithms were achieved using XGBoost in combination with the SVM algorithm. Through feature selection, the number of genomic markers can be reduced to 1 in 20, while the predictive performance on some traits can even be improved compared to using the full genome data. Finally, we developed a new tool that can be used to execute combined XGBoost and SVM algorithms to realize genomic feature selection and phenotypic prediction.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
我不是哪吒完成签到 ,获得积分10
6秒前
沐南完成签到 ,获得积分10
17秒前
Yulanda完成签到 ,获得积分10
18秒前
溪泉完成签到,获得积分10
26秒前
orixero应助flyingpig采纳,获得10
29秒前
yushiolo完成签到 ,获得积分10
31秒前
isedu完成签到,获得积分0
34秒前
luoman5656完成签到,获得积分10
36秒前
小蚂蚁发布了新的文献求助10
50秒前
SciGPT应助科研通管家采纳,获得20
58秒前
乐乐应助科研通管家采纳,获得10
58秒前
tiankong完成签到,获得积分10
59秒前
Flynut完成签到,获得积分10
1分钟前
蛋卷完成签到 ,获得积分10
1分钟前
小蓝完成签到 ,获得积分10
1分钟前
wushuimei完成签到 ,获得积分0
1分钟前
现代完成签到,获得积分10
1分钟前
1分钟前
lll完成签到 ,获得积分20
1分钟前
妇产科医生完成签到 ,获得积分0
1分钟前
阿连完成签到,获得积分10
1分钟前
haiyingaimer完成签到 ,获得积分10
1分钟前
jason完成签到 ,获得积分10
1分钟前
王吉萍完成签到 ,获得积分10
1分钟前
晴空万里完成签到 ,获得积分10
1分钟前
潇洒的蝴蝶完成签到 ,获得积分10
1分钟前
physicalpicture完成签到,获得积分10
1分钟前
如意元容完成签到,获得积分10
1分钟前
单小芫完成签到 ,获得积分10
1分钟前
LuciusHe完成签到,获得积分10
1分钟前
白白不喽完成签到 ,获得积分10
1分钟前
大气藏今完成签到,获得积分10
1分钟前
real完成签到 ,获得积分10
1分钟前
CadoreK完成签到 ,获得积分10
1分钟前
鹿雅彤完成签到 ,获得积分10
1分钟前
rainbow完成签到 ,获得积分10
2分钟前
leilei完成签到,获得积分10
2分钟前
名字有点甜诶完成签到 ,获得积分10
2分钟前
英俊青旋完成签到 ,获得积分10
2分钟前
黑大侠完成签到 ,获得积分0
2分钟前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Lewis’s Child and Adolescent Psychiatry: A Comprehensive Textbook Sixth Edition 2000
Engineering for calcareous sediments : proceedings of the International Conference on Calcareous Sediments, Perth 15-18 March 1988 / edited by R.J. Jewell, D.C. Andrews 1000
Wolffs Headache and Other Head Pain 9th Edition 1000
Continuing Syntax 1000
Signals, Systems, and Signal Processing 510
Austrian Economics: An Introduction 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6229870
求助须知:如何正确求助?哪些是违规求助? 8054546
关于积分的说明 16795537
捐赠科研通 5311667
什么是DOI,文献DOI怎么找? 2829194
邀请新用户注册赠送积分活动 1807000
关于科研通互助平台的介绍 1665378