摘要
•Genome selection (GS) considers marker effects across the whole genome.•The use of high-density markers is one of the features of GS.•GS is based on two distinct and related groups: training and breeding populations.•Phenotyping is a key informant in GS to build up accuracy of statistical models.•GS may revolutionize plant and tree breeding practices. Association analysis is used to measure relations between markers and quantitative trait loci (QTL). Their estimation ignores genes with small effects that trigger underpinning quantitative traits. By contrast, genome-wide selection estimates marker effects across the whole genome on the target population based on a prediction model developed in the training population (TP). Whole-genome prediction models estimate all marker effects in all loci and capture small QTL effects. Here, we review several genomic selection (GS) models with respect to both the prediction accuracy and genetic gain from selection. Phenotypic selection or marker-assisted breeding protocols can be replaced by selection, based on whole-genome predictions in which phenotyping updates the model to build up the prediction accuracy. Association analysis is used to measure relations between markers and quantitative trait loci (QTL). Their estimation ignores genes with small effects that trigger underpinning quantitative traits. By contrast, genome-wide selection estimates marker effects across the whole genome on the target population based on a prediction model developed in the training population (TP). Whole-genome prediction models estimate all marker effects in all loci and capture small QTL effects. Here, we review several genomic selection (GS) models with respect to both the prediction accuracy and genetic gain from selection. Phenotypic selection or marker-assisted breeding protocols can be replaced by selection, based on whole-genome predictions in which phenotyping updates the model to build up the prediction accuracy. Marker-assisted selection (MAS; see Glossary) has been used in plant improvement programs since the 1990s, after promising research results for tagging genes or mapping QTL. MAS and association genetics have been used in the detection of underlying major genes in gene pools and in their introgression to improve traits of major crop breeding programs. Nevertheless, they have shown some shortcomings due to long selection cycles and the search for significant marker–QTL associations being unable to capture ‘minor’ gene effects [1Heffner E.L. et al.Genomic selection for crop improvement.Crop Sci. 2009; 49: 1-12Crossref Scopus (1037) Google Scholar, 2Goddard M.E. Hayes B.J. Genomic selection.J. Anim. Breed. Genet. 2007; 124: 323-330Crossref PubMed Scopus (520) Google Scholar, 3Xu Y. et al.Whole-genome strategies for marker-assisted plant breeding.Mol. Breed. 2012; 29: 833-854Crossref Scopus (119) Google Scholar]. The introduction of GS [4Meuwissen T.H.E. et al.Prediction of total genetic value using genome-wide dense marker maps.Genetics. 2001; 157: 1819-1829PubMed Google Scholar] has paved the way to overcome these limitations using whole-genome prediction models. The use of high-density markers is one of the fundamental features of GS. Therefore, every trait locus has the probability of being in linkage disequilibrium (LD) with a minimum of one marker locus in the entire target population. Genome-wide selection removes the need to search for significant QTL–marker loci associations individually. Rather, GS accounts for bunches of predictors simultaneously and is characterized by constraining random estimates towards zero. Moreover, GS can accelerate breeding cycles in such a way that the rate of annual genetic gain per unit of time and cost can be enhanced [5Heffner E.L. et al.Plant breeding with genomic selection: gain per unit time and cost.Crop Sci. 2010; 50: 1681-1690Crossref Scopus (435) Google Scholar]. GS has long been practiced in the field of animal breeding, but is in its infancy in crop [1Heffner E.L. et al.Genomic selection for crop improvement.Crop Sci. 2009; 49: 1-12Crossref Scopus (1037) Google Scholar, 6Bernardo R. Yu J. Prospects for genomewide selection for quantitative traits in maize.Crop Sci. 2007; 47: 1082-1090Crossref Scopus (605) Google Scholar, 7Lorenz A.J. et al.Genomic selection in plant breeding.Adv. Agron. 2011; 110: 77-123Crossref Scopus (370) Google Scholar] and forest tree [8Wong C. Bernardo R. Genomewide selection in oil palm: increasing selection gain per unit time and cost with small populations.Theor. Appl. Genet. 2008; 116: 815-824Crossref PubMed Scopus (182) Google Scholar, 9Grattapaglia D. Resende M.D.V. Genomic selection in forest tree breeding.Tree Genet. Genomes. 2011; 7: 241-255Crossref Scopus (280) Google Scholar] breeding. Genome-wide selection or GS estimates marker effects across the whole genome of the breeding population (BP) based on the prediction model developed in the TP (Figure 1). TP is a group of related individuals (such as half-sibs or lines) that are both phenotyped and genotyped. BP usually includes the descendants of a TP or a new variety that is related to the TP, and is only genotyped not phenotyped. Hence, GS relies on the degree of genetic similarity between TP and BP in the LD between marker and trait loci. GS identifies the highest genomic estimated breeding values (GEBVs) instead of novel gene(s) in the target species. Given that many of the selections are replaced by selection on predictions, phenotyping can be considered as a key informant in GS to build up the accuracy of statistical models. MAS [10Collard B.C. Mackill D.J. Marker-assisted selection: an approach for precision plant breeding in the twenty-first century.Philos. Trans. R. Soc. Lond. B: Biol. Sci. 2008; 363: 557-572Crossref PubMed Scopus (1391) Google Scholar], marker-assisted recurrent selection (MARS) [11Crossa J. et al.Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers.Genetics. 2010; 186: 713-724Crossref PubMed Scopus (563) Google Scholar], and gene pyramiding [12Servin B. et al.Toward a theory of marker-assisted gene pyramiding.Genetics. 2004; 168: 513-523Crossref PubMed Scopus (140) Google Scholar] are still important methods of selection to identify and further incorporate novel gene(s) in recurrent parents. These methods can be complemented with GS in integrated plant breeding programs (Figure 1). Therefore, with the advent of cutting-edge next-generation sequencing (NGS) and high-throughput phenotyping tools, GS may revolutionize practical applications of crop and forest tree improvement programs. In this review, we discuss estimating GEBV, the accuracy and gain of selection using genome-wide prediction models, compare GS versus other selection methods of plant breeding, and provide an outlook of GS in plant breeding schemes. Plant breeding is a science of prediction. Various types of prediction model respond differently because they vary in their assumption(s) when treating the variance of complex traits. The standard linear model equation can be formulated as (Equation 1):y=μ+∑kχkβk+e,[1] where y is a vector of trait phenotype, μ is an overall phenotype mean, k represents the locus, χk is the allelic state at the locus k, βk is marker effect at the locus k, and e∼N(0,σe2) where e is the vector of random residual effects and σe2 is the residual variance. In χk, the allelic state of individuals can be coded as a matrix of 1, 0, or −1 to a diploid genotype value of AA, AB, or BB, respectively. The number of predictors (p) is usually far greater than the number of individuals (n). In such cases, estimates of ordinary least-squares (OLS) have a poor predictive ability because marker effects are treated as fixed effects, which leads to multicolinearity and overfitting among predictors, thereby making the model infeasible. The advent of GS [4Meuwissen T.H.E. et al.Prediction of total genetic value using genome-wide dense marker maps.Genetics. 2001; 157: 1819-1829PubMed Google Scholar] provides an opportunity to confront these challenges using alternative models, such as whole-genome regressions (Table 1, Figure 2). Parametric and nonparametric models can cluster whole-genome regression methods.Table 1Main features of genome-wide prediction modelsModel acronymaEN, elastic net; RF, random forest; RHKS, reproducing kernels Hilbert spaces regression.FeaturesRefsRR-BLUPAssumes that all markers have equal variances with small but non-zero effect4Meuwissen T.H.E. et al.Prediction of total genetic value using genome-wide dense marker maps.Genetics. 2001; 157: 1819-1829PubMed Google Scholar, 19Heffner E.L. et al.Genomic selection accuracy for grain quality traits in biparental wheat populations.Crop Sci. 2011; 51: 2597-2606Crossref Scopus (200) Google ScholarApplies homogeneous shrinkage of predictors towards zero, but allows for markers to have uneven effectsComputed from a realized-relation matrix based on markersSome QTL are in LD to marker loci, whereas others are notLASSOCombines both shrinkage and variable selection methods70Friedman J. et al.Regularization paths for generalized linear models via coordinate descent.J. Stat. Softw. 2010; 33: 1-22Crossref PubMed Scopus (9740) Google Scholar, 71Li Z. Sillanpää M.J. Overview of LASSO-related penalized regression methods for quantitative trait mapping and genomic selection.Theor. Appl. Genet. 2012; 125: 419-435Crossref PubMed Scopus (119) Google ScholarRR-BLUP does not use variable selection, but outsmarts LASSO when there is multicolinearity between the predictorsENDouble regularization using ℓ1 and ℓ2 penalty norms combines the merited features of these norms to confront the challenge of high-dimensional data72Zou H. Hastie T. Regularization and variable selection via the elastic net.J. R. Stat. Soc. Ser. B. 2005; 67: 301-320Crossref Scopus (10907) Google ScholarBRRInduces homogeneous shrinkage of all marker effects towards zero and yields a Gaussian distribution of marker effects73de los Campos G. et al.Prediction of complex human traits using the genomic best linear unbiased predictor.PLoS Genet. 2013; 9: e1003608Crossref PubMed Scopus (232) Google ScholarSimilar to RR-BLUP, there is a problem of QTL linkages to the marker lociBLApplies to both shrinkage and variable selection71Li Z. Sillanpää M.J. Overview of LASSO-related penalized regression methods for quantitative trait mapping and genomic selection.Theor. Appl. Genet. 2012; 125: 419-435Crossref PubMed Scopus (119) Google Scholar, 74de los Campos G. et al.Predicting quantitative traits with regression models for dense molecular markers and pedigree.Genetics. 2009; 182: 375-385Crossref PubMed Scopus (419) Google ScholarHas an exponential prior on marker variances resulting in a double exponential (DE) distributionThe DE distribution has a higher mass density at zero and heavier prior tails compared with a Gaussian distributionBayes AUtilizes an inverse chi-square (χ2) on marker variances yielding a scaled t-distribution for marker effects4Meuwissen T.H.E. et al.Prediction of total genetic value using genome-wide dense marker maps.Genetics. 2001; 157: 1819-1829PubMed Google Scholar, 74de los Campos G. et al.Predicting quantitative traits with regression models for dense molecular markers and pedigree.Genetics. 2009; 182: 375-385Crossref PubMed Scopus (419) Google ScholarSimilar to BL and in contrast to BRR, it shrinks tiny marker effects towards zero and larger values surviveHas a higher peak of mass density zero compared with the DE distributionBayes BSimilar to Bayes A, uses an inverse χ2 resulting in a scaled t-distribution4Meuwissen T.H.E. et al.Prediction of total genetic value using genome-wide dense marker maps.Genetics. 2001; 157: 1819-1829PubMed Google Scholar, 20Heffner E.L. et al.Genomic selection accuracy using multifamily prediction models in a wheat breeding program.Plant Genet. 2011; 4: 65-75Crossref Google ScholarUnlike Bayes A, utilizes both shrinkage and variable selection methodsWhen π = 0, then it is similar to Bayes ABayes CApplies both shrinkage and variable selection methods74de los Campos G. et al.Predicting quantitative traits with regression models for dense molecular markers and pedigree.Genetics. 2009; 182: 375-385Crossref PubMed Scopus (419) Google Scholar, 75Habier D. et al.Extension of the Bayesian alphabet for genomic selection.BMC Bioinformatics. 2011; 12: 186Crossref PubMed Scopus (767) Google Scholar, 76de Los Campos G. et al.Whole-genome regression and prediction methods applied to plant and animal breeding.Genetics. 2013; 193: 327-345Crossref PubMed Scopus (570) Google ScholarCharacterized by a Gaussian distributionBayes B and Bayes C consist of point of mass at zero in their slab priorsBayes CπA modified variant of Bayes B75Habier D. et al.Extension of the Bayesian alphabet for genomic selection.BMC Bioinformatics. 2011; 12: 186Crossref PubMed Scopus (767) Google ScholarUsed to alleviate the shortcomings of Bayes A and Bayes BUnlike Bayes B, π is not fixed, but estimated from the dataRKHSBased on genetic distance and a kernel function with a smoothing parameter to regulate the distribution of QTL effects77Gianola D. van Kaam J.B. Reproducing kernel Hilbert spaces regression methods for genomic assisted prediction of quantitative traits.Genetics. 2008; 178: 2289-2303Crossref PubMed Scopus (282) Google Scholar, 78de los Campos G. et al.Semi-parametric genomic-enabled prediction of genetic values using reproducing kernel Hilbert spaces methods.Genet. Res. 2010; 92: 295-308Crossref Scopus (233) Google ScholarEffective for detecting nonadditive gene effectsRFUses the regression model rooted in bootstrapping sample observations55Rutkoski J.E. et al.Genomic selection for durable stem rust resistance in wheat.Euphytica. 2011; 179: 161-173Crossref Scopus (107) Google Scholar, 66Jannink J.L. et al.Genomic selection in plant breeding: from theory to practice.Brief. Funct. Genomics. 2010; 9: 166-177Crossref PubMed Scopus (797) Google Scholar, 79Holliday J.A. et al.Predicting adaptive phenotypes from multilocus genotypes in Sitka spruce (Picea sitchensis) using random forest.G3. 2012; 2: 1085-1093Crossref PubMed Scopus (55) Google ScholarTakes the average of all tree nodes to find the best prediction modelCaptures the interactions between markersa EN, elastic net; RF, random forest; RHKS, reproducing kernels Hilbert spaces regression. Open table in a new tab The performance of GS depends on the prediction accuracy to select individuals whose phenotype is unknown. In GS, the GEBV can be computed from Equation 1 as (Equation 2):GEBV=xnewβˆk,[2] where xnew is a matrix comprising the allelic states of individuals in a BP, and βˆk is the estimate of the regression coefficient of βk. Cross-validation is used to train and develop the prediction model in the TP (Figure 3A) . Then, the best-fitted model can be used to further evaluate the GEBV in a BP (Figure 3B). Therefore, the prediction of GEBVs should mimic the alternatives of cross-validation strategies [13Pérez-Cabal M.A. et al.Accuracy of genome-enabled prediction in a dairy cattle population using different cross-validation layouts.Front. Genet. 2012; 3: 27Crossref PubMed Scopus (34) Google Scholar]. Prediction accuracy (rA) is the Pearson's correlation (r) between the selection criterion (GEBV) and the true breeding value (TBV) (Figure 3B). The expected prediction accuracy (rA) can be computed as in [14Daetwyler H.D. et al.The impact of genetic architecture on genome-wide evaluation methods.Genetics. 2010; 185: 1021-1031Crossref PubMed Scopus (541) Google Scholar] (Equation 3):rA=h2h2+MeNp,[3] where h2 is the narrow sense heritability, Np is the number of individuals in a TP, and Me is the number of independent chromosome segments, which depends on both the effective population size (Ne) and the genome length in Morgan (L) that was derived in [15Goddard M. Genomic selection: prediction of accuracy and maximisation of long-term response.Genetica. 2009; 136: 245-257Crossref PubMed Scopus (764) Google Scholar] as Me ≈ 2NeL. Ideally, Me is related to the effective number of QTL. The combined use of both Np and h2, rather than their individual assessment, is key to regulating the expected prediction accuracy [14Daetwyler H.D. et al.The impact of genetic architecture on genome-wide evaluation methods.Genetics. 2010; 185: 1021-1031Crossref PubMed Scopus (541) Google Scholar, 16Combs E. Bernardo R. Accuracy of genomewide selection for different traits with constant population size, heritability, and number of markers.Plant Genome. 2013; 6: 6Crossref Scopus (129) Google Scholar]. This is more pronounced when dealing with low trait heritability, where increasing the number of individuals in the TP may maintain the reduction in the expected prediction accuracy. In this situation, a higher Np than Me leads to a reduction in the value of MeNp, thereby increasing prediction accuracy. The response of GS is the output of various factors affecting the accuracy of GEBVs. These factors are interrelated in a complex and comprehensive manner. They include model performances, sample size and relatedness, marker density, gene effects, heritability and genetic architecture, and the extent and distribution of LD between markers and QTL. Accuracy varies among GS models according to their assumptions and treatments of marker effects (Table 1). For example, it has been established that both Bayesian least absolute shrinkage and selector operator [Bayesian LASSO (BL)] and ridge regression (RR) models outperform support vector regression for predicting GEBVs for host plant resistance to wheat rusts [17Ornella L. et al.Genomic prediction of genetic values for resistance to wheat rusts.Plant Genome J. 2012; 5: 136-148Crossref Scopus (79) Google Scholar], because these traits are controlled by additive gene effects. Another study compared 11 GS models on wheat (Triticum aestivum), maize (Zea mays), and barley (Hordeum vulgare) and all models, except the support vector machine, recorded similar average prediction accuracies using cross-validation [18Heslot N. et al.Genomic selection in plant breeding: a comparison of models.Crop Sci. 2012; 52: 146-160Crossref Scopus (449) Google Scholar]. In this study, cluster analysis of the GS models using Euclidean distance led to separate groupings of nonparametric versus parametric regressions. Generally as sample size increases prediction accuracy increases even though other influencing factors are crucial to consider. An increase in the TP of a biparental wheat population (TP comprising of 96, 48, and 96) [19Heffner E.L. et al.Genomic selection accuracy for grain quality traits in biparental wheat populations.Crop Sci. 2011; 51: 2597-2606Crossref Scopus (200) Google Scholar] and a multifamily wheat breeding program (TP comprising of 96, 192, and 256) [20Heffner E.L. et al.Genomic selection accuracy using multifamily prediction models in a wheat breeding program.Plant Genet. 2011; 4: 65-75Crossref Google Scholar] resulted in an increase in the prediction accuracy. Designing the composition of the TP in relation to the BP is important in maintaining a high degree of accuracy in GS. A few studies have shown that merging different groups of related populations enhances selection accuracy [21Schulz-Streeck T. et al.Genomic selection using multiple populations.Crop Sci. 2012; 52: 2453Crossref Scopus (62) Google Scholar, 22Zhao Y. et al.Accuracy of genomic selection in European maize elite breeding populations.Theor. Appl. Genet. 2012; 124: 769-776Crossref PubMed Scopus (200) Google Scholar]. Combining multiple groups as part of the TP attained maximum and statistically significant prediction accuracy compared with a single group in both dent and flint traits of maize [23Technow F. et al.Genomic prediction of northern corn leaf blight resistance in maize with combined or separated training sets for heterotic groups.G3. 2013; 3: 197-203Crossref PubMed Scopus (93) Google Scholar]. Studies in oat (Avena sativa) [24Asoro F.G. et al.Accuracy and training population design for genomic selection on quantitative traits in elite North American oats.Plant Genome J. 2011; 4: 132-144Crossref Scopus (152) Google Scholar], maize [25Ogutu J.O. et al.Genomic selection using regularized linear regression models: ridge regression, lasso, elastic net and their extensions.BMC Proc. 2012; 6: S10Crossref PubMed Scopus (222) Google Scholar], and sugar beet (Beta vulgaris) [26Würschum T. et al.Genomic selection in sugar beet breeding populations.BMC Genet. 2013; 14: 85Crossref PubMed Scopus (92) Google Scholar] showed similar results, whereas a study with barley found that the use of combined groups did not respond as expected [27Lorenz A.J. et al.Potential and optimization of genomic selection for fusarium head blight resistance in six-row barley.Crop Sci. 2012; 52: 1609-1621Crossref Scopus (172) Google Scholar]. In a biparental-crossing maize breeding population, the incorporation of half-sib representatives from both parents, rather than increasing the number of individuals arbitrarily, in the TP led to an increase in the prediction accuracy [28Riedelsheimer C. et al.Genomic predictability of interconnected biparental maize populations.Genetics. 2013; 194: 493-593Crossref PubMed Scopus (146) Google Scholar, 29Jacobson A. et al.General combining ability model for genomewide selection in a biparental cross.Crop Sci. 2014; 54: 895-905Crossref Scopus (64) Google Scholar]. The formation of the population structure can influence the performance of genomic-wide predictions in stratified populations [24Asoro F.G. et al.Accuracy and training population design for genomic selection on quantitative traits in elite North American oats.Plant Genome J. 2011; 4: 132-144Crossref Scopus (152) Google Scholar, 30Riedelsheimer C. et al.Genomic and metabolic prediction of complex heterotic traits in hybrid maize.Nat. Genet. 2012; 44: 217-220Crossref PubMed Scopus (442) Google Scholar, 31Wientjes Y.C.J. et al.The effect of linkage disequilibrium and family relationships on the reliability of genomic prediction.Genetics. 2013; 193: 621-631Crossref PubMed Scopus (135) Google Scholar, 32Windhausen V.S. et al.Effectiveness of genomic prediction of maize hybrid performance in different breeding populations and environments.G3. 2012; 2: 1427-1436Crossref PubMed Scopus (194) Google Scholar]. Research in a maize breeding program showed a very low prediction performance for dissimilar subpopulations [32Windhausen V.S. et al.Effectiveness of genomic prediction of maize hybrid performance in different breeding populations and environments.G3. 2012; 2: 1427-1436Crossref PubMed Scopus (194) Google Scholar]. Similarly, the accuracy of RR best linear unbiased predictor (RR-BLUP) declined as the genetic distance increased between TP and BP [26Würschum T. et al.Genomic selection in sugar beet breeding populations.BMC Genet. 2013; 14: 85Crossref PubMed Scopus (92) Google Scholar]. The presence of genetically dissimilar subpopulations in the TP resulted in insignificant prediction accuracy regardless of high marker application. GS has a dual effect by estimating trait-marker effects based on a relation matrix and generating predictions for the target population. Hence, in the presence of population structure, GS has the ability to identify the extent of relations of individuals within and between subpopulations. However, to secure accuracy across subpopulations, it is better to design the TP by pooling multiple subpopulations of stable LD between markers and QTL [21Schulz-Streeck T. et al.Genomic selection using multiple populations.Crop Sci. 2012; 52: 2453Crossref Scopus (62) Google Scholar, 24Asoro F.G. et al.Accuracy and training population design for genomic selection on quantitative traits in elite North American oats.Plant Genome J. 2011; 4: 132-144Crossref Scopus (152) Google Scholar]. Testing multiple traits across multienvironments is one impediment to the improved predictive ability of the GS models. Genotype × environment interaction (G×E) models based on phenotypes or markers may improve the prediction accuracy [17Ornella L. et al.Genomic prediction of genetic values for resistance to wheat rusts.Plant Genome J. 2012; 5: 136-148Crossref Scopus (79) Google Scholar, 33Kumar S. et al.Towards genomic selection in apple (Malus × domestica Borkh.) breeding programmes: prospects, challenges and strategies.Tree Genet Genomes. 2011; 8: 1-14Crossref Scopus (90) Google Scholar, 34Guo Z. et al.Accuracy of across-environment genome-wide prediction in maize nested association mapping populations.G3. 2013; 3: 263-272Crossref PubMed Scopus (33) Google Scholar, 35Burgueño J. et al.Genomic prediction of breeding values when modeling genotype × environment interaction using pedigree and dense molecular markers.Crop Sci. 2012; 52: 707-719Crossref Scopus (386) Google Scholar, 36Ly D. et al.Relatedness and genotype × environment interaction affect prediction accuracies in genomic selection: a study in cassava.Crop Sci. 2013; 53: 1312-1325Crossref Scopus (85) Google Scholar]. In this context, 2437 winter wheat lines were genotyped using 1287 single nucleotide polymorphisms (SNP) in 44 environments to predict an unknown environment using cross-validation [37Heslot N. et al.Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions.Theor. Appl. Genet. 2014; 127: 463-480Crossref PubMed Scopus (246) Google Scholar]. The authors used 22 environments as a training set and 22 environments as a validation set and the best model resulted in average increase of 11.1% in accuracy. GEBVs cannot explain the impact of correlated environmental interactions when avoiding G×E interactions, especially in multienvironment trials. Therefore, the G×E effects could bias prediction accuracy in GS [11Crossa J. et al.Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers.Genetics. 2010; 186: 713-724Crossref PubMed Scopus (563) Google Scholar, 38Resende Jr, M.F. et al.Accelerating the domestication of trees using genomic selection: accuracy of prediction models across ages and environments.New Phytol. 2012; 193: 617-624Crossref PubMed Scopus (202) Google Scholar, 39Resende M.D. et al.Genomic selection for growth and wood quality in Eucalyptus: capturing the missing heritability and accelerating breeding for complex traits in forest trees.New Phytol. 2012; 194: 116-128Crossref PubMed Scopus (248) Google Scholar]. Increasing marker density ensures the conservation of marker–QTL associations and achieves a high prediction accuracy. Marker density is mainly determined by the LD span and sample size. Maize has a shorter LD span compared with barley or wheat and, therefore, a higher marker density is preferred for maize than for both these small grain cereals. Research in a biparental bread wheat population genotyped with 485 markers showed that accuracy plateaued with a minimum number of markers (128–256), beyond which accuracy started to decline [19Heffner E.L. et al.Genomic selection accuracy for grain quality traits in biparental wheat populations.Crop Sci. 2011; 51: 2597-2606Crossref Scopus (200) Google Scholar]. Accuracy plateaued when 800 markers were used for genotyping elite maize populations [22Zhao Y. et al.Accuracy of genomic selection in European maize elite breeding populations.Theor. Appl. Genet. 2012; 124: 769-776Crossref PubMed Scopus (200) Google Scholar]. The effect of marker density to secure optimal prediction accuracy follows similar trends across species. For example, prediction accuracy for height in humans (with a short LD span) increased rapidly with marker density (approximately 150 000 markers) but plateaued at between 200 000 and 400 000 markers. The minimum number of markers across a family can be determined by Ne × L, where Ne and L are the effective population size and genome size in Morgan, respectively [40Meuwissen T. Accuracy of breeding values of ‘unrelated’individuals predicted by dense SNP genotyping.Genet. Sel. Evol. 2009; 41: 35Crossref PubMed Scopus (220) Google Scholar]. Nevertheless, in biparental GS, the marker number needed is reduced vis-à-vis multifamily GS [20Heffner E.L. et al.Genomic selection accuracy using multifamily prediction models in a wheat breeding program.Plant Genet. 2011; 4: 65-75Crossref Google Scholar]. Doubling the TP size, rather than increasing marker density, may be preferable when dealing with related individuals in the TP and BP. GS models should be assessed based on trait complexity and sample size. In this regard, a recent study estimated prediction accuracy of various models for days to maturity and grain yield in 306 elite wheat lines [41Perez-Rodriguez P. et al.Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.G3 (Bethesda). 2012; 2: 1595-1605Crossref PubMed Scopus (162) Google Scholar]. The study included penalized regressions [Bayesian ridge regression (BRR), BL, Bayes A, and Bayes B] versus nonlinear regressions [reproducing kernels Hilbert spaces regression (RKHS), Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN)]. The authors noticed that nonlinear models had a maximum prediction accuracy higher than that of penalized models. Similar results were obtained when using RKHS and RBNN in maize against BL [42Gonzalez-Camacho J.M. et al.Genome-enabled prediction of genetic values using radial basis function neural networks.Theor. Appl. Genet. 2012; 125: 759-771Crossref PubMed Scopus (143) Google Scholar]. Nonlinear models could capture the nonadditive genetic effects (i.e., dominance and epistasis), making them suitable for improving the accuracy of GS models [33Kumar