**7. Genomic selection**

Genomic selection (GS) or genome-wide selection (GWS) is a form of marker-based selec‐ tion, which was defined by Meuwissen (2007) as the simultaneous selection for many (tens or hundreds of thousands of) markers, which cover the entire genome in a dense manner so that all genes are expected to be in linkage disequilibrium with at least some of the markers. In GS genotypic data (genetic markers) across the whole genome are used to predict complex traits with accuracy sufficient to allow selection on that prediction alone. Selection of desirable individuals is based on genomic estimated breeding value (GEBV) (Nakaya and Isobe, 2012), which is a predicted breeding value calculated using an innovative method based on genome-wide dense DNA markers (Meuwissen et al., 2001). GS does not need significant testing and identifying a subset of markers associated with the trait (Meuwissen et al., 2001). In other words, QTL mapping with populations derived from specific crosses can be avoided in GS. However, it does first need to devel‐ op GS models, i.e. the formulae for GEBV prediction (Nakaya and Isobe, 2012). In this process (training phase), phenotypes and genome-wide genotypes are investigated in the training population (a subset of a population) to predict significant relationships between phenotypes and genotypes using statistical approaches. Subsequently, GEBVs are used for the selection of desirable individuals in the breeding phase, instead of the genotypes of markers used in traditional MAS. For accuracy of GEBV and GS, genome-wide genotype data is necessary and require high marker density in which all quantitative trait loci (QTLs) are in linkage disequilibrium with at least one marker.

GS can be possible only when high-throughput marker technologies, high-performance computing and appropriate new statistical methods become available. This approach has be‐ come feasible due to the discovery and development of large number of single nucleotide polymorphisms (SNPs) by genome sequencing and new methods to efficiently genotype large number of SNP markers. As suggested by Goddard and Hayes (2007), the ideal meth‐ od to estimate the breeding value from genomic data is to calculate the conditional mean of the breeding value given the genotype at each QTL. This conditional mean can only be cal‐ culated by using a prior distribution of QTL effects, and thus this should be part of the re‐ search to implement GS. In practice, this method of estimating breeding values is approximated by using the marker genotypes instead of the QTL genotypes, but the ideal method is likely to be approached more closely as more sequence and SNP data are ob‐ tained (Goddard and Hayes, 2007).

For complex traits such as grain yield, biotic and abiotic resistance, MARS has been pro‐ posed for "forward breeding" of native genes and pyramiding multiple QTLs (Ragot et al., 2000; Ribaut et al., 2000, 2010; Eathington, 2005; Crosbie et al., 2006). As defined by Ribaut et al. (2010), MARS is a recurrent selection scheme using molecular markers for the identification and selection of multiple genomic regions involved in the expression of complex traits to assemble the best-performing genotype within a single or across related populations. Johnson (2004) presented an example to demonstrate the efficiency of MARS for quantitative traits. In their maize MARS programs, a large-scale use of markers in biparental populations, first for QTL detection and then for MARS on yield (i.e. rapid cy‐ cles of recombination and selection based on associated markers for yield), could allow increased efficiency of long-term selection by increasing the frequency of favorable alleles (Johnson, 2004). Eathington (2005) and Crosbie et al. (2006) also indicated that the genetic gain achieved through MARS in maize was about twice that of phenotypic selection (PS) in some reference populations. In upland cotton, Yi et al. (2004) reported significant effec‐ tiveness of MARS for resistance to *Helicoverpa armigera*. The mean levels of resistance in improved populations after recurrent selection were significantly higher than those of pre‐

Genomic selection (GS) or genome-wide selection (GWS) is a form of marker-based selec‐ tion, which was defined by Meuwissen (2007) as the simultaneous selection for many (tens or hundreds of thousands of) markers, which cover the entire genome in a dense manner so that all genes are expected to be in linkage disequilibrium with at least some of the markers. In GS genotypic data (genetic markers) across the whole genome are used to predict complex traits with accuracy sufficient to allow selection on that prediction alone. Selection of desirable individuals is based on genomic estimated breeding value (GEBV) (Nakaya and Isobe, 2012), which is a predicted breeding value calculated using an innovative method based on genome-wide dense DNA markers (Meuwissen et al., 2001). GS does not need significant testing and identifying a subset of markers associated with the trait (Meuwissen et al., 2001). In other words, QTL mapping with populations derived from specific crosses can be avoided in GS. However, it does first need to devel‐ op GS models, i.e. the formulae for GEBV prediction (Nakaya and Isobe, 2012). In this process (training phase), phenotypes and genome-wide genotypes are investigated in the training population (a subset of a population) to predict significant relationships between phenotypes and genotypes using statistical approaches. Subsequently, GEBVs are used for the selection of desirable individuals in the breeding phase, instead of the genotypes of markers used in traditional MAS. For accuracy of GEBV and GS, genome-wide genotype data is necessary and require high marker density in which all quantitative trait loci

GS can be possible only when high-throughput marker technologies, high-performance computing and appropriate new statistical methods become available. This approach has be‐

(QTLs) are in linkage disequilibrium with at least one marker.

ceding populations.

**7. Genomic selection**

72 Plant Breeding from Laboratories to Fields

Since the application of GS was proposed by Meuwissen et al. (2001) to breeding popula‐ tions, theoretical, simulation and empirical studies have been conducted, mostly in animals (Goddard and Hayes, 2007; Jannink et al., 2010). Relatively speaking, GS in plants was less studied and large-scale empirical studies are not available in public sectors for plant breed‐ ing (Jannink et al., 2010), but it has attracted more and more attention in recent years (Ber‐ nardo, 2010; Bernardo and Yu, 2007; Guo et al., 2011; Heffner et al., 2010, 2011; Lorenzana and Bernardo, 2009; Wong and Bernardo, 2008; Zhong et al., 2009). Studies indicated that in all cases, accuracies provided by GS were greater than might be achieved on the basis of pedigree information alone (Jannink et al., 2010). In oil palm, for a realistic yet relatively small population, GS was superior to MARS and PS in terms of gain per unit cost and time (Wong and Bernardo, 2008). The studies have demonstrated the advantages of GS, suggest‐ ing that GS would be a potential method for plant breeding and it could be performed with realistic sizes of populations and markers when the populations used are carefully chosen (Nakaya and Isobe, 2012).

GS has been highlighted as a new approach for MAS in recent years and is regarded as a powerful, attractive and valuable tool for plant breeding. However, GS has not be‐ come a popular methodology in plant breeding, and there might be a far way to go be‐ fore the extensive use of GS in plant breeding programs. The major reason might be the unavailability of sufficient knowledge of GS for practical use (Nakaya and Isobe, 2012). Statistics and simulation discussed in terms of formulae in GS studies are most likely too specific and hard for plant breeders to understand and to use in practical breeding programs. From a plant breeder's point of view, GS can be practicable for a few breed‐ ing populations with a specific purpose, but may be impractical for a whole breeding program dealing with hundreds and thousands of crosses/populations at the same time. Therefore, GS must shift from theory to practice, and its accuracy and cost effectiveness must be evaluated in practical breeding programs to provide convincing empirical evi‐ dence and warrant a practicable addition of GS to a plant breeder's toolbox (Heffner et al., 2009). Development of easily understandable formulae for GEBVs and user-friendly software packages for GS analysis is helpful in facilitating and enhancing the applica‐ tion of GS in plant breeding. Kumpatla et al. (2012) recently presented an overall re‐ view on the GS for plant breeding.
