**4. Marker-assisted selection**

#### **4.1. MAS procedure and theoretical and practical considerations**

Marker-assisted selection (MAS) refers to such a breeding procedure in which DNA marker detection and selection are integrated into a traditional breeding program. Taking a single cross as an example, the general procedure can be described as follow:


**e.** In the subsequent generations (F4 and F5), conduct marker screening and make selection similarly as for F2:3s, but more attention is given to superior individuals within homozy‐ gous lines/rows of markers.

**a.** Planting the breeding populations with potential segregation for traits of interest or

**b.** Sampling plant tissues, usually at early stages of growth, e.g. emergence to young seed‐

**c.** Extracting DNA from tissue sample of each individual or family in the populations, and

**d.** Running PCR or other amplifying operation for the molecular markers associated with

**e.** Separating and scoring PCR/amplified products, by means of appropriate separation

**g.** Selecting the best individuals/families with both desired marker alleles for target traits and desirable performance/phenotypes of other traits, by jointly using marker results

**h.** Repeating the above activities for several generations, depending upon the association between the markers and the traits as well as the status of marker alleles (homozygous or heterozygous), and advancing the individuals selected in breeding program until sta‐

Marker-assisted selection (MAS) refers to such a breeding procedure in which DNA marker detection and selection are integrated into a traditional breeding program. Taking a single

**a.** Select parents and make the cross, at least one (or both) possesses the DNA marker al‐

**b.** Plant F1 population and detect the presence of the marker alleles to eliminate false hy‐

**c.** Plant segregating F2 population, screen individuals for the marker(s), and harvest the

**d.** Plant F2:3 plant rows, and screen individual plants with the marker(s). A bulk of F3 indi‐ viduals within a plant row may be used for the marker screening for further confirma‐ tion in case needed if the preceding F2 plant is homozygous for the markers. Select and

harvest the individuals with required marker alleles and other desirable traits.

polymorphism for the markers used.

or linked to the trait of interest.

and other selection criteria.

**4. Marker-assisted selection**

brids.

lele(s) for the desired trait of interest.

individuals carrying the desired marker allele(s).

preparing DNA samples for PCR and marker screening.

**f.** Identifying individuals/families carrying the desired marker alleles.

ble superior or elite lines that have improved traits are developed.

**4.1. MAS procedure and theoretical and practical considerations**

cross as an example, the general procedure can be described as follow:

and detection techniques, e.g. PAGE, AGE, etc.

ling stage.

56 Plant Breeding from Laboratories to Fields


A frequently asked question about marker-assisted selection is that "how many QTLs should be selected for MAS?" Theoretically, all the QTLs contributing to the trait of interest could be taken into account. For a quantitatively-inherited character like yield, numerous QTLs or genes are usually involved. It is almost impossible to select all QTLs or genes si‐ multaneously so that the selected individuals incorporate all the desired QTLs due to the limitation of resources and facilities. The number of individuals in the population increases exponentially with the increase of target loci involved. The relative efficiency of MAS de‐ creases as the number of QTLs increases and their heritability decreases (Moreau et al., 1998). In other words, MAS will be less effective for a highly complex character governed by many genes than for a simply inherited character controlled by a few genes. The number of genes/QTLs not only impacts the efficiency of MAS, but also the breeding design and imple‐ ment scheme (detail will be discussed below). Typically no more than three QTLs are re‐ garded as an appropriate and feasible choice (Ribaut and Betran, 1999), although five QTLs were used in improvement of fruit quality traits in tomato via marker-assisted introgression (Lecomte et al., 2004). With development of SNP markers (especially rapid automated detec‐ tion and genotyping technologies), selection of more QTLs at the same time might be prefer‐ red and practicable (Kumpatla et al., 2012).

For MAS for multiple genes/QTLs, it was suggested to limit the number of genes undergo‐ ing selection to three to four if they are QTLs selected on the basis of linked markers, and to five to six if they are known loci selected directly (Hospital, 2003). Only the multi-environ‐ mentally verified QTLs that possess medium to large effects are selected. The first priority should be given to the major QTLs that can explain greatest proportion of phenotypic varia‐ tion and/or can be consistently detected across a range of environments and different popu‐ lations. In addition, an index for selection that weights markers differently could be constructed, depending on their relative importance to the breeding objectives. Flint-Garcia et al. (2003) presented an example of such an index used to select for QTLs with different effect magnitudes.

Another question that is commonly asked also is that "how many markers should be used in MAS?" The more markers associated with a QTL are used, the greater opportunity of suc‐ cess in selecting the QTL of interest will be ensured. However, efficiency is also important for a breeding program, especially when the resources and facilities are limited. From the point of both effectiveness and efficiency, for a single QTL it is usually suggested to use two markers (i.e. flanking markers) that are tightly linked to the QTL of interest. The markers to be used should be close enough to the gene/QTL of interest (<5cM) in order to ensure that only a minor proportion of the selected individuals will be recombinants. If a marker (e.g. the peak marker) is found to be located within the region of gene sequence of interest or in such a close proximity to the QTL/gene that no recombination occurs between the marker and the QTL/gene, such a marker only should be preferable. However, if a marker is not tightly linked to a gene of interest, recombination between the marker and gene may reduce the efficiency of MAS because a single crossover may alternate the linkage association and leads to selection errors. The efficiency of MAS decreases as the recombination frequency (genetic distance) between the marker and gene increases. Use of two flanking markers rath‐ er than one may decrease the chance of such errors due to homologous recombination and increase the efficiency of MAS. In this case, only a double crossover (i.e. two single cross‐ overs occurring simultaneously on both sides of the gene/QTL in the region) may result in selection errors, but the frequency of a double crossover is considerably rare. For instance, if two flanking markers with an interval of 20cM or so between them are used, there will be higher probability (99%) for recovery of the target gene than only one marker used.

In practical MAS, a breeder is also concerned about how the markers should be detected, how many generations of MAS have to be conducted, and how large size of the population is needed. In general, detection of marker polymorphism is performed at early stages of plant growth. This is true especially for marker-assisted backcrossing and marker-assisted recurrent selection, because only the individuals that carry preferred marker alleles are ex‐ pected to be used in backcrossing to the recurrent parent and/or inter-mating between se‐ lected individuals/progenies. The generations of MAS required vary with the number of markers used, the degree of association between the markers and the QTLs/genes of interest, and the status of marker alleles. In many cases, marker screening is performed for two to four consecutive generations in a segregating population. If fewer markers are used and the markers are in close proximity to the QTL or gene of interest, fewer generations are needed. If homozygous status of marker alleles of interest is detected in two consecutive generations, marker screening may not be performed in their progenies. Bonnett et al. (2005) discussed the strategies for efficient implementation of MAS involving several issues, e.g. breeding systems or schemes, population sizes, number of target loci, etc. Their strategies include F2 enrichment, backcrossing, and inbreeding.

In MAS, phenotypic evaluation and selection is still very helpful if conditions permit to do so, and even necessary in cases when the QTLs selected for MAS are not so stable across en‐ vironments and the association between the selected markers and QTLs is not so close. Moreover, one should also take the impact of genetic background into consideration. The presence of a QTL or marker does not necessarily guarantee the expression of the desired trait. QTL data derived from multiple environments and different populations help a better understanding of the interactions of QTL x environment and QTL x QTL or QTL x genetic background, and thus help a better use of MAS. In addition to genotypic (markers) and phe‐ notypic data for the trait of interest, a breeder often pays considerable attention to other im‐ portant traits, unless the trait of interest is the only objective of breeding.

There are several indications for adoption of molecular markers in the selection for the traits of interest in practical breeding. The situations favorable for MAS include:


#### **4.2. MAS for major genes or improvement of qualitative traits**

only a minor proportion of the selected individuals will be recombinants. If a marker (e.g. the peak marker) is found to be located within the region of gene sequence of interest or in such a close proximity to the QTL/gene that no recombination occurs between the marker and the QTL/gene, such a marker only should be preferable. However, if a marker is not tightly linked to a gene of interest, recombination between the marker and gene may reduce the efficiency of MAS because a single crossover may alternate the linkage association and leads to selection errors. The efficiency of MAS decreases as the recombination frequency (genetic distance) between the marker and gene increases. Use of two flanking markers rath‐ er than one may decrease the chance of such errors due to homologous recombination and increase the efficiency of MAS. In this case, only a double crossover (i.e. two single cross‐ overs occurring simultaneously on both sides of the gene/QTL in the region) may result in selection errors, but the frequency of a double crossover is considerably rare. For instance, if two flanking markers with an interval of 20cM or so between them are used, there will be

higher probability (99%) for recovery of the target gene than only one marker used.

enrichment, backcrossing, and inbreeding.

58 Plant Breeding from Laboratories to Fields

In practical MAS, a breeder is also concerned about how the markers should be detected, how many generations of MAS have to be conducted, and how large size of the population is needed. In general, detection of marker polymorphism is performed at early stages of plant growth. This is true especially for marker-assisted backcrossing and marker-assisted recurrent selection, because only the individuals that carry preferred marker alleles are ex‐ pected to be used in backcrossing to the recurrent parent and/or inter-mating between se‐ lected individuals/progenies. The generations of MAS required vary with the number of markers used, the degree of association between the markers and the QTLs/genes of interest, and the status of marker alleles. In many cases, marker screening is performed for two to four consecutive generations in a segregating population. If fewer markers are used and the markers are in close proximity to the QTL or gene of interest, fewer generations are needed. If homozygous status of marker alleles of interest is detected in two consecutive generations, marker screening may not be performed in their progenies. Bonnett et al. (2005) discussed the strategies for efficient implementation of MAS involving several issues, e.g. breeding systems or schemes, population sizes, number of target loci, etc. Their strategies include F2

In MAS, phenotypic evaluation and selection is still very helpful if conditions permit to do so, and even necessary in cases when the QTLs selected for MAS are not so stable across en‐ vironments and the association between the selected markers and QTLs is not so close. Moreover, one should also take the impact of genetic background into consideration. The presence of a QTL or marker does not necessarily guarantee the expression of the desired trait. QTL data derived from multiple environments and different populations help a better understanding of the interactions of QTL x environment and QTL x QTL or QTL x genetic background, and thus help a better use of MAS. In addition to genotypic (markers) and phe‐ notypic data for the trait of interest, a breeder often pays considerable attention to other im‐

There are several indications for adoption of molecular markers in the selection for the traits

portant traits, unless the trait of interest is the only objective of breeding.

of interest in practical breeding. The situations favorable for MAS include:

In crop plants, many economically important characteristics are controlled by major genes/ QTLs. Such characteristics include resistance to diseases/pests, male sterility, self-incompati‐ bility and others related to shape, color and architecture of whole plants and/or plant parts. These traits are often of mono- or oligogenic inheritance in nature. Even for some quality traits, one or a few major QTLs or genes can account for a very high proportion of the phe‐ notypic variation of the trait (Bilyeu et al., 2006; Pham et al., 2012). Transfer of such a gene to a specific line can lead to tremendous improvement of the trait in the cultivar under devel‐ opment. The marker loci which are tightly linked to major genes can be used for selection and are sometimes more efficient than direct selection for the target genes. In some cases, such advantages in efficiency may be due to higher expression of the marker mRNA in such cases that the marker is actually within a gene. Alternatively, in such cases that the target gene of interest differs between two alleles by a difficult-to-detect SNP, an external marker of which polymorphism is easier to detect, may present as the most realistic option.

Soybean cyst nematode (SCN) (*Heterodera glycines* Inchinoe) may be taken as an example of MAS for major genes. This pathogen is the most economically significant soybean pest. The principal strategy to reduce or eliminate damage from this pest is the use of resistant culti‐ vars (Cregan et al., 1999). However, identifying resistant segregants in breeding populations is a difficult and expensive process. A widely used phenotypic assay takes five weeks, re‐ quires a large greenhouse space, and about 5 to 10 h of labor for every 100 plant samples processed (Young, 1999). Fortunately, the SSR marker Satt309 has been identified to be locat‐ ed only 1–2 cM away from the resistance gene *rhg1* (Cregan et al., 1999), which forms the basis of many public and commercial breeding efforts. In a direct comparison, genotypic se‐ lection with Satt309 was 99% accurate in predicting lines that were susceptible in subse‐ quent greenhouse assays for two test populations, and 80% accurate in a third population, each with a different source of SCN resistance (Young, 1999). In soybean, Shi et al. (2009) reported that using molecular markers in a cross J05 x V94-5152, they developed five F4:5 lines that were homozygous for all eight marker alleles linked to the genes/loci of resistance to soybean mosaic virus (SMV). These lines exhibited resistance to SMV strains G1 and G7 and presumably carried all three resistance genes (*Rsv1*, *Rsv3* and *Rsv4*) that would poten‐ tially provide broad and durable resistance to SMV.

#### **4.3. MAS for improvement of quantitative traits**

Most of the important agronomic traits are polygenic or controlled by multiple QTLs. MAS for the improvement of such traits is a complex and difficult task because it is related to many genes or QTLs involved, QTL x E interaction and epistasis. Usually, each of these genes has a small effect on the phenotypic expression of the trait and expression is affected by environmental conditions. Phenotyping of quantitative traits becomes a complex endeav‐ or consequently, and determining marker-phenotype association becomes difficult as well. Therefore, repeated field tests are required to accurately characterize the effects of the QTLs and to evaluate the stability across environments. The QTL x E interaction reduces the effi‐ ciency of MAS and epistasis can result in a skewed QTL effect on the trait.

Despite a tremendous amount of QTL mapping experiments over the past decade, applica‐ tion and utilization of the QTL mapping information in plant breeding has been constrained by a number of factors (Collard and Mackill, 2008):


In order to improve the efficiency of MAS for quantitative traits, appropriate field experi‐ mental designs and approaches have to be employed. Attention should be given to replica‐ tions both over time and space, consistency in experimental techniques, samplings and evaluations, robust data processing and statistical analysis. For example, composite interval mapping (CIM) allows the integration of data from different locations for joint analysis to estimate QTL-environment interaction so that stable QTLs across environments can be iden‐ tified. A saturated linkage map enables accurate identification of both targeted QTLs as well as linked QTLs in coupling and repulsion linkage phases. In practical breeding for improve‐ ment of a quantitative trait, usually not many minor QTLs are considered but only a few major QTLs are used in MAS. In case many QTLs especially minor-effect QTLs are involved, a breeder would prefer to consider the strategy of gene pyramiding (see the later section).

Fusarium head blight (FHB) caused by *Fusarum* species is one of the most destructive diseas‐ es in wheat and barley worldwide. To combat this disease, a great effort from multiple fields, including plant breeding and genetics, molecular genetics and genomics, plant path‐ ology, and integrated management, has been dedicated since 1990s. Resistance to HFB in both wheat and barley is quantitatively inherited, and many QTLs have been identified from different resources of germplasm (Buerstmayr et al., 2009). Use of MAS to improve the resistance has become a choice for many breeding programs. In wheat, a major QTL desig‐ nated as *Fhb1* was consistently detected across multiple environments and populations, and explained 20-40% of phenotypic variation in most cases (Buerstmayr et al., 2009; Jiang et al., 2007a, 2007b). Thus wheat breeders would especially prefer to use this major QTL to devel‐ op new cultivars with FHB resistance. Pumphrey et al. (2007) compared 19 pairs of NIL for *Fhb1* derived from an ongoing breeding program and found that the average reduction in disease severity between NIL pairs was 23% for disease severity and 27% for kernel infec‐ tion. Later investigation from the group also demonstrated successful implementation of MAS for this QTL (Anderson et al. 2007).

In addition, researchers also tried to incorporate multiple QTLs by MAS. Miedaner et al. (2006) demonstrated that MAS for three FHB resistance QTLs simultaneously was highly effective in enhancing FHB resistance in German spring wheat. FHB resistance was the highest in recombi‐ nant lines with multiple QTLs combined, especially 3B plus 5A. Jiang et al. (2007a) made a comparison of multiple-locus combinations in a RIL population derived from the cross "Veery x CJ 9306". For three loci, the average levels of resistance from low to high in genotypes were: no favorable allele – one favorable allele – two favorable alleles – three favorable alleles, ex‐ cept for the non-reciprocal comparisons. When four or five loci carrying favorable alleles from the resistant parent CJ 9306 were considered simultaneously, the coefficients of determination between the accumulated effects of alleles for different combinations and the averages of num‐ ber or percentage of diseased spikelets for the corresponding RILs were 0.33-0.41 (P<0.01) (Jiang et al., 2007a). Therefore, the authors concluded that the effects of FHB resistance QTLs could be accumulated and the resistance could be feasibly enhanced by selection of favorable marker alleles for multiple loci in breeding programs.

In the U.S., the Coordinated Agricultural Projects (CAPs) with aims to encourage collabora‐ tive efforts in applied plant genomics and molecular research have been implemented in several crops, such as rice, wheat, barley, beans, potato, tomato, etc. An important strategy CAPs take is applying marker-assisted selection to plant breeding and efficiently using ge‐ netic resources and facilities available, including thousands and ten thousands of DNA markers and plant introductions, to facilitate development of crop cultivars with improved yield, resistance and quality.
