**6. Breeding for protein content**

Availability of genetic variability for soybean food-grade traits offers scope to improve through breeding. Breeding cultivated soybean varieties with high protein or high oil are an extremely important and promising objective. High protein and low oil content add nutritional value to soy foods. Germplasms that cover a wide range in protein content (33.1–55.9%) and oil content (13.6–23.6%) are available for breeders to modify the seed/oil ratio in the breeding program. The negative correlation between protein and oil facilitates the development of high protein and low oil lines. High protein content is generally associated with low yield, which makes the development of lines that combine high protein and high yield difficult. However, high yield is mostly achieved by selection for moderately high protein content (43–45%) [13]. Seed protein and oil content are two valuable quality traits controlled by multiple genes in soybean. The phenotypic range of protein content of soybean has been reported to be 34.1–56.8% of seed dry mass, and oil content ranged from 8.3 to 27.9% [49], suggesting that there is great potential for genetic improvement of soybean seed protein and oil content. The negative correlation between oil and protein content makes improvement of both traits simultaneously a challenging task using conventional breeding [50]. Therefore, the identification of molecular markers associated with quantitative trait loci (QTLs) controlling protein and oil content is a prerequisite for breaking the negative correlations between both traits [51].

In the SoyBase database, 241 QTLs for protein content and 315 QTLs for oil content were reported and found to be distributed over 20 soybean chromosomes [52]. A majority of these QTLs were mapped by linkage mapping based on biparental populations and limited by the relatively small phenotypic variation and by the fact that only two alleles per locus can be studied simultaneously. The broad chromosome regions of QTLs make it difficult to identify putative candidate genes of interest [53]. With the advancement of genetic map construction, the availability of a well-annotated reference genome, resources for association mapping, and whole-genome resequencing (WGRS) data, a large number of QTLs for seed protein content have been identified (**Table 1**).

Several genome-wide association studies [50, 66, 68] and QTL analysis [53, 56] have shown similar QTL genomic loci (e.g., Chrs20, 15, and 5) for protein and oil indicating negative pleiotropic effect or linkage (larger LD). The QTL on Chr20 was


**61**

*Food Grade Soybean Breeding, Current Status and Future Directions*

most likely in the genomic region of 29.8–31.6 Mbp that was supported by integrating GWAS, transcriptome, and QTL mapping analysis (**Table 1**) [68]. It was observed that the gene order was conserved and 18 identified genes were tandemly duplicated on Chr10 and showed similar gene ontology [83]. Three putative candidate genes were identified on Chr20 and suggested that these non-duplicated genes might be related to protein content [68]. Similarly, Chr15 QTL (38.1–39.7 Mbp) showed an inversely duplicated genomic block on Chr8. The QTL on Chr15 comprises 18 putative genes, 13 of which were duplicated with similar gene function. Syntenic analysis provided a basis for divergence of QTL regions that took place during recent genome duplication and suggested the retention or loss of several genes that might be responsible for oil content and protein in soybean. In addition to pleiotropic effects of protein on oil and yield, variation in seed protein concentration significantly affects seed size, crop growth, and development [84]. High-protein genotypes showed lower leaf area and harvest index when compared with high-yielding genotypes. While high-protein small seed showed higher leaf area at the beginning of seed fill, more canopy biomass production, and low levels of assimilate per seed [84]. Therefore, breaking the undesirable genetic linkage between protein, oil, and yield related loci through repetitive

Consumers have preference for firmer tofu texture that partly depends upon the protein composition. The genotypic variation in this trait is partly due to the ratio of 11S-to-7S protein fraction in the seed. The 11S fraction generally possesses greater gelling potential than 7S; hence, high 11S-to-7S ratio is desirable as it results into harder than those with low ratio. The 11S-to-7S ratio is reported to range from 0.3 to 4.9. However, genotypes with same 11S-to-7S ratio do not always result in the same firmness because of different 11S subunit composition. In general, a high 11S-to-7S ratio as well as suitable 11S composition is of importance for good tofu firmness. The selection and manipulation of specific subunit composition will play a major role in the development of improved protein quality. Molecular markers linked to the various subunit of glycinin and β-conglycinin have been reported previously. PCR-based markers were reported for the identification of β-conglycinin genes [85, 86]. An RFLP marker associated with the *Scg-1* (suppressor of β-conglycinin) gene was developed by using the α-subunit gene as probe [87]. SNPs in the β subunit genes were used to map the *Scg-1* gene, and the chromosomal region associated with β-conglycinin deficiency was located on linkage group I of the soybean genetic map [86]. Hayashi et al. [88] reported AFLP markers linked to the recessive allele, cgdef, controlling the mutant line lacking 7S globulin subunits (α, α′, β). Markers linked to the glycinin genes were reported. RFLP markers were identified for both *Gy*4 and *Gy*5 and mapped in linkage group O and F on the public soybean linkage map [31, 89]. *Gy*1, *Gy*2, and g*y*6 are linked in tandem to one another on linkage group N, while *Gy*3 and *Gy*7 are linked to one another on linkage group L [90]. KASP-SNP markers linked to 7S α′ and 11S A1, A3, and A4 subunits have been reported [91]. Three SSR markers (Satt461, Satt292, and Satt156) were found to be associated with glycinin QTLs that were distributed on linkage group D2, I, and L, whereas two β-conglycinin QTL-associated SSRs (Satt461 and Satt249) were distributed on LG D2 and J [35]. Functional markers (FMs) have advantages over the linked markers, because their polymorphic sites have been derived from the genes involved in phenotypic trait variation [92]. Glycinin genes have high degree of conservation within the subgenus *Soja*, but there are more variations within subgenus *Glycine* [93]. Despite the high degree of similarity among the subunits in Group I and Group II, gene

*DOI: http://dx.doi.org/10.5772/intechopen.92069*

recombination and random mating is necessary.

**7. Breeding for 11S/7S ratio**

#### **Table 1.**

*Major QTLs for seed protein, sucrose and oligosaccharide content reported in soybean.*

#### *Food Grade Soybean Breeding, Current Status and Future Directions DOI: http://dx.doi.org/10.5772/intechopen.92069*

most likely in the genomic region of 29.8–31.6 Mbp that was supported by integrating GWAS, transcriptome, and QTL mapping analysis (**Table 1**) [68]. It was observed that the gene order was conserved and 18 identified genes were tandemly duplicated on Chr10 and showed similar gene ontology [83]. Three putative candidate genes were identified on Chr20 and suggested that these non-duplicated genes might be related to protein content [68]. Similarly, Chr15 QTL (38.1–39.7 Mbp) showed an inversely duplicated genomic block on Chr8. The QTL on Chr15 comprises 18 putative genes, 13 of which were duplicated with similar gene function. Syntenic analysis provided a basis for divergence of QTL regions that took place during recent genome duplication and suggested the retention or loss of several genes that might be responsible for oil content and protein in soybean. In addition to pleiotropic effects of protein on oil and yield, variation in seed protein concentration significantly affects seed size, crop growth, and development [84]. High-protein genotypes showed lower leaf area and harvest index when compared with high-yielding genotypes. While high-protein small seed showed higher leaf area at the beginning of seed fill, more canopy biomass production, and low levels of assimilate per seed [84]. Therefore, breaking the undesirable genetic linkage between protein, oil, and yield related loci through repetitive recombination and random mating is necessary.
