**4. Sorghum genetic mapping**

Building a linkage map is the fundamental step required for a detailed study of genetic improvement of crops by marker-assisted selection. Mapping of sorghum genome based on DNA markers started in the 90's, and nowadays there are several genetic maps available. It is important to mention that sorghum, particularly *S. bicolor*, possesses 10 chromosomes and has been classified as a diploid (2n = 2x = 20) [23]. However, it has been assumed that sorghum has a tetraploid origin, due to the large number of complementary gene *loci* and to some studies on meiotic mating among chromosomes as in *S. halepense*, which is 2n = 4x = 40. Other studies with fluorescence *in situ* hybridization (FISH) have reached the same conclusion [24]). Nonetheless, [25] used the same FISH technique and other structural genomic resources, including genomic clones with large inserts in artificial bacterial chromosomes (BACs), and identified the 10 chromosomes simultaneously. Years after, Paterson *et al.,* [7] found identities and homologies among the linkage groups in metaphase state and this determined *S. bicolor* diploidy (2n = 20) as well as the genome length of 730 Megabases (Mb).

The first genetic maps built where based on DNA analogy tests based on corn genome Binelli *et al*., [26]; Whitkus *et al*., [27]; Melake-Berhan *et al*., [28]; Pereira *et al*., [27] After, maps were built from genomic DNA analysis Chittenden *et al*., [29]; Ragab *et al*., [30]; Xu *et al*., [31]. Other published map was based on tests done in sugar cane and maize [32]. All these maps were built using restriction fragments length polymorphisms (RFLPs) and the majority of them used F2 populations, while Dufour *et al*., [32] used two populations of recombinant inbred lines (RILs). This last map was extended by Boivin *et al.,* [33] with the addition of a great number of RFLPs and AFLPs (Amplified Fragment Length Polymorphisms). On the other hand, [34] built a sorghum map using a RIL population and a variety of tests which include sorghum genomic DNA, corn and sugar cane DNA and cDNA, versus tests of other cereals, and 8 simple sequence repeat (SSRs) microsatellite *loci.* Subudhi and Nguyen [35] completed the alignment of the 10 linkage groups using RFLPs on a RIL population and completing the maps of de Chittenden *et al*. [29], Ragab *et al*. [30, 31] of corn and other cereals.

Kong et al. [36] mapped a RIL population with 31 SSR polymorphic *loci* obtained from 51 clones isolated from a *S. bicolor* genetic library, which was provided with four oligomers di- and trinucleotides radioactively labeled. Haussmann *et al*. [37] mapped molecular markers related to resistance of the hemiparasite

**117**

production [43].

using RFLPs.

been reached until now.

certain apparent collinearity was also found.

*A General Overview of Sweet Sorghum Genomics DOI: http://dx.doi.org/10.5772/intechopen.98539*

distributed in among the linkage groups.

*Striga hermonthica* in two recombinant populations (RIP-1, −2) of F3.5 lines. RIP-1 and RIP-2 maps covered 1,498 cM and 1,599 cM respectively with 157 markers

Apart from these linkage maps, integrated maps have also been built. An integrated linkage map of SSRs and AFLPs from sorghum was reported by Kong *et al*. [36] using different sorghum lines. SSR *loci* were designed from clones isolated from two sorghum BAC libraries. The linkage map covered 1,406 cM and consisted in 147 SSR *loci* and 323 RFLP *loci*. Klein *et al*. [36] constructed an integrated physical and genetic map of sorghum genome (750 Mb) from PCR methods for the creation of BAC libraries and the localization of BAC clones in sorghum genetic maps. Also, Menz *et al*. [38] built a genetic map using AFLPs. The 1713 cM of the map covered 2,926 *loci* distributed among the 10 linkage groups, where 2,454 were AFLPs, 136 were SSRs previously mapped in sorghum and 203 were cDNAs and genomic clones coming from rice, barley, oat and maize. Another reported map was the one from [39], which consisted in 2,512 *loci* spaced in intervals of 0.4 cM on average, and it was based in 2,050 RFLPs, including 865 heterology tests

from sugar cane, maize, rice, *Pennisetum setaceum* and *Arabidopsis thaliana*. Recently, a high genetic density map was published by Ji *et al.* [40], where specific length amplified fragment markers (SLAFs) were utilized. This map was based on a F2 population of 130 individuals originated from a cross between a grain sorghum variety, J204, and a sweet sorghum variety, Keter. Massive sequencing was used to cover the 52,928 SLAFs from the 43 million reads generated. From these markers, 12% appeared to be polymorphic and from 2,246 of these SLAFs a linkage map was built, covering the 10 chromosomes. The total length was 2,158 cM, which is 50% more compared to the previous maps available, which were constructed

Another method used is the comparative genome mapping. This particular method is interesting for geneticists and evolutionary biologists to elucidate the mechanisms determining chromosome's evolution. Comparative genome mapping provides a powerful technique to study the way and the time where chromosomal evolution occurs [23]. This approach involves the use of molecular markers, such as RFLPs, to map the genomes of two species for a group of markers in common (*loci*). Even it is an expensive and intensive duty, this method can determine the reach and the nature of the chromosomes recombination in incompatible species crosses. The finding of small chromosomal regions which retain a similar gene order in sorghum and in two dicotyledon species (*Arabidopsis* y *Gossypium hirsutum*), suggest that comparative mapping can reach a major evolutive distance compared to what has

Among the *Andropogoneae* grass tribe, comparative mapping facilitates the understanding of sorghum genetics. At this point, several research groups have established a relationship between sorghum and maize genomes [27, 28, 32, 41, 42]. The high degree of conservation of the genes order between these two crops has limited the identification of chromosomal rearrangements between them. Apart from being compared with maize, sorghum has also been compared to rice, where

Until 2015, more than 850 *loci* associated to traits relevant to biofuels production were identified in sorghum. These are traits regarding plant architecture (roots, leaves and stem), flowering time, and conversion rate of biomass into biofuels. These quantitative trait *loci* (QTLs) related to biofuels generation have been found in different mapped populations, which suggest the plasticity of these traits in different environments. This makes the genes located in these QTL regions could be potential targets to improve sweet sorghum yield for biomass and biofuels

#### *A General Overview of Sweet Sorghum Genomics DOI: http://dx.doi.org/10.5772/intechopen.98539*

*Biotechnological Applications of Biomass*

**4. Sorghum genetic mapping**

genome length of 730 Megabases (Mb).

Ragab *et al*. [30, 31] of corn and other cereals.

Sweet sorghum has been found on different races [18], which challenges its origin, selection and genetics. This also suggest high genetic variability between sweet and grain sorghum, which could be exploited for genetic improvement of sweet sorghum. Currently the BTx623 grain sorghum genome sequence is available [10], which provides a genomic base for comparative studies of the genome. Regardless this achievement, it is still difficult to access the information related to the hidden variability among genomes of the same species. Zheng et al., [19] studied the resequencing of the two sweet and one grain sorghum genomes, with the aim of identify polymorphism patterns of the sequences and structural variations, using BTx623 as a reference genome. This study allowed the identification of great differences in the number of SNPs, indels, copy number variations and structural variations (SV) among these genomes. The comparison of this genetic variation defined potential genomic regions and metabolic pathways associated to sweet sorghum and traits such as sugar production. **Table 1** presents phenotypic

Building a linkage map is the fundamental step required for a detailed study of genetic improvement of crops by marker-assisted selection. Mapping of sorghum genome based on DNA markers started in the 90's, and nowadays there are several genetic maps available. It is important to mention that sorghum, particularly *S. bicolor*, possesses 10 chromosomes and has been classified as a diploid (2n = 2x = 20) [23]. However, it has been assumed that sorghum has a tetraploid origin, due to the large number of complementary gene *loci* and to some studies on meiotic mating among chromosomes as in *S. halepense*, which is 2n = 4x = 40. Other studies with fluorescence *in situ* hybridization (FISH) have reached the same conclusion [24]). Nonetheless, [25] used the same FISH technique and other structural genomic resources, including genomic clones with large inserts in artificial bacterial chromosomes (BACs), and identified the 10 chromosomes simultaneously. Years after, Paterson *et al.,* [7] found identities and homologies among the linkage groups in metaphase state and this determined *S. bicolor* diploidy (2n = 20) as well as the

The first genetic maps built where based on DNA analogy tests based on corn genome Binelli *et al*., [26]; Whitkus *et al*., [27]; Melake-Berhan *et al*., [28]; Pereira *et al*., [27] After, maps were built from genomic DNA analysis Chittenden *et al*., [29]; Ragab *et al*., [30]; Xu *et al*., [31]. Other published map was based on tests done in sugar cane and maize [32]. All these maps were built using restriction fragments length polymorphisms (RFLPs) and the majority of them used F2 populations, while Dufour *et al*., [32] used two populations of recombinant inbred lines (RILs). This last map was extended by Boivin *et al.,* [33] with the addition of a great number of RFLPs and AFLPs (Amplified Fragment Length Polymorphisms). On the other hand, [34] built a sorghum map using a RIL population and a variety of tests which include sorghum genomic DNA, corn and sugar cane DNA and cDNA, versus tests of other cereals, and 8 simple sequence repeat (SSRs) microsatellite *loci.* Subudhi and Nguyen [35] completed the alignment of the 10 linkage groups using RFLPs on a RIL population and completing the maps of de Chittenden *et al*. [29],

Kong et al. [36] mapped a RIL population with 31 SSR polymorphic *loci* obtained from 51 clones isolated from a *S. bicolor* genetic library, which was provided with four oligomers di- and trinucleotides radioactively labeled. Haussmann *et al*. [37] mapped molecular markers related to resistance of the hemiparasite

and genotypic differences between grain and sweet sorghum.

**116**

*Striga hermonthica* in two recombinant populations (RIP-1, −2) of F3.5 lines. RIP-1 and RIP-2 maps covered 1,498 cM and 1,599 cM respectively with 157 markers distributed in among the linkage groups.

Apart from these linkage maps, integrated maps have also been built. An integrated linkage map of SSRs and AFLPs from sorghum was reported by Kong *et al*. [36] using different sorghum lines. SSR *loci* were designed from clones isolated from two sorghum BAC libraries. The linkage map covered 1,406 cM and consisted in 147 SSR *loci* and 323 RFLP *loci*. Klein *et al*. [36] constructed an integrated physical and genetic map of sorghum genome (750 Mb) from PCR methods for the creation of BAC libraries and the localization of BAC clones in sorghum genetic maps. Also, Menz *et al*. [38] built a genetic map using AFLPs. The 1713 cM of the map covered 2,926 *loci* distributed among the 10 linkage groups, where 2,454 were AFLPs, 136 were SSRs previously mapped in sorghum and 203 were cDNAs and genomic clones coming from rice, barley, oat and maize. Another reported map was the one from [39], which consisted in 2,512 *loci* spaced in intervals of 0.4 cM on average, and it was based in 2,050 RFLPs, including 865 heterology tests from sugar cane, maize, rice, *Pennisetum setaceum* and *Arabidopsis thaliana*.

Recently, a high genetic density map was published by Ji *et al.* [40], where specific length amplified fragment markers (SLAFs) were utilized. This map was based on a F2 population of 130 individuals originated from a cross between a grain sorghum variety, J204, and a sweet sorghum variety, Keter. Massive sequencing was used to cover the 52,928 SLAFs from the 43 million reads generated. From these markers, 12% appeared to be polymorphic and from 2,246 of these SLAFs a linkage map was built, covering the 10 chromosomes. The total length was 2,158 cM, which is 50% more compared to the previous maps available, which were constructed using RFLPs.

Another method used is the comparative genome mapping. This particular method is interesting for geneticists and evolutionary biologists to elucidate the mechanisms determining chromosome's evolution. Comparative genome mapping provides a powerful technique to study the way and the time where chromosomal evolution occurs [23]. This approach involves the use of molecular markers, such as RFLPs, to map the genomes of two species for a group of markers in common (*loci*). Even it is an expensive and intensive duty, this method can determine the reach and the nature of the chromosomes recombination in incompatible species crosses. The finding of small chromosomal regions which retain a similar gene order in sorghum and in two dicotyledon species (*Arabidopsis* y *Gossypium hirsutum*), suggest that comparative mapping can reach a major evolutive distance compared to what has been reached until now.

Among the *Andropogoneae* grass tribe, comparative mapping facilitates the understanding of sorghum genetics. At this point, several research groups have established a relationship between sorghum and maize genomes [27, 28, 32, 41, 42]. The high degree of conservation of the genes order between these two crops has limited the identification of chromosomal rearrangements between them. Apart from being compared with maize, sorghum has also been compared to rice, where certain apparent collinearity was also found.

Until 2015, more than 850 *loci* associated to traits relevant to biofuels production were identified in sorghum. These are traits regarding plant architecture (roots, leaves and stem), flowering time, and conversion rate of biomass into biofuels. These quantitative trait *loci* (QTLs) related to biofuels generation have been found in different mapped populations, which suggest the plasticity of these traits in different environments. This makes the genes located in these QTL regions could be potential targets to improve sweet sorghum yield for biomass and biofuels production [43].

Regardless of the multiple QTLs already reported, very few studies have been done with the aim of genetically improving these traits. In one of these, a quantitative gene (dw3), orthologous to branchytic 2 (br2) from corn, was cloned with the intention of reducing plant height. This gene is a P-glycoprotein which modules auxin transport in maize stems [44]. Another group of researchers cloned and sequenced, from the cultivar dulce Rio, homologous genes of the sucrose transporter proteins (SUTs), which were compared to the published sequence of BTX623 grain sorghum variety. It was possible to identify six SUTs in BTx623, along with nine differences in the amino acids sequence of SbSUT5 between the two varieties. Two of the five remaining SUTs exhibited unique variations in the amino acids sequences of SbSUT1 and SbSUT2, whereas the rest shared identical sequences. It was also proven that in a mutant of *Saccharomyces* (SEY6210), uncapable of growing with sucrose as the only available carbon source, sorghum SUTs are capable of transporting sucrose [45]. This showcases the relatively low knowledge of the genes underlying the traits associated to biofuels generation in sweet sorghum and bolsters the potential of sweet sorghum breeding to produce biofuel through the exploitation of its genetic resources.

## **5. Genome sequencing and sorghum functional genomics**

Massive sequencing of the line BTx623 is nowadays completed and approximately 10.5 million of reads (8X coverage) have been deposited in the NCBI database. In the preliminary assembly, more than 97% of the genes codifying for proteins (Expressed Sequence Tag, EST) in sorghum were found in 250 large contigs. The majority was able to be joined, ordered and oriented using genetic and physical maps to reconstruct the full chromosomes. The preliminary alignment assembly for the sorghum sequence was based on methyl-filtrated sequences. Also, the assembly for sorghum, maize, sugar cane transcripts, as well as *Arabidopsis* and rice proteomes, confirmed the correct assembly of the bases and local structure. This allowed the approximate prediction of 30,000 to 50,000 *loci* which code for proteins. The conserve genetic synteny with rice is evident, as expected from the comparisons obtained from the maps [10].

The spatial structure of the genes in sorghum is represented by approximately 125,000 ESTs, which have been grouped in 22,000 unigenes, representing more than the 20 diverse libraries of different genotypes [46]. Around 50,000 methylfiltrated reads, which provide an estimated coverage of 1X [47] have been assembled into contigs. Another representative strategy is the cloning and direct sequencing (Cot-Base cloning), which was used in sorghum in 2001 for the first time [48]. This method offers the potential to cover and increase this coverage more than could be achieved with ESTs and methyl-filtrated reads as demonstrated in maize.

The progress in transcriptomes' characterization has been parallel to the identification of differential genes expressing in response to biotic and abiotic factors, as well as to damage caused by insects, dehydration, high salt concentration, abscisic acid [49], methyl-jasmonate, salicylic acid and amino cyclopropane carboxylic acid [50].

#### **6. Post-transcriptional regulation by miRNAs in sorghum**

The micro-RNAs (miRNAs) are small RNA molecules of approximately 21 nucleotides, which play an important role in the post-transcriptional genetic regulation inhibiting the translation of the messenger RNAs (mRNAs) by blocking

**119**

*A General Overview of Sweet Sorghum Genomics DOI: http://dx.doi.org/10.5772/intechopen.98539*

tance as well as other types of abiotic stresses.

translation machinery or by excision of the mRNAs [51]. In plants, the majority of miRNAs promote the degradation of mRNA targets by perfect or almost perfect mating of the complimentary RNA strands [52]. miRNAs intervene in a variety of biological processes, such as development and identity of organs, metabolism and stress responses [53]. A substantial number of miRNAs has been identified in different plants, and recently the number of studies in sorghum has been increasing

Recently, Katiyar et al. [54] showed the importance of studying miRNAs and other RNA molecules using RNA sequencing from the libraries created from genotypes of a variety tolerant to drought (M35–1) and one susceptible. These varieties were cultivated in controlled conditions as well as in drought stress. After sequencing the RNA profiles generated, it was possible to identify 96 miRNAs regulated by the stressed caused by drought conditions. This represents new perspectives for the genetic engineering regarding the potential of miRNAs to improve drought resis-

Following the same research line, in 2016, Hamza et al., used 8 deregulated miR-NAs by abiotic stress in 11 elite varieties of sorghum under low water availability and drought [55]. This study showed that the miRNAs miR396, miR393, miR397-5p, miR166, miR167 and miR168 have a significative deregulation, being sbi-miR396 and sbi-miRNA398 the ones with higher overexpression for all the genotypes. This same research group has studied the effects of drought and salinity in the miRNAs profiles generated in *S. bicolor* [56]; these results confirm that the miRNAs expression patterns are related to the dose of stress the plants are subjected; however,

Other important trait to improve sweet sorghum is sugar accumulation, which has been already studied by Yu et al. [57], who propose mir-271 as a specific miRNA of the Rio sweet sorghum variety, related to cellulose synthesis and sugar accumulation. A full detailed list with most of the relevant miRNAs for the genetic improvement of sorghum in biofuels production was published by Dhaka et al. [58].

Methods for sorghum transformation have been available since the beginning of the 90's, initially by protoplasts [59] and cell culture [60], and subsequently *in planta* [61, 62], using *Agrobacterium* and protocols based in microprojectiles which are now available and with substantially improved efficiencies [63–69]. Sorghum is a crop hard to transform, since it is a recalcitrant genus for tissue culture and the transformation protocols reported are scarce and not very reproducible. In the particular case of sweet sorghum, [70] proposed a transformation system based on optimizing tissue culture conditions using embrionary callus with a regeneration of 90% in 12 weeks. Also, hygromycin resistance selection conferred by the Ubi-*hpt* transgene was performed, followed by particle bombardment. This method proved to be highly reproducible with an efficiency of transformation of 0.09% in every embryo. In 2012, Liu and Godwin, published a method with a better transformation efficiency in *S. bicolor*, in which using pure line embryos (IEs) Tx430, reaching an efficiency of 20.7% in the three independent experiments [71]. The protocol, which involves the use of microprojectiles and transgenes regulated under the *ubi1* constitutive promoter, improves the conditions of the media culture for embryos, as well as the parameters for transformation with microprojectiles. In this experiment,

After, Tien-Do *et al*. [72] developed a fast and efficient system for sorghum transformation using binary vectors and the AGL1 *Agrobacterium* strain instead of

with respect to the identification of miRNAs and their target genes.

every miRNA responded in a unique way in every of the six genotypes.

**7. Transformation and reverse genetic in sorghum**

the transgenes were inherited by the T1 generation.

#### *A General Overview of Sweet Sorghum Genomics DOI: http://dx.doi.org/10.5772/intechopen.98539*

*Biotechnological Applications of Biomass*

exploitation of its genetic resources.

comparisons obtained from the maps [10].

**5. Genome sequencing and sorghum functional genomics**

Massive sequencing of the line BTx623 is nowadays completed and approximately 10.5 million of reads (8X coverage) have been deposited in the NCBI database. In the preliminary assembly, more than 97% of the genes codifying for proteins (Expressed Sequence Tag, EST) in sorghum were found in 250 large contigs. The majority was able to be joined, ordered and oriented using genetic and physical maps to reconstruct the full chromosomes. The preliminary alignment assembly for the sorghum sequence was based on methyl-filtrated sequences. Also, the assembly for sorghum, maize, sugar cane transcripts, as well as *Arabidopsis* and rice proteomes, confirmed the correct assembly of the bases and local structure. This allowed the approximate prediction of 30,000 to 50,000 *loci* which code for proteins. The conserve genetic synteny with rice is evident, as expected from the

The spatial structure of the genes in sorghum is represented by approximately 125,000 ESTs, which have been grouped in 22,000 unigenes, representing more than the 20 diverse libraries of different genotypes [46]. Around 50,000 methylfiltrated reads, which provide an estimated coverage of 1X [47] have been assembled into contigs. Another representative strategy is the cloning and direct sequencing (Cot-Base cloning), which was used in sorghum in 2001 for the first time [48]. This method offers the potential to cover and increase this coverage more than could be

The progress in transcriptomes' characterization has been parallel to the identification of differential genes expressing in response to biotic and abiotic factors, as well as to damage caused by insects, dehydration, high salt concentration, abscisic acid [49], methyl-jasmonate, salicylic acid and amino cyclopropane carboxylic

The micro-RNAs (miRNAs) are small RNA molecules of approximately 21 nucleotides, which play an important role in the post-transcriptional genetic regulation inhibiting the translation of the messenger RNAs (mRNAs) by blocking

achieved with ESTs and methyl-filtrated reads as demonstrated in maize.

**6. Post-transcriptional regulation by miRNAs in sorghum**

Regardless of the multiple QTLs already reported, very few studies have been done with the aim of genetically improving these traits. In one of these, a quantitative gene (dw3), orthologous to branchytic 2 (br2) from corn, was cloned with the intention of reducing plant height. This gene is a P-glycoprotein which modules auxin transport in maize stems [44]. Another group of researchers cloned and sequenced, from the cultivar dulce Rio, homologous genes of the sucrose transporter proteins (SUTs), which were compared to the published sequence of BTX623 grain sorghum variety. It was possible to identify six SUTs in BTx623, along with nine differences in the amino acids sequence of SbSUT5 between the two varieties. Two of the five remaining SUTs exhibited unique variations in the amino acids sequences of SbSUT1 and SbSUT2, whereas the rest shared identical sequences. It was also proven that in a mutant of *Saccharomyces* (SEY6210), uncapable of growing with sucrose as the only available carbon source, sorghum SUTs are capable of transporting sucrose [45]. This showcases the relatively low knowledge of the genes underlying the traits associated to biofuels generation in sweet sorghum and bolsters the potential of sweet sorghum breeding to produce biofuel through the

**118**

acid [50].

translation machinery or by excision of the mRNAs [51]. In plants, the majority of miRNAs promote the degradation of mRNA targets by perfect or almost perfect mating of the complimentary RNA strands [52]. miRNAs intervene in a variety of biological processes, such as development and identity of organs, metabolism and stress responses [53]. A substantial number of miRNAs has been identified in different plants, and recently the number of studies in sorghum has been increasing with respect to the identification of miRNAs and their target genes.

Recently, Katiyar et al. [54] showed the importance of studying miRNAs and other RNA molecules using RNA sequencing from the libraries created from genotypes of a variety tolerant to drought (M35–1) and one susceptible. These varieties were cultivated in controlled conditions as well as in drought stress. After sequencing the RNA profiles generated, it was possible to identify 96 miRNAs regulated by the stressed caused by drought conditions. This represents new perspectives for the genetic engineering regarding the potential of miRNAs to improve drought resistance as well as other types of abiotic stresses.

Following the same research line, in 2016, Hamza et al., used 8 deregulated miR-NAs by abiotic stress in 11 elite varieties of sorghum under low water availability and drought [55]. This study showed that the miRNAs miR396, miR393, miR397-5p, miR166, miR167 and miR168 have a significative deregulation, being sbi-miR396 and sbi-miRNA398 the ones with higher overexpression for all the genotypes. This same research group has studied the effects of drought and salinity in the miRNAs profiles generated in *S. bicolor* [56]; these results confirm that the miRNAs expression patterns are related to the dose of stress the plants are subjected; however, every miRNA responded in a unique way in every of the six genotypes.

Other important trait to improve sweet sorghum is sugar accumulation, which has been already studied by Yu et al. [57], who propose mir-271 as a specific miRNA of the Rio sweet sorghum variety, related to cellulose synthesis and sugar accumulation. A full detailed list with most of the relevant miRNAs for the genetic improvement of sorghum in biofuels production was published by Dhaka et al. [58].
