**3.1. Transcriptome profiling of Solanaceae**

Transcriptome sequencing of a species is the first step to access the functionally active genes. The transcriptome sequencing either by first-generation Sanger sequencing or by high throughput NGS approaches provides an insight into the expression of genes in a particular tissue/or different developmental stages of a species. The vast amount of sequencing data serve as a useful resource for the identification of sequence variations for the development of various markers, which would enable the mapping of candidate genes/QTLs for important traits. These applications have been discussed below in four important Solanaceae crops.

#### *3.1.1. Potato*

**3. Application of NGS technology in Solanaceae genetics and genomics**

NGS technologies have numerous potential applications in plant genetics and genomics, which include generation of genomic resources, complete decoding of a species genome, differential gene expression studies, whole genome association studies (WGAS), genomics assisted

Transcriptome sequencing of a species is the first step to access the functionally active genes. The transcriptome sequencing either by first-generation Sanger sequencing or by high throughput NGS approaches provides an insight into the expression of genes in a particular tissue/or different developmental stages of a species. The vast amount of sequencing data serve as a useful resource for the identification of sequence variations for the development of various markers, which would enable the mapping of candidate genes/QTLs for important traits. These

applications have been discussed below in four important Solanaceae crops.

**studies**

breeding (GAB), etc. (Figure 1).

250 Next Generation Sequencing - Advances, Applications and Challenges

**Figure 1.** Overview of NGS applications in plant genetics and genomics

**3.1. Transcriptome profiling of Solanaceae**

Potato (*S. tuberosum*) is the world's fourth largest crop after maize, rice, and wheat. It has a number of ploidy levels ranging from diploid (2*n* = 24) to triploids, tetraploids, pentaploids, and hexaploids. Most of the cultivated varieties are autotetraploid (4*n* = 48). Potato is the world's most important food crops that have edible tuber produced from stolons under favorable environmental conditions. It is accepted worldwide as a cheap source of dietary starch, protein, vitamins, and antioxidants, especially to feed large populations in developing countries. To date, only 4,20,074 ESTs are available in NCBI database (http:// www.ncbi.nlm.nih.gov/nucest/?term=potato) that served as a valuable resource in various studies of gene discovery and expression analysis in potato germplasm [19–22]. In 2011, Massa et al. [23] reported a transcriptome sequence of *S. tuberosum* group Phureja clone DM1-3 516R44 using Illumina GAII platform. In this study, a total of 22,704 transcripts were identified, and 83% of these were of known function. The expression analysis was performed in a set of 32 tissues at various developmental stages and revealed that more than twenty thousand genes were found to be expressed in normal potato tissue and of these, some showed tissue-specific expression. In another study, using the weighted gene correlation network analysis (WGCNA), 18 gene co-expression modules were identified that comprised of a total of 5400 genes [24]. These modules were classified according to the high correlated expression profiles of genes in particular developmental stages. Two modules contained mainly transcription factors that showed co-expression in fruit development (e.g., Leafy Cotyledon 1 and transcriptional factor B3 domains) and tuber-tissue-specific expression (e.g., APETALA and WRKY). In another study, using digital gene expression (DGE) profiling, five genes encoding for DOF protein, a blue light receptor, a lectin, a syntaxin-like protein, and a protein with unknown function were found to be specifically associated with photoperiodic tuberization [25]. Hamilton et al. [26] published transcriptome sequencing of three potato cultivars and identified a total of 55,340 SNPs using the Maq SNP filter. In 2013, a whole-genome transcript analysis of the pollen mRNA of *Solanumtuberosum*, *S. demissum*, and their reciprocal F1 hybrids was performed using Illumina GAII platform [27]. A total of 12.6 billion bases were obtained and were assembled into 13,020 transcripts. They identified the transcriptional differences between these samples and also identified nuclear genes that contributed to the differences observed in reciprocal crosses. Very recently, a comparative transcriptome analysis of white and purple potato was reported using Illumina HiSeq 2000 platform [28]. *De novo* assembly of the reads was per‐ formed for each cultivar using Trinity version r20131110 (http://trinityrnaseq.source‐ forge.net/). A total of 209 million paired-end reads were assembled into 60,930 transcripts. They identified candidate genes encoding transcription factors involved in anthocyanin biosynthesis. In a very interesting study, Aulakh et al. [29] reported global gene expression comparisons between wild-type (Bintje) and an activation-tagged mutant *underperformer* (*up*) using RNA-seq and identified approximately 1600 genes that were differentially expressed between them, thereby suggesting the modification of various biological pathways in the mutant variety.

#### *3.1.2. Tomato*

Tomato is an important vegetable crop that supplies vitamins and nutrients and consumed in different forms around the world. Whole transcriptome sequencing of six tomato accessions *Solanum pimpinellifolium* was performed by sequencing by synthesis method of Illumina GAII [30]. This resulted in the generation of 17 Gb of sequence data with 291,915,037 high-quality reads and represented an average of 32.5 Mb of transcriptomic sequence per accession. By using these data, a large number of SNPs were identified to analyze genetic variation in cultivated and wild populations. A leaf transcriptome sequence data of tomato cv. Hon‐ gtaiyang 903 were generated using Illumina RNA-seq, which resulted in 50,616 transcripts [31]. Eighty-four percent of these transcripts were functionally annotated in the NCBI nr database and 94.5% in the tomato reference genome [24]. Of these, 14,371 transcripts were found to be involved in 310 pathways. An expression analysis revealed that 2787 transcripts showed significant expression after exogenous ABA treatment. These transcripts were related to ABA signaling pathway, various transcription factors, heat shock proteins, and pathogen resistance. The RNA-seq of one cultivated (*Solanum lycopersicum* M82) and five wild species with two red-fruited (*Solanumpimpinellifolium* and *Solanum galapagense*) and three green-fruited (*Solanum habrochaites*, *Solanum chmielewski*, and *Solanum pennellii*) varieties of tomato was performed to study the changes in gene expression and diversity in DNA sequence of these six species [32]. From this analysis, they identified several distinguishable polymorphic positions between cultivated and wild genotypes. Further, to examine the effect of the fungal symbiosis of tomato root on tomato fruit metabolism, Zouari et al. [33] performed an RNA-Seq of *S. lycopersicum* cv. Moneymaker using Illumina GA and studied transcriptome profiling during fruit maturation. A total of 712 differentially expressed genes in fruits from mycorrhizal and control plants were identified. The majority of the regulated genes were involved in various functions such as photosynthesis, stress response, transport, amino acid synthesis, and carbohydrate metabolism. Further, it was found that AM fungi can serve as a replacement of exogenous fertilizer for the growth of tomato plant with nutrient rich fruits. In addition, to examine the hormonal response in tomato roots, Gupta et al. [34] published a transcriptome atlas of tomato root using Illumina RNA-Seq method. By mapping the 165 million reads onto the tomato reference genome (*S. lycopersicum*), they identified differential expression pattern after various hormonal treatments. To look into regulatory and metabolic pathways specific to fruit tissues, Matas et al. [35] reported a transcriptome study coupled with laser capture microdissection. Five fruit pericarp tissues were sequenced by the pyrosequencing method of GSFLX platform (Roche) and identified 20,976 high-quality expressed unigenes, which included genes that showed expression specific to particular cell type and tissue. Very recently, Mou et al. [36] performed a global analysis of transcriptome of cherry tomato (*Lycopersicon esculentum* var. *cerasiforme* "XinTaiyang") fruit after exogenous treatment of ABA and nordi‐ hydroguaiaretic acid (an inhibitor of ABA biosynthesis) to study their effect on fruit ripening process. Of the total 25,728 genes, 10,388 were found to be differentially expressed. The data also revealed the upregulation and downregulation of pigment-related genes after exogenous ABA and NDGA treatment, respectively. Moreover, they also suggested the transcriptional abundance of candidate genes involved in photosynthesis during inhibition of endogenous ABA, which highlighted the significance of ABA in the regulation of ripening process in tomato fruit. Further, to utilize the large amount of transcriptome data for tomato for studying gene expression analysis, Bostan and Chiusano [37] recently presented a web-based platform, i.e., NexGenEx-Tom, that contain collection of high quality transcriptome data of several tissue at various stages of the development of different tomato genotypes and serve as a useful approach for analysis of gene expression profiling and comparisons in various tissues/ genotypes.

#### *3.1.3. Pepper (Capsicum)*

*Solanum pimpinellifolium* was performed by sequencing by synthesis method of Illumina GAII [30]. This resulted in the generation of 17 Gb of sequence data with 291,915,037 high-quality reads and represented an average of 32.5 Mb of transcriptomic sequence per accession. By using these data, a large number of SNPs were identified to analyze genetic variation in cultivated and wild populations. A leaf transcriptome sequence data of tomato cv. Hon‐ gtaiyang 903 were generated using Illumina RNA-seq, which resulted in 50,616 transcripts [31]. Eighty-four percent of these transcripts were functionally annotated in the NCBI nr database and 94.5% in the tomato reference genome [24]. Of these, 14,371 transcripts were found to be involved in 310 pathways. An expression analysis revealed that 2787 transcripts showed significant expression after exogenous ABA treatment. These transcripts were related to ABA signaling pathway, various transcription factors, heat shock proteins, and pathogen resistance. The RNA-seq of one cultivated (*Solanum lycopersicum* M82) and five wild species with two red-fruited (*Solanumpimpinellifolium* and *Solanum galapagense*) and three green-fruited (*Solanum habrochaites*, *Solanum chmielewski*, and *Solanum pennellii*) varieties of tomato was performed to study the changes in gene expression and diversity in DNA sequence of these six species [32]. From this analysis, they identified several distinguishable polymorphic positions between cultivated and wild genotypes. Further, to examine the effect of the fungal symbiosis of tomato root on tomato fruit metabolism, Zouari et al. [33] performed an RNA-Seq of *S. lycopersicum* cv. Moneymaker using Illumina GA and studied transcriptome profiling during fruit maturation. A total of 712 differentially expressed genes in fruits from mycorrhizal and control plants were identified. The majority of the regulated genes were involved in various functions such as photosynthesis, stress response, transport, amino acid synthesis, and carbohydrate metabolism. Further, it was found that AM fungi can serve as a replacement of exogenous fertilizer for the growth of tomato plant with nutrient rich fruits. In addition, to examine the hormonal response in tomato roots, Gupta et al. [34] published a transcriptome atlas of tomato root using Illumina RNA-Seq method. By mapping the 165 million reads onto the tomato reference genome (*S. lycopersicum*), they identified differential expression pattern after various hormonal treatments. To look into regulatory and metabolic pathways specific to fruit tissues, Matas et al. [35] reported a transcriptome study coupled with laser capture microdissection. Five fruit pericarp tissues were sequenced by the pyrosequencing method of GSFLX platform (Roche) and identified 20,976 high-quality expressed unigenes, which included genes that showed expression specific to particular cell type and tissue. Very recently, Mou et al. [36] performed a global analysis of transcriptome of cherry tomato (*Lycopersicon esculentum* var. *cerasiforme* "XinTaiyang") fruit after exogenous treatment of ABA and nordi‐ hydroguaiaretic acid (an inhibitor of ABA biosynthesis) to study their effect on fruit ripening process. Of the total 25,728 genes, 10,388 were found to be differentially expressed. The data also revealed the upregulation and downregulation of pigment-related genes after exogenous ABA and NDGA treatment, respectively. Moreover, they also suggested the transcriptional abundance of candidate genes involved in photosynthesis during inhibition of endogenous ABA, which highlighted the significance of ABA in the regulation of ripening process in tomato fruit. Further, to utilize the large amount of transcriptome data for tomato for studying gene expression analysis, Bostan and Chiusano [37] recently presented a web-based platform, i.e., NexGenEx-Tom, that contain collection of high quality transcriptome data of several tissue at

252 Next Generation Sequencing - Advances, Applications and Challenges

The capsicum is a diploid, 2*x* = 2*n* = 12, and self-pollinating plant. Capsicum is closely related to other members of the Solanaceae family, such as potato, tomato, and tobacco, that originated in the New World. The genus contains 39 species of which only six species are cultivated, such as *C. annuum*, *C. baccatum*, *C. frutescence*, *C. chinense*, *C. pubescens*, and *C. assamicum* [38, 39]. These Capsicum species are grouped as pungent (hot/spicy) and nonpungent (sweet) pepper based on the presence and absence of capsaicinoid compounds, respectively, and therefore used as a major ingredient in various cuisines around the world. The fruit contains beneficial metabolites such as carotenoids (provitamin A), vitamins C and E, flavonoids, and capsaici‐ noids. It is also used as a coloring agent in food and also have several medicinal properties and thus used in making of traditional medicine. Moreover, several studies have suggested an effective role of capsaicinoids in inhibiting the growth of cancer [40–42], the painkiller in arthritis, reducing appetite, and weight management [43–45]. For chili pepper, a large number of varieties are available that are well adapted in diverse climate conditions around the world [46]. Many studies were targeted toward various aspects, including the development of genetic and genomic resources for crop improvement [39]. A *Capsicum* transcriptome database (DB, http://www.bioingenios.ira.cinvestav.mx:81/Joomla/) was developed by the sequencing of *C. annuum* transcriptome from different tissues [47]. They have obtained 1,324,516 raw reads from which 32,314 high-quality contigs, and 51,118 singletons were assembled. Functional annota‐ tion of the 75% of the contigs was done resulting in 7481 novel sequences. Further, using 454 GS-FLX pyrosequencing platform, the transcriptome analysis of red pepper (*C. annuum* L. TF68) was carried out [48]. They obtained approximately 30.63 Mb of EST data with 9818 contigs and 23,712 singletons. In another study, Nicolai et al. [49] performed transcriptome analysis using Roche 454 pyrosequencing, and this consists of 23,748 contigs and 60,370 singletons. Using the data, they identified a total of 11,849 SNPs and 853 SSRs. However, in a separate study, Ashrafi et al. [50] used three chili genotypes, namely, Maor, Early Jalapeno, and Criollo de Morelos-334 (CM334) for transcriptome sequencing. From the first assembly, they identified a total of 4236 SNPs and 2489 SSRs, while the second transcriptome assembly based on Illumina GAII resulted in 22,000 high-quality putative SNPs and 10,398 SSRs. Recently, the Pepper GeneChip array from Affymetrix in *Capsicum* for polymorphism detec‐ tion and expression analysis was reported [51]. Further, the hybridization of genomic DNA from 40 diverse *C. annuum* lines and few lines from other cultivated species such as *C. frutescens*, *C. chinense*, and *C. pubescens* resulted in generation of 33,401 single-position marker (SPP) from 13,323 unigenes. Liu et al. [52] constructed *de novo* transcriptome assembly in *C. frutescens* and obtained 54,045 high-quality unigenes in which a total of 4072 SSRs were identified, including three candidate genes i.e., dihydroxyacid dehydratase (DHAD), Thr deaminase (TD), and prephenate aminotransferase (PAT) involved in the capsaicinoid biosynthesis pathway. Additionally, a total of 9150 putative SNPs in 3349 contigs were identified between *C. frutescens* and *C. annuum*. In another study, a high-throughput tran‐ scriptome profiling in two *C. annuum* varieties resulted in 279,221 and 316,357 sequenced reads with a total of 120.44 and 142.54 Mb of sequence data. A total of 9701 and 12,741 potential SNPs were identified [53].
