**6. Single Cell Genome Analyses of the extremophilic microbial diversity**

A recent concept in the field of the culture- independent approaches for identification and characterization of microbial genetic and metabolic diversity is "Single Cell Genome Analyses (SCGA)" [106]. This approach accesses genomes from one cell at a time. Therefore, this approach allows the analyses of the microbial genetic and metabolic diversity at the level of the most fundamental biological unit. The central technical aspect of this approach involves separation of individual cells from a complex mixture of environmental matrix using a cell sorting methods such as fluorescence-activated cell sorting (FACS). Cell separation is followed by cell lysis and recovery of the femtogram levels of DNA from a Single cell. The recovered single cell DNA is amplified using multiple displacement amplification (MDA) and amplification of single cell genomic DNA, such that the quantities of DNA increases to 100s of nano grams – 10s of micro grams (a 103 -106 fold increase) [107, 108]. The single amplified genomes (SAGs) are subsequently used for screening by PCR amplification and NGS sequencing. The taxonomic identity of the concerned extremophilic microbial cell is ascertained with 16S rRNA gene sequencing, whereas subsequent shotgun or NGS sequencing, assembly and annotation is carried out with single amplified genomes of interest identified through preliminary phylotype characterization [106–109].

Despite its tremendous scientific capabilities, the SCGA is yet to make outreaching impact on microbial genomics in general and extremophilic microbiology in particular. The technical procedure used for SCGA faces many challenges that are not yet completely addressed. The most critical challenges include: (i) technical limitation in precise and reproducible separation of single bacterial cells with available methodologies; (ii) low amounts of starting DNA recoverable from single bacterial cell; (iii) requirement of a high degree of amplification; (iv) possibility of cross contamination; (v) introduction of chimeric artifacts and biases in genomic coverage during single genome amplification; and (v) poor post-sequencing quality control, data analyses and sequence assembly [110]. Due to these limitations, the resulting composite assemblies from SCGA can often represent incomplete or inaccurately characterized genomes for a given strain or species [107, 111]. However, several technological updates are being made to circumvent these limitations of the SCGA, which would soon enable highly accurate data generation and its physiological interpretation based on the absence as well as presence of genes and pathways [108].

*Extremophilic Microbes and Metabolites - Diversity, Bioprospecting and Biotechnological...*

**5.2 Identification and characterization of cyanobactins**

**5.3 Identification and characterization of Type II polyketides**

previously unknown and rare carbon skeletons [93].

Type II polyketides are a group of small molecules with aromatic rings and contain alternating carbonyl and methylene groups (-CO-CH2-).Many of the Type II polyketides (e.g. tetracycline and doxorubicin) are well documented for antimicrobial and ant cancerous activities [90]. Gene clusters involved in synthesis of these small molecules are rather divergent and exhibit low levels of DNA sequence homology, yet each of them contain at least a 'polyketide synthetase', encoded by three highly conserved genes, i.e. 2 genes for ketosynthases (KSs) and one gene for a acyl carrier protein. These 3 genes are referred as 'minimal PKS synthesis gene cluster'. Studies carried out with metagenomes in general and extremophilic metagenome in particular have shown a rich diversity of novel 'minimal PKS synthesis gene cluster' [102]. In subsequent studies, gene clusters with minimal PKS synthesis genes were identified in soil metagenomes [103]. The transformation and heterologous expression in different strains belonging to genus *Streptomyces* and lead to synthesis and identification of several new polyketide metabolites with

**5.4 Identification and characterization of trans-acyltransferse polyketides**

This class of small molecule polyketides is biosynthesized through activity of a freestanding acyltransferases and constitutes one of the most important groups of pharmacologically interesting polyketides. Considering their pharmaceutical

library of hybrid cyanobactins [90].

resulted in identification of multiple predicted glycopeptide-encoding gene clusters from the soil metagenomic libraries. In the follow up studies, the novel glycopeptide synthesis related gene(s) and gene cluster(s) identified from the metagenomic DNA were transformed and heterologously expressed in a *Streptomyces* expression host [95, 96]. Such technical intervention resulted in several new derivative glycopeptide antibiotics (with methyl, sulfur and sugar substitution) were generated being

Cyanobactins are a family of small, cyclic peptides produced by cyanobacteria

and consist of N-to-C macro-cylization of a 6–20 amino acid chain. They are generally assembled through the cleavage and modification of short precursor proteins. Many of these peptides show antimalarial or antitumor activity [97]. It is speculated that close to 30% of all cyanobacterial strains contain genes corresponding to synthesis of cyanobactins [98, 99]. It is also speculated that, bacterial diversity other than cyanobacteria may also have harbor the gene(s) and gene cluster(s) for synthesis of cyanobactins [98]. However, access to such cyanobactins gene cluster(s) is limited due to the non- cultivability of the vast microbial majority. A few metagenomic studies have reported cloning and heterologous expression of biosynthetic gene clusters for the cyanobactins. In one such example study, the gene cluster for 'patellamide' was cloned and heterologously expressed from metagenomic libraries of uncultured cyanobacterial symbionts associated with marine sponge [100, 101]. In other studies, the structural diversity of diversity was enriched with subtle changes in the gene encoding for precursor peptide and employed it in combination with multiple strategies e.g. (i) orthogonal loading of unnatural amino acids; (ii) mutagenesis of precursor peptide; (iii) generation of a

**72**

synthesized.

### **6.1 Combining single cell genomics and metagenomics**

Despite the individual technical limitations of both the approaches, it is regarded that the combined synergistic application of single-cell genomics and metagenomics can offer great opportunities, since the advantages offered by each of these techniques are complementary in nature. To highlight, it is underlined that one hand metagenomics is not known to suffer from any problem associated with chimera generation during strand displacement and genome amplification or separation of individual microbial cells from a complex heterogeneous mixture. On the other hand single-cell genomics overcomes the limitation of metagenomics by leading to a direct and unambiguous association of phylogeny and metabolic functions. Information obtained from SCGA can be effectively used to assign taxonomy to individual metagenome contigs with high accuracy [107, 112–114]. SCGA may also be used for retrieving complete genomes of candidate taxon from the metagenomic data. Similarly, the metagenomic reads can be mapped back to scaffolds for closely related SAG and therefore significantly improve their annotation.

The synergistic application of metagenomics and single cell genomics is regarded to have a unified and far reaching implication in harnessing the biotechnological potential of the extremophilic microbial diversity. As a matter of fact, extremophilic environments have already featured prominently in studies implementing both metagenomics and single-cell genomics studies. The most note-worthy set of studies were performed on acidophilic biofilms of Richmond Mine, California, USA, wherein initial metagenomic studies led to the identification of dominant microbial communities, while subsequent single cell genomics studies could identify even novel, low-abundance archaeal lineages that were later named as archaeal richmond mine acidophilic nanoorganisms (ARMAN) [115, 116]. The nanoorganisms have since been the matter of investigation throughout the world. In the same vein, the synergistic application of metagenomics and single cell genomics has led to identification of three previously uncultivated and uncharacterized halophilic phylotypes that represent the candidate phylum Nanohaloarchaeota from studies carried out on samples collected from halophilic Pola salterns, Alicante, Spain. Apart from the taxonomic and phylogenetic characterization of novel extremophiles, the synergistic application of metagenomics and single cell genomics also led to identification of their critical metabolic functions e.g. presence of rhodopsin and genes for a photoheterotrophic lifestyle.
