**2. Metagenomic sequencing approaches for investigating intestinal microbiomes**

#### **2.1. Bacterial identification by 16S rRNA gene sequencing**

New generation sequencing technologies are capable of processing high amount of DNA in a relative short time using 16S ribosomal (16S rRNA) genetic information. Several high throughput platforms such as 454 Roche GS FLX, Applied Biosystems SOLiD System, Illumina HiSeq and MiSeq System and Ion Torrent Personal Genome Machine (PGM) have been used for this kind of metagenomic approach [7, 8]. The molecular-based taxonomic investigation for bacteria employs direct sequencing of PCR-amplified small sequences of 16S rRNA gene from extracted DNA, generally using universal primers annealing conserved nucleotides to amplify one or more fragments of variable regions. As a few numbers of base pairs can change in a very short period of evolutionary time, amplicons around 300 bp are frequently enough for taxonomic assignment [9]. The sequences at a pre-defined level of identity stand for grouped clusters of similar sequencing reads, known as Operational Taxonomic Unit (OTU), which corresponds to a group of very similar 16S sequences. Reference databases (GreenGenes, myRDP, NCBI) are used to classify OTUs providing identification of taxonomy, relative frequencies and diversity of community composition in samples obtained from the certain ecosystem [10, 11]. This approach allows identification of new species and investigation of low-abundance bacteria and even uncultivated gut microbial communities from a single analysis. In addition, these technologies are faster and more accurate compared to classical identification methods (cloning and culture) [12]. However, this approach has some limitations regarding information about the microbiome function, mainly because several species of bacteria have not been characterized yet and secondly due to a great variability found among individuals, it is expected that the microbiota function present high redundancy, in which different species may occupy the same niche in the gut [13].

#### **2.2. Shotgun sequencing for predictive functional analyses**

The whole metagenome sequencing can be performed using a shotgun approach. In such genomic survey strategy, multiple continuous overlapping sequences (contigs), which are assembled from fragmented sequences and obtained from total purified genomic DNA, are used for identifying genes through alignment with bacterial reference genomes and databases (KEGG, SEED and NCBI) [6]. Shotgun approach is quite versatile, in which the samples can be submitted to various methods, including nebulization, endonucleases, or sonication for random fragmentation of DNA, sequencing a subsequent contig assembly and annotation. Furthermore, advanced computational methods applying different algorithms are frequently being developed for more accurate assembly and annotation of genes, thus allowing functional characterization in complex environments like the human gut [14, 15]. This method also provides identification of variants and polymorphisms and gives a more comprehensive understanding on the functional information of microorganism communities, for example, by reconstructing metabolic pathways *in silico* [12]. A major limitation of this strategy is that metagenomic sequencing of multiple individuals is extremely expensive in comparison with 16S rRNA sequencing and generation of a large amount of data demands intense computational analysis, most of time to be performed by bioinformatics specialists [16].

#### **2.3. Metagenomic consortium**

in the human gut, where they occupy niches that make contributions to nutrient processing, pathogen colonization resistance and mucosal immune system development [1]. The intestinal microbiome is formed by hundreds of different bacterial species colonizing mucosal surfaces. Its compositional structure differs across human populations according to geographic regions in the world method of delivery at childbirth, breast or bottle feeding, age, diet and medications [2]. Actually, the role of one individual microbiota is composed by the repertoire of expressed genes, known as metagenome. Impressively, it is estimated that humans possess 10 million extra genes from intestinal bacteria [3]. Significant perturbation of the gut microbiota can lead to a dysbiosis state, which compromise important functions in host immunity and raise susceptibility to immune-mediated diseases [4]. Therefore, there has been great interest in identifying the metagenomic content of the gut microbiota which can be used to treat or prevent diseases. In this context, extensive endeavor are being carried out to elucidate the gut ecosystem and molecular mechanisms underlying the pathogenesis of several intestinal disorders [5]. Culture-independent methods, in particular next-generation sequencing technologies, have prompted a huge breakthrough in our knowledge regarding the microbial communities colonizing the human body and

**2. Metagenomic sequencing approaches for investigating intestinal** 

New generation sequencing technologies are capable of processing high amount of DNA in a relative short time using 16S ribosomal (16S rRNA) genetic information. Several high throughput platforms such as 454 Roche GS FLX, Applied Biosystems SOLiD System, Illumina HiSeq and MiSeq System and Ion Torrent Personal Genome Machine (PGM) have been used for this kind of metagenomic approach [7, 8]. The molecular-based taxonomic investigation for bacteria employs direct sequencing of PCR-amplified small sequences of 16S rRNA gene from extracted DNA, generally using universal primers annealing conserved nucleotides to amplify one or more fragments of variable regions. As a few numbers of base pairs can change in a very short period of evolutionary time, amplicons around 300 bp are frequently enough for taxonomic assignment [9]. The sequences at a pre-defined level of identity stand for grouped clusters of similar sequencing reads, known as Operational Taxonomic Unit (OTU), which corresponds to a group of very similar 16S sequences. Reference databases (GreenGenes, myRDP, NCBI) are used to classify OTUs providing identification of taxonomy, relative frequencies and diversity of community composition in samples obtained from the certain ecosystem [10, 11]. This approach allows identification of new species and investigation of low-abundance bacteria and even uncultivated gut microbial communities from a single analysis. In addition, these technologies are faster and more accurate compared to classical identification methods (cloning and culture) [12]. However, this approach has some limitations regarding information about the microbiome function, mainly because several species of bacteria have not been characterized yet and secondly due to a great variability found among individuals, it is expected that the microbiota function present high redundancy, in

their functional beneficence to host health [6].

**2.1. Bacterial identification by 16S rRNA gene sequencing**

which different species may occupy the same niche in the gut [13].

**microbiomes**

56 Metagenomics for Gut Microbes

Massive increase in the amount of data from human gut microbiota and identification of genes or families of genes submitted in the databases has prompted the creation of consortia such as the Human Microbiome Project in healthy individuals, which led to establishment of a reference microbial genome database according to results of 16S profiling of 242 healthy adults from the United States [17]. The European milestone Metagenomics of the Human Intestinal Tract (MetaHIT) intended to identify potential links between the association of gut microbiome with obesity and inflammatory bowel disease (IBD) from 540 Gb of DNA from stool samples of 124 healthy or sick individuals [18]. Moreover, about 1000 bacterial species were found and each individual in this study is estimated to contain at least 160 species, and in addition, 18 species of bacteria were common to 124 individuals [19]. Nevertheless, complementary approaches to metagenomic studies as well as integrative analysis are required to understand the complex and intrinsic interactions with gut microbiota and hosts, like metatranscriptomics, metaproteomics for studying the functional aspects of the microbiota and metabolomics [20, 21].
