**2. Bioinformatic tools in metagenomics and metatranscriptomics in different samples**

The first step of a NGS-based study involves the extraction of nucleic acids in sufficient quantity and quality to carry out the sequencing process in order to have an unbiased knowledge of the microbial diversity present in a sample [6, 12, 13]. The processing of DNA samples (environmental and host) can be performed by cell recovery by centrifugation gradients in differential media and the subsequent recovery of DNA by silica columns [6]. Another methodology used is the *in situ* lysis of the sample by the addition of enzymes (Proteinase K and lysozyme) with the subsequent separation of cell debris by centrifugation and recovery of DNA by solvent precipitation or by silica adsorption [14]. The main advantage of *in situ* lysis is that higher amounts of DNA are obtained when compared to cell recovery techniques; however, there is a risk of the presence of contaminants that may interfere with sequencing reactions [6].

In the case of RNA, the main methodologies perform *in situ* lysis of the sample under RNase-free conditions using different guanidine solvents and salts to avoid the presence of ribonucleases [15]. The samples should be placed at −80°C either in dry ice or liquid nitrogen to avoid their degradation.

Quality control of nucleic acids can be carried out by visualization on agarose gels, by spectrophotometric means (Nanodrop) and in microfluidic chambers (Bioanalyzers). This last system has been widely used since it allows the visualization and simultaneous quantification of nucleic acids [8, 16–18]. In the case of RNA, these systems have developed a scale known as RNA Integrity Number (RIN) which, based on the proportion between the major and minor subunits of the rRNA assigns a minimum value that must be greater than 8.0.

#### **2.1 NGS shared tools for metagenomics and metatranscriptomics**

Once a sample with sufficient quality and quantity was sent to sequencing, a series of files with the ".fastq" extension are obtained, which contains the information of the sequence and the quality for each base. This format is used by different programs (FASTQC and PRINSEQ ) to perform the quality control of the

#### *The Use of Bioinformatic Tools in Symbiosis and Co-Evolution Studies DOI: http://dx.doi.org/10.5772/intechopen.86559*

sequencing, showing basic statistics such as the total number of bases, read size, GC content, quality for each base in PHRED33 or PHRED64 scale, as well as the presence of overrepresented sequences [8, 19–23]. The files analyzed are introduced to different programs (Trimmomatic, TrimGalore, and CutAdapt) that trims the reads of the ".fastq" file, based on the quality for each nucleotide, eliminating sequences with a PHRED value below 20 and a minimum fragment size defined by the user [19, 20, 22, 24].

These programs are able to eliminate segments of initiators and sequencing adapters, which must be provided in a separate file. The output files of these programs are archives in ".fastq" format, where the sequences that are common for all samples are placed in one file, and the unique sequences for each individual sample are placed in several files [19, 20].
