*2.1.3 Gene insertion or deletion*

A large number of researches have documented the effect of gene knockout/in on BGC expression or levels of SM production. However, conventional methods of gene editing are time-intensive, while CRISPR-Cas9-based approach allows for much faster and efficient gene editing [30]. The emergence of CRISPR-Cas9 has opened up a new era in gene editing opportunities [31]. Recently, CRISPR gene editing approach has been used to insert promoter in order to activate microorganisms' SMs' production [32].

Nowadays, CRISPR-Cas9 is used to introduce promoter at multiple BGCs, and at the same time, resulting in the activation of BGCs followed by the production of SMs [32]. Multiplexed site-specific genome engineering (MSGE) was also used for multiple BGCs' editing [33]. MSGE has led to a significant increase in the secondary metabolites' production.

While, gene editing approaches provide a significant platform to manipulate the genetic machinery of microbes toward the production of novel, natural secondary

metabolites, the identification of secondary metabolites is also equally important. Metabolomics plays a significant role in the identification and characterization of secondary metabolites produced by native or genetically modified microorganisms.

### **2.2 Identification and characterization of secondary metabolites**

Unlike all omics techniques, metabolomics often requires a broad array of instrumentation such as coulometric array detectors for detecting redox compounds, fluorescent spectrometers for detecting aromatic compounds, and ELSD for detecting lipids, whereas genomics, proteomics, or transcriptomics measurements are often conducted by a single instrument.

In general, microbial secondary metabolites' investigation is mainly conducted in two different approaches, the targeted and untargeted metabolites' identification [34]. Targeted metabolites' experiments aim to detect a specific group of compounds (about 20 compounds) that are already identified. Whereas, untargeted secondary metabolites' investigation aims to detect and identify a large scale of metabolites that are produced by microorganisms, including known and novel metabolites [35].

Over the past decade, two general technologies have emerged as the primary tools in metabolomics, the nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) [36]. Some of the common MS-based analyses are (GC-MS), (CE-MS), and (LC-MS) [37, 38]. These high-throughput tools provide a broad coverage of many classes of secondary metabolites, including amino acids, lipids, sugars, organic acids, and others.

### *2.2.1 Detection of secondary metabolites*

Mass spectrometry (MS) is a technique that measures the mass-to-charge ratio of molecule. The principle of chromatography is to detect the retention time of the constituents that travel at different speeds under a specific condition. Therefore, various constituents take different time to pass from the inlet to the detector of the chromatography system [36].

Nuclear magnetic resonance (NMR) spectroscopy principle is based on using the magnetic properties of atomic nuclei to determine the chemical and physical properties of atoms or molecules in which they are contained. NMR's mechanism of action is that the magnetic nuclei in magnetic field absorb, resulting in reemitting of electromagnetic radiation at a specific resonance frequency depending on the magnetic properties of the isotope of atom as well as the strength of the magnetic field [36].

Both MS and NMR can be utilized to identify targeted and untargeted metabolomics. In fact, MS and NMR are often complementary techniques to each other. While NMR can be used to differentiate between structural isomers, MS provides information on the formula of the molecule [39]. Comparing to NMR, mass spectrometry is more sensitive and is able to detect a large scale of metabolites. On the other hand, nuclear magnetic resonance (NMR) spectroscopy is highly quantitative and reproducible. Unlike MS, NMR requires a larger sample amount for analysis [40, 41].

#### *2.2.2 Data analysis*

In fact, the complexity and huge amount of information that are obtained from either NMR spectroscopy or MS are considered to be one of the major challenges in metabolomics experiments [7]. The extraction of the important information that is generated by MS or NMR spectroscopy depends on using computer software

**7**

*Enhancement and Identification of Microbial Secondary Metabolites*

in order to organize the vast amount of data [40]. First, the row data acquired from the NMR spectroscopy or MS must be first converted into computer formats compatible with software packages. In fact, the goal of metabolomics data analysis is to compare and identify the differences between hundreds or thousands of SMs. It is unpractical to visualize changes between groups of metabolites by analyzing metabolites individually; therefore, univariate and multivariate statistical techniques can then be used to interpret the data. One of the most widely used statistical methods is the principal component analysis (PCA) [39, 42, 43]. By using PCA, the data can be simplified without losing their main features. Generally, the PCA principle is based on reducing the dimensionality of the data set, while keeping characteristics participating most to the variance. In fact, PCA provides information on multivariate differences among metabolites. It is usually conducted at the

However, different univariant statistical tests can be used to analyze isolated metabolites such as ANOVA, nonparametric Wilcoxon signed-rank test, Kruskal-Wallis test, and the parametric Student's t-test [44]. Furthermore, other univariant analysis can be used to validate the analysis such as false discovery rate calculations

Due to the development of various bioinformatics software, most of metabolites can be identified. Two types of metabolites' identification are applied, including (a) putative identification and (b) definitive identification [7]. In putative identification, one or two molecular properties are utilized for identification. However, in definitive identification, two properties such as the retention time and accurate mass and/or fragmentation mass spectrum and/or NMR spectrum are used and compared with authentic chemical standard. Comparing to putative identification, definitive identification is a more accurate form of identification, while definitive identification uses the authentic chemical standard. Usually, the definitive identifi-

Nowadays, a variety of different metabolomics' databases are available

for metabolites' identification. Generally, spectra generated during analysis are compared with reference compounds in databases, and then similarity is assigned to each other. Even though, metabolome databases are updated daily, still significant

Some of the common databases used in nuclear magnetic resonance (NMR) spectroscopy are METLIN (http://metlin.scripps.edu), the Human Metabolome Database (HMDB, http://www.hmdb.ca), and Biological Magnetic Resonance Databank (http://www.bmrb.wisc.edu/metabolomics/), whereas commonly used databases for mass spectrometry are NIST (http://www.nist.gov/srd/nist1a.htm), MassBank (http://www.massbank.jp), the Golm Metabolite Database (GMD, http:// csbdb.mpimp-golm.mpg.de/csbdb/gmd/gmd.html), METLI, and MMCD (http://

Microorganisms are a rich source of secondary metabolites which have significant pharmaceutical, biomedical, and food applications. Nowadays, the development and integration of gene editing tools, especially CRISPR-Cas9

numbers of secondary metabolites in biological system are unidentified.

Some are spectral-based databases as well as chemical structure-based databases

*DOI: http://dx.doi.org/10.5772/intechopen.93489*

early stages of data analysis.

or Bonferroni correction [44].

*2.2.3 Metabolites' identification*

online [45, 46].

**3. Conclusion**

mmcd.nmrfam.wisc.edu) [45].

cation is performed after the putative identification.

#### *Enhancement and Identification of Microbial Secondary Metabolites DOI: http://dx.doi.org/10.5772/intechopen.93489*

in order to organize the vast amount of data [40]. First, the row data acquired from the NMR spectroscopy or MS must be first converted into computer formats compatible with software packages. In fact, the goal of metabolomics data analysis is to compare and identify the differences between hundreds or thousands of SMs. It is unpractical to visualize changes between groups of metabolites by analyzing metabolites individually; therefore, univariate and multivariate statistical techniques can then be used to interpret the data. One of the most widely used statistical methods is the principal component analysis (PCA) [39, 42, 43]. By using PCA, the data can be simplified without losing their main features. Generally, the PCA principle is based on reducing the dimensionality of the data set, while keeping characteristics participating most to the variance. In fact, PCA provides information on multivariate differences among metabolites. It is usually conducted at the early stages of data analysis.

However, different univariant statistical tests can be used to analyze isolated metabolites such as ANOVA, nonparametric Wilcoxon signed-rank test, Kruskal-Wallis test, and the parametric Student's t-test [44]. Furthermore, other univariant analysis can be used to validate the analysis such as false discovery rate calculations or Bonferroni correction [44].

#### *2.2.3 Metabolites' identification*

*Extremophilic Microbes and Metabolites - Diversity, Bioprospecting and Biotechnological...*

**2.2 Identification and characterization of secondary metabolites**

ments are often conducted by a single instrument.

organic acids, and others.

chromatography system [36].

field [36].

for analysis [40, 41].

*2.2.2 Data analysis*

*2.2.1 Detection of secondary metabolites*

metabolites, the identification of secondary metabolites is also equally important. Metabolomics plays a significant role in the identification and characterization of secondary metabolites produced by native or genetically modified microorganisms.

Unlike all omics techniques, metabolomics often requires a broad array of instrumentation such as coulometric array detectors for detecting redox compounds, fluorescent spectrometers for detecting aromatic compounds, and ELSD for detecting lipids, whereas genomics, proteomics, or transcriptomics measure-

In general, microbial secondary metabolites' investigation is mainly conducted in two different approaches, the targeted and untargeted metabolites' identification [34]. Targeted metabolites' experiments aim to detect a specific group of compounds (about 20 compounds) that are already identified. Whereas, untargeted secondary metabolites' investigation aims to detect and identify a large scale of metabolites that are produced by microorganisms, including known and novel metabolites [35].

Over the past decade, two general technologies have emerged as the primary tools in metabolomics, the nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) [36]. Some of the common MS-based analyses are (GC-MS), (CE-MS), and (LC-MS) [37, 38]. These high-throughput tools provide a broad coverage of many classes of secondary metabolites, including amino acids, lipids, sugars,

Mass spectrometry (MS) is a technique that measures the mass-to-charge ratio of molecule. The principle of chromatography is to detect the retention time of the constituents that travel at different speeds under a specific condition. Therefore, various constituents take different time to pass from the inlet to the detector of the

Nuclear magnetic resonance (NMR) spectroscopy principle is based on using the magnetic properties of atomic nuclei to determine the chemical and physical properties of atoms or molecules in which they are contained. NMR's mechanism of action is that the magnetic nuclei in magnetic field absorb, resulting in reemitting of electromagnetic radiation at a specific resonance frequency depending on the magnetic properties of the isotope of atom as well as the strength of the magnetic

Both MS and NMR can be utilized to identify targeted and untargeted metabolomics. In fact, MS and NMR are often complementary techniques to each other. While NMR can be used to differentiate between structural isomers, MS provides information on the formula of the molecule [39]. Comparing to NMR, mass spectrometry is more sensitive and is able to detect a large scale of metabolites. On the other hand, nuclear magnetic resonance (NMR) spectroscopy is highly quantitative and reproducible. Unlike MS, NMR requires a larger sample amount

In fact, the complexity and huge amount of information that are obtained from either NMR spectroscopy or MS are considered to be one of the major challenges in metabolomics experiments [7]. The extraction of the important information that is generated by MS or NMR spectroscopy depends on using computer software

**6**

Due to the development of various bioinformatics software, most of metabolites can be identified. Two types of metabolites' identification are applied, including (a) putative identification and (b) definitive identification [7]. In putative identification, one or two molecular properties are utilized for identification. However, in definitive identification, two properties such as the retention time and accurate mass and/or fragmentation mass spectrum and/or NMR spectrum are used and compared with authentic chemical standard. Comparing to putative identification, definitive identification is a more accurate form of identification, while definitive identification uses the authentic chemical standard. Usually, the definitive identification is performed after the putative identification.

Nowadays, a variety of different metabolomics' databases are available online [45, 46].

Some are spectral-based databases as well as chemical structure-based databases for metabolites' identification. Generally, spectra generated during analysis are compared with reference compounds in databases, and then similarity is assigned to each other. Even though, metabolome databases are updated daily, still significant numbers of secondary metabolites in biological system are unidentified.

Some of the common databases used in nuclear magnetic resonance (NMR) spectroscopy are METLIN (http://metlin.scripps.edu), the Human Metabolome Database (HMDB, http://www.hmdb.ca), and Biological Magnetic Resonance Databank (http://www.bmrb.wisc.edu/metabolomics/), whereas commonly used databases for mass spectrometry are NIST (http://www.nist.gov/srd/nist1a.htm), MassBank (http://www.massbank.jp), the Golm Metabolite Database (GMD, http:// csbdb.mpimp-golm.mpg.de/csbdb/gmd/gmd.html), METLI, and MMCD (http:// mmcd.nmrfam.wisc.edu) [45].

### **3. Conclusion**

Microorganisms are a rich source of secondary metabolites which have significant pharmaceutical, biomedical, and food applications. Nowadays, the development and integration of gene editing tools, especially CRISPR-Cas9

*Extremophilic Microbes and Metabolites - Diversity, Bioprospecting and Biotechnological...*

(gene cloning, gene refactoring, and gene insertion or deletion) in metabolomics, provide a successful platform for the identification and detection of known and novel SMs and also to increase the production of SMs. However, there are still some challenges associated with the application of metabolomics and gene editing, including that complete identification of novel SMs requires a combination of different methods which also result in increase in the screening cost. Thus, a comprehensive and sensitive technique is the need of hour, which has the ability to provide comprehensive information of any SMs under any conditions. Also, the off-target effect of CRISPR-Cas9 is a significant problem. However, the integration of metabolomics and CRISPR-Cas9-based gene editing tools may improve the efficiency of microbial secondary metabolites' discovery.
