**1. Introduction**

Who in biology has not heard of *Escherichia coli*? Known to many as the fundamental model microbe and perhaps model organism, *E. coli* is the cornerstone of many important findings in molecular biology and other areas of cell physiology. Perhaps even the first chemo-organoheterotroph had a similar mass composition as *E. coli*, providing the hits necessary to understand the evolution of modern bacteria [1]. Also called the "workhorse" of molecular biology for its fast growing rate in chemically defined media and extensive molecular tools available for different purposes, *E. coli* is considered the most important model organism of them all. Important findings and Nobel Prizes in biology have been developed in *E. coli*. For instance,

© 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

cracking the genetic code [2], unveiling the nature of DNA replication [3], the groundbreaking advances on gene organization and regulation or as we love to call 'the operon' [4, 5], important evidence for the basis of mutations and ultimately to the evolution of organisms [6, 7], and finally, the achievement of a genetically modified organism [8] that skyrocketed several applications of the enormous capacity for manipulating this organism, rendering *E*. *coli* as a key player in biotechnology (for an excellent review, see Ref. [9]).

*E. coli* is part of the normal microbiota of mammals, rendering the predominant facultative microbe of the gastrointestinal tract and is currently a hot debate on the impact on normal

*Escherichia coli* as a Model Organism and Its Application in Biotechnology

http://dx.doi.org/10.5772/67306

255

This organism lacks many interesting features for biotechnology, such as growing at extreme temperatures or pH, the capacity to degrade toxic compounds, pollutants, or difficult to degrade polymers [23]. But as we will see later, this bacterium is capable of doing amazing

*E. coli* harbors a genome with particular features such as a strikingly organized structure, remnents of many phages, and insertion sequences (IS) and a high transport capacity toward the cytoplasm. In 1997, the complete genome sequence of the K-12 strain was obtained, and a myriad of research was catapulted from that moment on [24]. The complete genome contains a single circular duplex molecule composed of 4,639,221 bp. Regarding its structure, protein-coding regions correspond to 87.8% of the genome, while 0.8% encodes for stable RNAs, and 0.7% consists of noncoding repeats. The remaining 11% encodes for regulatory and other functions. Nevertheless, nearly 34% (1431) proteins are considered orphan or without defined molecular function but in a recent study, it was demonstrated that by homology with distant phylogenetical relationships, they may play a role in defined molecular pathways or processes [25]. From the orphan set in *E. coli*, at least 446 contain some molecular signature that can assess their molecular role. The fact that such a vast portion of the genome remains uncharacterized for defined molecular pathways or roles paves the way for future basic research that may have a strong impact on applied research and development. For biotechnological applications, growth optimization is also a required feature of genetically modified organisms. For *E. coli*, it has been recently shown that growth can be manipulated by removing nonessential genomic sequences that also can lead to the design of important biotechnological strains for industrial purposes [26]. The advancement of genomic, transcriptomic, and proteomic technologies has led to the development of several online resources for the analysis of molecular and physiological

microflora establishment and their role in disease [22].

aspects of *E. coli*. Some examples are given in **Table 1**.

**Name URL Description\***

PortEco http://www.porteco.org A resource for knowledge and data

EcoliWiki http://ecoliwiki.net/ Community-based pages about

EcoGen 3.0 http://ecogene.org/ A database dedicated to analyzing

https://ecocyc.org A comprehensive database joining

and phages

together genomic information with biochemical features of *E. coli*

of the biology of *E. coli* including plasmids, mobile genetic elements,

everything related to the biology of

the nonpathogenic *E. coli*

and comparing genomic and transcriptomic data

*Escherichia coli* **databases**

and Metabolic Pathways

EcoCyc: Encyclopedia of *E. coli* Genes

things and its only limit is our imagination.

The main reasons why *E. coli* is the organism of choice extends and is not limited to its fast growth in chemically defined media; relative cheap culture media; does not form aggregates; industrial scalability; several molecular tools for manipulation; extensive knowledge of its genetics and genomics; extensive knowledge on its transcriptome, proteome, and metabolome, and several strains are considered biosafety 1, which renders it ideal even for teaching and school demonstrations (for example, Bio-Rad transformation kits, URL: http://www.biorad.com/es-mx/category/pglo-plasmid-gfp-kits).

Typical *E. coli*, a Gram-negative bacillus bears a rod-shaped, measures only about 1 μm long by 0.35 μm wide, although this can vary considerably depending on the strain and its conditions, there are studies regarding different mutations that affect its size and length considerably. A recent paper showed that this bacterium can grow up to 750 μm in length. By random Tn10 insertion, they found this particular mutant has an insertion at genes *ybd*N and *ybd*M whose function remains uncertain and another mutation that remains to be identified [10]. Interestingly, this mutant strain gives rise to extremely long cells that are viable (capable of cell division) and retain metabolic activity that can be useful for studies involving intracellular localization or optimize cell-surface contact.

In terms of ecology, *E. coli* is a facultative aerobe (either respiration takes place in the presence of oxygen or fermentation in its absence), which bears a sensor for oxygen presence (redox state in the quinone pool) and can activate or repress the required metabolic enzymes, depending on oxygen levels [11–13]. Also, *E. coli* and other Enterobacteriaceae can be the first organisms to colonize human infant intestine due to this capacity to metabolize oxygen or ferment because this facultative bacteria will consume the remaining oxygen in this environment so that other strict anaerobic bacteria are capable of colonizing the intestine rendering the normal microbiota found in humans and also thrive as part of the normal microbiota [14].

Phylogenetically, *E. coli* is a member of the Enterobacteriaceae and is closely related to pathogens such as *Salmonella*, *Klebsiella*, *Serratia*, and the infamous *Yersinia pestis*, which causes plague. Although *E. coli* is mostly harmless, pathogenicity islands have been identified and associated with pathogenesis in *E. coli* resulting in strains that colonize different tissues [15].

The building blocks of *E. coli* consists of about 55% protein, 25% nucleic acids, 9% lipids, 6% cell wall, 2.5% glycogen, and 3% other metabolites [16–18], which for biotechnological applications is important, since carbon flux is often a problematic issue to address in order to generate a novel metabolic pathway or to enhance a current functioning pathway. Also, carbon flux is tightly regulated by sophisticated regulatory networks that require modeling and a basic understanding of the regulatory mechanisms in order to manipulate them and achieve desired goals [19–21].

*E. coli* is part of the normal microbiota of mammals, rendering the predominant facultative microbe of the gastrointestinal tract and is currently a hot debate on the impact on normal microflora establishment and their role in disease [22].

cracking the genetic code [2], unveiling the nature of DNA replication [3], the groundbreaking advances on gene organization and regulation or as we love to call 'the operon' [4, 5], important evidence for the basis of mutations and ultimately to the evolution of organisms [6, 7], and finally, the achievement of a genetically modified organism [8] that skyrocketed several applications of the enormous capacity for manipulating this organism, rendering *E*. *coli* as

The main reasons why *E. coli* is the organism of choice extends and is not limited to its fast growth in chemically defined media; relative cheap culture media; does not form aggregates; industrial scalability; several molecular tools for manipulation; extensive knowledge of its genetics and genomics; extensive knowledge on its transcriptome, proteome, and metabolome, and several strains are considered biosafety 1, which renders it ideal even for teaching and school demonstrations (for example, Bio-Rad transformation kits, URL: http://www.bio-

Typical *E. coli*, a Gram-negative bacillus bears a rod-shaped, measures only about 1 μm long by 0.35 μm wide, although this can vary considerably depending on the strain and its conditions, there are studies regarding different mutations that affect its size and length considerably. A recent paper showed that this bacterium can grow up to 750 μm in length. By random Tn10 insertion, they found this particular mutant has an insertion at genes *ybd*N and *ybd*M whose function remains uncertain and another mutation that remains to be identified [10]. Interestingly, this mutant strain gives rise to extremely long cells that are viable (capable of cell division) and retain metabolic activity that can be useful for studies involving intracel-

In terms of ecology, *E. coli* is a facultative aerobe (either respiration takes place in the presence of oxygen or fermentation in its absence), which bears a sensor for oxygen presence (redox state in the quinone pool) and can activate or repress the required metabolic enzymes, depending on oxygen levels [11–13]. Also, *E. coli* and other Enterobacteriaceae can be the first organisms to colonize human infant intestine due to this capacity to metabolize oxygen or ferment because this facultative bacteria will consume the remaining oxygen in this environment so that other strict anaerobic bacteria are capable of colonizing the intestine rendering the normal microbiota found in humans and also thrive as part of the normal microbiota [14]. Phylogenetically, *E. coli* is a member of the Enterobacteriaceae and is closely related to pathogens such as *Salmonella*, *Klebsiella*, *Serratia*, and the infamous *Yersinia pestis*, which causes plague. Although *E. coli* is mostly harmless, pathogenicity islands have been identified and associated with pathogenesis in *E. coli* resulting in strains that colonize different tissues [15]. The building blocks of *E. coli* consists of about 55% protein, 25% nucleic acids, 9% lipids, 6% cell wall, 2.5% glycogen, and 3% other metabolites [16–18], which for biotechnological applications is important, since carbon flux is often a problematic issue to address in order to generate a novel metabolic pathway or to enhance a current functioning pathway. Also, carbon flux is tightly regulated by sophisticated regulatory networks that require modeling and a basic understanding of the regulatory mechanisms in order to manipulate them and

a key player in biotechnology (for an excellent review, see Ref. [9]).

254 *Escherichia coli* Escherichia coli - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications

rad.com/es-mx/category/pglo-plasmid-gfp-kits).

lular localization or optimize cell-surface contact.

achieve desired goals [19–21].

This organism lacks many interesting features for biotechnology, such as growing at extreme temperatures or pH, the capacity to degrade toxic compounds, pollutants, or difficult to degrade polymers [23]. But as we will see later, this bacterium is capable of doing amazing things and its only limit is our imagination.

*E. coli* harbors a genome with particular features such as a strikingly organized structure, remnents of many phages, and insertion sequences (IS) and a high transport capacity toward the cytoplasm. In 1997, the complete genome sequence of the K-12 strain was obtained, and a myriad of research was catapulted from that moment on [24]. The complete genome contains a single circular duplex molecule composed of 4,639,221 bp. Regarding its structure, protein-coding regions correspond to 87.8% of the genome, while 0.8% encodes for stable RNAs, and 0.7% consists of noncoding repeats. The remaining 11% encodes for regulatory and other functions. Nevertheless, nearly 34% (1431) proteins are considered orphan or without defined molecular function but in a recent study, it was demonstrated that by homology with distant phylogenetical relationships, they may play a role in defined molecular pathways or processes [25]. From the orphan set in *E. coli*, at least 446 contain some molecular signature that can assess their molecular role. The fact that such a vast portion of the genome remains uncharacterized for defined molecular pathways or roles paves the way for future basic research that may have a strong impact on applied research and development. For biotechnological applications, growth optimization is also a required feature of genetically modified organisms. For *E. coli*, it has been recently shown that growth can be manipulated by removing nonessential genomic sequences that also can lead to the design of important biotechnological strains for industrial purposes [26].

The advancement of genomic, transcriptomic, and proteomic technologies has led to the development of several online resources for the analysis of molecular and physiological aspects of *E. coli*. Some examples are given in **Table 1**.



Why plasmids are the basis for genetic engineering? By surveying the literature and commercial sources catalogs, there is a myriad of applications for plasmids: cloning, mutagenesis, protein fusion and overexpression, shuttle vectors from bacteria to a diverse range of hosts, among others. Plasmids must be first presented and then we provide some features that are relevant regarding the importance of plasmids in molecular biology and biotechnology.

*Escherichia coli* as a Model Organism and Its Application in Biotechnology

http://dx.doi.org/10.5772/67306

257

Plasmids are extrachromosomal molecules that are self-replicative and sometimes provide interesting features to its host. The term was first coined by Joshua Lederberg in 1952 referring to genetic elements in bacteria that remained as an independent molecule from the chromosome at any stage of their replication cycle [28]. The definition was further refined to all the autonomously replicating DNA molecules to avoid including viruses. These molecules are present not only in eubacteria but also are found in Archea and some lower eukaryotic organisms [29]. In nature, many bacteria contain self-replicating DNA molecules that can be harnessed for molecular biology applications. *E. coli* plasmids were the first ones to be exten-

In the 1970s, the first generation of cloning plasmids was created, and from that moment on, research in the biological area was enriched with a powerful tool. Plasmids must contain several important features to be used in research: proper size for ease to transform or transfect, selection markers, a replication origin, regulatory elements to control expression, and transcription termination. All features are important when designing a plasmid vector for the desired application, the reader can imagine the goal, and there will always be a way to create the molecular tool for achieving such a goal, and that is possible due to the basic structure of most plasmids used in molecular biology and their modularity [31]. In **Table 2**, we summarize some of the most important features (modules) that plasmids must have in order to serve for different applications. We point out that sequence composition and structure, copy number, selection marker, and special features such as reporter proteins or regulatory elements are the most important features in a plasmid and can influence the outcome of the desired application.

**Name Type of element Characteristics**

ColE1 Replication origin Generates 15–20 copies of each

p15A Replication origin Low copy number replication origin,

plasmid molecule. Colicin production. Related to plasmids that confer immunity to phage infections [32]. Found in low copy plasmids such as pBR322 [33]. There are mutations in this replication origin that leads to high copy number plasmids, such as pUC series that can render up to 700 copies per cell

estimated in 18–22 copies per cell [36]. This type of replication origin is often found in pACYC and its

[34, 35]

derivative vectors

sively modified for such purposes [30].

*Escherichia coli* **plasmids**

**Table 1.** Relevant *E. coli* resources and databases.

Recently, using the Genome Conformation Capture technique, it was revealed that the linear organization of the genome is also true for the 3D structure, rendering neighboring genes to form small factories that are coregulated or coexpressed and showed a higher probability of forming protein-protein interactions. This organization represents two important aspects of bacterial genomes, first, the compactness in the 3D space, containing pathways in a nonrandom distribution, and second, genes that are closer to each other tend to be coexpressed and form protein-protein interactions favoring the concept of transcription factor even in microbial cells [27].

In the following sections, we will review what we consider the modern tools for genetic engineering *E. coli*, and the future for this microbe that can be considered the toolbox for molecular biology and may be the answer to many problems that humanity may face in the future.
