**2. Plasmid and the** *E. coli* **revolution**

Without a doubt, plasmids are the most important tools not only for the manipulation of *E. coli* but also the foundation for the genetic engineering of many organisms, cloning and sequencing, generation of mutants, and many applications in molecular biology. In this section, a brief summary of the myriad of plasmids available will be addressed, providing some of the most important features of plasmids and the up-to-date technologies available for plasmid manipulation and application.

Why plasmids are the basis for genetic engineering? By surveying the literature and commercial sources catalogs, there is a myriad of applications for plasmids: cloning, mutagenesis, protein fusion and overexpression, shuttle vectors from bacteria to a diverse range of hosts, among others. Plasmids must be first presented and then we provide some features that are relevant regarding the importance of plasmids in molecular biology and biotechnology.

Plasmids are extrachromosomal molecules that are self-replicative and sometimes provide interesting features to its host. The term was first coined by Joshua Lederberg in 1952 referring to genetic elements in bacteria that remained as an independent molecule from the chromosome at any stage of their replication cycle [28]. The definition was further refined to all the autonomously replicating DNA molecules to avoid including viruses. These molecules are present not only in eubacteria but also are found in Archea and some lower eukaryotic organisms [29]. In nature, many bacteria contain self-replicating DNA molecules that can be harnessed for molecular biology applications. *E. coli* plasmids were the first ones to be extensively modified for such purposes [30].

In the 1970s, the first generation of cloning plasmids was created, and from that moment on, research in the biological area was enriched with a powerful tool. Plasmids must contain several important features to be used in research: proper size for ease to transform or transfect, selection markers, a replication origin, regulatory elements to control expression, and transcription termination. All features are important when designing a plasmid vector for the desired application, the reader can imagine the goal, and there will always be a way to create the molecular tool for achieving such a goal, and that is possible due to the basic structure of most plasmids used in molecular biology and their modularity [31]. In **Table 2**, we summarize some of the most important features (modules) that plasmids must have in order to serve for different applications. We point out that sequence composition and structure, copy number, selection marker, and special features such as reporter proteins or regulatory elements are the most important features in a plasmid and can influence the outcome of the desired application.

Recently, using the Genome Conformation Capture technique, it was revealed that the linear organization of the genome is also true for the 3D structure, rendering neighboring genes to form small factories that are coregulated or coexpressed and showed a higher probability of forming protein-protein interactions. This organization represents two important aspects of bacterial genomes, first, the compactness in the 3D space, containing pathways in a nonrandom distribution, and second, genes that are closer to each other tend to be coexpressed and form protein-protein interactions favoring the concept of transcription factor even in micro-

*E. coli* Genetic Stock Center http://cgsc.biology.yale.edu/ The CGSC Database of *E. coli* genetic

**Name URL Description\***

256 *Escherichia coli* Escherichia coli - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications

show\_organism?org=eco

RegulonDB http://regulondb.ccg.unam.mx Database on transcriptional

A powerful resource for

gene information

understanding molecular datasets in different contexts, also easy access to

regulation in *E. coli* K-12 containing knowledge manually curated from original scientific publications, complemented with high throughput datasets and comprehensive computational predictions

information includes genotypes and reference information for the strains in the CGSC collection. An excellent resource for acquiring information

and strains for research

Kegg http://www.genome.jp/kegg-bin/

In the following sections, we will review what we consider the modern tools for genetic engineering *E. coli*, and the future for this microbe that can be considered the toolbox for molecular biology and may be the answer to many problems that humanity may face in

Without a doubt, plasmids are the most important tools not only for the manipulation of *E. coli* but also the foundation for the genetic engineering of many organisms, cloning and sequencing, generation of mutants, and many applications in molecular biology. In this section, a brief summary of the myriad of plasmids available will be addressed, providing some of the most important features of plasmids and the up-to-date technologies available for plas-

bial cells [27].

*Source*: databases website.

**Table 1.** Relevant *E. coli* resources and databases.

\*

*Escherichia coli* **databases**

the future.

**2. Plasmid and the** *E. coli* **revolution**

mid manipulation and application.



ability to be transferred from one host to another like the case of the OriV from RK2 plasmid [44]. We recommend for further information about replication control of plasmids to refer to

*Escherichia coli* as a Model Organism and Its Application in Biotechnology

http://dx.doi.org/10.5772/67306

259

Linear plasmids are common in bacteria (particularly in actinobacteria), but thus far, only N15 plasmid prophage has been isolated from *E. coli* [46] and is an impediment for generating knockouts using linear DNA (see Section 3). Recently, a linear plasmid was created to clone unstable fragments bearing repetitive sequences without showing size bias during cloning,

One of the biggest impediments for plasmid segregation is the insert size. As discussed in Section 5, we are now facing not only the most exciting age of molecular biology but also the most challenging. In order to develop "bugs to the order," i.e., microbes are "trained" to perform specific tasks [48], from simple pathways to synthesize a specific metabolite to complex genetic circuits that can be controlled for novel environmental responses. In all these cases, plasmids play an important role, from generating site-specific integration of chromosomal fusions to large plasmids that can hold complex constructs for further modification. Another challenging area is the generation of strains capable of producing metabolic products required for the pharmaceutical industry, where metabolic pathways and cell metabolism can be a strong impediment for proper synthesis and purification of relevant precursors [49]. *E. coli* renders an important platform for tackling down impediments for the synthesis of

As part of the information needed for plasmid manipulation, databases and repositories are also relevant for the manipulation and selection of the right plasmid for the applications you want to further exploit. Such examples are given: a powerful plasmid repository is Addgene (https://www.addgene.org/vector-database/) where you can get any plasmid in the repository with minimum fees for shipping and handling. This source is important in many aspects; you can gain knowledge about plasmid sequence, special features, creators, and availability to use since you can acquire them with small fees. Also, Harvard University is currently generating a plasmid repository but still under development (https://plasmid.med.harvard.edu/ PLASMID/Home.xhtml). These repositories are an excellent option for accelerating research due to finding already generated constructs useful for ongoing projects. Also, *E. coli* Genetic Stock Center (**Table 1**) is a good source of strains and plasmids for different applications. For a small fee, the strains and plasmids can be shipped worldwide and characteristics can be consulted. Also, the American Type Culture Collection (www.atcc.org) is a source for strains with desired characteristics, as well as knowledge on their features. In plasmid biology, strain selection is fundamental for plasmid stability and proper propagation, as well as for the correct function of special features, such as protein expression [50]. For example, DNA methylation is an important impediment in certain applications such as eukaryotic transfec-

experiments, the repertoire for host selection is big. Most applications require lysogen strains harboring the required RNA polymerase. As stated in **Table 2**, some plasmids require the product from the *pir* gene, so careful strain selection must be taken into account so that plas-

strains for plasmid isolation is required. In protein expression

active promoter sequences, or sequences with A + T content [47].

del Solar et al. [45].

novel compounds.

tion, proper DAM−

mids replicate efficiently.

and DCM<sup>−</sup>

Additional sources: [31], http://blog.addgene.org/plasmid-101-origin-of-replication. https://www.neb.com/, https:// www.thermofisher.com/mx/es/home/brands/invitrogen.html, and http://parts.igem.org/Help:Synthetic\_Biology, which contains a repository of parts that can be needed for plasmid construction using synthetic methods.

**Table 2.** Common elements in plasmids used in molecular biology applications.

Some exceptional features of plasmids are they can be used in systems where replication origins (check compatibility first) and selection markers can coexist in the same cell, which can be extremely useful for the coexpression of four different proteins in the same cell (e.g., the four plasmid system developed by Dykxhoorn et al., which are compatible between them [41]); the broad diversity of selection markers and partitioning control elements [30]; cloning capacity [31], which is an important feature for cloning large fragments required for synthetic biology applications or metabolic engineering; reporter proteins useful for selecting positive clones; recombination or assembly technology for easier cloning methods [42, 43], and the ability to be transferred from one host to another like the case of the OriV from RK2 plasmid [44]. We recommend for further information about replication control of plasmids to refer to del Solar et al. [45].

Linear plasmids are common in bacteria (particularly in actinobacteria), but thus far, only N15 plasmid prophage has been isolated from *E. coli* [46] and is an impediment for generating knockouts using linear DNA (see Section 3). Recently, a linear plasmid was created to clone unstable fragments bearing repetitive sequences without showing size bias during cloning, active promoter sequences, or sequences with A + T content [47].

One of the biggest impediments for plasmid segregation is the insert size. As discussed in Section 5, we are now facing not only the most exciting age of molecular biology but also the most challenging. In order to develop "bugs to the order," i.e., microbes are "trained" to perform specific tasks [48], from simple pathways to synthesize a specific metabolite to complex genetic circuits that can be controlled for novel environmental responses. In all these cases, plasmids play an important role, from generating site-specific integration of chromosomal fusions to large plasmids that can hold complex constructs for further modification. Another challenging area is the generation of strains capable of producing metabolic products required for the pharmaceutical industry, where metabolic pathways and cell metabolism can be a strong impediment for proper synthesis and purification of relevant precursors [49]. *E. coli* renders an important platform for tackling down impediments for the synthesis of novel compounds.

As part of the information needed for plasmid manipulation, databases and repositories are also relevant for the manipulation and selection of the right plasmid for the applications you want to further exploit. Such examples are given: a powerful plasmid repository is Addgene (https://www.addgene.org/vector-database/) where you can get any plasmid in the repository with minimum fees for shipping and handling. This source is important in many aspects; you can gain knowledge about plasmid sequence, special features, creators, and availability to use since you can acquire them with small fees. Also, Harvard University is currently generating a plasmid repository but still under development (https://plasmid.med.harvard.edu/ PLASMID/Home.xhtml). These repositories are an excellent option for accelerating research due to finding already generated constructs useful for ongoing projects. Also, *E. coli* Genetic Stock Center (**Table 1**) is a good source of strains and plasmids for different applications. For a small fee, the strains and plasmids can be shipped worldwide and characteristics can be consulted. Also, the American Type Culture Collection (www.atcc.org) is a source for strains with desired characteristics, as well as knowledge on their features. In plasmid biology, strain selection is fundamental for plasmid stability and proper propagation, as well as for the correct function of special features, such as protein expression [50]. For example, DNA methylation is an important impediment in certain applications such as eukaryotic transfection, proper DAM− and DCM<sup>−</sup> strains for plasmid isolation is required. In protein expression experiments, the repertoire for host selection is big. Most applications require lysogen strains harboring the required RNA polymerase. As stated in **Table 2**, some plasmids require the product from the *pir* gene, so careful strain selection must be taken into account so that plasmids replicate efficiently.

Some exceptional features of plasmids are they can be used in systems where replication origins (check compatibility first) and selection markers can coexist in the same cell, which can be extremely useful for the coexpression of four different proteins in the same cell (e.g., the four plasmid system developed by Dykxhoorn et al., which are compatible between them [41]); the broad diversity of selection markers and partitioning control elements [30]; cloning capacity [31], which is an important feature for cloning large fragments required for synthetic biology applications or metabolic engineering; reporter proteins useful for selecting positive clones; recombination or assembly technology for easier cloning methods [42, 43], and the

Additional sources: [31], http://blog.addgene.org/plasmid-101-origin-of-replication. https://www.neb.com/, https:// www.thermofisher.com/mx/es/home/brands/invitrogen.html, and http://parts.igem.org/Help:Synthetic\_Biology, which

contains a repository of parts that can be needed for plasmid construction using synthetic methods.

**Table 2.** Common elements in plasmids used in molecular biology applications.

Additional elements required for positive clone selection, reporter protein fusions among others

*Escherichia coli* **plasmids**

Amp<sup>r</sup> , Kanr , Cm<sup>r</sup> , Tetr

LacZ, CcdB, Green Fluorescent

protein (GFP), etc

**Name Type of element Characteristics**

258 *Escherichia coli* Escherichia coli - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications

pMB1 Replication origin Versatile replication origin. The

pSC101 Replication origin Five copies per cell [38]

R6K Replication origin 15–20 copies per cell. Requires

among other Selection markers Elements required for the selection

original sequence generates 15–20 copies per cell, but a mutant version can lead up to 700 copies per cell [37]. This plasmid contains the *Eco*RI restriction-modification system

the π protein from the gene *pir* for replication [39]. This origin of replication is functional in diverse

and maintenance of plasmids in bacterial hosts. Here are listed the resistance cassettes for Ampicillin, Kanamycin, Chloramphenicol, and Tetracycline, which are the most common selection markers. For additional markers, RAC database contains the information regarding antibiotic resistance traits and their sequence [40] or iGEM website for sequence modules bearing the proper syntax for synthetic constructs

Plasmids have been modified so that they contain multiple cloning sites with diverse unique restriction sites, counter selection for positive clone selection. Additional elements such as filamentous origins for single-stranded DNA generation for sequencing of high G+C templates or

site-directed mutagenesis

bacterial species

Novel methods such as Gibson assembly, Golden Gate assembly, and AQUA (advanced quick assembly) methods [43, 51, 52] have skyrocketed the possibility to assemble any plasmid with the desired characteristics. These methods are based on designed modules that can either be Polymerase Chain Reaction (PCR) amplified or generated as a complete synthetic construct and then assembled in the desired combination either by an enzymatic process (Gibson and Golden gate) or even enzyme-free methods such as AQUA.

many of them provide support for fitness and evolution has preserved them, therefore full

*Escherichia coli* as a Model Organism and Its Application in Biotechnology

http://dx.doi.org/10.5772/67306

261

Larger genomic editions are needed to understand how far we can delete redundant or nonessential sequences. By using Cre/lox recombination, substantial genomic fragments can be deleted or sequentially removed, rendering the nonessential regions (regardless the genes

Studies regarding genome size analyzed through deletions of specific genes or complete genomic regions have led on thinking about the minimal genome. In the case of *E. coli*, there are several pieces of evidence (reviewed in Ref. [62]) that points out that at least 23% of the genome can be eliminated gaining genomic stability and normal growth. Also, eliminating insertion sequences can enhance the capacity of *E. coli* to synthesize proteins due to the decrease or insertions on plasmids, and strains exhibit normal growth plus increased genome

All these methods rely on basic bacterial genetics founded with *E. coli*, such as transposonbased integration of recombination sequences, λ-recombination of PCR products integrating deletion module cassettes, and the gene-specific knockout methods [62]. Mutations can then be transferred from one strain to the other to generate multiple deletions at once, and other technologies are still limited to either whole genome synthesis with previous knowledge on

The most relevant study revealed that genome size has an impact on *E. coli* cell growth, where it is shown that apparently dispensable sequences are needed under restrictive conditions, providing a hint of the still far future of fully functioning cells with all the desired characteristics for biotechnological applications [26]. We envision that genome reduction is a worthy effort, regardless of the method used to generate them. Another important aspect that we have to consider is that all conditions of the mutant strains are exposed to laboratory conditions rendering a behavior close to the ancestor or original strains. Nevertheless, there are also hidden features that must be exploited in order to understand fully the behavior of the genome and the essentiality of genes [62]. Thus far, *E. coli* remains restricted to the use of classic genetic tools and transposon or plasmid-based techniques. We encourage *E. coli* research community to join efforts to enter the synthetic biology era, toward the generation of a fully

After 6 years, in 2016, the first bacteria operating under a "minimal chemically synthesized

This research we believe has an impact in the following areas. First, both studies settled the basis for whole genome synthetic biology, which will lead to important findings in many research areas. Second, the extensive transposon-based mutagenesis studies on the genome of *Mycoplasma mycoides* led to the knowledge of the basis of essential genes or quasi-essential genes that have an important impact on cell fitness. Third, all this knowledge led to the design of a complete chemically synthesized genome with all the basic functions, and we now have the basic information for mining existing genomes to look for core modules in the bacterial genomes and design genomes with specific functions. Taking together all the

genome" was created after the first fully functioning synthetic genome [64, 65].

genome engineering is far more complicated than previously thought.

present) from the genome [59–61].

stability [63].

the structure of the genome.

synthetic *E. coli*. Our excitement is based on the following:

Finally, plasmid biology is still under scrutiny, for their involvement in the mobility of traits that are important for human health such as antibiotic resistance, the distribution of pathogenicity islands, and genome evolution. Recently, novel tools for plasmid mining have been developed and uncover from Next Generation Sequencing data that plasmids can be uncovered and analyzed for further characterization [53, 54].

We are still in the process of truly knowing the potential of *E. coli*, novel tools generated through plasmids in combination with other molecular strategies will lead to new discoveries that will render this organism the basis for important discoveries. In the next section, we will discuss some aspects of gene knockouts and the knowledge we have gained from this versatile organism.
