**Abstract**

Among the living organisms, *Escherichia coli* has been the most common choice employed for recombinant protein expression. In addition to its well-characterized genetics, *E. coli* is fast growing, relatively cheap, and easy to handle. These fine properties, in conjunction with the success achieved in transforming plasmid DNA into *E. coli*, as well as the advent of various genetic engineering techniques in the 1970s, have enabled *E. coli* to be considered as the most favorable host for genetic manipulations. The recent advances in better comprehension of regulatory controls of gene expression and the availability of various novel approaches, which include both intracellular, e.g., through intein-mediated expression and self-cleavages, and extracellular, e.g., through the use of secretion signals, to achieve successful expression of the target proteins in *E. coli* further support the view that *E. coli* is the most promising host choice for heterologous protein expression.

**Keywords:** *Escherichia coli*, *E. coli*, recombinant protein expression, heterologous protein, authentic structures, fusion protein, affinity tags, secretion, excretion, inteins

#### **1. Introduction**

The achievements in unveiling the structure of DNA, deciphering the genetic code, understanding gene expression and regulation, and discovering extrachromosomal DNA (plasmid), restriction endonucleases and DNA ligases in the 1950s and 1960s laid the groundwork for the construction of the first chimeric (recombinant) DNA molecule [1]. In 1973, Cohen and Boyer reported their success in creating the first *Escherichia coli* transformant into which a recombinant plasmid molecule was introduced [2].

The possibility of inserting foreign DNA into *E. coli* has not only allowed the development of a vast number of molecular biology techniques for genetic manipulations, e.g., construction and characterization of cDNA libraries, DNA splicing and amplification, hybridization and sequencing, site-specific mutagenesis, research and applications of bacteriophages and DNA modifying enzymes, studies of regulation of gene expression, etc., but also the exploitation of *E. coli* for use as a surrogate host for the expression of heterologous proteins. Despite the presence of restriction

systems in *E. coli* [3, 4], foreign DNA from a wide variety of sources, ranging from a simple virus to a complex organism such as the human being, have been shown to be stably maintained in it. Therefore, although *E. coli* may not be employed as the final host cell for target protein expression, due to the difficulties of DNA manipulations, which include selection, characterization, and amplification of recombinant DNA constructs directly performed in the final host cell, *E. coli* is routinely recruited to fulfill the tasks. In this regard, *E. coli* acts as an indispensable stepping stone facilitating molecular genetic studies undertaken in eukaryotic and other bacterial host cells.

The fine properties including fast growth rate (with a doubling time of less than 30 min), ease of manipulation, relatively low cost of cultivation/protein expression, and the well-characterized genetics of *E. coli* are the key reasons why this bacterium has been commonly involved in the wide range of studies mentioned above. Thus, *E. coli* has been crowned the most popular host of choice employed for heterologous protein expression.

Being a Gram-negative bacterium, *E. coli* possesses two layers of cell envelope, the inner/cytoplasmic membrane and the outer membrane, between which is the narrow periplasmic space (**Figure 1**). Except for specific strains, e.g., a colicinproducing strain, *E. coli* does not produce proteins beyond the outer membrane in the living environment. Therefore, in the early days, *E. coli* was essentially employed for cytoplasmic/intracellular expression of recombinant proteins. However, the advancement of recombinant DNA technology in the 1970s–1980s not only enabled *E. coli* to serve as an efficient host system for protein expression but also facilitated *E. coli* to be transformed into a highly versatile expression platform, with which heterologous proteins may also be allowed to be produced in the culture supernatant. In addition, despite being expressed in the cytoplasm, target proteins resulting from a recently developed novel process, the intein-mediated expression approach [5, 6], have been shown to possess primary structures precisely the same as expected [7]. This new recombinant DNA approach may prove to be a practical

#### **Figure 1.**

*Schematic representation of the subcellular compartments of E. coli available for expression of heterologous proteins. These compartments such as cytoplasm, periplasm, and culture medium are shown. For a secretory protein, it is initially formed as a preprotein ( ), which consists of a signal peptide ( ). The preprotein is directed to the SecYEG pathway. Before reaching the secretion channel ( ), the preprotein is unfolded by the SecB protein ( ), followed by interacting with the SecA protein ( ), which helps hydrolyze ATP to support the active transport of the preprotein. The signal peptide is then cleaved off from the preprotein subsequent to the passage of the latter through the SecYEG pathway.*

**29**

*Escherichia coli: A Versatile Platform for Recombinant Protein Expression*

process for the recombinant production of medically valuable proteins that are preferred to share the authentic structures with their native counterparts. Proteins expressed in eukaryotic cells are subjected to post-translational modifications (PTM), of which many of them do not appear to occur in *E. coli* cells. Despite this deficiency, *E. coli* is still the most common choice employed for the expression of eukaryotic proteins. For example, about 30% of the medically valuable proteins produced using recombinant DNA approaches are expressed employing *E. coli* as the host [8]. The finding of successful expression of eukaryotic proteins in E. coli suggests that many target proteins may not be posttranslationally modified, and even some of them are, PTM may not have a direct effect on functional activities. The observation also supports the view that *E. coli* will continuously play an indispensable role in heterologous protein expression. In choosing the most appropriate tactic for the expression of a heterologous protein in *E. coli*, it is important that we understand both the target protein and the available methods of choice well. *E. coli* is recognized as being a "versatile" host from the perspective that it may facilitate heterologous protein expression in all three of the subcellular compartments including: (i) cytoplasm, (ii) periplasm, and (iii) culture medium (**Figure 1**). In this communication, we discuss how those compartments may be employed to express foreign proteins that share widely different biochemical properties, under the condition that the presence of PTM is not a prerequisite.

**2. Expression of recombinant proteins in the cytoplasm of** *E. coli*

host to facilitate expression of heterologous proteins (**Table 1**).

**2.1 Fusion protein approach**

The breakthrough achievements in the construction of recombinant DNA molecules [1] and the transformation of chimeric DNA constructs into *E. coli* [2] in the early 1970s have paved the way for rapid advances in the development of recombinant DNA approaches to the expression of a wide collection of useful/valuable proteins of various origins. Due to the aforementioned fine properties of *E. coli*, this Gram-negative bacterium has been extensively studied and exploited for use as a

A common strategy in the expression of heterologous proteins is to fuse the target protein with a fusion partner, of which a familiar example is the enzyme β-galactosidase (β-Gal) expressed intracellularly in *E. coli*. Being well-characterized in terms of its structure and regulation of expression [27, 28], in the early days, β-Gal was one of the few *E. coli* products to be employed as a reporter protein, for which convenient detection assays [29] were available. Fusing the short mammalian somatostatin (Som) comprising only 14 amino acids (aa) to β-Gal, in 1977, Itakura et al. demonstrated (**Figure 2**), for the first time, successful expression of bioactive recombinant somatostatin in *E. coli* [9]. In the work, Som was fused to β-Gal through the application of oligonucleotide assembly. Thus, expression of the two proteins, which was under the regulatory controls of the Lac operon, would result initially in a β-Gal-Som precursor. The β-Gal component played two important roles in the fusion: first, it offered a facile screening assay for the selection of potentially positive clones expressing β-Gal-Som; second, it served as a guardian protecting Som from being attacked by proteolytic degradation from the N-terminus.

Since Som is a short polypeptide consisting of only 14 aa residues [30], which does not consist of Met as a member, in engineering the aforementioned β-Gal-Som fusion, a Met residue was intentionally inserted precisely between β-Gal and Som, thus resulting in a β-Gal-Met-Som precursor in the work [9]. In vitro cleavage with

*DOI: http://dx.doi.org/10.5772/intechopen.82276*

#### *Escherichia coli: A Versatile Platform for Recombinant Protein Expression DOI: http://dx.doi.org/10.5772/intechopen.82276*

*The Universe of Escherichia coli*

protein expression.

systems in *E. coli* [3, 4], foreign DNA from a wide variety of sources, ranging from a simple virus to a complex organism such as the human being, have been shown to be stably maintained in it. Therefore, although *E. coli* may not be employed as the final host cell for target protein expression, due to the difficulties of DNA manipulations, which include selection, characterization, and amplification of recombinant DNA constructs directly performed in the final host cell, *E. coli* is routinely recruited to fulfill the tasks. In this regard, *E. coli* acts as an indispensable stepping stone facilitating molecular genetic studies undertaken in eukaryotic and other bacterial host cells. The fine properties including fast growth rate (with a doubling time of less than 30 min), ease of manipulation, relatively low cost of cultivation/protein expression, and the well-characterized genetics of *E. coli* are the key reasons why this bacterium has been commonly involved in the wide range of studies mentioned above. Thus, *E. coli* has been crowned the most popular host of choice employed for heterologous

Being a Gram-negative bacterium, *E. coli* possesses two layers of cell envelope, the inner/cytoplasmic membrane and the outer membrane, between which is the narrow periplasmic space (**Figure 1**). Except for specific strains, e.g., a colicinproducing strain, *E. coli* does not produce proteins beyond the outer membrane in the living environment. Therefore, in the early days, *E. coli* was essentially employed for cytoplasmic/intracellular expression of recombinant proteins.

However, the advancement of recombinant DNA technology in the 1970s–1980s not only enabled *E. coli* to serve as an efficient host system for protein expression but also facilitated *E. coli* to be transformed into a highly versatile expression platform, with which heterologous proteins may also be allowed to be produced in the culture supernatant. In addition, despite being expressed in the cytoplasm, target proteins resulting from a recently developed novel process, the intein-mediated expression approach [5, 6], have been shown to possess primary structures precisely the same as expected [7]. This new recombinant DNA approach may prove to be a practical

*Schematic representation of the subcellular compartments of E. coli available for expression of heterologous proteins. These compartments such as cytoplasm, periplasm, and culture medium are shown. For a secretory protein, it is initially formed as a preprotein ( ), which consists of a signal peptide ( ). The preprotein is directed to the SecYEG pathway. Before reaching the secretion channel ( ), the preprotein is unfolded by the SecB protein ( ), followed by interacting with the SecA protein ( ), which helps hydrolyze ATP to support the active transport of the preprotein. The signal peptide is then cleaved off from the preprotein subsequent to* 

**28**

**Figure 1.**

*the passage of the latter through the SecYEG pathway.*

process for the recombinant production of medically valuable proteins that are preferred to share the authentic structures with their native counterparts.

Proteins expressed in eukaryotic cells are subjected to post-translational modifications (PTM), of which many of them do not appear to occur in *E. coli* cells. Despite this deficiency, *E. coli* is still the most common choice employed for the expression of eukaryotic proteins. For example, about 30% of the medically valuable proteins produced using recombinant DNA approaches are expressed employing *E. coli* as the host [8]. The finding of successful expression of eukaryotic proteins in E. coli suggests that many target proteins may not be posttranslationally modified, and even some of them are, PTM may not have a direct effect on functional activities. The observation also supports the view that *E. coli* will continuously play an indispensable role in heterologous protein expression. In choosing the most appropriate tactic for the expression of a heterologous protein in *E. coli*, it is important that we understand both the target protein and the available methods of choice well. *E. coli* is recognized as being a "versatile" host from the perspective that it may facilitate heterologous protein expression in all three of the subcellular compartments including: (i) cytoplasm, (ii) periplasm, and (iii) culture medium (**Figure 1**). In this communication, we discuss how those compartments may be employed to express foreign proteins that share widely different biochemical properties, under the condition that the presence of PTM is not a prerequisite.
