**3. Genome sequencing enables a great leap forward in bacterial redox biology research**

The eponymous Sanger DNA sequencing method was developed by Frederick Sanger and colleagues in 1977 [25]. This method is based on selective incorporation of chain-terminating dideoxynucleotides by DNA polymerases during *in vitro* DNA replication [26]. Sanger sequencing was the most widely utilized DNA sequencing technology until relatively recently. Gene sequencing became reasonably attainable with the 1986 release of a fully automated DNA sequencer made by Applied Biosystems. Around the same time, Kary Mullis of Cetus Corporation developed polymerase chain reaction (PCR) technology, which led to the first commercial PCR enzyme and thermal cycler systems available to scientists in 1987 [27]. Together, Sanger sequencing and the development of PCR technology ushered in the gene sequencing era and revolutionized molecular biology.

With the ability to sequence genes, in conjunction with the already rich field of bacterial genetics and its corresponding techniques, the stage was set for identifying genes involved in redox biology. Along these lines, a genetic selection in *Escherichia coli* designed to identify factors involved in protein translocation led to the serendipitous discovery of mutations in the *dsbA* gene that affected disulfide-bond formation [28]. The DsbA protein was isolated and demonstrated to catalyze disulfide-bond reduction using insulin as a substrate *in vitro* [28, 29]. Later studies revealed DsbA to be a potent and sequential oxidant [30]. Specifically, DsbA forms disulfide bonds between sequential cysteines in proteins as they are translocated to the periplasm [31] (**Figure 1**). Collectively, these studies identified DsbA as the first periplasmic protein involved in disulfide-bond formation and paved the way for elucidating the disulfidebond forming machinery in *E*. *coli*.

A second protein involved in disulfide-bond formation was identified through genetic screens of resistance or sensitivity to strong reducing agents. In these screens, Tn*10* insertion mutants sensitive to DTT and benzylpenicillin were mapped to a second gene, which was named *dsbB* [30, 32]. The *dsbB* gene product was later confirmed to specifically oxidize DsbA [33]. Since then, research in several laboratories has elucidated the electron transfer pathway through which approximately 40% of cell envelope proteins in *E*. *coli* obtain disulfide bonds [34–38] (see **Figure 1**). Specifically, the DsbA protein transfers disulfide bonds to substrate proteins in the periplasm by accepting electrons from the substrate's cysteine residues. As a result, the cysteine residues of DsbA become reduced and the protein must be oxidized for it to catalyze another round of disulfide bond transfer [28, 29]. This oxidation reaction is carried out by DsbB, an inner membrane protein with two pairs of redox-active cysteines [30, 32]. The electrons received by DsbB in its oxidation of DsbA are transferred to the pool of quinones within the inner membrane [37, 39–43]. Then, the reduced quinones are recycled by cytochrome and terminal oxidases of the electron transport chain [42, 44–46]. Together, DsbA and DsbB act as the oxidation system for disulfide-bond formation in the periplasm (**Figure 1**). These two proteins form one part of the periplasmic disulfide-bond forming pathway; additional proteins, DsbC and DsbD, among others, play downstream roles in the fidelity of native disulfide bonds.

The advent of genome sequencing and PCR in the later 1980s caused a shift from eukaryotic PDI studies to research centered on bacterial disulfide-bond formation, which is detailed in the following section. It should be noted that Anfinsen's idea that the amino acid sequence of a protein encodes all of the information necessary for its proper folding was not fully correct. Even though Anfinsen shared the 1972 Nobel Prize in Chemistry with Stanford Moore and William H. Stein, the following decades of his and others' research showed that disulfide-bond formation and protein folding are, in fact, catalyzed processes *in vivo*. The work surrounding RNase A refolding and the elucidation of PDI serves as an example wherein the true answers to fundamental questions often require far more research to unravel their

362 *Escherichia coli* Escherichia coli - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications

**3. Genome sequencing enables a great leap forward in bacterial redox** 

The eponymous Sanger DNA sequencing method was developed by Frederick Sanger and colleagues in 1977 [25]. This method is based on selective incorporation of chain-terminating dideoxynucleotides by DNA polymerases during *in vitro* DNA replication [26]. Sanger sequencing was the most widely utilized DNA sequencing technology until relatively recently. Gene sequencing became reasonably attainable with the 1986 release of a fully automated DNA sequencer made by Applied Biosystems. Around the same time, Kary Mullis of Cetus Corporation developed polymerase chain reaction (PCR) technology, which led to the first commercial PCR enzyme and thermal cycler systems available to scientists in 1987 [27]. Together, Sanger sequencing and the development of PCR technology ushered in the gene

With the ability to sequence genes, in conjunction with the already rich field of bacterial genetics and its corresponding techniques, the stage was set for identifying genes involved in redox biology. Along these lines, a genetic selection in *Escherichia coli* designed to identify factors involved in protein translocation led to the serendipitous discovery of mutations in the *dsbA* gene that affected disulfide-bond formation [28]. The DsbA protein was isolated and demonstrated to catalyze disulfide-bond reduction using insulin as a substrate *in vitro* [28, 29]. Later studies revealed DsbA to be a potent and sequential oxidant [30]. Specifically, DsbA forms disulfide bonds between sequential cysteines in proteins as they are translocated to the periplasm [31] (**Figure 1**). Collectively, these studies identified DsbA as the first periplasmic protein involved in disulfide-bond formation and paved the way for elucidating the disulfide-

A second protein involved in disulfide-bond formation was identified through genetic screens of resistance or sensitivity to strong reducing agents. In these screens, Tn*10* insertion mutants sensitive to DTT and benzylpenicillin were mapped to a second gene, which was named *dsbB* [30, 32]. The *dsbB* gene product was later confirmed to specifically oxidize DsbA [33]. Since then, research in several laboratories has elucidated the electron transfer pathway through which approximately 40% of cell envelope proteins in *E*. *coli* obtain disulfide bonds [34–38]

complexities.

**biology research**

sequencing era and revolutionized molecular biology.

bond forming machinery in *E*. *coli*.

**Figure 1.** The disulfide-bond-forming pathways in the periplasm of E. coli*.* A protein containing four cysteines in their reduced (free thiol) states is translocated into the periplasm by the SecYEG translocon. (1). Oxidized DsbA catalyzes disulfide-bond formation either as the protein is translocated or after, resulting in sequential disulfide bonds in this protein. DsbA is then oxidized to its active state by DsbB. DsbB is oxidized by ubiquinone or menaquinone under aerobic or anaerobic conditions, respectively (not shown). (2). If the disulfide bonds formed by DsbA are misoxidized, reduced DsbC catalyzes their isomerization to yield the properly folded protein. (3). DsbD then reduces DsbC to its active state. DsbD is reduced by an electron cascade originating from NADPH and mediated by thioredoxin reductase and thioredoxin in the cytoplasm (not shown).

The misoxidation of substrates by DsbA necessitates the existence of a system capable of isomerizing incorrect disulfide bonds to their correct linkages in prokaryotes. In *E. coli*, the isomerization of disulfide bonds in proteins is catalyzed by DsbC. The *dsbC* gene was discovered in 1994, shortly after the discovery of the *dsbB* gene, using the same genetic selection approach [47]. The *dsbC* gene product was characterized and was shown to contain two cysteines that reside in the CXXC motif generally found in oxidoreductases. Subsequently, DsbC was demonstrated to catalyze disulfide-bond isomerization of substrates containing nonconsecutive disulfide bonds [48–51] (**Figure 1**). This substrate preference of DsbC was illustrated with two nearly identical *E. coli* proteins, phytase (AppA) and glucose-1-phosphatase (Agp), which differ by the former containing a nonconsecutive disulfide bond, while the latter has only consecutive disulfide bonds. AppA was shown to be dependent on DsbC for proper folding into its active conformation, whereas Agp exhibited no dependence on DsbC until a nonconsecutive disulfide bond placed similarly as that found in phytase was introduced [48]. To date, no exceptions to the substrate preference of DsbC have been found. Taken together, these results suggested that DsbC is a protein disulfide isomerase that catalyzes the rearrangement of misoxidized disulfide bonds, in particular, the rearrangement of nonconsecutive disulfide bonds. Thus, DsbA and DsbC work in parallel in maintaining the correct disulfide bonds in the periplasmic *E. coli* proteome. DsbA catalyzes disulfide-bond formation as the protein is translocated into the periplasm, resulting in the formation of consecutive disulfide bonds. In those proteins requiring nonconsecutive disulfide bonds, DsbC catalyzes the isomerization of misoxidized bonds to yield active enzymes. The exact details of substrate recognition and the *in vivo* mechanism of isomerization catalyzed by DsbC have yet to be elucidated. However, preliminary evidence suggests that certain correctly oxidized proteins may result not only from oxidation and isomerization but also from iterative cycles of reduction and oxidation by DsbA and DsbC [52]. Another protein, DsbG, shares 28% sequence identity with DsbC and exhibits protein disulfide isomerase activity, albeit on a more narrow scope of yet-to-beidentified substrates [5, 53, 54].

structure promotes the binding of the DsbC dimer and occludes the binding of the monomeric DsbA and DsbB proteins, thereby separating the oxidative and reductive pathways [63].

From Biology to Biotechnology: Disulfide Bond Formation in *Escherichia coli*

http://dx.doi.org/10.5772/67393

365

The formation of disulfide bonds is essential to the structural integrity and folding of proteins that are vital in many biological processes. *E. coli* and other prokaryotes have evolved a complex network of electron transport chains and quality control systems to facilitate and ensure proper disulfide-bond formation in the form of the Dsb proteins described above. The discovery of these Dsb proteins, and the subsequent revival of interest in disulfide-bond formation in eukaryotes, would not have been realized without the powerful combination of well-designed, selective genetic screens to produce mutants and the ability to sequence the resulting mutated genes. With the advent of next-generation sequencing, we should expect further elucidation of the biological and chemical processes that we do not yet understand or

**4. Disulfide-bond research in the post-genomic sequencing era**

**4.1. Hunting for new disulfide-bond forming enzymes in the genomic landscape**

One of the first examples of the use of sequenced genomes to predict and identify novel disulfide-bond forming pathways was conducted by Todd Yeates and colleagues [68–70]. They hypothesized that organisms rich in disulfide-bonded proteins would have a propensity to encode for proteins with an even number of cysteine residues, since an odd number might cause formation of aberrant disulfide bonds. This conjecture was based on the observation that the predicted open reading frames (ORFs) of the hyperthermophilic *Pyrobaculum aerophilum* and *Aeropyrum pernix* species are strongly biased toward an even number of cysteines [70]. Since then, they have expanded their analysis to show that hyperthermophilic members of the Crenarchaeota branch all contain a multitude of disulfide-bonded proteins [68]. Mass spectrometric analysis of the proteome of *Sulfolobus solfataricus* revealed the majority of cysteines to be disulfide bonded [71], and several disulfide-bonded proteins were identified using 2D gel analysis of lysates of *P. aerophilum* [72]. The presence of a high number of disulfide bond-containing proteins in hyperthermophilic Crenarchaeota suggested these bacteria possess an undiscovered method of disulfide-bond maintenance. Indeed, experimental

Since 2008, the cost of genome sequencing has declined faster than predicted by Moore's Law [64]. Currently, the cost of sequencing a genome is ~\$1500, and the lofty \$1000/genome goal is within reach. Due to the radical drop in DNA sequencing costs, a multitude of laboratories and private and government institutions have completed the sequencing of approximately 30,000 bacterial genomes [65]. This wealth of data is currently being used for a variety of biotechnological and clinical purposes including diagnostics, public health benefits, and biosurveillance/epidemiological studies [66, 67]. Accordingly, we have termed this time period as the "post-genomic sequencing era" to represent research that uses sequenced genomes, metagenomes, and environmental samples to search for novel enzymes and pathways and to

have yet to be discovered.

predict the redox biology of bacteria.

Like DsbA, DsbC has a dedicated redox protein partner, named DsbD, which is responsible for maintaining it in its reduced state (**Figure 1**). The *dsbD* gene was discovered using the same genetic screens that led to the discoveries of both DsbB and DsbC [55]. The *dsbD* gene product consists of three domains: an N-terminal periplasmic domain, a transmembrane domain, and a C-terminal periplasmic thioredoxin-like domain that shares approximately 45% sequence homology with eukaryotic PDIs [56]. Each of the domains of DsbD contains a pair of conserved cysteine residues that are redox active and essential for its function [56]. To maintain DsbC in its reduced state, DsbD channels reducing equivalents that are mediated through a cascade of disulfide-bond reductions starting with the reduction of thioredoxin reductase by NADPH [57, 58]. Thioredoxin reductase reduces thioredoxin, which then reduces the cysteine pair in the transmembrane domain of DsbD [58, 59]. This reduced cysteine pair initiates the sequential reduction of disulfide bonds in the C-terminal and N-terminal DsbD domains, respectively [59]. The reduced N-terminal domain cysteines then reduce DsbC (**Figure 1**). Reduction of DsbC occurs only when it is dimeric [60, 61]. This substrate preference likely stems from the tertiary structure of the N-terminal domain of DsbD, which adopts a immunoglobulin-like fold and places the active site in the antigen-binding-like region [62]. The tertiary structure promotes the binding of the DsbC dimer and occludes the binding of the monomeric DsbA and DsbB proteins, thereby separating the oxidative and reductive pathways [63].

The misoxidation of substrates by DsbA necessitates the existence of a system capable of isomerizing incorrect disulfide bonds to their correct linkages in prokaryotes. In *E. coli*, the isomerization of disulfide bonds in proteins is catalyzed by DsbC. The *dsbC* gene was discovered in 1994, shortly after the discovery of the *dsbB* gene, using the same genetic selection approach [47]. The *dsbC* gene product was characterized and was shown to contain two cysteines that reside in the CXXC motif generally found in oxidoreductases. Subsequently, DsbC was demonstrated to catalyze disulfide-bond isomerization of substrates containing nonconsecutive disulfide bonds [48–51] (**Figure 1**). This substrate preference of DsbC was illustrated with two nearly identical *E. coli* proteins, phytase (AppA) and glucose-1-phosphatase (Agp), which differ by the former containing a nonconsecutive disulfide bond, while the latter has only consecutive disulfide bonds. AppA was shown to be dependent on DsbC for proper folding into its active conformation, whereas Agp exhibited no dependence on DsbC until a nonconsecutive disulfide bond placed similarly as that found in phytase was introduced [48]. To date, no exceptions to the substrate preference of DsbC have been found. Taken together, these results suggested that DsbC is a protein disulfide isomerase that catalyzes the rearrangement of misoxidized disulfide bonds, in particular, the rearrangement of nonconsecutive disulfide bonds. Thus, DsbA and DsbC work in parallel in maintaining the correct disulfide bonds in the periplasmic *E. coli* proteome. DsbA catalyzes disulfide-bond formation as the protein is translocated into the periplasm, resulting in the formation of consecutive disulfide bonds. In those proteins requiring nonconsecutive disulfide bonds, DsbC catalyzes the isomerization of misoxidized bonds to yield active enzymes. The exact details of substrate recognition and the *in vivo* mechanism of isomerization catalyzed by DsbC have yet to be elucidated. However, preliminary evidence suggests that certain correctly oxidized proteins may result not only from oxidation and isomerization but also from iterative cycles of reduction and oxidation by DsbA and DsbC [52]. Another protein, DsbG, shares 28% sequence identity with DsbC and exhibits protein disulfide isomerase activity, albeit on a more narrow scope of yet-to-be-

364 *Escherichia coli* Escherichia coli - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications

Like DsbA, DsbC has a dedicated redox protein partner, named DsbD, which is responsible for maintaining it in its reduced state (**Figure 1**). The *dsbD* gene was discovered using the same genetic screens that led to the discoveries of both DsbB and DsbC [55]. The *dsbD* gene product consists of three domains: an N-terminal periplasmic domain, a transmembrane domain, and a C-terminal periplasmic thioredoxin-like domain that shares approximately 45% sequence homology with eukaryotic PDIs [56]. Each of the domains of DsbD contains a pair of conserved cysteine residues that are redox active and essential for its function [56]. To maintain DsbC in its reduced state, DsbD channels reducing equivalents that are mediated through a cascade of disulfide-bond reductions starting with the reduction of thioredoxin reductase by NADPH [57, 58]. Thioredoxin reductase reduces thioredoxin, which then reduces the cysteine pair in the transmembrane domain of DsbD [58, 59]. This reduced cysteine pair initiates the sequential reduction of disulfide bonds in the C-terminal and N-terminal DsbD domains, respectively [59]. The reduced N-terminal domain cysteines then reduce DsbC (**Figure 1**). Reduction of DsbC occurs only when it is dimeric [60, 61]. This substrate preference likely stems from the tertiary structure of the N-terminal domain of DsbD, which adopts a immunoglobulin-like fold and places the active site in the antigen-binding-like region [62]. The tertiary

identified substrates [5, 53, 54].

The formation of disulfide bonds is essential to the structural integrity and folding of proteins that are vital in many biological processes. *E. coli* and other prokaryotes have evolved a complex network of electron transport chains and quality control systems to facilitate and ensure proper disulfide-bond formation in the form of the Dsb proteins described above. The discovery of these Dsb proteins, and the subsequent revival of interest in disulfide-bond formation in eukaryotes, would not have been realized without the powerful combination of well-designed, selective genetic screens to produce mutants and the ability to sequence the resulting mutated genes. With the advent of next-generation sequencing, we should expect further elucidation of the biological and chemical processes that we do not yet understand or have yet to be discovered.
