**6. Conclusions**

development. While the inhibitors described above exhibit relatively weak binding affinities, the resulting phenotypes observed support their disruption of disulfide-bond formation in the cell. These "first-generation" molecules can serve as a foundation from which more potent

372 *Escherichia coli* Escherichia coli - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications - Recent Advances on Physiology, Pathogenesis and Biotechnological Applications

There are 4306 predicted *E. coli* K12 protein sequences present in the UniProt proteome database (http://www.uniprot.org/proteomes/) [129]. An initial analysis of their compartmentalization within the cell using the prediction software TOPCONS2 (http://topcons.cbr.su.se/ pred/) [130] allowed us to putatively assign each of these proteins to one of three subcellular compartments: cytoplasmic, transmembrane in the inner membrane (referred to as transmembrane hereafter), or secreted. To hone in on proteins exhibiting possible oxidoreductase activity, the CXXC motif was used as a signature to identify 406 proteins, which showed that approximately 10% of all predicted *E. coli* proteins contain this motif, thereby demonstrating its relative ubiquity. Of these 406 proteins, ~75% are cytoplasmic, ~18% are transmembrane, and ~7% are secreted (see **Table 2**). The pool of non-CXXC-containing proteins comprises the remaining 3900 proteins, of which ~63% are cytoplasmic, ~23% are transmembrane, and ~14% are secreted (omitted from **Table 2**). The transmembrane and secreted compartments have a lower fraction of CXXC-containing proteins in keeping with the exclusion of cysteine residues from these compartments in aerobes [77]. A comparison of the non-CXXC sequence pool with the CXXC sequence pool shows a slight enrichment of CXXC proteins in the cytoplasm (~75%) versus non-CXXC cytoplasmic proteins (~63%). The distribution of CXXC and non-CXXC proteins in the transmembrane is similar (18 and 23%, respectively); however, about twice as many non-CXXC proteins are secreted (14%) compared to CXXC proteins. Approximately 22% (90 of 406) of CXXC proteins are annotated in the UniProt data as binding metal ions or as iron-sulfur cluster-containing proteins. While 46% of all CXXC proteins have been functionally characterized, the remaining majority (54%) should be characterized to develop a better understanding of the reactions they catalyze, how those identified to be oxidoreductases may contribute to the redox biology of bacteria, and to identify novel targets for therapeutics.

**Compartment Number of proteins Contain CXXC Known function Unknown function Metal binding** Cytoplasm 2755 64% 305 75% 147 78% 158 72% 79 88% Transmembrane 970 23% 72 18% 33 18% 39 18% 7 8% Secreted 581 13% 29 7% 8 4% 21 10% 4 4% Total 4306 100% 406 100% 188 100% 218 100% 90 100% Secreted refers to proteins in the periplasm and secreted outside of the cell. Compartment location was predicted using topological and signal sequence input data on the TOPCONS server. Gene ontology (GO) codes EXP and IDA were used to identify proteins with experimentally verified function from the UniProt database; those lacking these codes were defined as having unknown function. GO codes were also used to identify CXXC proteins annotated to bind metals

**Table 2.** The *E. coli* proteome separated by compartment, the presence of CXXC motifs, and known function.

compounds can be identified and developed.

**5. Future directions**

[129].

While more than 20 years of research have elucidated many of the Dsb proteins and their functions, more questions surrounding these proteins remain to be answered: What are the precise mechanisms by which PDI and DsbC catalyze disulfide-bond isomerization *in vivo*? How are electrons transported across the inner membrane by DsbD? What are the redox states and midpoint potentials of the cytoplasm of Crenarchaeota? Additionally, most of the characterization of Dsb proteins has been done in *E. coli*, which is not an appropriate model for all bacteria, e.g., *M. tuberculosis, Staphylococcus aureus,* and *Listeria monocytogenes*, so further characterization of the Dsb protein networks in other organisms is needed. Along these lines, Dsb proteins from pathogenic bacteria represent possible targets for antibiotic/vaccine development. Since several Dsb proteins have been structurally characterized, it is now possible to develop antibiotics by structure-guided design. While broad-spectrum antibiotic molecules are unlikely to be developed, again due to the diversity of Dsb proteins/networks within bacterial species, those targeting specific pathogenic species are not out of reach.

As more disulfide-bonded proteins are characterized, our knowledge of the stability and structures these bonds confer, their likelihood of scrambling in mulitply disulfide-bonded proteins, and their relative redox potentials will grow. This will allow researchers to better predict native disulfide bonds from sequence data and better engineer disulfide bonds in proteins for desirable physicochemical properties, which will benefit both the biotechnological and pharmaceutical industries, especially in the development and production of antibodies. Ideally, both industries should aim to produce antibodies as quickly, cheaply, and effectively as possible. The engineering of bacterial strains to overproduce correctly folded antibodies and/or engineering antibodies themselves for desired properties represents a technically challenging but incredibly useful advancement in the field of oxidative protein folding. Future research in these areas should lead to great innovations in both the biotechnological and pharmaceutical industries that will improve the health and increase the knowledge of humankind.
