**Proposal for a Minimal DNA Auto-Replicative System**

Agustino Martinez-Antonio, Laura Espindola-Serna and Cesar Quiñones-Valles

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/51986

### **1. Introduction**

[191] Wang Y, Prosen DE, Mei L, Sullivan JC, Finney M, Vander Horn PB. A novel strategy to engineer DNA polymerases for enhanced processivity and improved performance

[192] Kawamura A, Ishino Y, Ishino S. Biophysical analysis of PCNA from *Pyrococcus furio‐*

[193] Ishino S, Kawamura, A, Ishino Y. Application of PCNA to processive PCR by reduc‐ ing the stability of its ring structure. J Jap Soc Extremophiles, 2012;11(1) 19-25

[194] Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol, 2008;26(10)

[195] Ansorge WJ. Next-generation DNA sequencing techniques. Nat Biotechnol,

[196] Metzker ML. Sequencing technologies - the next generation. Nat Rev Genet,

in vitro. Nucleic Acids Res, 2004;32(3) 1197-1207

*sus.* J Jap Soc Extremophiles, 2012;11(1) 12-18

1135-1145.

126 The Mechanisms of DNA Replication

2009;25(4) 195-203

2010;11(1) 31-46.

DNA replication allows cell division and population growth of living organisms. Here we will focus on DNA replication in prokaryotic single celled microorganisms. Several excellent reviews of the molecular processes that carry out DNA replication in bacteria already exist, *E. coli* being the model described in most detail (Langston LD et al., 2009; Quiñones-Valles et al., 2011). Briefly, the process begins when DnaA (DNA initiator replication protein) in its activated form (DnaA-ATP) recognizes and binds the *oriC* (origin of replication on the bacte‐ rial chromosome). In the following step, the replisome is assembled and binds to the com‐ plex of DnaA-ATP at the *oriC.* Next, the DNA strands are separated and synthesis of the complementary strands initiates followed by elongation steps. The molecular mechanisms of elongation differ depending on the strand used as a template; the leading strand is repli‐ cated continuously starting from a unique RNA primer, whereas on the lagging strand DNA polymerase III must recognize several RNA primers, previously synthesized by DnaG, and then replicate each DNA fragment (Okazaki fragments). This is followed by the replacement of RNA primers by DNA polymerase I, and removal of nicks by a DNA ligase. The whole process concludes when replisomes reach the *ter* site, almost opposite to *oriC* on the circular DNA molecule*.* Tus proteins are attached to the *ter* sites and when replisomes reach these complexes, they collide and finally are disassembled (see Figure 1 for an overview of the whole process).

From another aspect, one of the more challenging areas of Synthetic Biology is the design and construction of minimal cells. The accomplishment of this aim might contribute to an‐ swering basic questions about the minimal components necessary to sustain life systems, in addition to cell auto-organization, function and evolution. In a practical application, mini‐

© 2013 Martinez-Antonio et al.; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Martinez-Antonio et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

mal cells can be used as a background chassis for the generation of dedicated biological sys‐ tems designed for the synthesis or degradation of diverse compounds of interest.

**Figure 1. Main steps of DNA replication in bacteria.** a) Initiation of DNA replication; the *datA* locus has a high affini‐ ty for binding DnaA (1). DnaA binds to ATP, homo-multimers of DnaA-ATP are formed (2). These homo-multimers bind to *oriC* and once replication is initiated SeqA binds this region and prevents initiation of a new replication event (3). The SSB (single strand binding) protein and DnaB assist the complex to open the DNA strands and release DnaC (4). A DNA topoisomerase helps to further unfold the DNA strands (5). b) The elongation phase; the replication fork is formed and the replisome is assembled (6). DNA polymerase III replicates the leading strand (7). DnaG incorporates RNA primers as primers for replication of the lagging strand (8). Polymerase III can now replicate Okazaki fragments on the lagging strand (9). DNA polymerase I replaces RNA nucleotides for DNA nucleotides (10). A DNA ligase (LigA) seals the nicks on contiguous DNA fragments (11). c) Termination of DNA replication; The protein Tus binds to the *ter* sites, when replisomes reach Tus, replication ceases (12). The recombinases XerC and XerD resolve the replicated DNA strands (13). Finally, FtsK translocates the DNA strands and each double-stranded DNA molecule can be liberated (14).

In recent years, the essential properties and capabilities necessary to develop minimum cells have been broadly speculated (MacDonald et al., 2011). Among these characteristics it is evi‐ dent that DNA replication should be a fundamental property of these biosystems. Many genes for DNA replication are found to be conserved when comparative analysis of bacterial genomes is carried out. These types of genes are considered as informational genes, in charge of maintaining the genetic code, and are among the genes less frequently found be horizontally transferable (Jain et al, 1999). Therefore by genomic comparisons and functional analyses it is possible to propose a minimum core of genes capable of supporting the proc‐ ess of DNA replication.

From a genetic point of view, and for the purpose of this study it is important to state our definition of a minimal DNA auto-replicative system (MiDARS) as: *a genetic system compris‐ ing the minimum number of DNA components, including regulatory elements and gene products necessary for the auto-replication of the DNA molecule on which they are encoded, functioning in an in vitro condition.*

In this chapter we will develop a proposal for the construction of such an auto-replicative DNA system. This system is designed to serve as a scaffold for the incorporation of addi‐ tional biological functions such as transcription and translation, etc. For the scaffold design we exploit information of genes necessary for replication in *E. coli* that are highly conserved in bacteria with extremely reduced genomes and analyze their functional role in DNA repli‐ cation in order to finally propose a minimal genetic system with a DNA auto-replicatory function.

### **2. Minimal cells and minimal genetic systems**

A minimal cell can be defined as a biological system that has the minimal number of genetic parts and molecular components for supporting life functions under defined growth condi‐ tions. In other words, it includes only the necessary number of genes and derived biomolec‐ ular machinery that are considered basic to support life functions (Jewett and Forster, 2010).

The concept of life is intrinsically complex; in biochemical terms it could be defined by three basic characteristics (Luisi et al., 2006):

**1.** auto-regulation of metabolism,

mal cells can be used as a background chassis for the generation of dedicated biological sys‐

**Figure 1. Main steps of DNA replication in bacteria.** a) Initiation of DNA replication; the *datA* locus has a high affini‐ ty for binding DnaA (1). DnaA binds to ATP, homo-multimers of DnaA-ATP are formed (2). These homo-multimers bind to *oriC* and once replication is initiated SeqA binds this region and prevents initiation of a new replication event (3). The SSB (single strand binding) protein and DnaB assist the complex to open the DNA strands and release DnaC (4). A DNA topoisomerase helps to further unfold the DNA strands (5). b) The elongation phase; the replication fork is formed and the replisome is assembled (6). DNA polymerase III replicates the leading strand (7). DnaG incorporates RNA primers as primers for replication of the lagging strand (8). Polymerase III can now replicate Okazaki fragments on the lagging strand (9). DNA polymerase I replaces RNA nucleotides for DNA nucleotides (10). A DNA ligase (LigA) seals the nicks on contiguous DNA fragments (11). c) Termination of DNA replication; The protein Tus binds to the *ter* sites, when replisomes reach Tus, replication ceases (12). The recombinases XerC and XerD resolve the replicated DNA strands (13). Finally, FtsK translocates the DNA strands and each double-stranded DNA molecule can be liberated (14).

tems designed for the synthesis or degradation of diverse compounds of interest.

128 The Mechanisms of DNA Replication


The design and synthesis of minimal cells depends on the environmental conditions the sys‐ tems will be exposed to. Initially, we might consider that a minimal cell should be exposed to the most favorable conditions in order to facilitate its conception and function. These fa‐ vorable conditions will require an environment where the cell is not suffering any kind of environmental stress. Nonetheless, even this ideal scenario is a challenging condition to di‐ rect the rational design of components of a minimal cellular system since the genes for many cellular functions are not yet totally defined. What we could do is to start to reconstruct min‐ imal biological functions that are more or less well defined. These might be the processes relating to the central dogma of molecular biology: DNA replication, DNA transcription and mRNA translation (Figure 2). Some of these functions have been the object of different stud‐ ies; e. g. transcription and translation were successfully recreated in the experiment of Asa‐ hara (2010) by separately expressing the components of the *E. coli* RNA polymerase, including the sigma70 factor and reconstituting the function of the complete enzyme *in vitro*.

Since one of the fundamental characteristics of life systems is the replication of their own genetic material, we can consider the design of minimal genetic systems that sustain DNA auto-replication as an important to starting point.

**Figure 2. Representation of a hypothetical minimal auto-replicative system.** One of the key features of the mini‐ mal cell is that it should perform basic functions such as transcription, translation and replication of the genetic infor‐ mation contained in its genome.

### **3. Approaches for the development of minimal genetic systems**

Currently there are two approaches for the study of minimal biological systems. These are the *top down* and *bottom up* strategies (Delaye L & Moya A, 2009; Murtas, 2009). The *top down* approach considers the analysis of existing biological systems and, by following a reduction‐ ist approach, looks to minimize the number of components either by searching for con‐ served genetic elements or by experimentally reducing the genome without losing functionality. This strategy was used to reduce the *E. coli* genome by 15% by deleting nonessential genes, recombinogenic and mobile DNA elements, and cryptic genes. The resulting cells had good growth profiles and showed improved performance for protein production (Pósfai et al., 2006). Another focus of this approach is to carry out comparative genomics and define a set of conserved genes such as those in charge of specific functions (Gil et al., 2004; Forster & Church, 2006).

cellular functions are not yet totally defined. What we could do is to start to reconstruct min‐ imal biological functions that are more or less well defined. These might be the processes relating to the central dogma of molecular biology: DNA replication, DNA transcription and mRNA translation (Figure 2). Some of these functions have been the object of different stud‐ ies; e. g. transcription and translation were successfully recreated in the experiment of Asa‐ hara (2010) by separately expressing the components of the *E. coli* RNA polymerase, including the sigma70 factor and reconstituting the function of the complete enzyme *in vitro*. Since one of the fundamental characteristics of life systems is the replication of their own genetic material, we can consider the design of minimal genetic systems that sustain DNA

**Figure 2. Representation of a hypothetical minimal auto-replicative system.** One of the key features of the mini‐ mal cell is that it should perform basic functions such as transcription, translation and replication of the genetic infor‐

Currently there are two approaches for the study of minimal biological systems. These are the *top down* and *bottom up* strategies (Delaye L & Moya A, 2009; Murtas, 2009). The *top down* approach considers the analysis of existing biological systems and, by following a reduction‐ ist approach, looks to minimize the number of components either by searching for con‐

**3. Approaches for the development of minimal genetic systems**

auto-replication as an important to starting point.

130 The Mechanisms of DNA Replication

mation contained in its genome.

On the other hand, the *bottom up* approach involves the construction of complex systems starting from relatively simple molecular precursors. A classical example is the experiment of Miller, who obtained amino acids from a mixture of simple organic and inorganic mole‐ cules (Miller, 1953).

Considering the design and construction of minimal genetic systems, benefits should be ob‐ tained by employing both complementary *top down* and *bottom up* approaches.

### **4.** *Escherichia coli* **as a model organism for the design of a minimal DNA auto-replicative system**

*Escherichia coli* is a bacillary Gram-negative, aerobic, facultative and non-sporulating organ‐ ism. It was discovered in 1885 by the physician Theodore von Escherich and is now classi‐ fied as part of the Enterobacteriaceae family of the Gamma-proteobacterias (Blattner et al., 1997).

This bacterium lives in the intestine of mammals, and assists its hosts with assimilation of nutrients, providing some vitamins and preventing the establishment of bacterial patho‐ gens. Since its discovery, *E. coli* has been widely used as a working model in the laboratory to study biochemistry and diverse molecular processes. In addition, it has been widely used in biotechnology as a vehicle for the expression of multiple recombinant proteins and whole metabolic pathways.

Arthur Kornberg was one of the most prominent investigators in molecular biology and a pioneer in the description of the replicative process using *E. coli* as a model. For his accom‐ plishments in the field he was awarded the Nobel Prize in Physiology and Medicine in 1959. He discovered DNA polymerase I (Bessman et al., 1958; Lehman et al., 1958a), and describes the synthesis of DNA as a process based on the use of a single strand of DNA as a template (Lehman et al., 1958b). Later, Kornberg and his collaborators discovered additional enzymes involved in DNA replication: DNA primase, DNA helicase, DnaA, PriA among others. Nowadays, the replication process and the replicative enzymes of *E. coli* are the best under‐ stood and characterized of any organism.

From a biotechnological standpoint, *E. coli* shows three important characteristics that make it an ideal organism to serve as the platform for the design of a synthetic cellular program (Foley & Shuler, 2010):


Additionally, the genome of *E. coli* serves as the principal source of standardized genetic parts for the construction of genetic circuits, the "BioBricks", in a project whose aim is to standardize genetic parts to facilitate biological engineering (http://partsregistry.org), (Smolke, 2009). Most biobricks are designed to function in *E. coli*, therefore, we think *E. coli* is the best organism of choice for the design of a DNA auto-replicative system.

### **5. Comparison of the DNA replicative machinery of** *E. coli* **with that of bacteria with reduced genomes**

Comparative genomics is a powerful approach that allows the identification of genetic se‐ quences sharing identity/similarity among different organisms. Through these comparisons it is possible to identify conserved genes and predict the components of the replicative ma‐ chinery in several different organisms.

For our purpose, among the organisms of interest to consider in our design are those with extremely reduced genomes. A characteristic of these organisms is that they are incapable of growth in a free-living manner. The genomes of organisms with these characteristics corre‐ spond to those having the minimum number of genes possible in nature. From these we chose the 25 organisms with the most reduced genomes known to date (Table 1). All of these genomes contain less than 1,200 kbp of DNA and all are endosymbiotic bacteria, most of which are thought to survive at the expense of the host.

In these organisms, we searched for genes encoding enzymes involved in DNA replicative func‐ tions with orthology to the replicative machinery from *E. coli* (Table 2). To find orthologous genes we followed two complementary strategies: we looked for Clusters of Orthologous Groups (COGs, Tatusov et al., 2003) and also used bidirectional best blast hits (Moreno-Hagelsieb & Lat‐ imer, 2008). In Table 2 the blue cells indicate where genes orthologous to *E. coli* are present in the target organism. In the table we show orthologous genes to be present in at least fifteen of these bacteria. Remarkably, bacteria with the most reduced genomes; *Carsonella ruddii* PV (Nakaba‐ chi et al., 2006; Tamames et al., 2007), *Hodgkinia cicadicola Dsem* (McCutcheon et al., 2009) and *Tremblaya princeps* PCIT (López-Madrigal et al., 2011; McCutcheon & Moran, 2011) had only 5, 3 and 5 genes related to replication respectively which were orthologous to *E. coli*. These three or‐ ganisms are strict endosymbionts of insects, with the smallest genomes known to date (Table 2). The fact that these bacteria showed fewer genes related to DNA replication in comparison to bac‐ teria with larger genomes (Figure 3), indicates that the minimal replicative machinery in these organisms might be composed by a small number of constituents. This observation raises many open questions, for instance:


**Table 1.** Bacteria with genome sizes less than 1200 kbp

**1.** functionally it is the organism best characterized at the molecular and biochemical lev‐

**2.** it has proven to be a robust vehicle for the expression of multiple biotechnological proc‐

Additionally, the genome of *E. coli* serves as the principal source of standardized genetic parts for the construction of genetic circuits, the "BioBricks", in a project whose aim is to standardize genetic parts to facilitate biological engineering (http://partsregistry.org), (Smolke, 2009). Most biobricks are designed to function in *E. coli*, therefore, we think *E. coli*

**5. Comparison of the DNA replicative machinery of** *E. coli* **with that of**

Comparative genomics is a powerful approach that allows the identification of genetic se‐ quences sharing identity/similarity among different organisms. Through these comparisons it is possible to identify conserved genes and predict the components of the replicative ma‐

For our purpose, among the organisms of interest to consider in our design are those with extremely reduced genomes. A characteristic of these organisms is that they are incapable of growth in a free-living manner. The genomes of organisms with these characteristics corre‐ spond to those having the minimum number of genes possible in nature. From these we chose the 25 organisms with the most reduced genomes known to date (Table 1). All of these genomes contain less than 1,200 kbp of DNA and all are endosymbiotic bacteria, most of

In these organisms, we searched for genes encoding enzymes involved in DNA replicative func‐ tions with orthology to the replicative machinery from *E. coli* (Table 2). To find orthologous genes we followed two complementary strategies: we looked for Clusters of Orthologous Groups (COGs, Tatusov et al., 2003) and also used bidirectional best blast hits (Moreno-Hagelsieb & Lat‐ imer, 2008). In Table 2 the blue cells indicate where genes orthologous to *E. coli* are present in the target organism. In the table we show orthologous genes to be present in at least fifteen of these bacteria. Remarkably, bacteria with the most reduced genomes; *Carsonella ruddii* PV (Nakaba‐ chi et al., 2006; Tamames et al., 2007), *Hodgkinia cicadicola Dsem* (McCutcheon et al., 2009) and *Tremblaya princeps* PCIT (López-Madrigal et al., 2011; McCutcheon & Moran, 2011) had only 5, 3 and 5 genes related to replication respectively which were orthologous to *E. coli*. These three or‐ ganisms are strict endosymbionts of insects, with the smallest genomes known to date (Table 2). The fact that these bacteria showed fewer genes related to DNA replication in comparison to bac‐ teria with larger genomes (Figure 3), indicates that the minimal replicative machinery in these organisms might be composed by a small number of constituents. This observation raises many

els in terms of components of metabolism,

**bacteria with reduced genomes**

chinery in several different organisms.

open questions, for instance:

which are thought to survive at the expense of the host.

**3.** it has a short growth cycle and is easy to manipulate genetically.

is the best organism of choice for the design of a DNA auto-replicative system.

esses,

132 The Mechanisms of DNA Replication

**Figure 3.** Conservation of DNA replicative machinery in bacteria with reduced genomes. Graph showing the relation‐ ship between number of genes annotated with DNA replicative functions versus genome sizes.


The apparent requirement of only a handful of genes for DNA replication in extremely re‐ duced genomes, compared with the 228 annotated in *E. coli,* might suggest a parsimonious mechanism of DNA replication in endosymbiont bacteria since they are always living in sta‐ ble environments. The genes which are more highly conserved in both reduced genomes and *E. coli* are those whose products form the replisome (*dnaE, dnaB, dnaN, dnaG, dnaX, dnaQ, ssb, holA* and *holB*), the genes encoding for DNA topoisomerase type II (*gyrA* and *gyrB*), and the gene for the NAD(+)-dependent DNA-ligase, *ligA* (Figure 4).

### **6. Components of a Minimal DNA Auto-Replicative System (MiDARS)**

Of the three organisms with the most reduced genomes in nature, *Carsonella ruddii* is the more closely related phylogenetically to *E. coli* (Nakabachi et al., 2006). For this reason in our design we used the information of the replicative machinery in *Carsonella ruddii* and the functions known in *E. coli.* For the physical construction of the systems, however, we will use genes from *E. coli* for two main reasons:


**Table 2.** Conservation of replicative genetic machinery in bacteria with less than 1200 kbp

**Figure 3.** Conservation of DNA replicative machinery in bacteria with reduced genomes. Graph showing the relation‐

**i.** Are these genes sufficient to sustain the process of replication of an entire chromosome?; **ii.** Does the host supply the missing elements for replication of the endosymbionts

**iii.** Do these organisms use additional proteins in comparison to those currently de‐

The apparent requirement of only a handful of genes for DNA replication in extremely re‐ duced genomes, compared with the 228 annotated in *E. coli,* might suggest a parsimonious mechanism of DNA replication in endosymbiont bacteria since they are always living in sta‐ ble environments. The genes which are more highly conserved in both reduced genomes and *E. coli* are those whose products form the replisome (*dnaE, dnaB, dnaN, dnaG, dnaX, dnaQ, ssb, holA* and *holB*), the genes encoding for DNA topoisomerase type II (*gyrA* and

**6. Components of a Minimal DNA Auto-Replicative System (MiDARS)**

Of the three organisms with the most reduced genomes in nature, *Carsonella ruddii* is the more closely related phylogenetically to *E. coli* (Nakabachi et al., 2006). For this reason in our design we used the information of the replicative machinery in *Carsonella ruddii* and the functions known in *E. coli.* For the physical construction of the systems, however, we will

ship between number of genes annotated with DNA replicative functions versus genome sizes.

*gyrB*), and the gene for the NAD(+)-dependent DNA-ligase, *ligA* (Figure 4).

scribed for the process of DNA replication?

DNA? and,

134 The Mechanisms of DNA Replication

use genes from *E. coli* for two main reasons:


For the design of the minimal DNA auto-replicative system we will attempt to include the minimal elements present in *Carsonella* and -in a conservative manner- those we presume as necessary to perform the process of DNA replication in *E. coli*. In addition to the cod‐ ing sequences, it is also necessary to define the regulatory regions of the genes and we pro‐ pose to conserve the operative regions as defined for *E. coli* with the future aim of expanding the minimal functions of *E. coli* including the regulatory functions. Other impor‐ tant regions to include in the design are: the DNA replication origin (*oriC*) and the sig‐ nals for termination of replication. Below we propose the genetic components that would constitute a MiDARS.

**Figure 4. Conservation of genes for DNA replication** in 25 reduced genomes and in E*. coli.* Grey bars represent the genes proposed to be essential for DNA auto-replication in a minimal genetic system.

#### **The DNA initiator protein (***dnaA)*

At the beginning of the replication process, check-point proteins have to recognize and un‐ fold the initiation site for replication at *oriC*. In *E. coli* DnaA is the principal protein em‐ ployed for this purpose and is highly conserved among bacteria with reduced genomes. Therefore, *dnaA* should be present in the MiDARS.

#### **The DNA helicase (***dnaB)*

The next candidate gene is *dnaB,* which encodes a DNA helicase. The role of the product of this gene is to unwind the DNA strands, a very important process during the elongation stage of replication.

#### **The DNA primase (***dnaG***)**

The gene that encodes the primase (*dnaG*) should also be considered. It is important for the synthesis of the RNA primers that permit the elongation of new DNA strands.

#### **The single strand stabilization protein (***ssb***)**

Another important function is the stabilization of single strands, carried out by the SSB pro‐ tein, encoded by the *ssb* gene.

#### **The core components of DNA polymerase III (***dnaE* **and** *dnaQ***)**

The gene for the α subunit (*dnaE*) of DNA polymerase III is present in all 25 organisms with reduced genomes and the gene for the ε subunit (*dnaQ*) in twenty-one. These proteins form part of the core of DNA polymerase III, which carries out the essential polymerization and proofreading activities during DNA synthesis.

#### **The clamp components (***dnaX***,** *holA***,** *holB, dnaN***)**

During the elongation stage, two very important structures are formed; the leader and slider clamps. The first has the function of anchorage between DNA polymerase III and the DNA helicase; (Reyes-Lamothe R. et al., 2010) allowing the synthesis of the DNA in a synchron‐ ized manner between the leading and lagging strand. It is composed of the following subu‐ nits (genes): τ (dnaX), γ (*dnaX*), δ (*holA)* and δ'(*holB*). The circular slider clamp is constituted by two β-subunits (both products of *dnaN* gene), that recognize and bind to DNA-RNA hy‐ brids (Georgescu R. et al., 2010). The slider clamp assists the core of DNA pol III to bind the lagging strand and allows the extension of the Okazaki fragments.

#### **The DNA ligase (***ligA***)**

**Figure 4. Conservation of genes for DNA replication** in 25 reduced genomes and in E*. coli.* Grey bars represent the

At the beginning of the replication process, check-point proteins have to recognize and un‐ fold the initiation site for replication at *oriC*. In *E. coli* DnaA is the principal protein em‐ ployed for this purpose and is highly conserved among bacteria with reduced genomes.

The next candidate gene is *dnaB,* which encodes a DNA helicase. The role of the product of this gene is to unwind the DNA strands, a very important process during the elongation

The gene that encodes the primase (*dnaG*) should also be considered. It is important for the

Another important function is the stabilization of single strands, carried out by the SSB pro‐

synthesis of the RNA primers that permit the elongation of new DNA strands.

genes proposed to be essential for DNA auto-replication in a minimal genetic system.

**The DNA initiator protein (***dnaA)*

**The DNA helicase (***dnaB)*

136 The Mechanisms of DNA Replication

**The DNA primase (***dnaG***)**

tein, encoded by the *ssb* gene.

stage of replication.

Therefore, *dnaA* should be present in the MiDARS.

**The single strand stabilization protein (***ssb***)**

The function of a ligase is needed for sealing nicks formed when the RNA primers are re‐ moved and replaced by DNA in the Okazaki fragments on the lagging strand.

#### **Type II DNA topoisomerase (***gyrA* **and** *gyrB***)**

We consider that a relaxing system produced by a DNA helicase may be necessary. This could be provided by the DNA gyrase complex (Type II Topoisomerase) composed of the A (*gyrA*) and B (*gyrB*) subunits.

#### **Protein for termination of replication (***tus***)**

Although there are several proteins that could contribute to termination of DNA replication we think that in a minimal system, the action of Tus could be enough to ensure this.

#### **Origin of DNA replication** *(oriC)*

This DNA sequence of around 245 bp in *E. coli* (Tabata et al., 1983) is needed to enable the DnaA protein to initiate the process of DNA replication

#### **Termination of DNA replication (***terB* **and** *terC***)**

These sequences are used by the Tus proteins to form the trap which terminates DNA repli‐ cation.

The proposed elements that constitute the auto-replicative system are also listed in Table 3. This proposal is somewhat similar to previous reports, where genes that could constitute a minimal cell based on a comparative genomics study among various endosymbionts are de‐ scribed (Gil et al., 2004). In the present study however we also considered the inclusion of the DNA regions for initiation and termination of replication, as well as the *dnaA*, *ssb* and *tus* genes*.*


**Table 3.** Components of a minimal DNA auto-replicative system.

### **7. Expression of the replicative proteins of the MiDARS**

A primary condition for the operation of an auto replicative system is that the protein-ma‐ chinery encoded in it should be expressed. For transcription of the assembled group of genes, we propose use the *E. coli* RNA polymerase and its transcription factor sigma70 since all the genes of the system have a sigma70 factor promoter. The essential components of the RNA polymerase and their sigma70 factors have previously been successfully ex‐ pressed separately and their activity reconstituted as mentioned previously (Asahara & Chong, 2010). We propose these components can be assembled as an additional functional module whose activity can be assayed separately and subsequently integrated into the sys‐ tem. The resulting mRNA (16) could be translated in an *in vitro* system such as the Pure SystemTM (Ueda et al., 1992; Shimizu & Ueda, 2010); containing ribosomes, aminoacyltRNAs, chaperones and initiation, elongation and termination factors among other ele‐ ments essential for translation. Once protein synthesis is completed, the products could initiate replication of the DNA molecule for which the addition of deoxynucleotide triphos‐ phates (dNTPs) and the appropriate buffers will be necessary. The source of energy for the system will be creatine phosphate with the creatine kinase enzyme as the regenerator (Shi‐ mizu et al., 2006). An outline for the operation of the DNA auto-replicative system is shown in Figure 5.

### **8. Perspectives**

**Gene/DNA element Product Size (bp.** *E. coli***)** *dnaA* Chromosomal replication initiator protein DnaA 1404 *dnaB* Replicative DNA helicase 1416 *dnaG* DNA primase 1746 *ssb* Single-stranded DNA-binding protein 537 *dnaE* DNA polymerase III α subunit 3483 *dnaN* DNA polymerase III, β subunit 1101 *dnaQ* DNA polymerase III ε subunit 732 *dnaX* DNA polymerase III, τ and γsubunits 1932 *holA* DNA polymerase III,δ subunit 1032 *holB* DNA polymerase III, δ' subunit 1005 *polA* DNA pol I 5'-3' and 3'-5' exonuclease ; 3'-5' polymerase 2787 *ligA* DNA ligase, NAD(+)-dependent 2016 *gyrA* DNA gyrase (type II topoisomerase), subunit A 2628 *gyrB* DNA gyrase, subunit B 2415 *tus* Termination DNA replication protein 930 *oriC* DNA region for initiation, origin of replication 245 *ter* DNA region for termination of replication 23

**Table 3.** Components of a minimal DNA auto-replicative system.

138 The Mechanisms of DNA Replication

**7. Expression of the replicative proteins of the MiDARS**

A primary condition for the operation of an auto replicative system is that the protein-ma‐ chinery encoded in it should be expressed. For transcription of the assembled group of genes, we propose use the *E. coli* RNA polymerase and its transcription factor sigma70 since all the genes of the system have a sigma70 factor promoter. The essential components of the RNA polymerase and their sigma70 factors have previously been successfully ex‐ pressed separately and their activity reconstituted as mentioned previously (Asahara & Chong, 2010). We propose these components can be assembled as an additional functional module whose activity can be assayed separately and subsequently integrated into the sys‐ tem. The resulting mRNA (16) could be translated in an *in vitro* system such as the Pure SystemTM (Ueda et al., 1992; Shimizu & Ueda, 2010); containing ribosomes, aminoacyltRNAs, chaperones and initiation, elongation and termination factors among other ele‐ ments essential for translation. Once protein synthesis is completed, the products could initiate replication of the DNA molecule for which the addition of deoxynucleotide triphos‐ phates (dNTPs) and the appropriate buffers will be necessary. The source of energy for the Previous efforts have been made to propose the design of minimal cells however this objec‐ tive is still far from being accomplished. From the standpoint of Synthetic Biology, biological systems that are robust, predictable in performance and highly efficient are desired (Jewett & Forster, 2010). In this work, we present a proposal to build an auto-replicative DNA sys‐ tem as the first step toward the development of synthetic biosystems. Additional cellular processes will need to be designed and constructed in a modular way including: transcrip‐ tional and translational functions and a minimal metabolism in order to maintain cell growth and produce energy.

Once this first prototype has been constructed and tested for performance, some further re‐ duced combinations of the proposed number of genes could be tested to determine the abso‐ lute minimum set of genes sufficient to sustain DNA auto-replication; e.g. the few genes present in *Carsonella ruddii* PV.

The system proposed in this work can be assembled using methodologies such as that used when working with Biobricks (Smolke, 2009). Once the mini-chromosome is assembled it could function in cell-free systems, in anucleated mini-cells (Adler et al., 1967), in spores that lack DNA (Siccardi et al., 1975), in micelles or lipidic vesicles, and in some commercial sys‐ tems. An important achievement in this sense has previously been reported by another re‐ search group, namely DNA replication achieved by using the Phi29 DNA polymerase, inside a lipidic vesicle. In this report only one strand was linearly replicated and circularized (Kurihara, 2011).

The successful development of a DNA auto-replicative system as proposed here could be a very important platform for the development of synthetic biology and the potential for such a system is great:


**Figure 5. Proposal for the minimal components of a MiDARS and their function.** a) The genetic system is a simpli‐ fied version of a prokaryotic DNA mini-chromosome (25432 bp). The system contains the initiation region (*oriC*) and termination (*ter*) sites for DNA replication as well as a set of genes from *E. coli* (Table 4). The genes can be organized in the same order as in the native chromosome and contain their native operator regions to control expression. b) Tran‐ scription and, c) Translation can be carried out in solution using commercial kits (e.g. Pure System), RNA Polymerase and the *E. coli* sigma70 factor. d) The initiation of replication is regulated by DnaA-ATP and the helicase will join to the lagging strand in order to form the replication forks. The primase will bind to the helicase to carry out the synthesis of RNA primers that permit the activity of DNA pol III. The SSB protein stabilizes single strands of DNA. e) Two core subu‐ nits (α and ε) of the DNA Pol III, perform the elongation and proofreading of DNA. The DNA ligase and DNA Pol I re‐ place the RNA primers, sealing the nicks between contiguous DNA fragments on the lagging strand. Topoisomerase II will relax the DNA template as the replication fork progresses. f) The Tus protein bound to the *ter* sites serves as a trap for the replicative machinery headed by the DNA helicase, stopping its movement and promoting the separation of the new MiDARS.

### **9. Conclusions**

Here we propose a design for the construction of a minimal genetic system for DNA autoreplication. This proposal is based on the consideration of the latest knowledge of the details of the mechanisms and controls of DNA replication in *E. coli* and by taking into account the conservation of the replicative machinery in bacteria with extremely reduced genomes par‐ ticularly those present in *Carsonella ruddii* PV.

The proposed auto-replicative device consists of 17 DNA elements (27822 bp including their operator regions) taken from the *E. coli* genome and incorporating the most conserved ele‐ ments of the replicative machinery found in bacteria with extremely reduced genomes. These genetic elements will maintain their native operator and termination regions. Their products encode proteins encompassing the minimal number of predicted activities in‐ volved in DNA replication. Finally we propose some conditions in which the system might function.

### **Acknowledgments**

Authors thank June Simpson for critical comments to the ms. This work was supported by CONACYT grants (102854 and 103686) given to AM-A. LE-S and CQ-V thank to CONACYT for PhD scholarships (208153 and 206011)

### **Author details**

Agustino Martinez-Antonio, Laura Espindola-Serna and Cesar Quiñones-Valles

Departamento de Ingeniería Genética, Cinvestav, Irapuato-León, Irapuato Gto, México

### **References**

**Figure 5. Proposal for the minimal components of a MiDARS and their function.** a) The genetic system is a simpli‐ fied version of a prokaryotic DNA mini-chromosome (25432 bp). The system contains the initiation region (*oriC*) and termination (*ter*) sites for DNA replication as well as a set of genes from *E. coli* (Table 4). The genes can be organized in the same order as in the native chromosome and contain their native operator regions to control expression. b) Tran‐ scription and, c) Translation can be carried out in solution using commercial kits (e.g. Pure System), RNA Polymerase and the *E. coli* sigma70 factor. d) The initiation of replication is regulated by DnaA-ATP and the helicase will join to the lagging strand in order to form the replication forks. The primase will bind to the helicase to carry out the synthesis of RNA primers that permit the activity of DNA pol III. The SSB protein stabilizes single strands of DNA. e) Two core subu‐ nits (α and ε) of the DNA Pol III, perform the elongation and proofreading of DNA. The DNA ligase and DNA Pol I re‐ place the RNA primers, sealing the nicks between contiguous DNA fragments on the lagging strand. Topoisomerase II will relax the DNA template as the replication fork progresses. f) The Tus protein bound to the *ter* sites serves as a trap for the replicative machinery headed by the DNA helicase, stopping its movement and promoting the separation of

the new MiDARS.

140 The Mechanisms of DNA Replication


[17] López-Madrigal, S., Latorre, A., Porcar, M., Moya, A., & Gil, R. (2011). Complete ge‐ nome sequence of "Candidatus *Tremblaya princeps*" strain PCVAL, an intriguing translational machine below the living-cell status. J Bacteriol. , 193, 5587-8.

[4] Blattner, F. R., Plunkett, G., Bloch, C. A., Perna, N. T., Burland, V., Riley, M., Collado-Vides, J., Glasner, J. D., Rode, C. K., Mayhew, G. F., Gregor, J., Davis, N. W., Kirkpa‐ trick, H. A., Goeden, M. A., Rose, D. J., Mau, B., & Shao, Y. (1997). The complete

[5] Delaye, L., & Moya, A. (2009). Evolution of reduced prokaryotic genomes and the

[6] Foley, P. L., & Shuler, M. L. (2010). Considerations for the design and construction of a synthetic platform cell for biotechnological applications. Biotechnol Bioeng , 105,

[7] Forster, A. C., & Church, G. M. (2006). Towards synthesis of a minimal cell. Mol Syst

[8] Georgescu, R., Yao, N. Y., & O'Donnell, O. (2010). Single-molecule analysis of the *Es‐ cherichia coli* replisome and use of clamps to bypass replication barriers. *FEBS Letters*,

[9] Gil, R., Silva, F. J., Peretó, J., & Moya, A. (2004). Determination of the core of a mini‐

[10] Jain, R., Rivera, M. C., & Lake, J. (1999). Horizontal gene transfer among genomes:

[11] Jewett, M. C., & Forster, A. C. (2010). Update on designing and building minimal

[12] Keyamura, K., & Katayama, T. (2011). DnaA protein DNA-binding domain binds to Hda protein to promote inter-AAA+ domain interaction involved in regulatory inac‐

[13] Kurihara, K., Tamura, M., Shohda, K., Toyota, T., Suzuki, K., & Sugawara, T. (2011). Self-reproduction of supramolecular giant vesicles combined with the amplification

[14] Langston, L. D., Indiani, C., & O'Donnell, M. (2009). Whither the replisome: emerg‐ ing perspectives on the dynamic nature of the DNA replication machinery. *Cell Cycle*,

[15] Lehman, I. R., Bessman, M. J., Simms, E. S. ., & Kornberg, A. (1958a). Enzymatic syn‐ thesis of deoxyribonucleic acid. I. Preparation of substrates and partial purification of

[16] Lehman, I. R., Zimmerman, S. B., Adler, J., Bessman, M. J., Simms, E. S., & Kornberg, A. (1958b). Enzymatic synthesis of deoxyribonucleic acid. V. Chemical composition of enzymatically synthesized deoxyribonucleic acid. Proc Natl Acad. Sci U S A , 44,

mal bacterial gene set. Microbiol Mol Biol Rev , 68, 518-37.

The complexity hypothesis. PNAS. , 96, 3801-3806.

tivation of DnaA. J Biol Chem , 286, 29336-29346.

of encapsulated DNA. *Nature Chemistry*, 3, 775-781.

an enzyme from *Escherichia coli*. J. Biol Chem , 233, 163-170.

cells. Curr Opin Biotechnol , 21, 697-703.

genome sequence of *Escherichia coli* K-12. Science , 277, 1453-1462.

minimal cell concept: Variations on a theme. *BioEssays*, 32, 281-287.

26-36.

142 The Mechanisms of DNA Replication

Biol 2:45.

584, 2596-2605.

8, 2686-91.

1191-1196.

