**DNA Replication in Archaea, the Third Domain of Life**

Yoshizumi Ishino and Sonoko Ishino

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/53986

### **1. Introduction**

The accurate duplication and transmission of genetic information are essential and crucially important for living organisms. The molecular mechanism of DNA replication has been one of the central themes of molecular biology, and continuous efforts to elucidate the precise molecular mechanism of DNA replication have been made since the discovery of the double helix DNA structure in 1953 [1]. The protein factors that function in the DNA replication process, have been identified to date in the three domains of life (Figure 1).

**Figure 1.** Stage of DNA replication

© 2013 Ishino and Ishino; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Ishino and Ishino; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


**Table 1.** The proteins involved in DNA replication from the three domains of life

Archaea, the third domain of life, is a very interesting living organism to study from the as‐ pects of molecular and evolutional biology. Rapid progress of whole genome sequence anal‐ yses has allowed us to perform comparative genomic studies. In addition, recent microbial ecology has revealed that archaeal organisms inhabit not only extreme environments, but al‐ so more ordinary habitats. In these situations, archaeal biology is among the most exciting of research fields. Archaeal cells have a unicellular ultrastructure without a nucleus, resem‐ bling bacterial cells, but the proteins involved in the genetic information processing path‐ ways, including DNA replication, transcription, and translation, share strong similarities with those of eukaryotes. Therefore, most of the archaeal proteins were identified as homo‐ logues of many eukaryotic replication proteins, including ORC (origin recognition complex), Cdc6, GINS (Sld5-Psf1-Psf2-Psf3), MCM (minichromosome maintenance), RPA (replication protein A), PCNA (proliferating cell nuclear antigen), RFC (replication factor C), FEN1 (flap endonuclease 1), in addition to the eukaryotic primase, DNA polymerase, and DNA ligase; these are obviously different from bacterial proteins (Table 1) and these proteins were bio‐ chemically characterized [2-4]. Their similarities indicate that the DNA replication machi‐ neries of Archaea and Eukaryota evolved from a common ancestor, which was different from that of Bacteria [5]. Therefore, the archaeal organisms are good models to elucidate the functions of each component of the eukaryotic type replication machinery complex. Genom‐ ic and comparative genomic research with archaea is made easier by the fact that the ge‐ nome size and the number of genes of archaea are much smaller than those of eukaryotes. The archaeal replication machinery is probably a simplified form of that in eukaryotes. On the other hand, it is also interesting that the circular genome structure is conserved in Bacte‐ ria and Archaea and is different from the linear form of eukaryotic genomes. These features have encouraged us to study archaeal DNA replication, in the hopes of gaining fundamental insights into this molecular mechanism and its machinery from an evolutional perspective. The study of bacterial DNA replication at a molecular level started in about 1960, and then eukaryotic studies followed since 1980. Because Archaea was recognized as the third do‐ main of life later, the archaeal DNA replication research became active after 1990. With in‐ creasing the available total genome sequences, the progress of research on archaeal DNA replication has been rapid, and the depth of our knowledge of archaeal DNA replication has almost caught up with those of the bacterial and eukaryotic research fields. In this chapter, we will summarize the current knowledge of DNA replication in Archaea.

### **2. Replication origin**

Archaea Eukaryota Bacteria

Cdc6 Cdt1 MCM GINS Cdc45

family B DNA polymerase (Pol δ) (Pol ε)

clamp loader (RFC) clamp (PCNA)

FEN1 DNA2 DNA ligase DnaC DnaB

family C DNA polymerase

(Pol III)

clamp loader (γ-complex) clamp (β-clamp)

Pol I RNaseH DNA ligase

origin recognition Cdc6/Orc1 ORC DnaA

primer synthesis DNA primase Pol α / primase DnaG

DNA polymerase

(Pol B) family D DNA polymerase

(Pol D)

Dna2 DNA ligase

Archaea, the third domain of life, is a very interesting living organism to study from the as‐ pects of molecular and evolutional biology. Rapid progress of whole genome sequence anal‐ yses has allowed us to perform comparative genomic studies. In addition, recent microbial ecology has revealed that archaeal organisms inhabit not only extreme environments, but al‐ so more ordinary habitats. In these situations, archaeal biology is among the most exciting of research fields. Archaeal cells have a unicellular ultrastructure without a nucleus, resem‐ bling bacterial cells, but the proteins involved in the genetic information processing path‐ ways, including DNA replication, transcription, and translation, share strong similarities with those of eukaryotes. Therefore, most of the archaeal proteins were identified as homo‐ logues of many eukaryotic replication proteins, including ORC (origin recognition complex), Cdc6, GINS (Sld5-Psf1-Psf2-Psf3), MCM (minichromosome maintenance), RPA (replication protein A), PCNA (proliferating cell nuclear antigen), RFC (replication factor C), FEN1 (flap endonuclease 1), in addition to the eukaryotic primase, DNA polymerase, and DNA ligase; these are obviously different from bacterial proteins (Table 1) and these proteins were bio‐ chemically characterized [2-4]. Their similarities indicate that the DNA replication machi‐

**Table 1.** The proteins involved in DNA replication from the three domains of life

clamp loader (RFC) clamp (PCNA)

MCM GINS

DNA unwinding Cdc6/Orc1

DNA synthesis family B

maturation Fen1

initiation

92 The Mechanisms of DNA Replication

elongation

The basic mechanism of DNA replication was predicted as "replicon theory" by Jacob et al. [6]. They proposed that an initiation factor recognizes the replicator, now referred to as a replication origin, to start replication of the chromosomal DNA. Then, the replication origin of *E. coli* DNA was identified as *oriC* (origin of chromosome). The archaeal replication origin was identified in the *Pyrococcus abyssi* in 2001 as the first archaeal replication origin. The ori‐ gin was located just upstream of the gene encoding the Cdc6 and Orc1-like sequences in the *Pyrococcus* genome [7]. We discovered a gene encoding an amino acid sequence that bore similarity to those of both eukaryotic Cdc6 and Orc1, which are the eukaryotic initiators. Af‐ ter confirming that this protein actually binds to the *oriC* region on the chromosomal DNA we named the gene product Cdc6/Orc1 due to its roughly equal homology with regions of eukaryotic Orc1 and Cdc6, [7]. The gene consists of an operon with the gene encoding DNA polymerase D (it was originally called Pol II, as the second DNA polymerase from *Pyrococ‐ cus furiosus*) in the genome [8]. A characteristic of the *oriC* is the conserved 13 bp repeats, as predicted earlier by bioinformatics [9], and two of the repeats are longer and surround a predicted DUE (DNA unwinding element) with an AT-rich sequence in *Pyrococcus* genomes (Figure 2) [10]. The longer repeated sequence was designated as an ORB (Origin Recognition Box), and it was actually recognized by Cdc6/Orc1 in a *Sulfolobus solfataricus* study [11]. The 13 base repeat is called a miniORB, as a minimal version of ORB. A whole genome microar‐ ray analysis of *P. abyssi* showed that the Cdc6/Orc1 binds to the *oriC* region with extreme specificity, and the specific binding of the highly purified *P. furiosus* Cdc6/Orc1 to ORB and miniORB was confirmed *in vitro* [12]. It has to be noted that multiple origins were identified in the *Sulfolobus* genomes. It is now well recognized that *Sulfolobus* has three origins and they work at the same time in the cell cycle [11, 13-16]. Analysis of the mechanism of how the multiple origins are utilized for genome replication is an interesting subject in the re‐ search field of archaeal DNA replication. The main questions are how the initiation of repli‐ cation from multiple origins is regulated and how the replication forks progress after the collision of two forks from opposite directions.

**Figure 2. The** *oriC* **region in** *Pyrococcus* **genome.** The region surrounding *oriC* is presented schematically. The ORB1 and ORB2 are indicated by large arrow, and the mini-ORB repeats are indicated by small arrowheads. DUE is indicated in red. The unwinding site, determined by *in vitro* analysis, is indicated in orange. The transition site is indicated by green arrows. The *cdc6/orc1* gene located in downstream is drawn by gray arrow.

### **3. How does Cdc6/Orc1 recognize** *oriC***?**

An important step in characterizing the initiation of DNA replication in Archaea is to under‐ stand how the Cdc6/Orc1 protein recognizes the *oriC* region. Based upon amino acid se‐ quence alignments, the archaeal Cdc6/Orc1 proteins belong to the AAA+ family of proteins. The crystal structures of the Cdc6/Orc1 protein from *Pyrobaculum aerophilum* [17] and one of the two Cdc6/Orc1 proteins, ORC2 from *Aeropyrum pernix* (the two homologs in this organ‐ ism are called ORC1 and ORC2 by the authors) [18] were determined. These Cdc6/Orc1 pro‐ teins consist of three structural domains. Domains I and II adopt a fold found in the AAA+ family proteins. A winged helix (WH) fold, which is present in a number of DNA binding proteins, is found in the domain III. There are four ORBs arranged in pairs on both sides of the DUE in the *oriC* region of *A. pernix,* and ORC1 binds to each ORB as a dimer. A mecha‐ nism was proposed in which ORC1 binds to all four ORBs to introduce a higher-order as‐ sembly for unwinding of the DUE with alterations in both topology and superhelicity [19]. Furthermore, the crystal structures of *S. solfataricus* Cdc6-1 and Cdc6-3 (two of the three Cdc6/Orc1 proteins in this organism) forming a heterodimer bound to *ori2* DNA (one of the three origins in this organism) [20], and that of *A. pernix* ORC1 bound to an origin sequence [21] were determined. These studies revealed that both the N-terminal AAA+ ATPase do‐ main (domain I+II) and C-terminal WH domain (domain III) contribute to origin DNA bind‐ ing, and the structural information not only defined the polarity of initiator assembly on the origin but also indicated the induction of substantial distortion, which probably triggers the unwinding of the duplex DNA to start replication, into the DNA strands. These structural data also provided the detailed interaction mode between the initiator protein and the *oriC* DNA. Mutational analyses of the *Methanothermobactor thermautotrophicus* Cdc6-1 protein re‐ vealed the essential interaction between an arginine residue conserved in the archaeal Cdc6/ Orc1 and an invariant guanine in the ORB sequence [22].

*P. furiosus* Cdc6/Orc1 is difficult to purify in a soluble form. A specific site in the *oriC* to start unwinding *in vitro*, was identified using the protein prepared by a denaturation-renatura‐ tion procedure recently [23]*.* As shown in Figure 2, the local unwinding site is about 670 bp away from the transition site between leading and lagging syntheses, which was determined earlier by an *in vivo* replication initiation point (RIP) assay [10]. Although the details of the replication machinery that must be established at the unwound site are not fully understood in Archaea, it is expected to minimally include MCM, GINS, primase, PCNA, DNA poly‐ merase, and RPA, as described below. The following *P. furiosus* studies revealed that the AT‐ Pase activity of the Cdc6/Orc1 protein was completely suppressed by binding to DNA containing the ORB. Limited proteolysis and DNase I-footprint experiments suggested that the Cdc6/Orc1 protein changes its conformation on the ORB sequence in the presence of ATP. The physiological meaning of this conformational change has not been solved, but it should have an important function to start the initiation process [24] as in the case of bacteri‐ al DnaA protein. In addition, results from an *in vitro* recruiting assay indicated that MCM (Mcm protein complex), the replicative DNA helicase, is recruited onto the *oriC* region in a Cdc6/Orc1-dependent, but not ATP-dependent, manner [24], as described below. However, this recruitment is not sufficient for the unwinding function of MCM, and some other func‐ tion remains to be identified for the functional loading of this helicase to promote the pro‐ gression of the DNA replication fork.

### **4. MCM helicase**

they work at the same time in the cell cycle [11, 13-16]. Analysis of the mechanism of how the multiple origins are utilized for genome replication is an interesting subject in the re‐ search field of archaeal DNA replication. The main questions are how the initiation of repli‐ cation from multiple origins is regulated and how the replication forks progress after the

**Figure 2. The** *oriC* **region in** *Pyrococcus* **genome.** The region surrounding *oriC* is presented schematically. The ORB1 and ORB2 are indicated by large arrow, and the mini-ORB repeats are indicated by small arrowheads. DUE is indicated in red. The unwinding site, determined by *in vitro* analysis, is indicated in orange. The transition site is indicated by

An important step in characterizing the initiation of DNA replication in Archaea is to under‐ stand how the Cdc6/Orc1 protein recognizes the *oriC* region. Based upon amino acid se‐ quence alignments, the archaeal Cdc6/Orc1 proteins belong to the AAA+ family of proteins. The crystal structures of the Cdc6/Orc1 protein from *Pyrobaculum aerophilum* [17] and one of the two Cdc6/Orc1 proteins, ORC2 from *Aeropyrum pernix* (the two homologs in this organ‐ ism are called ORC1 and ORC2 by the authors) [18] were determined. These Cdc6/Orc1 pro‐ teins consist of three structural domains. Domains I and II adopt a fold found in the AAA+ family proteins. A winged helix (WH) fold, which is present in a number of DNA binding proteins, is found in the domain III. There are four ORBs arranged in pairs on both sides of the DUE in the *oriC* region of *A. pernix,* and ORC1 binds to each ORB as a dimer. A mecha‐ nism was proposed in which ORC1 binds to all four ORBs to introduce a higher-order as‐ sembly for unwinding of the DUE with alterations in both topology and superhelicity [19]. Furthermore, the crystal structures of *S. solfataricus* Cdc6-1 and Cdc6-3 (two of the three

green arrows. The *cdc6/orc1* gene located in downstream is drawn by gray arrow.

**3. How does Cdc6/Orc1 recognize** *oriC***?**

collision of two forks from opposite directions.

94 The Mechanisms of DNA Replication

After unwinding of the *oriC* region, the replicative helicase needs to remain loaded to pro‐ vide continuous unwinding of double stranded DNA (dsDNA) as the replication forks prog‐ ress bidirectionally. The MCM protein complex, consisting of six subunits (Mcm2, 3, 4, 5, 6, and 7), is known to be the replicative helicase "core" in eukaryotic cells [25]. The MCM fur‐ ther interacts with Cdc45 and GINS, to form a ternary assembly referred to as the "CMG complex", that is believed to be the functional helicase in eukaryotic cells (Figure 3) [26]. However, this idea is still not universal for the eukaryotic replicative helicase.

**Figure 3. DNA-Unwinding complex in eukaryotes and archaea.** The CMG complex is the replicative helicase for the template DNA unwinding reaction in eukaryotes. The archaeal genomes contain the homologs of the Mcm and Gins proteins, but a Cdc45 homolog has not been identified. Recent research suggests that a RecJ-like exonuclease GAN, which has weak sequence homology to that of Cdc45, may work as a helicase complex with MCM and GINS.

Most archaeal genomes appear to encode at least one Mcm homologue, and the helicase ac‐ tivities of these proteins from several archaeal organisms have been confirmed *in vitro* [27-31]. In contrast to the eukaryotic MCM, the archaeal MCMs, consist of a homohexamer or homo double hexamer, having distinct DNA helicase activity by themselves *in vitro*, and therefore, these MCMs on their own may function as the replicative helicase *in vivo*. The structure-function relationships of the archaeal Mcms have been aggressively studied using purified proteins and site-directed mutagenesis [32]. An early report using the ChIP method showed that the *P. abyssi* Mcm protein preferentially binds to the origin *in vivo* in exponen‐ tially growing cells [7, 12]. The *P. furiosus* MCM helicase does not display significant helicase activity *in vitro*. However, the DNA helicase activity was clearly stimulated by the addition of GINS (the Gins23-Gins51 complex), which is the homolog of the eukaryotic GINS com‐ plex (described below in more detail). This result suggests that MCM works with other ac‐ cessory factors to form a core complex in *P. furiosus* similar to the eukaryotic CMG complex as described above [31].

Some archaeal organisms have more than two Cdc6/Orc1 homologs. It was found that the two Cdc6/Orc1 homologs, Cdc6-1 and Cdc6-2, both inhibit the helicase activity of MCM in *M. thermautotrophicus* [33. 34]. Similarly, Cdc6-1 inhibits MCM activity in *S. solfataricus* [35]. In contrast, the Cdc6-2 protein stimulates the helicase activity of MCM in *Thermoplasma acid‐ ophilum* [36]. Functional interactions between Cdc6/Orc1 and Mcm proteins need to be inves‐ tigated in greater detail to achieve a more comprehensive understanding of the conservation and diversity of the initiation mechanism in archaeal DNA replication.

Another interesting feature of DNA replication initiation is that several archaea have multi‐ ple genes encoding Mcm homologs in their genomes. Based on the recent comprehensive genomic analyses, thirteen archaeal species have more than one *mcm* gene. However, many of the *mcm* genes in the archaeal genomes seem to reside within mobile elements, originat‐ ing from viruses [37]. For example, two of the three genes in the *Thermococcus kodakarensis* genome are located in regions where genetic elements have presumably been integrated [38]. The establishment of a genetic manipulation system for *T. kodakarensis*, is the first for a hyperthermophilic euryarchaeon [39. 40], and is advantageous for investigating the function of these Mcm proteins. Two groups have recently performed gene disruption experiments for each *mcm* gene [41, 42]. These experiments revealed that the knock-out strains for *mcm1* and *mcm2* were easily isolated, but *mcm3* could not be disrupted. Mcm3 is relatively abun‐ dant in the *T. kodakarensis* cells. Furthermore, an *in vitro* experiment using purified Mcm pro‐ teins showed that only Mcm3 forms a stable hexameric structure in solution. These results support the contention that Mcm3 is the main helicase core protein in the normal DNA rep‐ lication process in *T. kodakarensis*.

The functions of the other two Mcm proteins remain to be elucidated. The genes for Mcm1 and Mcm2 are stably inherited, and their gene products may perform some important func‐ tions in the DNA metabolism in *T. kodakarensis.* The DNA helicase activity of the recombi‐ nant Mcm1 protein is strong *in vitro,* and a distinct amount of the Mcm1 protein is present in *T. kodakarensis* cells. Moreover, Mcm1 functionally interacts with the GINS complex from *T. kodakarensis* [42]*.* These observations strongly suggest that Mcm1 does participate in some aspect of DNA transactions, and may be substituted with Mcm3. Our immunoprecipitation experiments showed that Mcm1 co-precipitated with Mcm3 and GINS, although they did not form a heterohexameric complex [42], suggesting that Mcm1 is involved in the repli‐ some or repairsome and shares some function in *T. kodakarensis* cells. Although western blot analysis could not detect Mcm2 in the extract from exponentially growing *T. kodakarensis* cells [42], a RT-PCR experiment detected the transcript of the *mcm2* gene in the cells (Ishino et al., unpublished). The recombinant Mcm2 protein also has ATPase and helicase activities *in vitro.* [41] Therefore, the *mcm2* gene is expressed under normal growth conditions and may work in some process with a rapid turn over. Further experiments to measure the effi‐ ciency of *mcm2* gene transcription by quantitative PCR, as well as to assess the stability of the Mcm2 protein in the cell extract, are needed. Phenotypic analyses investigating the sensi‐ tivities of the Δ*mcm1* and Δ*mcm2* mutant strains to DNA damage caused by various muta‐ gens, as reported for other DNA repair-related genes in *T. kodakarensis* [43], may provide a clue to elucidate the functions of these Mcm proteins.

*Methanococcus maripaludis* S2 harbors four *mcm* genes in its genome, three of which seem to be derived from phage, a shotgun proteomics study detected peptides originating from three out of the four *mcm* gene products [44]. Furthermore, the four gene products co-ex‐ pressed in *E. coli* cells were co-purified in the same fraction [45]. These results suggest that multiple Mcm proteins are functional in the *M. maripaludis* cells.

### **5. Recruitment of Mcm to the** *oriC* **region**

**Figure 3. DNA-Unwinding complex in eukaryotes and archaea.** The CMG complex is the replicative helicase for the template DNA unwinding reaction in eukaryotes. The archaeal genomes contain the homologs of the Mcm and Gins proteins, but a Cdc45 homolog has not been identified. Recent research suggests that a RecJ-like exonuclease GAN, which has weak sequence homology to that of Cdc45, may work as a helicase complex with MCM and GINS.

Most archaeal genomes appear to encode at least one Mcm homologue, and the helicase ac‐ tivities of these proteins from several archaeal organisms have been confirmed *in vitro* [27-31]. In contrast to the eukaryotic MCM, the archaeal MCMs, consist of a homohexamer or homo double hexamer, having distinct DNA helicase activity by themselves *in vitro*, and therefore, these MCMs on their own may function as the replicative helicase *in vivo*. The structure-function relationships of the archaeal Mcms have been aggressively studied using purified proteins and site-directed mutagenesis [32]. An early report using the ChIP method showed that the *P. abyssi* Mcm protein preferentially binds to the origin *in vivo* in exponen‐ tially growing cells [7, 12]. The *P. furiosus* MCM helicase does not display significant helicase activity *in vitro*. However, the DNA helicase activity was clearly stimulated by the addition of GINS (the Gins23-Gins51 complex), which is the homolog of the eukaryotic GINS com‐ plex (described below in more detail). This result suggests that MCM works with other ac‐ cessory factors to form a core complex in *P. furiosus* similar to the eukaryotic CMG complex

Some archaeal organisms have more than two Cdc6/Orc1 homologs. It was found that the two Cdc6/Orc1 homologs, Cdc6-1 and Cdc6-2, both inhibit the helicase activity of MCM in *M. thermautotrophicus* [33. 34]. Similarly, Cdc6-1 inhibits MCM activity in *S. solfataricus* [35]. In contrast, the Cdc6-2 protein stimulates the helicase activity of MCM in *Thermoplasma acid‐ ophilum* [36]. Functional interactions between Cdc6/Orc1 and Mcm proteins need to be inves‐ tigated in greater detail to achieve a more comprehensive understanding of the conservation

Another interesting feature of DNA replication initiation is that several archaea have multi‐ ple genes encoding Mcm homologs in their genomes. Based on the recent comprehensive genomic analyses, thirteen archaeal species have more than one *mcm* gene. However, many of the *mcm* genes in the archaeal genomes seem to reside within mobile elements, originat‐ ing from viruses [37]. For example, two of the three genes in the *Thermococcus kodakarensis* genome are located in regions where genetic elements have presumably been integrated [38]. The establishment of a genetic manipulation system for *T. kodakarensis*, is the first for a hyperthermophilic euryarchaeon [39. 40], and is advantageous for investigating the function of these Mcm proteins. Two groups have recently performed gene disruption experiments for each *mcm* gene [41, 42]. These experiments revealed that the knock-out strains for *mcm1*

and diversity of the initiation mechanism in archaeal DNA replication.

as described above [31].

96 The Mechanisms of DNA Replication

Another important question is how MCM is recruited onto the unwound region of *oriC.* The detailed loading mechanism of the MCM helicase has not been elucidated. It is believed that archaea utilize divergent mechanisms of MCM helicase assembly at the *oriC* [46].

An *in vitro* recruiting assay showed that *P. furiosus* MCM is recruited to the *oriC* DNA in a Cdc6/Orc1-dependent manner [24]. This assay revealed that preloading Cdc6/Orc1 onto the ORB DNA resulted in a clear reduction in MCM recruitment to the *oriC* region, suggesting that free Cdc6/Orc1 is preferable as a helicase recruiter, to associate with MCM and bring it to *oriC*. It would be interesting to understand how the two tasks, origin recognition and MCM recruiting, are performed by the Cdc6/Orc1 protein, because the WH domain, which primarily recognizes and binds ORB, also has strong affinity for the Mcm protein. The as‐ sembly of the Mcm protein onto the ORB DNA by the Walker A-motif mutant of *P. furiosus* Cdc6/Orc1 occurred with the same efficiency as the wild type Cdc6/Orc1. The DNA binding of *P. furiosus* Cdc6/Orc1 was not drastically different in the presence and absence of ATP, as in the case of the initiator proteins from *Archaeoglobus fulgidus* [28], *S. solfataricus* [11], and *A. pernix* [19]. Therefore, it is still not known whether the ATP binding and hydrolysis activity of Cdc6/Orc1 regulates the Mcm protein recruitment onto *oriC* in the cells.

One more important issue is the very low efficiency of the Mcm protein recruitment in the reported *in vitro* assay [24]. Quantification of the recruited Mcm protein by the *in vitro* assay showed that less than one Mcm hexamer was recruited to the ORB. The linear DNA contain‐ ing ORB1 and ORB2, used in the recruiting assay, may not be suitable to reconstitute the archaeal DNA replication machinery and a template that more closely mimics the chromo‐ somal DNA may be required. Additionally, it may be that as yet unidentified proteins are required to achieve efficient *in vitro* helicase loading in the *P. furiosus* cells. Finally, it will ultimately be necessary to construct a more defined *in vitro* replication system to analyze the regulatory functions of Cdc6/Orc1 precisely during replication initiation.

In *M. thermautotrophicus*, the Cdc6-2 proteins can dissociate the Mcm multimers [47]. The ac‐ tivity of Cdc6-2 might be required as the MCM helicase loader in this organism. The interac‐ tion between Cdc6/Orc1 and Mcm is probably general. However, the effect of Cdc6/Orc1 on the MCM helicase activity differs among various organisms, as described above. Some other protein factors may function in various archaea, for example a protein that is distantly relat‐ ed to eukaryotic Cdt1, which plays a crucial role during MCM loading in Eukaryota, exists in some archaeal organisms, although its function has not been characterized yet [14].

### **6. GINS**

The eukaryotic GINS complex was originally identified in *Saccharomyces cerevisiae* as essen‐ tial protein factor for the initiation of DNA replication [48]. GINS consists of four different proteins, Sld5, Psf1, Psf2, and Psf3 (therefore, GINS is an acronym for Japanese go-ichi-nisan, meaning 5-1-2-3, after these four subunits). The amino acid sequences of the four subu‐ nits in the GINS complex share some conservation, suggesting that they are ancestral paralogs [49]. However, most of the archaeal genomes have only one gene encoding this family protein, and more interestingly, the Crenarchaeota and Euryarchaeota (the two major subdomains of Archaea) characteristically have two genes with sequences similar to Psf2 and Psf3, and Sld5 and Psf1, respectively referred to as Gins23 and Gins51 [31, 49]. A Gins homolog, designated as Gins23, was biochemically detected in *S. solfataricus* as the first Gins protein in Archaea, in a yeast two-hybrid screening for interaction partners of the Mcm pro‐ tein, and another subunit, designated as Gins15, was identified by mass-spectrometry analy‐ sis of an immunoaffinity-purified native GINS from an *S. solfataricus* cell extract. [50]. The *S. solfataricus* GINS, composed of two proteins, Gins23 and Gins15, forms a tetrameric struc‐ ture with a 2:2 molar ratio [50]. The GINS from *P. furiosus*, a complex of Gins23 and Gins51 with a 2:2 ratio, was identified as the first euryarchaeal GINS [31]. Gins51 was preferred over Gins15 because of the order of the name of GINS.

MCM recruiting, are performed by the Cdc6/Orc1 protein, because the WH domain, which primarily recognizes and binds ORB, also has strong affinity for the Mcm protein. The as‐ sembly of the Mcm protein onto the ORB DNA by the Walker A-motif mutant of *P. furiosus* Cdc6/Orc1 occurred with the same efficiency as the wild type Cdc6/Orc1. The DNA binding of *P. furiosus* Cdc6/Orc1 was not drastically different in the presence and absence of ATP, as in the case of the initiator proteins from *Archaeoglobus fulgidus* [28], *S. solfataricus* [11], and *A. pernix* [19]. Therefore, it is still not known whether the ATP binding and hydrolysis activity

One more important issue is the very low efficiency of the Mcm protein recruitment in the reported *in vitro* assay [24]. Quantification of the recruited Mcm protein by the *in vitro* assay showed that less than one Mcm hexamer was recruited to the ORB. The linear DNA contain‐ ing ORB1 and ORB2, used in the recruiting assay, may not be suitable to reconstitute the archaeal DNA replication machinery and a template that more closely mimics the chromo‐ somal DNA may be required. Additionally, it may be that as yet unidentified proteins are required to achieve efficient *in vitro* helicase loading in the *P. furiosus* cells. Finally, it will ultimately be necessary to construct a more defined *in vitro* replication system to analyze the

In *M. thermautotrophicus*, the Cdc6-2 proteins can dissociate the Mcm multimers [47]. The ac‐ tivity of Cdc6-2 might be required as the MCM helicase loader in this organism. The interac‐ tion between Cdc6/Orc1 and Mcm is probably general. However, the effect of Cdc6/Orc1 on the MCM helicase activity differs among various organisms, as described above. Some other protein factors may function in various archaea, for example a protein that is distantly relat‐ ed to eukaryotic Cdt1, which plays a crucial role during MCM loading in Eukaryota, exists

in some archaeal organisms, although its function has not been characterized yet [14].

The eukaryotic GINS complex was originally identified in *Saccharomyces cerevisiae* as essen‐ tial protein factor for the initiation of DNA replication [48]. GINS consists of four different proteins, Sld5, Psf1, Psf2, and Psf3 (therefore, GINS is an acronym for Japanese go-ichi-nisan, meaning 5-1-2-3, after these four subunits). The amino acid sequences of the four subu‐ nits in the GINS complex share some conservation, suggesting that they are ancestral paralogs [49]. However, most of the archaeal genomes have only one gene encoding this family protein, and more interestingly, the Crenarchaeota and Euryarchaeota (the two major subdomains of Archaea) characteristically have two genes with sequences similar to Psf2 and Psf3, and Sld5 and Psf1, respectively referred to as Gins23 and Gins51 [31, 49]. A Gins homolog, designated as Gins23, was biochemically detected in *S. solfataricus* as the first Gins protein in Archaea, in a yeast two-hybrid screening for interaction partners of the Mcm pro‐ tein, and another subunit, designated as Gins15, was identified by mass-spectrometry analy‐ sis of an immunoaffinity-purified native GINS from an *S. solfataricus* cell extract. [50]. The *S. solfataricus* GINS, composed of two proteins, Gins23 and Gins15, forms a tetrameric struc‐

of Cdc6/Orc1 regulates the Mcm protein recruitment onto *oriC* in the cells.

regulatory functions of Cdc6/Orc1 precisely during replication initiation.

**6. GINS**

98 The Mechanisms of DNA Replication

The MCM2-7 hexamer was copurified in complex with Cdc45 and GINS from *Drosophila melanogaster* embryo extracts and *S. cerevisiae* lysates, and the "CMG (Cdc45-MCM2-7-GINS) complex" (Figure 3), as described above, should be important for the function of the replica‐ tive helicase. The CMG complex was also associated with the replication fork in *Xenopus lae‐ vis* egg extracts, and a large molecular machine, containing Cdc45, GINS, and MCM2-7, was proposed as the unwindosome to separate the DNA strands at the replication fork [51]. Therefore, GINS must be a critical factor for not only the initiation process, but also the elon‐ gation process in eukaryotic DNA replication. *S. solfataricus* GINS interacts with MCM and primase, suggesting that GINS is involved in the replisome. The concrete function of GINS in the replisome remains to be determined. No stimulation or inhibition of either the heli‐ case or primase activity was observed by the interaction with *S. solfataricus* GINS *in vitro* [50]. On the other hand, the DNA helicase activity of *P. furiosus* MCM is clearly stimulated by the addition of the *P. furiosus* GINS complex, as described above [31].

In contrast to *S. solfataricus* and *P. furiosus*, which each express a Gins23 and Gins51, *Thermo‐ plasma acidophilum* has a single Gins homolog, Gins51. The recombinant Gins51 protein from *T. acidophilum* was confirmed to form a homotetramer by gel filtration and electron micro‐ scopy analyses. Furthermore, a physical interaction between *T. acidophilum* Gins51 and Mcm was detected by a surface plasmon resonance analysis (SPR). Although the *T. acidophilum* Gins51 did not affect the helicase activity of its cognate MCM, when the equal ratio of each molecule was tested *in vitro* [52], an excess amount of Gins51 clearly stimulated the helicase activity (Ogino et al., unpublished). In the case of *T. kodakarensis,* the ATPase and helicase activities of MCM1 and MCM3 were clearly stimulated by *T. kodakarensis* GINS *in vitro*. It is interesting that the helicase activity of MCM1 was stimulated more than that of MCM3. Physical interactions between the *T. kodakarensis* Gins and Mcm proteins were also detected [53]. These reports suggested that the MCM-GINS complex is a common part of the replica‐ tive helicase in Archaea (Figure 3).

Recently, the crystal structure of the *T. kodakarensis* GINS tetramer, composed of Gins51 and Gins23 was determined, and the structure was conserved with the reported human GINS structures [53]. Each subunit of human GINS shares a similar fold, and assembles into the heterotetramer of a unique trapezoidal shape [54-56]. Sld5 and Psf1 possess the α-helical (A) domain at the N-terminus and the β-stranded domain (B) at the C-terminus (AB-type). On the other hand, Psf2 and Psf3 are the permuted version (BA-type). The backbone structure of each subunit and the tetrameric assembly of *T. kodakarensis* GINS are similar to those of hu‐ man GINS. However, the location of the C-terminal B domain of Gins51 is remarkably dif‐ ferent between the two GINS structures [53]. A homology model of the homotetrameric GINS from *T. acidophilum* was performed using the *T. kodakarensis* GINS crystal structure as a template. The Gins 51 protein has a long disordered region inserted between the A and B domains and this allows the conformation of the C-terminal domains to be more flexible. This domain arrangement leads to the formation of an asymmetric homotetramer, rather than a symmetrical assembly, of the *T. kodakarensis* GINS [53].

The Cdc45 protein is ubiquitously distributed from yeast to human, supporting the notion that the formation of the CMG complex is universal in the eukaryotic DNA replication proc‐ ess. However, no archaeal homologue of Cdc45 has been identified. A recent report of bioin‐ formatic analysis showed that the primary structure of eukaryotic Cdc45 and prokaryotic RecJ share a common ancestry [57]. Indeed, a homolog of the DNA binding domain of RecJ has been co-purified with GINS from *S. solfataricus* [50]. Our experiment detected the stimu‐ lation of the 5'-3' exonuclease activity of the RecJ homologs from *P. furiosus* and *T. kodakaren‐ sis* by the cognate GINS complexes (Ishino et al., unpublished). The RecJ homolog from *T. kodakarensis* forms a stable complex with the GINS, and the 5'-3' exonuclease activity is en‐ hanced *in vitro*; therefore, the RecJ homolog was designated as GAN, from GINS-Associated Nuclease in a very recent paper [58]. Another related report found that the human Cdc45 structure obtained by the small angle X-ray scattering analysis (SAXS) is consistent with the crystallographic structure of the RecJ family members [59]. These current findings will pro‐ mote further research on the structures and functions of the higher-order unwindosome in archaeal and eukaryotic cells (Figure 3).

#### **7. Primase**

To initiate DNA strand synthesis, a primase is required for the synthesis of a short oligonu‐ cleotide, as a primer. The DnaG and p48-p58 proteins are the primases in Bacteria and Eu‐ karyota, respectively. The p48-p58 primase is further complexed with p180 and p70, to form DNA polymerase α-primase complex. The catalytic subunits of the eukaryotic (p48) and archaeal primases, share a little, but distinct sequence homology with those of the family X DNA polymerases [60]. The first archaeal primase was identified from *Methanococcus janna‐ schii,* as an ORF with a sequence similar to that of the eukaryotic p48. The gene product ex‐ hibited DNA polymerase activity and was able to synthesize oligonucleotides on the template DNA [61]. We characterized the p48-like protein (p41) from *P. furiosus*. Unexpect‐ edly, the archaeal p41 protein did not synthesize short RNA by itself, but preferentially uti‐ lized deoxynucleotides to synthesize DNA strands up to several kilobases in length [62]. Furthermore, the gene neighboring the p41 gene encodes a protein with very weak similari‐ ty to the p58 subunit of the eukaryotic primase. The gene product, designated p46, actually forms a stable complex with p41, and the complex can synthesize a short RNA primer, as well as DNA strands of several hundred nucleotides *in vitro* [63]. The short RNA but not DNA primers were identified in *Pyrococcus* cells, and therefore, some mechanism to domi‐ nantly use RNA primers exists in the cells [10].

Further research on the primase homologs from *S. solfataricus* [64-66], *Pyrococcus horikoshii* [67-69]*,* and *P. abyssi* [70] showed similar properties *in vitro*. Notably, p41 is the catalytic sub‐ unit, and the large one modulates the activity in the heterodimeric archaeal primases. The small and large subunits are also called PriS and PriL, respectively. The crystal structure of the N-terminal domain of PriL complexed with PriS of *S. solfataricus* primase revealed that PriL does not directly contact the active site of PriS, and therefore, the large subunit may in‐ teract with the synthesized primer, to adjust its length to a 7-14 mer. The structure of the catalytic center is similar to those of the family X DNA polymerases. The 3'-terminal nucleo‐ tidyl transferase activity, detected in the *S. solfataricus* primase [64, 66], and the gap-filling and strand-displacement activities in the *P. abyssi* primase [70] also support the structural similarity between PriS and the family X DNA polymerases.

A unique activity, named PADT (template-dependent Polymerization Across Discontinuous Template), in the *S. solfataricus* PriSL complex was published very recently [71]. The activity may be involved in double-strand break repair in Archaea.

The archaeal genomes also encode a sequence similar to the bacterial type DnaG primase. The DnaG homolog from the *P. furiosus* genome was expressed in *E. coli,* but the protein did not show any primer synthesis activity *in vitro*, and thus the archaeal DnaG-like protein may not act as a primase in *Pyrococcus* cells (Fujikane et al. unpublished). The DnaG-like protein was shown to participate in RNA degradation, as an exosome component [72, 73]. However, a recent paper reported that a DnaG homolog from *S. solfataricus* actually synthesizes pri‐ mers with a 13 nucleotide length [74]. It would be interesting to investigate if the two differ‐ ent primases share the primer synthesis for leading and lagging strand replication, respectively, in the *Sulfolobus* cells, as the authors suggested [74]. A proposed hypothesis about the evolution of PriSL and DnaG from the last universal common ancestor (LUCA) is interesting [71].

The *Sulfolobus* PriSL protein was shown to interact with Mcm through Gins23 [50]. This pri‐ mase-helicase interaction probably ensures the coupling of DNA unwinding and priming during the replication fork progression [50]. Furthermore, the direct interaction between PriSL and the clamp loader RFC (described below) in *S. solfataricus* may regulate the primer synthesis and its transfer to DNA polymerase in archaeal cells [75].

### **8. Single-stranded DNA binding protein**

This domain arrangement leads to the formation of an asymmetric homotetramer, rather

The Cdc45 protein is ubiquitously distributed from yeast to human, supporting the notion that the formation of the CMG complex is universal in the eukaryotic DNA replication proc‐ ess. However, no archaeal homologue of Cdc45 has been identified. A recent report of bioin‐ formatic analysis showed that the primary structure of eukaryotic Cdc45 and prokaryotic RecJ share a common ancestry [57]. Indeed, a homolog of the DNA binding domain of RecJ has been co-purified with GINS from *S. solfataricus* [50]. Our experiment detected the stimu‐ lation of the 5'-3' exonuclease activity of the RecJ homologs from *P. furiosus* and *T. kodakaren‐ sis* by the cognate GINS complexes (Ishino et al., unpublished). The RecJ homolog from *T. kodakarensis* forms a stable complex with the GINS, and the 5'-3' exonuclease activity is en‐ hanced *in vitro*; therefore, the RecJ homolog was designated as GAN, from GINS-Associated Nuclease in a very recent paper [58]. Another related report found that the human Cdc45 structure obtained by the small angle X-ray scattering analysis (SAXS) is consistent with the crystallographic structure of the RecJ family members [59]. These current findings will pro‐ mote further research on the structures and functions of the higher-order unwindosome in

To initiate DNA strand synthesis, a primase is required for the synthesis of a short oligonu‐ cleotide, as a primer. The DnaG and p48-p58 proteins are the primases in Bacteria and Eu‐ karyota, respectively. The p48-p58 primase is further complexed with p180 and p70, to form DNA polymerase α-primase complex. The catalytic subunits of the eukaryotic (p48) and archaeal primases, share a little, but distinct sequence homology with those of the family X DNA polymerases [60]. The first archaeal primase was identified from *Methanococcus janna‐ schii,* as an ORF with a sequence similar to that of the eukaryotic p48. The gene product ex‐ hibited DNA polymerase activity and was able to synthesize oligonucleotides on the template DNA [61]. We characterized the p48-like protein (p41) from *P. furiosus*. Unexpect‐ edly, the archaeal p41 protein did not synthesize short RNA by itself, but preferentially uti‐ lized deoxynucleotides to synthesize DNA strands up to several kilobases in length [62]. Furthermore, the gene neighboring the p41 gene encodes a protein with very weak similari‐ ty to the p58 subunit of the eukaryotic primase. The gene product, designated p46, actually forms a stable complex with p41, and the complex can synthesize a short RNA primer, as well as DNA strands of several hundred nucleotides *in vitro* [63]. The short RNA but not DNA primers were identified in *Pyrococcus* cells, and therefore, some mechanism to domi‐

Further research on the primase homologs from *S. solfataricus* [64-66], *Pyrococcus horikoshii* [67-69]*,* and *P. abyssi* [70] showed similar properties *in vitro*. Notably, p41 is the catalytic sub‐ unit, and the large one modulates the activity in the heterodimeric archaeal primases. The small and large subunits are also called PriS and PriL, respectively. The crystal structure of

than a symmetrical assembly, of the *T. kodakarensis* GINS [53].

archaeal and eukaryotic cells (Figure 3).

nantly use RNA primers exists in the cells [10].

**7. Primase**

100 The Mechanisms of DNA Replication

The single-stranded DNA binding protein, which is called SSB in Bacteria and RPA in Arch‐ aea and Eukaryota, is an important factor to protect the unwound single-stranded DNA from nuclease attack, chemical modification, and other disruptions during the DNA replica‐ tion and repair processes. SSB and RPA have a structurally similar domain containing a common fold, called the OB (oligonucleotide/oligosaccharide binding)-fold, although there is little amino acid sequence similarity between them [76]. The common structure suggests that the mechanism of single-stranded DNA binding is conserved in living organisms de‐ spite the lack of sequence similarity. *E. coli* SSB is a homotetramer of a 20 kDa peptide with one OB-fold, and the SSBs from *Deinococcus radiodurans* and *Thermus aquaticus* consist of a homodimer of the peptide containing two OB-folds. The eukaryotic RPA is a stable hetero‐ trimer, composed of 70, 32, and 14 kDa proteins. RPA70 contains two tandem repeats of an OB-fold, which are responsible for the major interaction with a single-stranded DNA in its central region. The N-terminal and C-terminal regions of RPA70 mediate interactions with RPA32 and also with many cellular or viral proteins [77, 78]. RPA32 contains an OB-fold in the central region [79-81], and the C-terminal region interacts with other RPA subunits and various cellular proteins [77, 78. 82, 83]. RPA14 also contains an OB-fold [77]. The eukaryotic RPA interacts with the SV40 T-antigen and the DNA polymerase α-primase complex, and thus forms part of the initiation complex at the replication origin [84]. The RPA also stimu‐ lates Polα-primase activity and PCNA-dependent Pol δ activity [85, 86].

The RPAs from *M. jannaschii* and *M. thermautotrophicus* were reported in 1998, as the first arch‐ aeal single-stranded DNA binding proteins [87-89]. These proteins share amino acid sequence similarity with the eukaryotic RPA70, and contain four or five repeated OB-fold and one zincfinger motif. The *M. jannaschii* RPA exists as a monomer in solution, and has single-strand DNA binding activity. On the other hand, *P. furiosus* RPA forms a complex consisting of three distinct subunits, RPA41, RPA32, and RPA14, similar to the eukaryotic RPA [90]. The *P. furio‐ sus* RPA strikingly stimulates the RadA-promoted strand-exchange reaction *in vitro* [90].

While the euryarchaeal organisms have a eukaryotic-type RPA homologue, the crenarchaeal SSB proteins appear to be much more related to the bacterial proteins, with a single OB fold and a flexible C-terminal tail. However, the crystal structure of the SSB protein from *S. solfa‐ taricus* showed that the OB-fold domain is more similar to that of the eukaryotic RPAs, sup‐ porting the close relationship between Archaea and Eukaryota [91].

The RPA from *Methanosarcina acetivorans* displays a unique property. Unlike the multiple RPA proteins found in other archaea and eukaryotes, each subunit of the *M. acetivorans* RPAs, RPA1, RPA2, and RPA3, have 4, 2, and 2 OB-folds, respectively, and can act as a dis‐ tinct single-stranded DNA-binding proteins. Furthermore, each of the three RPA proteins, as well as their combinations, clearly stimulates the primer extension activity of *M. acetivor‐ ans* DNA polymerase BI *in vitro*, as shown previously for bacterial SSB and eukaryotic RPA [92]. Architectures of SSB and RPA suggested that they are composed of different combina‐ tions of the OB fold. Bacterial and eukaryotic organisms contain one type of SSB or RPA, re‐ spectively. In contrast, archaeal organisms have various RPAs, composed of different organizations of OB-folds. A hypothesis that homologous recombination might play an im‐ portant role in generating this diversity of OB-folds in archaeal cells was proposed, based on experiments characterizing the engineered RPAs with various OB-folds [93].

### **9. DNA polymerase**

DNA polymerase catalyzes phosphodiester bond formation between the terminal 3'-OH of the primer and the α-phosphate of the incoming triphosphate to extend the short primer, and is therefore the main player of the DNA replication process. Based on the amino acid sequence similarity, DNA polymerases have been classified into seven families, A, B, C, D, E, X, and Y (Table 2) [94-98].

The fundamental ability of DNA polymerases to synthesize a deoxyribonucleotide chain is widely conserved, but more specific properties, including processivity, synthesis accuracy,

5

and substrate nucleotide selectivity, differ depending on the family. The enzymes within the same family have basically similar properties. *E. coli* has five DNA polymerases, and Pol I, Pol II, and Pol III belong to families A, B, and C, respectively. Pol IV and Pol V are classified in family Y, as the DNA polymerases for translesion synthesis (TLS). In eukaryotes, the rep‐ licative DNA polymerases, Pol α, Pol δ, and Pol ε, belong to family B, and the translesion DNA polymerases, η, ι, and κ, belong to family Y [99].

The most interesting feature discovered at the inception of this research area was that the archaea indeed have the eukaryotic Pol α-like (Family B) DNA polymerases [100-102]. Mem‐ bers of the Crenarchaeota have at least two family B DNA polymerases [103, 104]. On the other hand, there is only one family B DNA polymerase in the Euryarchaeota. Instead, the euryarchaeal genomes encode a family D DNA polymerase, proposed as Pol D, which seems to be specific for these archaeal organisms and has never been found in other do‐ mains [95, 105]. The genes for family Y-like DNA polymerases are conserved in several, but not all, archaeal genomes. The role of each DNA polymerase in the archaeal cells is still not known, although the distribution of the DNA polymerases is getting clearer (Table 2) [106].


\* plasmid-encoded

central region. The N-terminal and C-terminal regions of RPA70 mediate interactions with RPA32 and also with many cellular or viral proteins [77, 78]. RPA32 contains an OB-fold in the central region [79-81], and the C-terminal region interacts with other RPA subunits and various cellular proteins [77, 78. 82, 83]. RPA14 also contains an OB-fold [77]. The eukaryotic RPA interacts with the SV40 T-antigen and the DNA polymerase α-primase complex, and thus forms part of the initiation complex at the replication origin [84]. The RPA also stimu‐

The RPAs from *M. jannaschii* and *M. thermautotrophicus* were reported in 1998, as the first arch‐ aeal single-stranded DNA binding proteins [87-89]. These proteins share amino acid sequence similarity with the eukaryotic RPA70, and contain four or five repeated OB-fold and one zincfinger motif. The *M. jannaschii* RPA exists as a monomer in solution, and has single-strand DNA binding activity. On the other hand, *P. furiosus* RPA forms a complex consisting of three distinct subunits, RPA41, RPA32, and RPA14, similar to the eukaryotic RPA [90]. The *P. furio‐ sus* RPA strikingly stimulates the RadA-promoted strand-exchange reaction *in vitro* [90].

While the euryarchaeal organisms have a eukaryotic-type RPA homologue, the crenarchaeal SSB proteins appear to be much more related to the bacterial proteins, with a single OB fold and a flexible C-terminal tail. However, the crystal structure of the SSB protein from *S. solfa‐ taricus* showed that the OB-fold domain is more similar to that of the eukaryotic RPAs, sup‐

The RPA from *Methanosarcina acetivorans* displays a unique property. Unlike the multiple RPA proteins found in other archaea and eukaryotes, each subunit of the *M. acetivorans* RPAs, RPA1, RPA2, and RPA3, have 4, 2, and 2 OB-folds, respectively, and can act as a dis‐ tinct single-stranded DNA-binding proteins. Furthermore, each of the three RPA proteins, as well as their combinations, clearly stimulates the primer extension activity of *M. acetivor‐ ans* DNA polymerase BI *in vitro*, as shown previously for bacterial SSB and eukaryotic RPA [92]. Architectures of SSB and RPA suggested that they are composed of different combina‐ tions of the OB fold. Bacterial and eukaryotic organisms contain one type of SSB or RPA, re‐ spectively. In contrast, archaeal organisms have various RPAs, composed of different organizations of OB-folds. A hypothesis that homologous recombination might play an im‐ portant role in generating this diversity of OB-folds in archaeal cells was proposed, based on

DNA polymerase catalyzes phosphodiester bond formation between the terminal 3'-OH of the primer and the α-phosphate of the incoming triphosphate to extend the short primer, and is therefore the main player of the DNA replication process. Based on the amino acid sequence similarity, DNA polymerases have been classified into seven families, A, B, C, D,

The fundamental ability of DNA polymerases to synthesize a deoxyribonucleotide chain is widely conserved, but more specific properties, including processivity, synthesis accuracy,

lates Polα-primase activity and PCNA-dependent Pol δ activity [85, 86].

porting the close relationship between Archaea and Eukaryota [91].

experiments characterizing the engineered RPAs with various OB-folds [93].

**9. DNA polymerase**

102 The Mechanisms of DNA Replication

E, X, and Y (Table 2) [94-98].

\*\* mitochondrial

**Table 2.** Distribution of DNA polymerases from seven families in the three domains of life.

The first family D DNA polymerase was identified from *P. furiosus,* by screening for DNA polymerase activity in the cell extract [107]. The corresponding gene was cloned, revealing that this new DNA polymerase consists of two proteins, named DP1 and DP2, and that the deduced amino acid sequences of these proteins were not conserved in the DNA polymer‐ ase families [8]. *P. furiosus* Pol D exhibits efficient strand extension activity and strong proofreading activity [8, 108]. Other family D DNA polymerases were also characterized by several groups [109-115]. The Pol D genes had been found only in Euryarchaeota. However, recent environmental genomics and cultivation efforts revealed novel phyla in Archaea: Thaumarchaeota, Korarchaeota, and Aigarchaeota, and their genome sequences harbor the genes encoding Pol D.

A genetic study on *Halobacterium* sp. NRC-1 showed that both Pol B and Pol D are essential for viability [116]. An interesting issue is to elucidate whether Pol B and Pol D work together at the replication fork for the synthesis of the leading and lagging strands, respectively. Ac‐ cording to the usage of an RNA primer and the presence of strand displacement activity, Pol D may catalyze lagging strand synthesis [106, 114].

Thaumarchaeota and Aigarchaeota harbor the genes encoding Pol D and crenarchaeal Pol BII [117, 118], while Korarchaeota encodes Pol BI, Pol BII and Pol D [119]. Biochemical char‐ acterization of these gene products will contribute to research on the evolution of DNA pol‐ ymerases in living organisms. A hypothesis that the archaeal ancestor of eukaryotes encoded three DNA polymerases, two distinct family B DNA polymerases and a family D DNA polymerase, which all contributed to the evolution of the eukaryotic replication ma‐ chinery, consisting of Pol α, δ, and ε, has been proposed [120].

A protein is encoded in the plasmid pRN1 isolated from a *Sulfolobus* strain [121]. This pro‐ tein, ORF904 (named RepA), has primase and DNA polymerase activities in the N-terminal domain and helicase activity in the C-terminal domain, and is likely to be essential for the replication of pRN1 [122, 123]. The amino acid sequence of the N-terminal domain lacks ho‐ mology to any known DNA polymerases or primases, and therefore, family E is proposed. Similar proteins are encoded by various archaeal and bacterial plasmids, as well as by some bacterial viruses [124]. Recently, one protein, tn2-12p, encoded in the plasmid pTN2 isolated from *Thermococcus nautilus,* was experimentally identified as a DNA polymerase in this fam‐ ily [125]. This enzyme is likely responsible for the replication of the plasmids. Further inves‐ tigations of this family of DNA polymerases will be interesting from an evolutional perspective.

### **10. PCNA and RFC**

The sliding clamp with the doughnut-shaped ring structure is conserved among living or‐ ganisms, and functions as a platform or scaffold for proteins to work on the DNA strands. The eukaryotic and archaeal PCNAs form a homotrimeric ring structure [126, 127], which encircles the DNA strand and anchors many important proteins involved in DNA replica‐ tion and repair (Figure 4). PCNA works as a processivity factor that retains the DNA poly‐ merase on the DNA by binding it on one surface (front side) of the ring for continuous DNA strand synthesis in DNA replication (Figure 5). To introduce the DNA strand into the central hole of the clamp ring, a clamp loader is required to interact with the clamp and open its ring. The archaeal and eukaryotic clamp loader is called RFC (Figure 5). The most studied archaeal PCNA and RFC molecules to date are *P. furiosus* PCNA [128-132] and RFC [133-136]. The PCNA and RFC molecules are essential for DNA polymerase to perform processive DNA synthesis. The molecular mechanism of the clamp loading process has been actively investigated [137] (Figure 5). An intermediate PCNA-RFC-DNA complex, in which the PCNA ring is opened with out-of plane mode, was detected by a single particle analysis of electron microscopic images using *P. furiosus* proteins (Figure 6) [138]. The crystal struc‐ ture of the complex, including the ATP-bound clamp loader, the ring-opened clamp, and the template-primer DNA, using proteins from bacteriophage T4, has recently been published [139], and our knowledge about the clamp loading mechanism is continuously progressing.

**Figure 4.** PCNA-interacting proteins

The first family D DNA polymerase was identified from *P. furiosus,* by screening for DNA polymerase activity in the cell extract [107]. The corresponding gene was cloned, revealing that this new DNA polymerase consists of two proteins, named DP1 and DP2, and that the deduced amino acid sequences of these proteins were not conserved in the DNA polymer‐ ase families [8]. *P. furiosus* Pol D exhibits efficient strand extension activity and strong proofreading activity [8, 108]. Other family D DNA polymerases were also characterized by several groups [109-115]. The Pol D genes had been found only in Euryarchaeota. However, recent environmental genomics and cultivation efforts revealed novel phyla in Archaea: Thaumarchaeota, Korarchaeota, and Aigarchaeota, and their genome sequences harbor the

A genetic study on *Halobacterium* sp. NRC-1 showed that both Pol B and Pol D are essential for viability [116]. An interesting issue is to elucidate whether Pol B and Pol D work together at the replication fork for the synthesis of the leading and lagging strands, respectively. Ac‐ cording to the usage of an RNA primer and the presence of strand displacement activity, Pol

Thaumarchaeota and Aigarchaeota harbor the genes encoding Pol D and crenarchaeal Pol BII [117, 118], while Korarchaeota encodes Pol BI, Pol BII and Pol D [119]. Biochemical char‐ acterization of these gene products will contribute to research on the evolution of DNA pol‐ ymerases in living organisms. A hypothesis that the archaeal ancestor of eukaryotes encoded three DNA polymerases, two distinct family B DNA polymerases and a family D DNA polymerase, which all contributed to the evolution of the eukaryotic replication ma‐

A protein is encoded in the plasmid pRN1 isolated from a *Sulfolobus* strain [121]. This pro‐ tein, ORF904 (named RepA), has primase and DNA polymerase activities in the N-terminal domain and helicase activity in the C-terminal domain, and is likely to be essential for the replication of pRN1 [122, 123]. The amino acid sequence of the N-terminal domain lacks ho‐ mology to any known DNA polymerases or primases, and therefore, family E is proposed. Similar proteins are encoded by various archaeal and bacterial plasmids, as well as by some bacterial viruses [124]. Recently, one protein, tn2-12p, encoded in the plasmid pTN2 isolated from *Thermococcus nautilus,* was experimentally identified as a DNA polymerase in this fam‐ ily [125]. This enzyme is likely responsible for the replication of the plasmids. Further inves‐ tigations of this family of DNA polymerases will be interesting from an evolutional

The sliding clamp with the doughnut-shaped ring structure is conserved among living or‐ ganisms, and functions as a platform or scaffold for proteins to work on the DNA strands. The eukaryotic and archaeal PCNAs form a homotrimeric ring structure [126, 127], which encircles the DNA strand and anchors many important proteins involved in DNA replica‐ tion and repair (Figure 4). PCNA works as a processivity factor that retains the DNA poly‐

genes encoding Pol D.

104 The Mechanisms of DNA Replication

perspective.

**10. PCNA and RFC**

D may catalyze lagging strand synthesis [106, 114].

chinery, consisting of Pol α, δ, and ε, has been proposed [120].

After clamp loading, DNA polymerase accesses the clamp and the polymerase-clamp com‐ plex performs processive DNA synthesis. Therefore, structural and functional analyses of the DNA polymerase-PCNA complex is the next target to elucidate the overall mechanisms of replication fork progression. The PCNA interacting proteins contain a small conserved se‐ quence motif, called the PIP box, which binds to a common site on PCNA [140]. The PIP box consists of the sequence "Qxxhxxaa", where "x" represents any amino acid, "h" represents a hydrophobic residue (e.g. L, I or M), and "a" represents an aromatic residue (e.g. F, Y or W). Archaeal DNA polymerases have PIP box-like motifs in their sequences [141]). However, only a few studies have experimentally investigated the function of the motifs. The crystal structure of *P. furiosus* Pol B complexed with a monomeric PCNA mutant was determined, and a convincing model of the polymerase-PCNA ring interaction was constructed [142]. This study revealed that a novel interaction is formed between a stretched loop of PCNA and the thumb domain of Pol B, in addition to the authentic PIP box. A comparison of the model structure with the previously reported structures of a family B DNA polymerase from RB69 phage, complexed with DNA [143, 144], suggested that the second interaction site plays a crucial role in switching between the polymerase and exonuclease modes, by in‐ ducing a PCNA-polymerase complex configuration that favors synthesis over editing. This putative mechanism for the fidelity control of replicative DNA polymerases is supported by experiments, in which mutations at the second interaction site enhanced the exonuclease ac‐ tivity in the presence of PCNA [144]. Furthermore, the three-dimensional structure of the DNA polymerase-PCNA-DNA ternary complex was analyzed by electron microscopic (EM) single particle analysis. This structural view revealed the entire domain configuration of the trimeric ring of PCNA and DNA polymerase, including the protein-protein or protein-DNA contacts. This architecture provides clearer insights into the switching mechanism between the editing and synthesis modes [145].

**Figure 5. Mechanisms of processive DNA synthesis** The clamp loader (RFC) tethers the clamp (PCNA) onto the pri‐ mer terminus of the DNA strand. The clamp loader is then replaced by DNA polymerase, which can synthesize the DNA strand processively without falling off.

In contrast to most euryarchaeal organisms, which have a single PCNA homolog forming a homotrimeric ring structure, the majority of crenarchaea have multiple PCNA homologues, and they are capable of forming heterotrimeric rings for their functions [146, 147]. It is espe‐ cially interesting that the three PCNAs, PCNA1, PCNA2, and PCNA3, specifically bind PCNA binding proteins, including DNA polymerases, DNA ligases, and FEN-1 endonu‐ clease [147, 148]. Detailed structural studies of the heterologous PCNA from *S. solfataricus* revealed that the interaction modes between the subunits are conserved with those of the homotrimeric PCNAs [149, 150].

*T. kodakarensis* is the only euryarchaeal species that has two genes encoding PCNA homo‐ logs on the genome [38]. These two genes from the *T. kodakarensis* genome, and the highly purified gene products, PCNA1 and PCNA2, were characterized [151]. PCNA1 stimulated the DNA synthesis reactions of the two DNA polymerases, Pol B and Pol D, from *T. kodakar‐ ensis in vitro*. PCNA2 however only had an effect on Pol B. The *T. kodakarensis* strain with *pcna2* disruption was isolated, whereas gene disruption for *pcna1* was not possible. These re‐ sults suggested that PCNA1 is essential for DNA replication, and PCNA2 may play a differ‐ ent role in *T. kodakarensis* cells. The sensitivities of the Δ*pcna2* mutant strain to ultraviolet irradiation (UV), methyl methanesulfonate (MMS) and mitomycin C (MMC) were indistin‐ guishable to those of the wild type strain. Both PCNA1 and PCNA2 form a stable ring struc‐ ture and work as a processivity factor for *T. kodakarensis* Pol B *in vitro*. The crystal structures of the two PCNAs revealed the different interactions at the subunit-subunit interfaces [152].

**Figure 6. Electron Microscopic Analysis of** *P. furious* **DNA polymerase-PCNA-DNA complex.** The complex in the editing mode of the DNA polymerase-PCNA-DNA ternary complex was shown. (A) Electron microscopic (EM) map of the complex is depicted by gray surface. DNA polymerase and PCNA are shown in a ribbon representation colored purple and blue, respectively. The DNA is colored white. The exonuclease active site is shown in a green ribbon. (B) Schematic view of the complex.

The RFC molecule is conserved as a pentameric complex in Eukaryota and Archaea. Howev‐ er, the eukaryotic RFC is a heteropentameric complex, consisting of five different proteins, RFC1 to 5, in which RFC1 is larger than the other four RFCs. On the other hand, the archaeal RFC consists of two proteins, RFCS (small) and RFCL (large), in a 4 to 1 ratio. A different form of RFC, consisting of three subunits, RFCS1, RFCS2, and RFCL, in a 3 to 1 to 1 ratio, was also identified from *M. acetivorans* [153]*.* The three subunits of RFC may represent an intermediate stage in the evolution of the more complex RFC in Eukaryota from the less complex RFC in Archaea [153, 154]. The subunit organization and the spatial distribution of the subunits in the *M. acetivorans* RFC complex were analyzed and compared with those of the *E. coli* γ-complex, which is also a pentamer consisting of three different proteins. These two clamp loaders adopt similar subunit organizations and spatial distributions, but the functions of the individual subunits are likely to be diverse [154].

### **11. DNA ligase**

and a convincing model of the polymerase-PCNA ring interaction was constructed [142]. This study revealed that a novel interaction is formed between a stretched loop of PCNA and the thumb domain of Pol B, in addition to the authentic PIP box. A comparison of the model structure with the previously reported structures of a family B DNA polymerase from RB69 phage, complexed with DNA [143, 144], suggested that the second interaction site plays a crucial role in switching between the polymerase and exonuclease modes, by in‐ ducing a PCNA-polymerase complex configuration that favors synthesis over editing. This putative mechanism for the fidelity control of replicative DNA polymerases is supported by experiments, in which mutations at the second interaction site enhanced the exonuclease ac‐ tivity in the presence of PCNA [144]. Furthermore, the three-dimensional structure of the DNA polymerase-PCNA-DNA ternary complex was analyzed by electron microscopic (EM) single particle analysis. This structural view revealed the entire domain configuration of the trimeric ring of PCNA and DNA polymerase, including the protein-protein or protein-DNA contacts. This architecture provides clearer insights into the switching mechanism between

**Figure 5. Mechanisms of processive DNA synthesis** The clamp loader (RFC) tethers the clamp (PCNA) onto the pri‐ mer terminus of the DNA strand. The clamp loader is then replaced by DNA polymerase, which can synthesize the

In contrast to most euryarchaeal organisms, which have a single PCNA homolog forming a homotrimeric ring structure, the majority of crenarchaea have multiple PCNA homologues, and they are capable of forming heterotrimeric rings for their functions [146, 147]. It is espe‐ cially interesting that the three PCNAs, PCNA1, PCNA2, and PCNA3, specifically bind PCNA binding proteins, including DNA polymerases, DNA ligases, and FEN-1 endonu‐ clease [147, 148]. Detailed structural studies of the heterologous PCNA from *S. solfataricus* revealed that the interaction modes between the subunits are conserved with those of the

*T. kodakarensis* is the only euryarchaeal species that has two genes encoding PCNA homo‐ logs on the genome [38]. These two genes from the *T. kodakarensis* genome, and the highly purified gene products, PCNA1 and PCNA2, were characterized [151]. PCNA1 stimulated the DNA synthesis reactions of the two DNA polymerases, Pol B and Pol D, from *T. kodakar‐ ensis in vitro*. PCNA2 however only had an effect on Pol B. The *T. kodakarensis* strain with *pcna2* disruption was isolated, whereas gene disruption for *pcna1* was not possible. These re‐ sults suggested that PCNA1 is essential for DNA replication, and PCNA2 may play a differ‐ ent role in *T. kodakarensis* cells. The sensitivities of the Δ*pcna2* mutant strain to ultraviolet

the editing and synthesis modes [145].

106 The Mechanisms of DNA Replication

DNA strand processively without falling off.

homotrimeric PCNAs [149, 150].

DNA ligase is essential to connect the Okazaki fragments of the discontinuous strand syn‐ thesis during DNA replication, and therefore, it universally exists in all living organisms. This enzyme catalyzes phosphodiester bond formation via three nucleotidyl transfer steps [155, 156]. In the first step, DNA ligase forms a covalent enzyme-AMP intermediate, by re‐ acting with ATP or NAD+ as a cofactor. In the second step, DNA ligase recognizes the sub‐ strate DNA, and the AMP is subsequently transferred from the ligase to the 5'-phosphate terminus of the DNA, to form a DNA-adenylate intermediate (AppDNA). In the final step, the 5'-AppDNA is attacked by the adjacent 3'-hydroxy group of the DNA and a phospho‐ diester bond is formed. DNA ligases are grouped into two families, according to their re‐ quirement for ATP or NAD+ as a nucleotide cofactor in the first step reaction. ATPdependent DNA ligases are widely found in all three domains of life, whereas NAD+ dependent DNA ligases exist mostly in Bacteria. Some halophilic archaea [157] and eukaryotic viruses [158] also have NAD+ -dependent enzymes.

Three genes (*LIG1*, *LIG3* and *LIG4*) encoding ATP-dependent DNA ligases have been identi‐ fied in the human genome to date and DNA ligase I (Lig I), encoded by *LIG1*, is a replicative enzyme that joins Okazaki fragments during DNA replication [156]. The first gene encoding a eukaryotic-like ATP-dependent DNA ligase was found in the thermophilic archaeon, *De‐ sulfolobus ambivalens* [159]. Subsequent identifications of the DNA ligases from archaeal or‐ ganisms revealed that these enzymes primarily use ATP as a cofactor. However, this classification may not be so strict. The utilization of NAD+ , as well as ATP, as a cofactor has been observed in several DNA ligases, including those from *T. kodakarensis* [160], *T. fumico‐ lans, P. abyssi* [161]), *Thermococcus* sp. NA1 [162], *T. acidophilum, Picrophilus torridus,* and *Fer‐ roplasma acidophilum,* although ATP is evidently preferable in all of the cases [163] (Table 3). The dual co-factor specificity (ATP/NAD+ ) is an interesting feature of these DNA ligase en‐ zymes and it will be enlightening to investigate the structural basis for this. Another dual co-factor specificity exists in the archaeal DNA ligases, which use ADP as well as ATP, as found in the enzymes from *A. pernix* [164] and *Staphylothermus marinus* [165], and in the case of *Sulfobococcus zilligii,* GTP is also the functional cofactor [166]. The DNA ligases from *P. horikoshii* [167] and *P. furiosus* [168] have a strict ATP preference (Table 3). Sufficient bio‐ chemical data have not been obtained to resolve the issue of dual co-factor specificity, and further biochemical and structural analyses are required.


The crystal structure of *P. furiosus* DNA ligase [169] was solved and the physical and func‐ tional interactions between the DNA ligase and PCNA was shown [168]*.* The detailed inter‐ action mode between human Lig I and PCNA is somewhat unclear, because of several controversial reports [170-172]. The stimulatory effect of *P. furiosus* PCNA on the enzyme ac‐ tivity of the cognate DNA ligase was observed at a high salt concentration, at which a DNA ligase alone cannot bind to a nicked DNA substrate. Interestingly, the PCNA-binding site is located in the middle of the N-terminal DNA binding domain (DBD) of the *P. furiosus* DNA ligase, and the binding motif, QKSFF, which is proposed as a shorter version of the PIP box, is actually looped out from the protein surface [168]. Interestingly, this motif is located in the middle of the protein chain, rather than the N- or C-terminal region, where the PIP boxes are usually located. To confirm that this motif is conserved in the archaeal/eukaryotic DNA li‐ gases, the physical and functional interactions between *A. pernix* DNA ligase and PCNA was analyzed and the interaction was shown to mainly depend on the phenylalanine 132 residue, which is located in the predicted region from the multiple sequence alignment of the ATP-dependent DNA ligases [173].

the 5'-AppDNA is attacked by the adjacent 3'-hydroxy group of the DNA and a phospho‐ diester bond is formed. DNA ligases are grouped into two families, according to their re‐

dependent DNA ligases are widely found in all three domains of life, whereas NAD+

dependent DNA ligases exist mostly in Bacteria. Some halophilic archaea [157] and

Three genes (*LIG1*, *LIG3* and *LIG4*) encoding ATP-dependent DNA ligases have been identi‐ fied in the human genome to date and DNA ligase I (Lig I), encoded by *LIG1*, is a replicative enzyme that joins Okazaki fragments during DNA replication [156]. The first gene encoding a eukaryotic-like ATP-dependent DNA ligase was found in the thermophilic archaeon, *De‐ sulfolobus ambivalens* [159]. Subsequent identifications of the DNA ligases from archaeal or‐ ganisms revealed that these enzymes primarily use ATP as a cofactor. However, this

been observed in several DNA ligases, including those from *T. kodakarensis* [160], *T. fumico‐ lans, P. abyssi* [161]), *Thermococcus* sp. NA1 [162], *T. acidophilum, Picrophilus torridus,* and *Fer‐ roplasma acidophilum,* although ATP is evidently preferable in all of the cases [163] (Table 3).

zymes and it will be enlightening to investigate the structural basis for this. Another dual co-factor specificity exists in the archaeal DNA ligases, which use ADP as well as ATP, as found in the enzymes from *A. pernix* [164] and *Staphylothermus marinus* [165], and in the case of *Sulfobococcus zilligii,* GTP is also the functional cofactor [166]. The DNA ligases from *P. horikoshii* [167] and *P. furiosus* [168] have a strict ATP preference (Table 3). Sufficient bio‐ chemical data have not been obtained to resolve the issue of dual co-factor specificity, and

**ATP ATP and ADP ATP and NAD+ ATP, ADP, and GTP** *Acidithiobacillus ferrooxidans Aeropyrum pernix Ferroplasma acidophilum Sulfophobococcus zilligii*

*Thermococcus fumicolans*


as a nucleotide cofactor in the first step reaction. ATP-

, as well as ATP, as a cofactor has

) is an interesting feature of these DNA ligase en‐


quirement for ATP or NAD+

108 The Mechanisms of DNA Replication

eukaryotic viruses [158] also have NAD+

The dual co-factor specificity (ATP/NAD+

**cofactor**

*Methanothermobacterium thermoautotrophicum*

*Sulfolobus shibatae Thermococcus* sp. 1519

classification may not be so strict. The utilization of NAD+

further biochemical and structural analyses are required.

*Ferroplasma acidarmanus Staphylothermus marinus Picrophilus torridus*

*Pyrococcus furiosus Thermococcus sp.*

**Table 3.** Cofactor dependency of the archaeal DNA ligases

*Pyrococcus horikoshii Thermococcus kodakarensis*

*Sulfolobus acidocaldarius Thermoplasma acidophilum*

The crystal structure of the human Lig I, complexed with DNA, was solved as the first ATPdependent mammalian DNA ligase, although the ligase was an N-terminal truncated form [174]. The structure comprises the N-terminal DNA binding domain, the middle adenyla‐ tion domain, and the C-terminal OB-fold domain. The crystal structure of Lig I (residues 233 to 919) in complex with a nicked, 5'-adenylated DNA intermediate revealed that the enzyme redirects the path of the dsDNA, to expose the nick termini for the strand-joining reaction. The N-terminal DNA-binding domain works to encircle the DNA substrate like PCNA and to stabilize the DNA in a distorted structure, positioning the catalytic core on the nick. The crystal structure of the full length DNA ligase from *P. furiosus* revealed that the architecture of each domain resembles those of Lig I, but the domain arrangements strikingly differ be‐ tween the two enzymes [168]. This domain rearrangement is probably derived from the "do‐ main-connecting" role of the helical extension conserved at the C-termini in the archaeal and eukaryotic DNA ligases. The DNA substrate in the open form of Lig I is replaced by motif VI at the C-terminus, in the closed form of *P. furiosus* DNA ligase. Both the shapes and elec‐ trostatic distributions are similar between motif VI and the DNA substrate, suggesting that motif VI in the closed state mimics the incoming substrate DNA. The subsequently solved crystal structure of *S. solfataricus* DNA ligase is the fully open structure, in which the three domains are highly extended [175]. In this work, the *S. solfataricus* ligase-PCNA complex was also analyzed by SAXS. *S. solfataricus* DNA ligase bound to the PCNA ring still retains an open, extended conformation. The closed, ring-shaped conformation observed in the Lig I structure as described above is probably the active form to catalyze a DNA end-joining re‐ action, and therefore, it is proposed that the open-to-closed movement occurs for ligation, and the switch in the conformational change is accommodated by a malleable interface with PCNA, which serves as an efficient platform for DNA ligation [175]. After the publication of these crystal structures, the three-dimensional structure of the ternary complex, consisting of DNA ligase-PCNA-DNA, using the *P. furiosus* proteins was obtained by EM single parti‐ cle analysis [176]. In the complex structure, the three domains of the crescent-shaped *P. fur‐ iosus* DNA ligase surround the central DNA duplex, encircled by the closed PCNA ring. The relative orientations of the ligase domains remarkably differ from those of the crystal struc‐ tures, and therefore, a large domain rearrangement occurs upon ternary complex formation. In the EM image model, the DNA ligase contacts PCNA at two sites, the conventional PIP box and a novel second contact in the middle adenylation domain. It is also interesting that a substantial DNA tilt from the PCNA ring axis is observed. Based on these structural analy‐ ses, a mechanism in which the PCNA binding proteins are bound and released sequentially. In fact, most of the PCNA binding proteins share the same binding sites in the interdomain connecting loop (IDCL) and the C-terminal tail of the PCNA. The structural features exclude the possibility that the three proteins contact the single PCNA ring simultaneously, because DNA ligase occupies two of the three subunits of the PCNA trimer. In the case of the RFC-PCNA-DNA complex structure obtained by the same EM technique, RFC entirely covers the PCNA ring, thus blocking the access of other proteins [138]. These ternary complexes appear to favor a mechanism involving the sequential binding and release of replication factors.

### **12. Flap endonuclease 1 (FEN1)**

Efficient processing of Okazaki fragments to make a continuous DNA strand is essential for the lagging strand synthesis in asymmetric DNA replication. The primase-synthesized RNA/DNA primers need to be removed to join the Okazaki fragments into an intact contin‐ uous strand DNA. Flap endonuclease 1 (FEN1) is mainly responsible for this task. Okazaki fragment maturation is highly coordinated with continuous DNA synthesis, and the interac‐ tions of DNA polymerase, FEN1, and DNA ligase with PCNA allow these enzymes to act sequentially during the maturation process, as described above.

FEN1, a structure-specific 5'-endonuclease, specifically recognizes a dsDNA with an unan‐ nealed 5'-flap [177, 178]. In the eukaryotic Okazaki fragment processing system, 5'-flap DNA structures are formed by the strand displacement activity of DNA polymerase δ. Lig I seals the nick after the flapped DNA is cleaved by FEN1. These processing steps are facilitat‐ ed by PCNA [179]. The interactions between eukaryotic FEN1 and PCNA have been well characterized [140, 171], and the stimulatory effect of PCNA on the FEN1 activity was also shown [180]. The crystal structure of the human FEN1-PCNA complex revealed three FEN1 molecules bound to each PCNA subunit of the trimer ring in different configurations [181]. Based on these structural analyses together with the description in the DNA ligase section, a flip-flop transition mechanism, which enables proteins to internally switch for different functions on the same DNA clamp are currently being considered.

The eukaryotic homologs of FEN1 were found in Archaea [182]. The crystal structures of FEN1 from *M. jannaschii* [183], *P. furiosus* [184], *P. horikoshii* [185], *A. fulgidus* [186], and *S. solfataricus* [150] have been determined. In addition, detailed biochemical studies were per‐ formed on *P. horikoshii* FEN1 [187, 188]. Thus, studies of the archaeal FEN1 proteins have provided important insights into the structural basis of the cleavage reaction of the flapped DNA. Our recent research showed that the flap endonuclease activity of *P. furiosus* FEN1 was stimulated by PCNA. Furthermore, the stimulatory effect of PCNA on the sequential action of FEN1 and DNA ligase was observed *in vitro* (Kiyonari et al., unpublished). Based on these results, a model of the molecular switching mechanisms of the last steps of Okaza‐ ki-fragment maturation was constructed. The quaternary complex of FEN1-Lig-PCNA-DNA was also isolated for the EM single particle analysis. These studies will provide more con‐ crete image of the molecular mechanism.

### **13. Summary and perspectives**

relative orientations of the ligase domains remarkably differ from those of the crystal struc‐ tures, and therefore, a large domain rearrangement occurs upon ternary complex formation. In the EM image model, the DNA ligase contacts PCNA at two sites, the conventional PIP box and a novel second contact in the middle adenylation domain. It is also interesting that a substantial DNA tilt from the PCNA ring axis is observed. Based on these structural analy‐ ses, a mechanism in which the PCNA binding proteins are bound and released sequentially. In fact, most of the PCNA binding proteins share the same binding sites in the interdomain connecting loop (IDCL) and the C-terminal tail of the PCNA. The structural features exclude the possibility that the three proteins contact the single PCNA ring simultaneously, because DNA ligase occupies two of the three subunits of the PCNA trimer. In the case of the RFC-PCNA-DNA complex structure obtained by the same EM technique, RFC entirely covers the PCNA ring, thus blocking the access of other proteins [138]. These ternary complexes appear to favor a mechanism involving the sequential binding and release of replication factors.

Efficient processing of Okazaki fragments to make a continuous DNA strand is essential for the lagging strand synthesis in asymmetric DNA replication. The primase-synthesized RNA/DNA primers need to be removed to join the Okazaki fragments into an intact contin‐ uous strand DNA. Flap endonuclease 1 (FEN1) is mainly responsible for this task. Okazaki fragment maturation is highly coordinated with continuous DNA synthesis, and the interac‐ tions of DNA polymerase, FEN1, and DNA ligase with PCNA allow these enzymes to act

FEN1, a structure-specific 5'-endonuclease, specifically recognizes a dsDNA with an unan‐ nealed 5'-flap [177, 178]. In the eukaryotic Okazaki fragment processing system, 5'-flap DNA structures are formed by the strand displacement activity of DNA polymerase δ. Lig I seals the nick after the flapped DNA is cleaved by FEN1. These processing steps are facilitat‐ ed by PCNA [179]. The interactions between eukaryotic FEN1 and PCNA have been well characterized [140, 171], and the stimulatory effect of PCNA on the FEN1 activity was also shown [180]. The crystal structure of the human FEN1-PCNA complex revealed three FEN1 molecules bound to each PCNA subunit of the trimer ring in different configurations [181]. Based on these structural analyses together with the description in the DNA ligase section, a flip-flop transition mechanism, which enables proteins to internally switch for different

The eukaryotic homologs of FEN1 were found in Archaea [182]. The crystal structures of FEN1 from *M. jannaschii* [183], *P. furiosus* [184], *P. horikoshii* [185], *A. fulgidus* [186], and *S. solfataricus* [150] have been determined. In addition, detailed biochemical studies were per‐ formed on *P. horikoshii* FEN1 [187, 188]. Thus, studies of the archaeal FEN1 proteins have provided important insights into the structural basis of the cleavage reaction of the flapped DNA. Our recent research showed that the flap endonuclease activity of *P. furiosus* FEN1 was stimulated by PCNA. Furthermore, the stimulatory effect of PCNA on the sequential

**12. Flap endonuclease 1 (FEN1)**

110 The Mechanisms of DNA Replication

sequentially during the maturation process, as described above.

functions on the same DNA clamp are currently being considered.

Research on the molecular mechanism of DNA replication has been a central theme of mo‐ lecular biology. Archaeal organisms became popular in the total genome sequencing age, as described above, and most of the DNA replication proteins are now equally understood by biochemical characterizations. In addition, the archaeal studies are especially interesting to understand the mechanisms by which cells live in extreme environmental conditions. Fur‐ thermore, it is also noteworthy that the proteins from the hyperthermophilic archaea are more stable than those from mesophilic organisms, and they are advantageous for the struc‐ tural and functional analyses of higher-ordered complexes, such as the replisome. Studies on the higher-ordered complexes, rather than single proteins, are essential for understand‐ ing each of the events involved in DNA metabolism, and the archaeal research will continu‐ ously contribute to the development and advancement of the DNA replication research field, as summarized in part in a recent review [189, 190].

In addition to basic molecular biology research, DNA replication proteins from thermo‐ philes have been quite useful reagents for gene manipulations, including genetic diagnosis, forensic DNA typing, and detection of bacterial and virus infections, as well as basic re‐ search. Numerous enzymes have been commercialized around the world, and are utilized daily. An example of the successful engineering of an archaeal DNA polymerase for PCR is the creation of the fusion protein between *P. furiosus* Pol B and a nonspecific dsDNA bind‐ ing protein, Sso7d, from *S. solfataricus,* by genetic engineering techniques [191]. The fusion DNA polymerase overcame the low processivity of the wild type Pol B by the high affinity Sso7d to the DNA strand. As another example, we successfully developed a novel proces‐ sive PCR method, using the archaeal Pol B with the help of a mutant PCNA [192, 193]. Sev‐ eral DNA sequencing technologies, referred to as "next-generation sequencing", have been developed [194, 195], and are now commercially available. Single-molecule detection, using dye-labeled modified nucleotides and longer read lengths, is now known as "third-genera‐ tion DNA sequencing" [196]. These technologies apply DNA polymerases or DNA ligases from various sources, indicating that these DNA replication enzymes are indispensable for the development of DNA manipulation technology. These facts prove that the progress of the basic research on the molecular biology of archaeal DNA replication will promote the development of the new technologies for genetic engineering.

#### **Author details**

Yoshizumi Ishino and Sonoko Ishino

Department of Bioscience and Biotechnology, Graduate School of Bioresource and Bioenvir‐ onmental Sciences, Kyushu University, Japan

### **References**


[13] Lundgren M, Andersson A, Chen L, Nilsson P, Bernander R. Three replication ori‐ gins in *Sulfolobus* species: synchronous initiation of chromosome replication and asynchronous termination. Proc Natl Acad Sci USA, 2004;101(18):7046-7051.

**Author details**

112 The Mechanisms of DNA Replication

**References**

Yoshizumi Ishino and Sonoko Ishino

2005;(6) 669-676.

876-887.

onmental Sciences, Kyushu University, Japan

bose nucleic acid. Nature, 1953;171(4356) 737-738.

Nucleic Acids Res, 1999;27(17) 3389-3401.

zation. Genes Cells, 1997;2(8) 499-512.

2003;4(2) 154-158.

*fataricus* Cell, 2004;116(1) 25-38.

[6] Jacob F, Brenner SCR. Hebd Seances Acad Sci. 1963;256 298-300.

Proc. Natl. Acad. Sci. USA, 2001;98(20) 11152-11157.

Department of Bioscience and Biotechnology, Graduate School of Bioresource and Bioenvir‐

[1] Watson JD, Crick FH. Molecular structure of nucleic acids; a structure for deoxyri‐

[2] Kelman Z, White MF. Archaeal DNA replication and repair. Curr Opin Microbiol,

[3] Barry ER, Bell SD. DNA replication in the archaea. Microbiol Mol Biol Rev, 2006;70(4)

[4] Wigley DB ORC proteins: marking the start. Curr Opin Struct Biol, 2009;19(1) 72-78.

[5] Leipe DD, Aravind L, Koonin EV. Did DNA replication evolve twice independently?

[7] Matsunaga F, Forterre P, Ishino Y, Myllykallio H. In vivo interactions of archaeal Cdc6/Orcl and minichromosome maintenance proteins with the replication origin.

[8] Uemori T, Sato Y, Kato I, Doi H, Ishino Y. A novel DNA polymerase in the hyper‐ thermophilic archaeon, *Pyrococcus furiosus*: gene cloning, expression, and characteri‐

[9] Lopez P, Philippe H, Myllykallio H, Forterre P. Identification of putative chromoso‐

[10] Matsunaga F, Norais C, Forterre P, Myllykallio H. Identification of short 'eukaryotic' Okazaki fragments synthesized from a prokaryotic replication origin. EMBO Rep,

[11] Robinson NP, Dionne I, Lundgren M, Marsh VL. Bernander R, Bell SD. Identification of two origins of replication in the single chromosome of the archaeon *Sulfolobus sol‐*

[12] Matsunaga F, Glatigny A, Mucchielli-Giorgi MH, Agier N, Delacroix H, Marisa L, et al. Genomewide and Biochemical Analyses of DNA-binding activity of Cdc6/Orc1

and Mcm proteins in *Pyrococcus* sp. Nucleic Acids Res, 2007;35(10)3214-3222.

mal origins of replication in Archaea. Mol Microbiol, 1999;32(4) 883-886.


[40] Sato T, Fukui T, Atomi H, Imanaka T. Improved and versatile transformation system allowing multiple genetic manipulations of the hyperthermophilic archaeon *Thermo‐ coccus kodakaraensis*. Appl Environ Microbiol, 2005;71(7) 3889-3899.

[27] Chong JP, Hayashi MK, Simon MN, Xu RM, Stillman B. A double-hexamer archaeal minichromosome maintenance protein is an ATP-dependent DNA helicase. Proc

[28] Grainge I, Scaife S, Wigley DB. Biochemical analysis of components of the pre-repli‐ cation complex of *Archaeoglobus fulgidus.* Nucleic Acids Res, 2003;31(16) 4888-4898.

[29] Kelman Z, Lee JK, Hurwitz J. The single minichromosome maintenance protein of *Methanobacterium thermoautotrophicum* ΔH contains DNA helicase activity. Proc Natl

[30] Shechter DF, Ying CY, Gautier J, The intrinsic DNA helicase activity of *Methanobacte‐ rium thermoautotrophicum* ΔH minichromosome maintenance protein. J Biol Chem,

[31] Yoshimochi T, Fujikane R, Kawanami M, Matsunaga F, Ishino Y. The GINS complex from *Pyrococcus furiosus* stimulates the MCM helicase activity. J Biol Chem,

[32] Sakakibara N, Kelman LM, Kelman Z. Unwinding the structure and function of the

[33] Shin JH, Grabowski B, Kasiviswanathan R, Bell SD, Kelman Z. Regulation of mini‐ chromosome maintenance helicase activity by Cdc6. J Biol Chem, 2003;278(39)

[34] Kasiviswanathan R, Shin JH, Kelman Z. Interactions between the archaeal Cdc6 and MCM proteins modulate their biochemical properties. Nucleic Acids Res, 2005;33(15)

[35] De Felice M, Esposito L, Pucci B, Carpentieri F, De Falco M, Rossi M, et al. Biochemi‐ cal characterization of a CDC6-like protein from the crenarchaeon *Sulfolobus solfatari‐*

[36] Haugland GT, Shin JH, Birkeland NK, Kelman Z. Stimulation of MCM helicase activ‐ ity by a Cdc6 protein in the archaeon *Thermoplasma acidophilum*. Nucleic Acids Res,

[37] Krupovic M, Gribaldo S, Bamford DH, Forterre P. The evolutionary history of arch‐ aeal MCM helicases: a case study of vertical evolution combined with hitchhiking of

[38] Fukui T, Atomi H, Kanai T, Matsumi R, Fujiwara S, Imanaka T.. Complete genome sequence of the hyperthermophilic archaeon *Thermococcus kodakarensis* KOD1 and

[39] Sato T, Fukui T, Atomi H, Imanaka T. Targeted gene disruption by homologous re‐ combination in the hyperthermophilic archaeon *Thermococcus kodakaraensis* KOD1. J

comparison with *Pyrococcus* genomes. Genome Res, 2005;15(3) 352-363.

mobile genetic elements. Mol Biol Evol, 2010;27(12) 2716-2732.

archaeal MCM helicase. Mol Microbiol, 2009;72(2) 286-896.

Natl Acad Sci USA, 2000;97(4) 1530-1535.

Acad Sci USA, 1999;96(26) 14783-14788.

*cus*. J Biol Chem, 2003;278(47) 46424-46431.

2000;275(20) 15049-15059.

2008;283(3) 1601-1609.

38059-38067.

114 The Mechanisms of DNA Replication

4940-4950.

2006, 34(21) 6337-6344.

Bacteriol, 2003;185(1) 210-220.


[66] De Falco M, Fusco A, DeFelice M, Rossi M, Pisani FM. The DNA primase is activated by substrates containing a thymine-rich bubble and has a 3'-terminal nucleotidyltransferase activity. Nucleic Acids Res, 2004;32(17) 5223-5230.

[53] Oyama T, Ishino S, Fujino S, Ogino H, Shirai T, Mayanagi K, et al. Architectures of archaeal GINS complexes, essential DNA replication initiation factors. BMC Biol,

[54] Kamada K, Kubota Y, Arata T, Shindo Y, Hanaoka F. Structure of the human GINS complex and its assembly and functional interface in replication initiation. Nat Struct

[55] Choi JM, Lim HS, Kim JJ, Song OK, Cho Y. Crystal structure of the human GINS

[56] Chang, YP, Wang G, BermudezV, Hurwitz J, Chen XS, et al. Crystal structure of the GINS complex and functional insight into its role in DNA replication. Proc Natl Acad

[57] Sanchez-Pulido L, Ponting CP. Cdc45: the missing RecJ ortholog in eukaryotes? Bio‐

[58] Li Z, Pan M, Santangelo T J, Chemnitz W, Yuan W, Edwards JL, et al. A novel DNA nuclease is stimulated by association with the GINS complex. Nucleic Acids Res,

[59] Krastanova I, Sannino V, Amenitsch H, Gileadi O, Pisani FM, Onesti S., et al. Struc‐ tural and functional insights into the DNA replication factor Cdc45 reveal an evolu‐ tionary relationship to the DHH family of phosphoesterases. J Biol Chem, 2011;287(6)

[60] Kirk BW, Kuchta RD. Arg304 of human DNA primase is a key contributor to cataly‐ sis and NTP binding: primase and the family X polymerases share significant se‐

[61] Desogus G, Onesti S, Brick P, Rossi M, Pisani FM. Identification and characterization of a DNA primase from the hyperthermophilic archaeon *Methanococcus jannaschii.*

[62] Bocquier A, Liu L, Cann I, Komori K, Kohda D, Ishino Y. Archaeal primase: bridging the gap between RNA and DNA polymerase. Curr Biol, 2001;11(6) 452-456.

[63] Liu L, Komori K, Ishino S, Bocquier AA, Cann IK, Kohda D, et al. The archaeal DNA primase: biochemical characterization of the p41-p46 complex from *Pyrococcus furio‐*

[64] Lao-Sirieix SH, Bell SD. The heterodimeric primase of the hyperthermophilic archae‐ on *Sulfolobus solfataricus* possesses DNA and RNA primase, polymerase and 3'-termi‐

[65] Lao-Sirieix SH, Nookala RK, Roversi P, Bell SD, Pellegrini L.et al. Structure of the

nal nucleotidyl transferase activities. J Mol Biol, 2004;344(5) 1251-1263.

heterodimeric core primase. Nat Struct Mol Biol, 2005;12(12) 1137-1144.

quence homology. Biochemistry, 1999;38(31) 7727-7736

Nucleic Acids Res, 1999;27(22) 4444-4450.

*sus.* J Biol Chem, 2001;276(48) 45484-45490.

2011;9 28.

116 The Mechanisms of DNA Replication

Mol Biol, 2007;14(5)388-396.

complex. Genes Dev, 2007;21(11) 1316-1321.

Sci USA, 2007;104(31) 12685-12690.

informatics, 2011;27(14) 1885-1888.

2011;39(14) 6114-6123.

4121-4128.


*ans* and their effects on DNA synthesis by DNA polymerase BI. J Biol Chem, 2004;279(8) 6315-6326.

[93] Lin Y, Lin LJ, Sriratana P, Coleman K, Ha T, Spies M, et al. Engineering of functional replication protein a homologs based on insights into the evolution of oligonucleo‐ tide/oligosaccharide-binding folds. J Bacteriol, 2008;190(17) 5766-5780.

[79] Bochkarev A, Bochkareva E, Frappier L, Edwards AM. The crystal structure of the complex of replication protein A subunits RPA32 and RPA14 reveals a mechanism

[80] Bochkareva E, Frappier L, Edwards A M, Bochkarev A.The RPA32 subunit of human replication protein A contains a single-stranded DNA-binding domain. J Biol Chem,

[81] Philipova D, Mullen JR, Maniar HS, Lu J, Gu C, Brill SJ, et al. A hierarchy of SSB pro‐

[82] Gomes XV, Wold MS. Structural analysis of human replication protein A. Mapping functional domains of the 70-kDa subunit. J Biol Chem, 1995;270(9) 4534-4543. [83] Mer G, Bochkarev A, Gupta R, Bochkareva E, Frappier L, Ingles CJ. Structural basis for the recognition of DNA repair proteins UNG2, XPA, and RAD52 by replication

[84] Dornreiter I, Erdile LF, Gilbert IU, von Winkler D, Kelly TJ, Fanning E. Interaction of DNA polymerase alpha-primase with cellular replication protein A and SV40 T anti‐

[85] Tsurimoto T, Stillman B. Multiple replication factors augment DNA synthesis by the two eukaryotic DNA polymerases, alpha and delta. EMBO J, 1989;8(12) 3883-3889. [86] Kenny MK, Lee SH, Hurwitz J. Multiple functions of human single-stranded-DNA binding protein in simian virus 40 DNA replication: single-strand stabilization and stimulation of DNA polymerases alpha and delta. Proc Natl Acad Sci USA,

[87] Chedin F, Seitz EM, Kowalczykowski SC. Novel homologs of replication protein A in archaea: implications for the evolution of ssDNA-binding proteins. Trends Biol Sci,

[88] Kelly TJ, Simancek P, Brush GS. Identification and characterization of a singlestranded DNA-binding protein from the archaeon *Methanococcus jannaschii.* Proc Natl

[89] Kelman Z, Pietrokovski S, Hurwitz J. Isolation and characterization of a split B-type DNA polymerase from the archaeon *Methanobacterium thermoautotrophicum* ΔH. J Biol

[90] Komori K, Ishino Y. Replication protein A in *Pyrococcus furiosus* is involved in homol‐

[91] Kerr ID, Wadsworth RI, Cubeddu L, Blankenfeldt W, Naismith JH, White MF In‐ sights into ssDNA recognition by the OB fold from a structural and thermodynamic

[92] Robbins JB, Murphy MC, White BA, Mackie RI, Ha T, Cann IK.et al. Functional anal‐ ysis of multiple single-stranded DNA-binding proteins from *Methanosarcina acetivor‐*

ogous DNA recombination. J Biol Chem, 2001;276(28) 25654-25660.

study of *Sulfolobus* SSB protein. EMBO J, 2003;22(11) 2561-270.

for single-stranded DNA binding. EMBO J, 1999;18(16) 4498-4504.

tomers in replication protein A. Genes Dev, 1996;10(17) 2222-2233.

1998;273(7) 3932-3936.

118 The Mechanisms of DNA Replication

factor RPA. Cell, 2000;103(3) 449-456.

Acad Sci USA, 1998;95(25) 14634-14639.

Chem, 1999;274(40) 28751-28761.

gen. EMBO J, 1992;11(2) 769-776.

1989;86(24) 9757-9761.

1998;23(8) 273-277.


[120] Tahirov TH, Makarova KS, Rogozin IB, Pavlov YI, Koonin EV. Evolution of DNA polymerases: an inactivated polymerase-exonuclease module in Pol ε and a chimeric origin of eukaryotic polymerases from two classes of archaeal ancestors. Biol Direct, 2009;4 11.

[107] Imamura M, Uemori T, Kato I, Ishino Y.A non-α-like DNA polymerase from the hy‐ perthermophilic archaeon *Pyrococcus furiosus.* Biol Pharm Bull, 1995;18(12) 1647-1652.

[108] Ishino Y, Ishino S. Novel DNA polymerases from Euryarchaeota. Meth Enzymol,

[109] Gueguen Y, Rolland JL, Lecompte O, Azam P, Le Romancer G, Flament D, et al. Characterization of two DNA polymerases from the hyperthermophilic euryarch‐

[110] Shen Y, Musti K, Hiramoto M, Kikuchi H, Kawarabayashi Y, Matsui I. Invariant Asp-1122 and Asp-1124 are essential residues for polymerization catalysis of family D DNA polymerase from *Pyrococcus horikoshii.* J Biol Chem, 2001;276(29) 27376-27383.

[111] Shen Y, Tang X, Matsui I. Subunit interaction and regulation of activity through ter‐ minal domains of the family D DNA polymerase from *Pyrococcus horikoshii.* J Biol

[112] Tang XF, Shen Y, Matsui E, Matsui I. Domain topology of the DNA polymerase D complex from a hyperthermophilic archaeon *Pyrococcus horikoshii.* Biochemistry,

[113] Jokela M, Eskelinen A, Pospiech H, Rouvinen J, Syväoja JE. Characterization of the 3' exonuclease subunit DP1 of *Methanococcus jannaschii* replicative DNA polymerase D.

[114] Henneke G, Flament D, Hübscher U, Querellou J, Raffin JP. The hyperthermophilic euryarchaeota *Pyrococcus abyssi* likely requires the two DNA polymerases D and B

[115] Castrec B, Laurent S, Henneke G, Flament D, Raffin JP. The glycine-rich motif of *Py‐ rococcus abyssi* DNA polymerase D is critical for protein stability. J Mol Biol,

[116] Berquist BR, DasSarma P, DasSarma S. Essential and non-essential DNA replication genes in the model halophilic Archaeon, *Halobacterium* sp. NRC-1. BMC Genet,

[117] Brochier-Armanet C, Boussau B, Gribaldo S, Forterre P. Mesophilic Crenarchaeota: proposal for a third archaeal phylum, the Thaumarchaeota. Nat Rev Microbiol,

[118] Nunoura T, Takaki Y, Kakuta J, Nishi S, Sugahara J, Kazama H, et al. Insights into the evolution of Archaea and eukaryotic protein modifier systems revealed by the

[119] Elkins JG, Podar M, Graham DE, Makarova KS, Wolf Y, Randau L, et al. A korarch‐ aeal genome reveals insights into the evolution of the Archaea. Proc Natl Acad Sci

genome of a novel archaeal group. Nucleic Acids Res, 2011;39(8) 3204-3223.

aeon *Pyrococcus abyssi.* Eur J Biochem, 2001;268(22) 5961-5969.

2001, 334 249-260

120 The Mechanisms of DNA Replication

Chem, 2003;278(23) 21247-21257.

Nucleic Acids Res, 2004;32(8)2430-2440.

for DNA replication. J Mol Biol, 2005;350(1) 53-64.

2004;43(37) 11818-11827.

2010;396(4) 840-848.

2008;6(3) 245-252.

USA, 2008;105(23) 8102-8107.

2007;8 31.


[148] Imamura K, Fukunaga K, Kawarabayasi Y, Ishino Y. Specific interactions of three proliferating cell nuclear antigens with replication-related proteins in *Aeropyrum per‐ nix.* Mol Microbiol, 2007;64(2) 308-318.

[134] Mayanagi K, Miyata T, Oyama T, Ishino Y, Morikawa K. Three-dimensional electron microscopy of clamp loader small subunit from *Pyrococcus furiosus.* J Struct Biol,

[135] Ishino S, Oyama T, Yuasa M, Morikawa K, Ishino Y. Mutational Analysis of *Pyrococ‐ cus furiosus* Replication Factor C based on the Three-Dimensional Structure. Extremo‐

[136] Oyama T, Ishino Y, Cann I, Ishino S, Morikawa K. Atomic structure of the clamp loader small subunit from *Pyrococcus furiosus*. Mol Cell, 2001;8(2) 455-463.

[137] Indiani C, O'Donnell M. The replication clamp-loading machine at work in the three

[138] Miyata T, Suzuki H, Oyama T, Mayanagi K, Ishino Y, Morikawa K. Open clamp structure in the clamp-loading complex visualized by electron microscopic image

[139] Kelch BA, Makino DL, O'Donnell M, Kuriyan J. The replication clamp-loading ma‐ chine at work in the three domains of life. Science, 2011;334(6063) 1675-1680.

[140] Warbrick EM. The puzzle of PCNA's many partners. BioEssays, 2000;22(11) 997-1006.

[141] Vivona JB, Kelman Z. The diverse spectrum of sliding clamp interacting proteins.

[142] Nishida H, Mayanagi K, Kiyonari S, Sato Y, Oyama T, Ishino Y, et al. Structural de‐ terminant for switching between the polymerase and exonuclease modes in the PCNA-replicative DNA polymerase complex. Proc Natl Acad Sci USA, 2009;106(49)

[143] Shamoo Y, Steitz TA. Building a replisome from interacting pieces: sliding clamp complexed to a peptide from DNA polymerase and a polymerase editing complex.

[144] Franklin MC, Wang J. Steitz TA. Structure of the replicating complex of a pol α fami‐

[145] Mayanagi K, Kiyonari S, Nishida H, Saito M, Kohda D, Ishino Y, et al. Architecture of the DNA polymerase B-proliferating cell nuclear antigen (PCNA)-DNA ternary

[146] Daimon K, Kawarabayasi Y, Kikuchi H, Sako Y, Ishino Y. Three proliferating cell nu‐ clear antigen-like proteins found in the hyperthermophilic archaeon *Aeropyrum per‐ nix*: interactions with the two DNA polymerases. J Bacteriol, 2002;184(3) 687-694.

[147] Dionne I, Nookala RK, Jackson SP, Doherty AJ, Bell SD. A heterotrimeric PCNA in the hyperthermophilic archaeon *Sulfolobus solfataricus.* Mol Cell, 2003;11(1) 275-282.

domains of life. Nat Rev Mol Cell Biol, 2006;7(10) 751-761.

analysis. Proc Natl Acad Sci USA, 2005;102(39) 13795-13800.

200;134(1) 35-45.

122 The Mechanisms of DNA Replication

philes, 2003;7(3) 169-175.

FEBS Lett, 2003;546(2-3) 167-172.

20693-20698.

Cell, 1999;99(2) 155-166.

ly DNA polymerase. Cell, 2001;105(5) 657-667.

complex. Proc Natl Acad Sci USA, 2011;108(5):1845-1849


[176] Mayanagi K, Kiyonari S, Saito M, Shirai T, Ishino Y, Morikawa K. Mechanism of rep‐ lication machinery assembly as revealed by the DNA ligase-PCNA-DNA complex ar‐ chitecture. Proc Natl Acad Sci USA, 2009;106(12) 4647-4652.

[162] Kim YJ, Lee HS, Bae SS., Jeon JH, Yang SH, Lim JK, et al. Cloning, expression, and characterization of a DNA ligase fro3 a hyperthermophilic archaeon *Thermococcus* sp.

[163] Ferrer M, Golyshina OV, Beloqui A, Böttger LH, Andreu JM, Polaina J, et al. A pur‐ ple acidophilic di-ferric DNA ligase from *Ferroplasma*. Proc Natl Acad Sci USA,

[164] Jeon SJ, Ishikawa K. A novel ADP-dependent DNA ligase from *Aeropyrum pernix* K1.

[165] Seo MS, Kim YJ, Choi JJ, Lee MS, Kim JH, Lee JH, et al. Cloning and expression of a DNA ligase from the hyperthermophilic archaeon *Staphylothermus marinus* and prop‐

[166] Sun Y, Seo MS, Kim JH, Kim YJ, Kim GA, Lee JI, et al. Novel DNA ligase with broad nucleotide cofactor specificity from the hyperthermophilic crenarchaeon *Sulfophobo‐ coccus zilligii*: influence of ancestral DNA ligase on cofactor utilization. Environ Mi‐

[167] Keppetipola N, Shuman S. Characterization of a thermophilic ATP-dependent DNA ligase from the euryarchaeon *Pyrococcus horikoshii*. J Bacteriol, 2005;187(20) 6902-6908.

[168] Kiyonari S, Takayama K, Nishida H, Ishino Y. Identification of a novel binding motif in *Pyrococcus furiosus* DNA ligase for the functional interaction with proliferating cell

[169] Nishida H, Kiyonari S, Ishino Y, Morikawa K. The closed structure of an archaeal

[170] Levin DS, Bai W, Yao N. An interaction between DNA ligase I and proliferating cell nuclear antigen: implications for Okazaki fragment synthesis and joining. Proc Natl

[171] Jónsson ZO, Hindges R, Hübscher U. Regulation of DNA replication and repair pro‐ teins through interaction with the front side of proliferating cell nuclear antigen. EM‐

[172] Tom S, Henricksen LA, Park MS, Bambara RA. DNA ligase I and proliferating cell nuclear antigen form a functional complex. J Biol Chem, 2001;276(27) 24817-24825.

[173] Kiyonari S, Kamigochi T, Ishino Y. A single amino acid substitution in the DNAbinding domain of *Aeropyrum pernix* DNA ligase impairs its interaction with prolifer‐

[174] Pascal JM, O'Brien PJ, Tomkinson AE, Ellenberger T. Human DNA ligase I complete‐ ly encircles and partially unwinds nicked DNA. Nature, 2004;432(7016) 473-478: [175] Pascal JM, Tsodikov OV, Hura GL, Song W, Cotner EA, Classen S, et al. A flexible interface between DNA ligase and PCNA supports conformational switching and ef‐

ating cell nuclear antigen, Extremophiles, 2007;11(5) 675-684.

ficient ligation of DNA. Mol Cell, 2006;24(2) 279-291.

DNA ligase from *Pyrococcus furiosus.* J Mol Biol, 2006;360(5) 956-967.

erties of the enzyme. J Biotechnol, 2007;128(3) 519-530.

nuclear antigen. J Biol Chem, 2006;281(38) 28023-28032.

Acad Sci USA, 1997;94(24) 12863-12868.

BO J, 1998;17(8) 2412-2425.

Biotechnol Lett, 2006;28(6) 401-407.

2008;105(26) 8878-8883.

124 The Mechanisms of DNA Replication

FEBS Lett, 2003;550(1-3) 69-73.

crobiol, 2008;10(12) 3212-3224.

