**3.1 RecF is an ABC ATPase**

434 DNA Repair

addition to initial loading of RecA, RecOR further stimulate homologous recombination by preventing the dissociation of RecA\* filament from ssDNA in *E. coli* (Bork et al., 2001). Somewhat different properties were reported for *Bacillus subtilis* RecO, which does not require RecR for initiation of RecA\* formation (Manfredi et al., 2008; Manfredi et al., 2010). Crystal structures of all three proteins and of the RecOR complex from *D. radiodurans* have been reported (Koroleva et al., 2007; Lee et al., 2004; Leiros et al., 2005; Makharashvili et al., 2004; Timmins et al., 2007). RecR structure resembles that of a DNA clamp-like tetramer (Lee et al., 2004). However, the role of a potential DNA clamp in RMPs-mediated reaction is unknown. Moreover, in the crystal structure of RecOR complex RecO occupies large portion of the clamp inner space. Such conformation makes it challenging to predict functionally relevant interaction of the complex with DNA. Another intriguing fact is that the crystal structure of RecO did not resemble any structural features of its functional eukaryotic analog Rad52 (Leiros et al., 2005; Makharashvili et al., 2004; Singleton et al., 2002), which

In contrast to genetic data, initial biochemical studies did not reveal the function of RecF in recombination initiation (Umezu et al., 1993). RecF binds both ss- and dsDNA in the presence of ATP, and it is a weak DNA-dependent ATPase (Griffin and Kolodner, 1990; Madiraju and Clark, 1991, 1992). It interacts with RecR in the presence of ATP and DNA (Webb et al., 1999). Surprisingly however, RecF was initially shown to play an inhibitory role during RecOR-mediated loading of RecA on SSB-protected ssDNA (Umezu et al., 1993). The UV-sensitivity of *RecF* mutant can be suppressed by RecOR overexpression, suggesting that RecF plays a regulatory role (Sandler and Clark, 1994). In agreement with this hypothesis, RecF dramatically increases the efficiency of RecOR-mediated RecA loading at ds/ssDNA junctions with a 3' ssDNA extension under specific conditions (Morimatsu and Kowalczykowski, 2003). RecF was suggested to recognize specific DNA junction structure to direct RecA loading at the boundary of SSGs. While initial experiments demonstrated such a preference (Hegde et al., 1996), later work did not support the binding preference of RecF to DNA junction (Webb et al., 1999). Purified RecF tends to gradually aggregate in solution (Webb et al., 1999). Apparently, nonspecific high molecular weight RecF aggregates interact with DNA resulting in the inhibitory effect of RecF or false positive interactions of RecF with specific DNA substrates (Hegde et al., 1996). In addition, RecFR complex limits the extension of RecA\* beyond SSGs, the observation indirectly supporting RecF specificity towards boundaries of SSGs while in complex with other proteins (Webb et al., 1997). RecF is co-transcribed with the replication initiation protein DnaA and with the β-clamp subunit of DNA polymerase III DnaN. However, its open reading frame is usually shifted by one or two nucleotides relatively to that of DnaN (Villarroya et al., 1998). *E. coli RecF* gene also has multiple rear codons. Thus, expression of RecF is likely to be down regulated at

translational level. Consequently, there are only a few copies of RecF in an *E. coli* cell.

How RecF promotes recombination remains an open question. The ability of RecFR complex to limit extension of RecA\* filament beyond the SSGs suggests that the RecFR complex may specifically interact with RecA\*. However, no direct observation of such interactions has been reported so far. RecF also binds RecX protein (Lusetti et al., 2006). RecX is a negative regulator of presynaptic complex formation, which inhibits filament extension by binding to RecA. RecF scavenges RecX from solution through direct interaction, thus diminishing negative regulatory effect of RecX (Drees et al., 2004; Lusetti et al., 2006). Additional

supports two identical reactions.

**2.3 Ambiguities of RecF function**

The amino acid sequence of RecF contains three conserved motifs characteristic of ATPbinding cassette (ABC) ATPases: Walker A, Walker B, and a "signature" motif. Walker A, or P-loop, is a nucleotide binding site found in a variety of ATPases (Walker et al., 1982). Walker B motif provides acidic amino acids important for coordination of a water molecule and a metal ion during the hydrolysis of a triphosphate nucleotide bound to the Walker A motif. The signature motif is a unique feature of ABC ATPases, a diverse family of proteins ranging from membrane transporters to DNA-binding proteins (review in (Hopfner and Tainer, 2003). ATP-dependent dimerization is a common feature of this class of proteins. Signature motif residues interact with the nucleotide bound to an opposite monomer (Hopfner et al., 2000). This motif is important for both ATP-dependent dimerization and subsequent ATP hydrolysis. ABC ATPases are not motor proteins and utilize ATP binding and hydrolysis as a switch or sensor mechanism, regulating diverse signaling pathways and reactions.

DNA-binding ABC ATPases include DNA mismatch and nucleotide excision repair enzymes (Ban and Yang, 1998; Junop et al., 2001; Obmolova et al., 2000; Tessmer et al., 2008), structural maintenance of chromosome (SMC) proteins cohesin and condensin (Strunnikov, 1998), and DSBs repair enzyme Rad50 (Hirano et al., 1995). SMCs and Rad50 are characterized by the presence of a long coiled-coil structural domain inserted between Nand C-terminal halves of the globular head domain (Haering et al., 2002). RecF lacks a coiled-coil region, but it does exhibit an ATP-dependent DNA binding and a slow DNAdependent ATP hydrolysis activity (Hegde et al., 1996; Madiraju and Clark, 1992; Webb et al., 1995). However, the SMC-like properties of RecF and their role in recombinational repair have not been addressed. Previously, only Walker A motif has been shown to be critical for RecF function (Sandler et al., 1992; Webb et al., 1999). All known ABC-type ATPases function as a heterooligomeric complexes in which a sequence of inter- and intramolecular interactions is triggered by the ATP-dependent dimerization and the dimerdependent ATP hydrolysis (Deardorff et al., 2007; Dorsett, 2011; Hopfner and Tainer, 2003; Junop et al., 2001; Moncalian et al., 2004; Smith et al., 2002). Thus, RecF may function in recombination initiation through a multistep pathway of protein-protein and DNA-protein interactions regulated by ATP-dependent RecF dimerization.

### **3.2 Structural similarity of RecF with Rad50 head domain**

The diversity of ABC ATPases makes it difficult to predict to which subfamily RecF belongs to based on sequence comparison. RecF is a globular protein lacking long coiled-coil domains of Rad50 and SMC proteins. However, it does not have significant sequence similarity beyond three major motifs with globular DNA binding proteins like MutS. We crystalized and solved a high resolution structure of RecF from *D. radiodurans* (DrRecF) (Fig.

ATP-Binding Cassette Properties of Recombination Mediator Protein RecF 437

The proposed model explains an ATP-dependence of RecF DNA binding. First, it is an acidic protein with mostly negatively charged surface area. In the model of an ATPdependent dimer, small patches of positively charged surface area are aligned on the top of the dimer, creating the extended basic surface area. Second, the arms of domain II form a deep cleft, sufficient to engulf a DNA helix. The constrains of a signature motif interaction with a γ-phosphate group of ATP does not allow to alter the distance between these arms in the model without significant structural clashes of surface exposed residues of the two monomers. Thus, the ATP-dependent dimerization leads to favorite juxtaposition of the

Fig. 3. A model of RecF dimer. **A**) Domains I and II of one RecF monomer are color-coded in yellow and orange, and of the other monomer in grey and blue. Signature motif residues are shown by stick representation in cyan and ATP by stick representation with nitrogen, oxygen, carbon and phosphate atoms are colored in blue, red, yellow and orange, correspondingly. **B**) The same dimer representation with bound dsDNA shown by stick representation in green. **C**) Orthogonal view of the dimer shown in B). **D**) Surface representation of DrRecF dimer in same orientation as in C) color-coded according to the

Proving ATP-dependent dimerization of RecF in solution was quite challenging due to poor solubility and a tendency of RecF to form nonspecific soluble aggregates (Webb et al., 1999). Initial attempts with size *e*xclusion *c*hromatography (SEC) yielded the monomeric form of E. coli RecF in the presence of ATP (Webb et al., 1999). The caveat of such experiment is in low protein solubility, when only solution with limited protein concentration can be run through column, and in a non-equilibrium nature of SEC, which may lead to dissociation of weak dimers. Later, it was shown that DrRecF nonspecifically interacts with the column resin even in a 1M KCl buffer (Koroleva et al., 2007). Therefore, a combination of SEC with static light scattering was utilized to determine the true molecular weight of eluted fractions. DrRecF does form an ATP-dependent dimer, though relatively unstable, which could dissociate on the column under non-equilibrium conditions at low protein concentration. The dimerization of wild type protein and specific mutants under equilibrium conditions was tested with a *d*ynamic *l*ight *s*cattering (DLS). DrRecF dimerizes only in the presence of ATP but not with ADP. Mutation of signature motif S276R resulted in lack of dimerization, as well as mutation of Walker motif A K39M, which prevents ATP binding. Walker A motif mutant K39R which binds, but does not hydrolyses ATP, forms dimer as well as mutants of Walker B motif D300N. Surprisingly, non-hydrolizable ATP analogs did not support dimerization in initial experiments, suggesting that RecF dimerization is highly sensitive to specific ATP-bound conformation. While DLS method is not suitable for quantitative analysis, it is highly sensitive

surface electrostatic potential.

surface charges and to surface complementarity, which stimulate DNA binding.

2) (Koroleva et al., 2007). The structure was solved with resolution of 1.6 Å using native and selenomethionine protein derivative crystals. The structure is comprised of two domains. The ATPase domain I is formed by two β-sheets wrapped around central α-helix A and is similar to the corresponding subdomain of the Rad50 head domain (Figure 2, right). Structures of nucleotide-binding domains are similar in all ABC ATPases. In contrast, structure of subdomain containing signature motif (Lobe II in Rad50) is highly diverse among even DNA binding ABC ATPases. However, all structural elements presented in RecF domain II are present in Rad50 Lobe II subdomain and these domains are structurally more similar than ATP-binding domains. The only difference is two long α-helixes of RecF which are connected at the apical part of this "arm-like" domain. In Rad50 analogous αhelixes are extended into an extremely long coiled-coil structure, absent in RecF. High degree of structural similarity unequivocally puts RecF in the same family together with Rad50 and SMC proteins. Therefore, RecF represents the only known globular protein with a structure highly homologous to that of the head domains of Rad50, cohesin and condensin.

Fig. 2. Cartoon representation of **A**) RecF and **B**) Rad50 head domain structures. α-helixes are shown in red and β-sheets in yellow. In RecF, α-helixes are lettered and β -strands are numbered. Walker A, B, and signature motifs are highlighted in green and labeled. In RecF, ATP-binding domain is designated as Domain I and signature motif domain as Domain II. In Rad50 corresponding domains are referred as Lobe I and Lobe II subdomains.

#### **3.3 The model of ATP-dependent dimer suggests mechanism of DNA binding**

RecF was crystallized as a monomer. ATP-dependent dimer was modeled based on known intersubunit interactions conserved in ABC ATPases and, specifically, based on a known structure of Rad50 dimer (Fig. 3)(Hopfner et al., 2000). In all proteins of this family, a conserved serine of the signature motif interacts with a γ-phosphate group of ATP. The ATP bound to Walker A motif was modeled accordingly to its highly conserved conformation in all Walker A and B containing structures. These constrains unambiguously dictate a single conformation of the potential RecF dimer (Fig. 3A). The model suggests a potential DNA binding site located on the top of two nucleotide-binding domains, in a conformation similar to the proposed DNA binding site of Rad50 (Figs. 3B-D). The resulting RecF dimer forms a semi-clamp or a symmetrical crab-claw with two arms extending in the directions similar to those of coiled–coil regions of Rad50 dimer (Hopfner et al, 2001). The claw structure contains sufficient space to accommodate and cradle dsDNA. In this model, the majority of conserved residues map to the dimerization interface and pocket region of the claw, where DNA binding is expected to occur.

2) (Koroleva et al., 2007). The structure was solved with resolution of 1.6 Å using native and selenomethionine protein derivative crystals. The structure is comprised of two domains. The ATPase domain I is formed by two β-sheets wrapped around central α-helix A and is similar to the corresponding subdomain of the Rad50 head domain (Figure 2, right). Structures of nucleotide-binding domains are similar in all ABC ATPases. In contrast, structure of subdomain containing signature motif (Lobe II in Rad50) is highly diverse among even DNA binding ABC ATPases. However, all structural elements presented in RecF domain II are present in Rad50 Lobe II subdomain and these domains are structurally more similar than ATP-binding domains. The only difference is two long α-helixes of RecF which are connected at the apical part of this "arm-like" domain. In Rad50 analogous αhelixes are extended into an extremely long coiled-coil structure, absent in RecF. High degree of structural similarity unequivocally puts RecF in the same family together with Rad50 and SMC proteins. Therefore, RecF represents the only known globular protein with a structure highly homologous to that of the head domains of Rad50, cohesin and

Fig. 2. Cartoon representation of **A**) RecF and **B**) Rad50 head domain structures. α-helixes are shown in red and β-sheets in yellow. In RecF, α-helixes are lettered and β -strands are numbered. Walker A, B, and signature motifs are highlighted in green and labeled. In RecF, ATP-binding domain is designated as Domain I and signature motif domain as Domain II.

In Rad50 corresponding domains are referred as Lobe I and Lobe II subdomains.

**3.3 The model of ATP-dependent dimer suggests mechanism of DNA binding**

claw, where DNA binding is expected to occur.

RecF was crystallized as a monomer. ATP-dependent dimer was modeled based on known intersubunit interactions conserved in ABC ATPases and, specifically, based on a known structure of Rad50 dimer (Fig. 3)(Hopfner et al., 2000). In all proteins of this family, a conserved serine of the signature motif interacts with a γ-phosphate group of ATP. The ATP bound to Walker A motif was modeled accordingly to its highly conserved conformation in all Walker A and B containing structures. These constrains unambiguously dictate a single conformation of the potential RecF dimer (Fig. 3A). The model suggests a potential DNA binding site located on the top of two nucleotide-binding domains, in a conformation similar to the proposed DNA binding site of Rad50 (Figs. 3B-D). The resulting RecF dimer forms a semi-clamp or a symmetrical crab-claw with two arms extending in the directions similar to those of coiled–coil regions of Rad50 dimer (Hopfner et al, 2001). The claw structure contains sufficient space to accommodate and cradle dsDNA. In this model, the majority of conserved residues map to the dimerization interface and pocket region of the

condensin.

The proposed model explains an ATP-dependence of RecF DNA binding. First, it is an acidic protein with mostly negatively charged surface area. In the model of an ATPdependent dimer, small patches of positively charged surface area are aligned on the top of the dimer, creating the extended basic surface area. Second, the arms of domain II form a deep cleft, sufficient to engulf a DNA helix. The constrains of a signature motif interaction with a γ-phosphate group of ATP does not allow to alter the distance between these arms in the model without significant structural clashes of surface exposed residues of the two monomers. Thus, the ATP-dependent dimerization leads to favorite juxtaposition of the surface charges and to surface complementarity, which stimulate DNA binding.

Fig. 3. A model of RecF dimer. **A**) Domains I and II of one RecF monomer are color-coded in yellow and orange, and of the other monomer in grey and blue. Signature motif residues are shown by stick representation in cyan and ATP by stick representation with nitrogen, oxygen, carbon and phosphate atoms are colored in blue, red, yellow and orange, correspondingly. **B**) The same dimer representation with bound dsDNA shown by stick representation in green. **C**) Orthogonal view of the dimer shown in B). **D**) Surface representation of DrRecF dimer in same orientation as in C) color-coded according to the surface electrostatic potential.

Proving ATP-dependent dimerization of RecF in solution was quite challenging due to poor solubility and a tendency of RecF to form nonspecific soluble aggregates (Webb et al., 1999). Initial attempts with size *e*xclusion *c*hromatography (SEC) yielded the monomeric form of E. coli RecF in the presence of ATP (Webb et al., 1999). The caveat of such experiment is in low protein solubility, when only solution with limited protein concentration can be run through column, and in a non-equilibrium nature of SEC, which may lead to dissociation of weak dimers. Later, it was shown that DrRecF nonspecifically interacts with the column resin even in a 1M KCl buffer (Koroleva et al., 2007). Therefore, a combination of SEC with static light scattering was utilized to determine the true molecular weight of eluted fractions. DrRecF does form an ATP-dependent dimer, though relatively unstable, which could dissociate on the column under non-equilibrium conditions at low protein concentration. The dimerization of wild type protein and specific mutants under equilibrium conditions was tested with a *d*ynamic *l*ight *s*cattering (DLS). DrRecF dimerizes only in the presence of ATP but not with ADP. Mutation of signature motif S276R resulted in lack of dimerization, as well as mutation of Walker motif A K39M, which prevents ATP binding. Walker A motif mutant K39R which binds, but does not hydrolyses ATP, forms dimer as well as mutants of Walker B motif D300N. Surprisingly, non-hydrolizable ATP analogs did not support dimerization in initial experiments, suggesting that RecF dimerization is highly sensitive to specific ATP-bound conformation. While DLS method is not suitable for quantitative analysis, it is highly sensitive

ATP-Binding Cassette Properties of Recombination Mediator Protein RecF 439

reactions were performed for a relatively short time (10-15 min) and with the excess of ATP, taking an advantage of RecF being a slow ATPase (Fig. 6C, below). Alternatively, the rate of ATP hydrolysis was measured over 1 or 2 hours time upon titration of RecF by different DNA oligonucleotides (Fig. 6B). The binding of all DNA substrates was relatively weak with the apparent dissociation constants greater than 15 µM (Fig. 5). Neither a wild type DrRecF in the presence of ADP nor a signature motif mutant S279R in the presence of ATP were able to bind DNA (Fig. 5), suggesting that the ATP-dependent dimerization is essential for RecF

Fig. 5. ATP-dependent binding of DrRecF to different DNA substrates (top) and DNAdependent ATP hydrolysis rates (bottom). DNA substrates are schematically represented above each plot with **A**) ssDNA, **B**) dsDNA and **C**) ds/ssDNA junction. Solid isotherms correspond to binding in the presence of ATP, dashed black – in the presence of ADP, dotted – to the binding of signature motif mutant S279R in the presence of ATP. Red isotherms correspond to DrRecF binding in the presence of ATP and 50 μM DrRecR. The maximum estimated ATP hydrolysis rates of DrRecF (Fig. 6A) are shown at the bottom with the top lane corresponding to reactions without DrRecR and the bottom – with RecR.

DNA binding of DrRecF is drastically alters in the presence of DrRecR (red isotherms in Fig. 5). DrRecR significantly increases the affinity of DrRecF to dsDNA (Fig. 5B) with the estimated association binding constant at least two orders of magnitude stronger than without DrRecR. DrRecR does not alter DrRecF ssDNA binding according to the DNA binding assay. However, the ATPase assay clearly demonstrated interaction of DrRecR with DrRecF in the presence of ssDNA. ssDNA does not stimulate ATP hydrolysis by DrRecF, while the presence of both DrRecR and ssDNA results in strongest ATPase rate. This suggests that DrRecR stimulates the ATPase rate of DrRecF bound to ssDNA, potentially destabilizing dimerization and ssDNA binding. In case of dsDNA, maximum ATPase rates were similar with and without DrRecR. Therefore, DrRecR stabilizes DrRecF complex with dsDNA without increasing its ATPase rate. Due to this stabilization effect of RecR, we are able to measure DNA binding and dimerization of DrRecF in the presence of ATP analogs (Fig. 6B). Curiously, a weak dimerization is observed at highest DrRecF concentration even in the presence of ADP. Therefore, DrRecR selectively stimulates binding of DrRecF dimer to dsDNA, while potentially destabilizing DrRecF complex with ssDNA. Both dimerization and DNA binding reactions were also measured as a function of time to verify that under

DrRecF concentration is 10 μM, DNA- 20 nM, ATP – 2 mM.

**4.2 RecR-dependent DNA specificity of RecF**

interaction with all DNA substrates.

to the presence of high molecular weight protein aggregates, and it was utilized to optimize RecF solution conditions for other experiments.
