**4. Duplicated genes: The case of** *rad3*

376 Selected Topics in DNA Repair

endonuclease activity in a *P. falciparum* cell free lysate. Authors provide evidence for the presence of class II, Mg2+–dependent and independent AP endonucleases in the extracts. Moreover, they detected that *Plasmodium* AP endonuclease(s) possessed a 3´ phosphodiesterase activity similar to those described in other class II AP endonucleases Demple et al., 1986. In a related study, it was reported that a *P. falciparum* lysate contained uracil DNA glycosylase, AP endonuclease, DNA polymerase, flap endonuclease, and DNA ligase activities Haltiwanger et al., 2000. In contrast, DNA repair activities in cell lysates have not been detected in *Entamoeba, Giardia* and *Trichomonas* parasites. These data remark the utility of cell free lysates to understand DNA repair pathways, and pointed out to the urgency to investigate endogenous DNA repair activities using whole cell extracts in

**3. Functional categorization of** *Entamoeba histolytica* **DNA repair genes** 

18% and 8% have helicase and endonuclease functions, respectively (**Fig. 2**).

Fig. 2. **Functional categorizations of** *E. histolytica* **DNA repair genes.** Biological processes

(http://david.abcc.ncifcrf.gov/gene2gene.jsp). Percentage of genes included in individual

and molecular functions were determined using David software

categories is given.

To define the putative functions of *E. histolytica* DNA repair genes in unrelated DNA repair processes, we investigated the functional diversity of genomic maintenance pathways using Gene Ontology (GO) annotations. Functional related gene groups were predicted by the David bioinformatic resources (http://david.abcc.ncifcrf.gov/gene2gene.jsp), using a functional classification tool which generates a gene-to-gene similarity matrix based in shared functional annotation using over 75,000 terms from 14 functional annotation sources, allowing the classification of highly related genes in functionally related groups. Results from this analysis revealed that a large number of DNA repair genes were miss-annotated in parasites genome databases (43%). However, our analysis clearly showed that the majority of these genes seems to participate in DNA repair related processes. Besides, 57% of genes were predicted to function in DNA repair related process. 11% of genes participates in DNA damage repair, and

parasites where no data is available.

Gene duplicates represent for 8-20% of the genes in eukaryotic cells, and the rates of gene duplication are estimated at between 0.2% and 2% per gene per million years. Gene duplications are one of the major motors in the evolution of genetic systems and may occur in homologous recombination, retrotransposition event, or duplication of an entire chromosome [Zhang, 2003]. Duplicated genes are believed to be a main system for the establishment of new gene functions generating evolutionary novelty [Long & Langley, 1993; Gilbert et al., 1997].

A detailed examination of **Table 1** revealed that several DNA repair genes are duplicated in protozoan parasites, while there is only one gene in yeast. For example, the HRR machinery includes two *rad51* genes in *P. falciparum*, two *rad54* and *mre11* genes in *E. histolytica* [Lopez-Casamichana et al., 2008], two *rpa1* genes in *T. vaginalis*, and two *sgs1* genes in *G. lamblia* and *P. falciparum*. We also identified two *rad27* genes in *P. falciparum* and *G. lamblia* NHEJ pathway, two *E. histolylica ntg1* and *P. falciparum pcna* genes in the BER pathway, as well as two *msh2* genes for the MMR pathway in *E. histolytica* and *P. falciparum*. But the most duplicated gene was the *rad3* gene from the NER mechanism, since there are three genes in *E. histolytica*, two in *G. lamblia* and six in *T. vaginalis*, whereas *P. falciparum* has only one *rad3* gene, alike yeast. Remarkably, gene duplication is evident for many other genes in *T. vaginalis* and reflexes the massive gene expansion inside the large genome of this pathogen [Hartl & Wirth, 2006]. In yeast, the RAD3 protein is involved in mitotic recombination and spontaneous mutagenesis, becoming essential for cell viability in the absence of DNA injury. Furthermore, this protein participates in the repair of UV-irradiated DNA via NER, and constitutes a subunit of RNA polII initiation factor TFIIH [Moriel-Carretero & Aguilera, 2010]. *S. cerevisiae* RAD3 is related to the *H. sapiens* XPD, also known as ERCC2. Defects in human XPD result in a wide range of diseases, including Xeroderma pigmentosum (XP), Cockayne's syndrome, and Trichothiodystrophy characterized by a wide spectrum of symptoms ranging from cancer susceptibility to neurological and developmental defects [Liu et al., 2008].

In order to describe the inferred evolutionary relationships among the most abundant duplicated gene found through the analysis of DNA repair machineries from the human pathogens studied here, we have undertaken a phylogenetic analysis of RAD3 helicase orthologues in *S. cerevisiae*, *E. histolytica*, *T. vaginalis*, *G. lamblia* and *P. falciparum*. We evaluated the minimum evolution of RAD3 proteins through the construction of Neighbor-Joining phylogenetic tree using the *MEGA* version 5.05 [Tamura et al., 2011]. The robustness was established by bootstrapping test, involving 500 replications of the data based on the criteria of 50% majority-rule consensus (**Fig. 3**). Two main branches that came from a common ancestor can be observed. On one branch, *T. vaginalis* RAD3 parologues are clustered into two sister proteins pairs (A2E1B9 and A2ELX1, A2E4I6 and A2DDD4), that have each evolved from the same ancestor. Besides, *E. histolytica* C4M6T8 is closer to *T. vaginalis* A2E4I6 and A2DDD4, than to its own paralogues. The other branch supports *T. vaginalis* A2G2G8 that is closely related to yeast and *P. falciparum* RAD3 proteins that came off the same node. Interestingly, these two organisms only have one *rad3* gene. This branch also includes *E. histolytica* C4M8K7 and C4M8Q4 sister proteins pair. Intriguingly, the two *Giardia* RAD3 proteins have emerged from different nodes and appeared to be more related to orthologues from other species than to each other; particularly, the branch supporting *Giardia* A8B495 also includes *Trichomonas* A2E1B9 and A2ELX1, while *Giardia* A8BYS3 is on the other branch, isolated from the other proteins, such as *Trichomonas* A2F1W2, which suggested that these proteins have evolved early.

DNA Repair in Pathogenic Eukaryotic Cells:

yeast [Lopez-Casamichana et al., 2007].

indicates similarity level.

Insights from Comparative Genomics of Parasitic Protozoan 379

In this chapter, we have identified the presence of *Mre11* and *Rad50* genes in the genome of *E. histolytica*, *T. vaginalis*, *G. lamblia* and *P. falciparum.* However, all analyzed pathogenic eukaryotic cells, with the exception *of E. histolytica,* lack the *Xrs2* homologue. The absence of a NBS1/Xrs2 homologous sequence in the other parasites might seem antagonistic to the idea of the existence of an active MRN complex. However we cannot discard the possibility that these microorganisms use a very divergent NBS1 protein, or even that this third component could be unessential. In order to initiate the characterization of components of MRN complex in these parasites, we studied the structural and evolutionary relationships between MRE11, RAD50 and NBS1 through PSI-BLAST analysis in comparison to human and yeast orthologues. This program generates a weighted profile from the sequences detected in the first pass of a gapped-BLAST search and iteratively searches the database using this profile as the query, allowing the inclusion of sequences with e-value cut off higher than 0.01 [Alschult et al., 1997]. Using the e-value threshold as a similarity measure, we evidenced a close relation between putative EhMRE11, HsMRE11, ScMRE11, TvMRE11 and PfMRE11. Conversely, GlMRE11 turned out to be less similar to the others, being closer to *E. histolytica* and *T. vaginalis* proteins (**Fig. 4**). On the other hand, analysis of RAD50 orthologues exposed a great conservation of these proteins, since all e-value threshold were <0.0001. As we have previously reported, EhNBS1 is closer to its human homologue than

Fig. 4. **Individual protein relationships of MRN complex in pathogenic eukaryotic cells.** Similarity was evaluated through PSI-BLAST analysis. The width of connecting lines

Fig. 3. **Phylogenetic relationships between RAD3 from** *S. cerevisiae***,** *E. histolytica***,** *T. vaginalis***,** *G. lamblia* **and** *P. falciparum***.** The unrooted tree was created with the MEGA 5.05 program using the Neighbor Joining algorithm based on ClustalW. Numbers above the tree nodes indicate the percentage of times that the branch was recovered in 500 replications.
