**1. Introduction**

66 DNA Repair

pericentrin (*PCNT*) gene cause primordial dwarfism. *Science*. 319:816–819. Renglin Lindh, A., N. Schultz, N. Saleh-Gohari, and T. Helleday. 2007. RAD51C (RAD51L2)

Sato, N., K. Mizumoto, M. Nakamura, H. Ueno, Y.A. Minamishima, J.L. Farber, and M.

Shimada, M., and K. Komatsu. 2009. Emerging connection between centrosome and DNA

Shimada, M., H. Niida, D.H. Zineldeen, H. Tagami, M. Tanaka, H. Saito, and M. Nakanishi.

Shimada, M., R. Sagae, J. Kobayashi, T. Habu, and K. Komatsu. 2009. Inactivation of the

Shimada, M., J. Kobayashi, R. Hirayama, and K. Komatsu. 2010. Differential role of repair

Simons, A.M., A.A. Horwitz, L.M. Starita, K. Griffin, R.S. Williams, J.N. Glover, and J.D. Parvin. 2006. BRCA1 DNA-binding activity is stimulated by BARD1. *Cancer Res*. 66:2012–2018. Smith, E., D. Dejsuphong, A. Balestrini, M. Hampel, C. Lenz, S. Takeda, A. Vindigni, and V.

Smith, S., and T. de Lange. 1999. Cell cycle dependent localization of the telomeric PARP, tankyrase, to nuclear pore complexes and centrosomes. *J Cell Sci*. 112 (Pt 21):3649–3656. Starita, L.M., A.A. Horwitz, M.C. Keogh, C. Ishioka, J.D. Parvin, and N. Chiba. 2005.

Tauchi, H., J. Kobayashi, K. Morishima, D.C. van Gent, T. Shiraishi, N.S. Verkaik, D.

Tibelius, A., J. Marhold, H. Zentgraf, C.E. Heilig, H. Neitzel, B. Ducommun, A. Rauch, A.D.

Tsvetkov, L., X. Xu, J. Li, and D.F. Stern. 2003. Polo-like kinase 1 and Chk2 interact and colocalize to centrosomes and the midbody. *J Biol Chem*. 278:8468–8475. Wang, H.F., K. Takenaka, A. Nakanishi, and Y. Miki. 2011. BRCA2 and nucleophosmin

Yoshida, K., and Y. Miki. 2004. Role of BRCA1 and BRCA2 as regulators of DNA repair, transcription, and cell cycle in response to DNA damage. *Cancer Sci*. 95:866–871. Zhang, S., P. Hemmerich, and F. Grosse. 2007. Werner syndrome helicase (WRN), nuclear

entry via centrosome-associated Chk1. *J Cell Biol*. 185:1149–1157.

116:38–45.

280:24498–24505.

cell death. *Oncogene*. 19:5281–5290.

repair machinery. *J Radiat Res (Tokyo)*. 50:295–301.

induced transcriptional repression. *Cell*. 132:221–232.

assembly by targeting CEP63. *Nat Cell Biol*. 11:278–285.

ATR/BRCA1 pathway. *Cancer Res*. 69:1768–1775.

overduplication. *Cancer Sci*. 101:2531–2537.

in higher vertebrate cells. *Nature*. 420:93–98.

kinase ROCK2. *Cancer Res*. 71:68–77.

centrosome. *Cell Biol Int*. 31:1109–1121.

R. Hennekam, F. de Zegher, H.G. Dorr, and A. Reis. 2008. Mutations in the

is involved in maintaining centrosome number in mitosis. *Cytogenet Genome Res*.

Tanaka. 2000. A possible role for centrosome overduplication in radiation-induced

2008. Chk1 is a histone H3 threonine 11 kinase that regulates DNA damage-

Nijmegen breakage syndrome gene leads to excess centrosome duplication via the

proteins, BRCA1/NBS1 and Ku70/DNA-PKcs, in radiation-induced centrosome

Costanzo. 2009. An ATM- and ATR-dependent checkpoint inactivates spindle

BRCA1/BARD1 ubiquitinate phosphorylated RNA polymerase II. *J Biol Chem*.

vanHeems, E. Ito, A. Nakamura, E. Sonoda, M. Takata, S. Takeda, S. Matsuura, and K. Komatsu. 2002. Nbs1 is essential for DNA repair by homologous recombination

Ho, J. Bartek, and A. Kramer. 2009. Microcephalin and pericentrin regulate mitotic

coregulate centrosome amplification and form a complex with the Rho effector

DNA helicase II (NDH II) and histone gammaH2AX are localized to the

This manuscript presents methods used to test, and resulting evidence to support the hypothesis that specialized transcription factor binding sites coordinate the expression of DNA repair genes. Building on the seminal work of the Elnitski laboratory (Yang et al. 2007), which identified the most complete set of human transcripts under the control of bidirectional promoters and identified the first putative regulatory networks that make use of the bidirectional promoter structure, the authors present additional details of these regulatory networks.

Much of the work regarding the regulation of DNA repair proteins is aimed at the level of protein-protein interactions and post-translational processing events (Hurley et al. 2007, Jensen et al. 2011, Shibata et al. 2010). However, transcriptional activation of DNA repair genes is likely to utilize shared factors, especially in cases of induced activation, which have not been thoroughly evaluated. Yang, Koehly and Elnitski reported the discovery and characterization of 5,653 bidirectional promoters in the human genome (Yang et al. 2007). Prior to that date, bidirectional promoters were annotated only for protein-coding genes, and only 1,352 examples had been reported in the human genome. The work of Yang et al. included evidence from all noncoding-RNA genes, as well. Each bidirectional promoter regulates the expression of two genes, oriented in opposite directions with transcription start sites within 1000 bp of one another. The authors developed a novel approach to map all bidirectional promoters by analyzing the public expressed-sequence-tag (EST) data. The prevalence of this promoter structure led the authors to explore the hypothesis that it plays a role in regulation of certain classes of genes. They discovered that many more DNA repair genes have bidirectional promoters than previously reported and that many genes with somatic mutations in cancer have bidirectional promoters. The relevance of DNA repair genes to cancers (Kinsella et al. 2009, Liang et al. 2009, Smith et al. 2010, Kelley et al. 2008, Li et al. 2009, Bellizii et al. 2009, Naccarati et al. 2007, Berwick et al. 2000)) and the association of bidirectional promoters with DNA repair genes suggested that bidirectional promoters might indicate a higher-order type of regulatory structure that could be detected through common features at the DNA sequence level. If true, these features should discriminate bidirectional promoters and unidirectional promoters of genes with DNA repair functions.

Shared Regulatory Motifs in Promoters of Human DNA Repair Genes 69

*BARD1* This gene encodes a protein which interacts with the N-terminal region of

*BRCA1* This gene encodes a nuclear phosphoprotein that plays a role in

*BRCA2* Inherited mutations in *BRCA1* and this gene, *BRCA2*, confer increased lifetime risk of developing breast or ovarian cancer. *CHK2* In response to DNA damage and replication blocks, cell cycle progression

*ERBB2* This gene encodes a member of the epidermal growth factor (EGF)

*TP53* This gene encodes tumor protein *p53*, which responds to diverse cellular

*FANCA* DNA repair protein that may operate in a post-replication repair or a cell

*FANCD2* Required for maintenance of chromosomal stability. Promotes accurate

*FANCF* DNA repair protein that may operate in a postreplication repair or a cell

algorithm (in UCINET 6; Borgatti et al., 2002). The distance between the 10 B/O cancer genes represents their similarity based on the number of shared genes found in the coexpression clusters. Genes in the center of the network were present in the largest number of gene clusters, seven out of 10, indicating that co-expression clusters intersect through

A systematic search of transcription factor binding sites in the list of bidirectional promoters was used to assess regulatory connections at the DNA level, and revealed several in common (using a motif finding algorithm we searched for the motifs reported in (Xie et al. 2005)). Notably, identical *ELK1* binding sites were located at the same distance from *ERBB2*, *FANCD2*, and *BRCA2* transcription start sites (Yang et al. 2007). *ETS* factor binding sites were present as a trio with SP1 and *PAX4/RXR* binding sites in the majority of the promoters. The transcription factors for which binding motifs were found in all of the promoters along with their

senescence, DNA repair, or changes in metabolism.

receptor family of receptor tyrosine kinases.

*FANCB* DNA repair protein required for *FANCD2* ubiquitination.

loading onto damaged chromatin.

descriptions from GeneCards (Safran et al. 2010) are reported in Table 2.

Table 1. The B/O cancer-related genes that were studied.

**2.2 Transcription factor binding site analysis** 

common regulatory nodes.

maintaining genomic stability, and it also acts as a tumor suppressor.

is halted through the control of critical cell cycle regulators. The protein encoded by this gene is a cell cycle checkpoint regulator and putative

stresses to regulate target genes that induce cell cycle arrest, apoptosis,

cycle checkpoint function. May be involved in inter-strand DNA crosslink repair and in the maintenance of normal chromosome stability.

and efficient pairing of homologs during meiosis. Involved in the repair of DNA double-strand breaks, both by homologous recombination and single-strand annealing. May participate in S phase and G2 phase checkpoint activation upon DNA damage. Promotes *BRCA2/FANCD1*

cycle checkpoint function. May be implicated in interstrand DNA crosslink repair and in the maintenance of normal chromosome stability.

**Gene Description from GeneCards (Safran 2010)** 

*BRCA1*.

tumor suppressor.

Thus, this chapter presents additional evidence of these regulatory networks. Specifically, this chapter provides evidence that there are distinct regulatory signatures for (1) genes involved in certain types of cancers, (2) bidirectional versus unidirectional promoters and (3) specific DNA repair pathways. The authors have identified transcription factor binding sites in bidirectional promoters of genes implicated in breast and ovarian (B/O) cancers. Additionally, they have discovered novel transcription factor binding sites that may serve as regulatory elements to distinguish DNA repair genes with bidirectional promoters from DNA repair genes with unidirectional promoters. Applications of this work extend to a collection of novel transcription factor binding sites shared among genes acting as checkpoint factors of DNA repair pathways. These findings have important implications – as evidence of novel regulatory mechanisms, and new insights into cancer biology (i.e., genomic elements relevant to transcriptional regulation) are gained.

#### **2. Regulatory features of genes implicated in breast and ovarian cancers**

This section provides evidence to support the hypothesis that there are distinct regulatory control systems among bidirectional and unidirectional promoters. Additionally, this section presents transcription factor binding sites discovered in bidirectional promoters of genes implicated in breast and ovarian cancers.

As reported in Yang et al. 2007, we identified transcription factor binding sites for known factors in genes implicated in B/O cancers. The enrichment of bidirectional promoters in several cancer genes, and in additional genes having functions in DNA repair, suggests common mechanisms of regulation. We used expression clustering and enrichment of genes with bidirectional promoters to group the cancer genes into expression groups from the full genome to address features common among the clusters that might indicate the presence of regulatory networks. The cancer-related genes that were identified and studied are listed below, along with their descriptions from GeneCards (Safran et al. 2010). The Elnitski group was the first to report that this set of genes has bidirectional promoters.

All genes were assessed for the top most related gene expression profiles in the genome using the gene sorter tool at the UCSC Genome Browser and expression data from the Novartis GNF Atlas2 (containing expression profiles for 96 tissues). Each cluster was then compared to all the others to identify intersection points (by gene names) among the lists of co-expressed genes. Using a process of multidimensional scaling, the gene lists were compared and a putative regulatory network was generated (Figure 1). The *MLH1* gene appeared in several co-expression clusters and therefore occupied a central location with connections to 7 other genes (*BARD1, FANCA, BRCA1, CHK2, BRCA2, TP53* and *FANCF*). Two additional genes co-occupied the central position with *MLH1*. *COMMD3* (an uncharacterized protein) and *ITGB3BP*, a regulator of apoptosis in breast cancer cells.

#### **2.1 Network visualization**

The bidirectional promoters that are associated with the breast and ovarian cancer genes were considered an affiliation network or a bipartite graph. In this example nodes represent the genes in the co-expression clusters and edges connect the genes appearing in more than one list. The higher the number of appearances of any gene from the ten co-expression lists, the more central its position in the network. Geodesic distances between genes were computed (e.g. length of the shortest path between genes through promoters, and the geodesic distance matrix was scaled using a metric multidimensional scaling (MDS)

Thus, this chapter presents additional evidence of these regulatory networks. Specifically, this chapter provides evidence that there are distinct regulatory signatures for (1) genes involved in certain types of cancers, (2) bidirectional versus unidirectional promoters and (3) specific DNA repair pathways. The authors have identified transcription factor binding sites in bidirectional promoters of genes implicated in breast and ovarian (B/O) cancers. Additionally, they have discovered novel transcription factor binding sites that may serve as regulatory elements to distinguish DNA repair genes with bidirectional promoters from DNA repair genes with unidirectional promoters. Applications of this work extend to a collection of novel transcription factor binding sites shared among genes acting as checkpoint factors of DNA repair pathways. These findings have important implications – as evidence of novel regulatory mechanisms, and new insights into cancer biology (i.e.,

**2. Regulatory features of genes implicated in breast and ovarian cancers** 

This section provides evidence to support the hypothesis that there are distinct regulatory control systems among bidirectional and unidirectional promoters. Additionally, this section presents transcription factor binding sites discovered in bidirectional promoters of

As reported in Yang et al. 2007, we identified transcription factor binding sites for known factors in genes implicated in B/O cancers. The enrichment of bidirectional promoters in several cancer genes, and in additional genes having functions in DNA repair, suggests common mechanisms of regulation. We used expression clustering and enrichment of genes with bidirectional promoters to group the cancer genes into expression groups from the full genome to address features common among the clusters that might indicate the presence of regulatory networks. The cancer-related genes that were identified and studied are listed below, along with their descriptions from GeneCards (Safran et al. 2010). The Elnitski group

All genes were assessed for the top most related gene expression profiles in the genome using the gene sorter tool at the UCSC Genome Browser and expression data from the Novartis GNF Atlas2 (containing expression profiles for 96 tissues). Each cluster was then compared to all the others to identify intersection points (by gene names) among the lists of co-expressed genes. Using a process of multidimensional scaling, the gene lists were compared and a putative regulatory network was generated (Figure 1). The *MLH1* gene appeared in several co-expression clusters and therefore occupied a central location with connections to 7 other genes (*BARD1, FANCA, BRCA1, CHK2, BRCA2, TP53* and *FANCF*). Two additional genes co-occupied the central position with *MLH1*. *COMMD3* (an uncharacterized protein) and *ITGB3BP*, a regulator of apoptosis in breast cancer cells.

The bidirectional promoters that are associated with the breast and ovarian cancer genes were considered an affiliation network or a bipartite graph. In this example nodes represent the genes in the co-expression clusters and edges connect the genes appearing in more than one list. The higher the number of appearances of any gene from the ten co-expression lists, the more central its position in the network. Geodesic distances between genes were computed (e.g. length of the shortest path between genes through promoters, and the geodesic distance matrix was scaled using a metric multidimensional scaling (MDS)

genomic elements relevant to transcriptional regulation) are gained.

was the first to report that this set of genes has bidirectional promoters.

genes implicated in breast and ovarian cancers.

**2.1 Network visualization** 


Table 1. The B/O cancer-related genes that were studied.

algorithm (in UCINET 6; Borgatti et al., 2002). The distance between the 10 B/O cancer genes represents their similarity based on the number of shared genes found in the coexpression clusters. Genes in the center of the network were present in the largest number of gene clusters, seven out of 10, indicating that co-expression clusters intersect through common regulatory nodes.
