**3. The CBF/NF-Y transcription factor**

Several CCAAT-binding proteins have been isolated and described, including CBF/NF-Y (CCAAT Binding Factor/Nuclear Factor of the Y box), CTF/NF1 (CCAAT Transcription Factor/Nuclear Factor 1), C/EBP (CCAAT/Enhancer Binding Protein) and CDP (CCAAT Displacement Protein) (Mantovani, 1999). Among them, NF-Y is the most ubiquitous and specific one acting as a key proximal promoter factor in the transcriptional regulation of an array of different eukaryotic genes. Unlike other CCAAT-binding proteins, NF-Y requires a high degree of conservation of the CCAAT pentanucletide sequence and shows strong preference for specific flanking sequences (Dorn *et al.*, 1987a; Stephenson *et al.*, 2007). Therefore, the NF-YC transcription factor can be distinguished from the other CCAATbinding proteins based on its DNA sequence requirements (Maity and de Crombrugghe, 1998).

The CBF/NF-Y transcription factor, which will be referenced in this chapter as NF-Y, is a conserved oligomeric transcription factor found in all eukaryotes that is involved in the regulation of diverse genes (Maity *et al.*, 1992; McNabb *et al.*, 1995; Edwards *et al.*, 1998; Mantovani, 1998; Siefers *et al.*, 2009). NF-Y typically acts in concert with other regulatory factors to modulate gene expression in a highly controlled manner (Nelson *et al.*, 2007). In many eukaryotic promoters, the functional NF-Y-binding sites are relatively close to the TATA motif (Bucher, 1990) and are invariably flanked by at least one additional functionally important *cis*-element. Several reports have shown that various factors, including transcription factors, co-activators, and TATA-binding proteins, interact with NF-Y or its subunits in promoting transcriptional regulation (Mantovani, 1999; Yazawa and Kamada, 2007). NF-Y was originally identified as the protein that recognizes the MHC class II conserved Y box in Ea promoters (Dorn *et al.*, 1987a; Matuoka and Chen, 2002). It specifically recognizes the consensus sequence 5'-CTGATTGGYYRR-3' or 5'-YYRRCCAATCAG-3'(Y is 5 pyrimidines and R is 5 purines) present in the promoter region of eukaryotic genes. Bioinformatic analyses indicate that about 30% of mammalian promoters have predicted

Edwards *et al.*, 1998). CCAAT boxes are highly conserved within homologous genes across species in terms of position, orientation, and flanking nucleotides (Mantovani, 1998). In addition, the spacing between the CCAAT box and other promoter-specific *cis*-elements is also conserved among species (Dorn *et al.*, 1987a; Chodosh *et al.*, 1988; Maity and de Crombrugghe, 1998).The expression of genes under the control of promoters that contain CCAAT boxes may be ubiquitous or tissue/stage specific, suggesting that the gene expression pattern is also determined by other *cis* and *trans* elements (Stephenson *et al.*,

In *Sacharomyces cerevisiae*, CCAAT boxes are found in the promoters of cytochrome genes, in genes coding for proteins that are activated by non-fermentable carbon sources (McNabb *et al.*, 1995) and in genes involved in nitrogen metabolism (Dang *et al.*, 1996). In the filamentous fungus *Aspergillus nidulans*, CCAAT boxes are present in genes involved with penicillin biosynthesis (Steidl *et al.*, 1999). In higher eukaryotes, a multitude of promoters contain CCAAT boxes, including those of developmentally controlled and tissue-specific genes (Berry *et al.*, 1992), housekeeping and inducible genes (Roy and Lee, 1995) and cellcycle regulated genes (Mantovani, 1998). In addition, many cell-cycle regulated promoters lack a recognizable TATA-box, but contain more than one CCAAT box in a position close to and sometimes overlapping with the start site of transcription (Zwicker and Muller, 1997).

Several CCAAT-binding proteins have been isolated and described, including CBF/NF-Y (CCAAT Binding Factor/Nuclear Factor of the Y box), CTF/NF1 (CCAAT Transcription Factor/Nuclear Factor 1), C/EBP (CCAAT/Enhancer Binding Protein) and CDP (CCAAT Displacement Protein) (Mantovani, 1999). Among them, NF-Y is the most ubiquitous and specific one acting as a key proximal promoter factor in the transcriptional regulation of an array of different eukaryotic genes. Unlike other CCAAT-binding proteins, NF-Y requires a high degree of conservation of the CCAAT pentanucletide sequence and shows strong preference for specific flanking sequences (Dorn *et al.*, 1987a; Stephenson *et al.*, 2007). Therefore, the NF-YC transcription factor can be distinguished from the other CCAATbinding proteins based on its DNA sequence requirements (Maity and de Crombrugghe,

The CBF/NF-Y transcription factor, which will be referenced in this chapter as NF-Y, is a conserved oligomeric transcription factor found in all eukaryotes that is involved in the regulation of diverse genes (Maity *et al.*, 1992; McNabb *et al.*, 1995; Edwards *et al.*, 1998; Mantovani, 1998; Siefers *et al.*, 2009). NF-Y typically acts in concert with other regulatory factors to modulate gene expression in a highly controlled manner (Nelson *et al.*, 2007). In many eukaryotic promoters, the functional NF-Y-binding sites are relatively close to the TATA motif (Bucher, 1990) and are invariably flanked by at least one additional functionally important *cis*-element. Several reports have shown that various factors, including transcription factors, co-activators, and TATA-binding proteins, interact with NF-Y or its subunits in promoting transcriptional regulation (Mantovani, 1999; Yazawa and Kamada, 2007). NF-Y was originally identified as the protein that recognizes the MHC class II conserved Y box in Ea promoters (Dorn *et al.*, 1987a; Matuoka and Chen, 2002). It specifically recognizes the consensus sequence 5'-CTGATTGGYYRR-3' or 5'-YYRRCCAATCAG-3'(Y is 5 pyrimidines and R is 5 purines) present in the promoter region of eukaryotic genes. Bioinformatic analyses indicate that about 30% of mammalian promoters have predicted

2007).

1998).

**3. The CBF/NF-Y transcription factor** 

NF-Y binding sites (Bucher, 1990; Testa *et al.*, 2005), and chromatin immunoprecipitation data have demonstrated additional widespread NF-Y binding in nonpromoter sites.

Suggesting the importance of binding context, NF-Y-regulated gene expression can be tissue specific, developmentally regulated, or constitutive (Maity and de Crombrugghe, 1998; Siefers *et al.*, 2009).The transcriptional activity of NF-Y can be regulated by differential expression, alternative splicing, protein–protein interactions, and cellular redox potential (Matuoka and Yu Chen, 1999).

NF-Y has been shown to be involved in the regulation of some G1/S genes whose expressions are attenuated during the senescence process (Matuoka and Yu Chen, 1999). NF-Y plays a pivotal role in the cell cycle regulation of the mammalian cyclin A, *cdc25C*, and *cdc2* genes, in the S-phase of the cell cycle (Currie, 1998). Additionally, there are a number of genes involved in the cellular response to damage and stress, including the phospholipid hydroperoxide glutathione peroxidase genes (Huang *et al.*, 1999), which are regulated by NF-Y, indicating its pivotal role in the removal of damaging agents from cells (Matuoka and Chen, 2002). Although NF-Y functions basically as a transactivator of gene expression, it is also involved, directly or indirectly, in the downregulation of transcription. For instance, NF-Y binds to the mouse CCAAT box renin enhancer and blocks the binding of positive regulatory elements (Shi *et al.*, 2001). In this case, NF-Y dysfunction would lead to the damage of systems that control blood pressure (Matuoka and Chen, 2002).

NF-Y is composed of three different subunits named NF-YA (also known as HAP-2 or CBF-B), NF-YB (HAP3 or CBF-A), and NF-YC (HAP5 or CBF-C) that interact to form a complex that can bind CCAAT DNA motifs and control the expression of target genes (Figure 1). Each subunit is required for DNA binding, subunit association and transcriptional regulation in both vertebrates and plants (Sinha *et al.*, 1995; Stephenson *et al.*, 2007). Yeast possesses a fourth subunit, called HAP4, which provides a transcriptional activation domain to the complex (Forsburg and Guarente, 1989; Lee *et al.*, 2003). The yeast HAP4 protein is not needed for DNA-binding but contains an acidic domain that is essential to promote transactivation when associated with the HAP2/HAP3/HAP5 complex (Olesen and Guarente, 1990; Serra *et al.*, 1998). In vertebrates, the function of this fourth domain was incorporated into other subunits (Forsburg and Guarente, 1989; Yazawa and Kamada, 2007). Despite the wide cellular distribution and functional variability of NF-Y-regulated genes, most eukaryotic genomes have only one or two genes encoding each NF-Y subunit (Maity and de Crombrugghe, 1998; Riechmann and Ratcliffe, 2000). Fungi and animals, for example, present single genes encoding each protein subunit. Thus, there is minimal combinatorial diversity in the subunit composition of the heterotrimeric NF-Y in these organisms (Siefers *et al.*, 2009). In contrast, the NF-Y complex in vascular plants is generally encoded by gene families (Riechmann and Ratcliffe, 2000).

### **3.1 NF-Y subunits**

NF-Y is the only transcription factor thus far identified for which the interaction of three heterologous subunits creates the DNA binding domain (Maity and de Crombrugghe, 1992; McNabb *et al.*, 1995; Sinha *et al.*, 1995). All three NF-Y subunits are essential for the DNA binding activity and one molecule of each subunit forms the NF-Y-DNA complex (Maity and De Crombrugghe, 1996)**.** Each NF-Y subunit contains a conserved domain with identities greater than 70% across species. This highly conserved domain is located at the *C*terminus of NF-YA; in the central part of NF-YB; and at the *N*-terminus of NF-YC (Li *et al.*, 1992).

The Evolutionary History of CBF Transcription Factors:

interaction domain of NF-Y subunits, respectively.

averages is 4N generations (Kimura, 1989; Zhang, 2003).

**4. Gene duplication and evolution** 

Conery, 2000; Ober, 2010).

*et al.*, 2003; Ober, 2010).

Gene Duplication of CCAAT – Binding Factors NF-Y in Plants 201

This disruption results in a partial dissociation of DNA from the histone core, which might enable the access of the general transcription machinery to initiate the transcription process.

Fig. 1. Assembly of NF-Y subunits and its binding to DNA. Initially, the NF-YB and NF-YC subunits form a tight heterodimer via protein–protein interactions **(a)**. The dimer then moves to the nucleus, where is recruited the third subunit (NF-YA) **(b)** to generate the complete, heterotrimeric NF-Y **(c)** that is able to bind promoters containing the core pentamer nucleotide sequence CCAAT **(d)** resulting in either positive or negative transcriptional regulation **(e)**. Adapted from Mantovani (1999). White circles and oblong black circles into each NF-Y subunit represent the DNA-binding domain and the NF-Y

DNA duplication act as one of the main forces driving the evolution of organisms by creating the raw genetic material that natural selection can subsequently modify. Gene duplications arise in eukaryotes at a rate of 0.01 paralogs per gene per million years (Lynch and Conery, 2000), the same order of magnitude of the mutation rate per nucleotide per year (De Grassi *et al.*, 2008). Duplication of individual genes, chromosomal segments, or entire genomes represent the primary source for the origin of evolutionary novelties, including new gene functions and expression patterns (Holland *et al.*, 1994; Sidow, 1996; Lynch and Conery, 2000). However, how duplicated genes successfully evolve from an initial state of complete redundancy, wherein one copy is likely to be expendable, to a stable situation in which both copies are maintained by natural selection, is unclear (Sidow, 1996; Lynch and

In the evolutionary history of plants, genome duplications have been relatively common, leading to the hypothesis that most angiosperms are to some extent polyploidal (Soltis, 2005). The genome of Arabidopsis, for example, possesses traces of at least three polyploidy events (Vision *et al.*, 2000; Simillion *et al.*, 2002), followed by subsequent gene loss (Bowers

Similar to a point mutation, a duplication that occurs in an individual can be fixed or lost in the population. Compared with pre-existing alleles, if a new allele of the duplicate gene is selectively neutral, it has a small probability (1/2N) to be fixed in a diploid population (where N is the effective population size). This suggests that the majority of duplicated genes will be lost. For those duplicated genes that do become fixed, the fixation time

The NF-YA conserved domain can be divided in two functionally distinct regions: an *N*terminal region that is required for NF-YB and NF-YC association and a *C*-terminal region required for DNA-binding (Maity and de Crombrugghe, 1992). Additionally, NF-YA usually contains a glutamine (Q)-rich and a serine/threonine (S/T)-rich regions. There are numerous variants of NF-YA due to alternative splicing at the Q-S/T domains (Li *et al.*, 1992) and, although the expression of these isoforms is variable depending of tissue and cell types, they all seem intact in terms of transcriptional function (Matuoka and Chen, 2002).

Both NF-YB and NF-YC subunits possess the highly conserved histone-fold motif (HFM) and are structurally similar to core histone subunits H2B and H2A, respectively, and to the archaebacterial histone-like protein Hmf-2 (Arents and Moudrianakis, 1995; Baxevanis *et al.*, 1995; Mantovani, 1998). In terms of identity, NF-YB is 30% identical to H2B, 14% to H2A, 17% to H4 and 18% to H3; NF-YC is 21% identical to H2A, 15% to H4 and H3 and 20% to H2B (Liberati *et al.*, 1999). Other proteins showing a remarkable identity (25-30%) to both NF-YB and NF-YC are present in *Archaea*. These proteins homodimerize and associate with DNA, forming nucleosome-like structures (Sandman *et al.*, 1990). The NF-YB and NF-YC subunits also contain residues that are important for their contact with DNA (Romier *et al.*, 2003; Stephenson *et al.*, 2007). In contrast, the conserved segment of NF-YA has no homology with the histone-fold motif, or with any of the known dimerization motifs present in other heteromeric DNA-binding proteins (Maity and de Crombrugghe, 1992).

Some portions of NF-YA, NF-YB and NF-YC present a high degree of identity with yeast HAP3, HAP2 and HAP5, respectively. These HAP genes, which are components of the yeast CCAAT-binding protein, are necessary for the expression of genes encoding components of the electron transport chain. Yeast strains mutated for either of the three genes failed to grow on media containing a nonfermentable carbon source such as lactate or glycerol, a characteristic respiratory-defect phenotype (McNabb *et al.*, 1995).

Assembly of the NF-Y heterotrimer in mammals (where this complex is better studied) follows a strict, stepwise pattern (Sinha *et al.*, 1995; Sinha *et al.*, 1996) (Figure 1). Initially, the NF-YB and NF-YC subunits form a tight heterodimer (Figure 1a) similar to those of the HFM, a conserved protein–protein and DNA-binding interaction module (Luger *et al.*, 1997) composed by 65 amino acid stretch common to all histones that is required for nucleosome formation (Baxevanis *et al.*, 1995; Luger *et al.*, 1997; de Silvio *et al.*, 1999). This dimer then moves to the nucleus, where the third subunit (NF-YA, Figure 1b) is recruited to generate the complete, heterotrimeric NF-Y (Figure 1c). Interestingly, NF-YA is unable to interact with the NF-YB or NF-YC alone, interacting only with the NF-YB-NF-YC heterodimer (Serra *et al.*, 1998). The complete NF-Y is able to bind promoters containing the core pentamer nucleotide sequence CCAAT (Figure 1d) with high specificity and affinity resulting in either positive or negative transcriptional regulation (Figure 1e) (Peng and Jahroudi, 2002; 2003; Ceribelli *et al.*, 2008; Siefers *et al.*, 2009)**.** 

Because the NF-Y transcription factor contains H2B-like and H2A-like molecules (NF-YB and NF-YC, respectively), the complex presents all the core histone components and could mimic the interaction of the nucleosome core with genomic DNA (Struhl and Moqtaderi, 1998). In this scenario, it has been demonstrated that the NF-YA/NF-YB/NF-YC trimer or the NF-YB/NF-YC dimer can bind to H3/H4 tetramer during nucleosome assembly (Caretti *et al.*, 1999). In addition, the NF-Y complex also can bind to the chromatin even after nucleosome formation, indicating the ability of NF-Y to interact with genomic DNA assembled in the nucleosome. The interaction between the NF-Y transcription factor and the DNA molecule causes local disruption of the nucleosomal architecture (Coustry *et al.*, 2001).

The NF-YA conserved domain can be divided in two functionally distinct regions: an *N*terminal region that is required for NF-YB and NF-YC association and a *C*-terminal region required for DNA-binding (Maity and de Crombrugghe, 1992). Additionally, NF-YA usually contains a glutamine (Q)-rich and a serine/threonine (S/T)-rich regions. There are numerous variants of NF-YA due to alternative splicing at the Q-S/T domains (Li *et al.*, 1992) and, although the expression of these isoforms is variable depending of tissue and cell types, they all seem intact in terms of transcriptional function (Matuoka and Chen, 2002). Both NF-YB and NF-YC subunits possess the highly conserved histone-fold motif (HFM) and are structurally similar to core histone subunits H2B and H2A, respectively, and to the archaebacterial histone-like protein Hmf-2 (Arents and Moudrianakis, 1995; Baxevanis *et al.*, 1995; Mantovani, 1998). In terms of identity, NF-YB is 30% identical to H2B, 14% to H2A, 17% to H4 and 18% to H3; NF-YC is 21% identical to H2A, 15% to H4 and H3 and 20% to H2B (Liberati *et al.*, 1999). Other proteins showing a remarkable identity (25-30%) to both NF-YB and NF-YC are present in *Archaea*. These proteins homodimerize and associate with DNA, forming nucleosome-like structures (Sandman *et al.*, 1990). The NF-YB and NF-YC subunits also contain residues that are important for their contact with DNA (Romier *et al.*, 2003; Stephenson *et al.*, 2007). In contrast, the conserved segment of NF-YA has no homology with the histone-fold motif, or with any of the known dimerization motifs present in other

heteromeric DNA-binding proteins (Maity and de Crombrugghe, 1992).

characteristic respiratory-defect phenotype (McNabb *et al.*, 1995).

Ceribelli *et al.*, 2008; Siefers *et al.*, 2009)**.** 

Some portions of NF-YA, NF-YB and NF-YC present a high degree of identity with yeast HAP3, HAP2 and HAP5, respectively. These HAP genes, which are components of the yeast CCAAT-binding protein, are necessary for the expression of genes encoding components of the electron transport chain. Yeast strains mutated for either of the three genes failed to grow on media containing a nonfermentable carbon source such as lactate or glycerol, a

Assembly of the NF-Y heterotrimer in mammals (where this complex is better studied) follows a strict, stepwise pattern (Sinha *et al.*, 1995; Sinha *et al.*, 1996) (Figure 1). Initially, the NF-YB and NF-YC subunits form a tight heterodimer (Figure 1a) similar to those of the HFM, a conserved protein–protein and DNA-binding interaction module (Luger *et al.*, 1997) composed by 65 amino acid stretch common to all histones that is required for nucleosome formation (Baxevanis *et al.*, 1995; Luger *et al.*, 1997; de Silvio *et al.*, 1999). This dimer then moves to the nucleus, where the third subunit (NF-YA, Figure 1b) is recruited to generate the complete, heterotrimeric NF-Y (Figure 1c). Interestingly, NF-YA is unable to interact with the NF-YB or NF-YC alone, interacting only with the NF-YB-NF-YC heterodimer (Serra *et al.*, 1998). The complete NF-Y is able to bind promoters containing the core pentamer nucleotide sequence CCAAT (Figure 1d) with high specificity and affinity resulting in either positive or negative transcriptional regulation (Figure 1e) (Peng and Jahroudi, 2002; 2003;

Because the NF-Y transcription factor contains H2B-like and H2A-like molecules (NF-YB and NF-YC, respectively), the complex presents all the core histone components and could mimic the interaction of the nucleosome core with genomic DNA (Struhl and Moqtaderi, 1998). In this scenario, it has been demonstrated that the NF-YA/NF-YB/NF-YC trimer or the NF-YB/NF-YC dimer can bind to H3/H4 tetramer during nucleosome assembly (Caretti *et al.*, 1999). In addition, the NF-Y complex also can bind to the chromatin even after nucleosome formation, indicating the ability of NF-Y to interact with genomic DNA assembled in the nucleosome. The interaction between the NF-Y transcription factor and the DNA molecule causes local disruption of the nucleosomal architecture (Coustry *et al.*, 2001). This disruption results in a partial dissociation of DNA from the histone core, which might enable the access of the general transcription machinery to initiate the transcription process.

Fig. 1. Assembly of NF-Y subunits and its binding to DNA. Initially, the NF-YB and NF-YC subunits form a tight heterodimer via protein–protein interactions **(a)**. The dimer then moves to the nucleus, where is recruited the third subunit (NF-YA) **(b)** to generate the complete, heterotrimeric NF-Y **(c)** that is able to bind promoters containing the core pentamer nucleotide sequence CCAAT **(d)** resulting in either positive or negative transcriptional regulation **(e)**. Adapted from Mantovani (1999). White circles and oblong black circles into each NF-Y subunit represent the DNA-binding domain and the NF-Y interaction domain of NF-Y subunits, respectively.
