**6. 4q35 genes in FSHD phatogenesis: Role of genes overexpression**

Although the exact molecular mechanism responsible for FSHD is unknown, it is a common agreement that reduction of D4Z4 elements might cause up-regulation of gene(s) in *cis*. A few genes have been considered as good candidates for FSHD based on their localization and/or function.

This section will critically discuss function and potential role of 4q35 candidate genes in FSHD development.

#### **6.1 FSHD region gene 2 (FRG2)**

FSHD Region Gene 2 (*FRG2*) was originally identified by *in silico* gene prediction (Van Geel et al., 1999) as a region of 3 kb 37 kb proximal to the D4Z4 repeat array. Predicted exons are preceded by a putative muscle specific promoter. *FRG2* gene is composed of four exons and encodes an mRNA of 2084 bp with two alternative polyadenylation sites. The *FRG2* ORF encodes a putative protein of 278 amino acids. The FRG2 protein does not show significant homology to known proteins (Rijkers et al., 2004). Alternative splicing creates an additional alanine codon (Rijkers et al., 2004). Even though *FRG2* has been shown to be nuclear, (Rijkers et al., 2004) its sequence contains not only two potential nuclear localization signals (NLS) but also a peroxisomal targeting signal (PTS1) at the carboxyterminal end of the protein (Swinkels et al., 1992). Copies of *FRG2* are dispersed throughout the genome, prevalently located in subtelomeric or pericentromeric regions (Winokur et al., 1994; van Geel et al., 1999; van Geel et al., 2002). However, only the *FRG2* copies on chromosomes 4 and 10 show a 98% identity, differing for just five nucleotide mismatches in the ORF (Rijkers et al., 2004). Experiments demonstrated that the *FRG2* promoter is sensitive to the presence of D4Z4 repeat units making *FRG2* an interesting candidate gene for FSHD pathophysiology (Rijkers et al., 2004). Indeed it has been shown that overexpression of *FRG2* is obtained by suppressing the activity of the D4Z4 recognition complex (DRC) (Gabellini et al., 2002). Moreover data suggests that in muscle biopsies from FSHD patients, *FRG2* overexpression inversely correlates with D4Z4 repeat number (Gabellini et al., 2002). However the overexpression of *FRG2* in FSHD is still controversial. If there is a general agreement that mRNA is virtually absent in most of human tissue, there is no consensus regarding the expression of *FRG2* in FSHD patients' samples. *FRG2* overexpression was reported in differentiating, but not proliferating myoblasts of FSHD patients (Rijkers et al., 2004). The overexpression of *FRG2* in FSHD myotubes has not been fully confirmed in other works (Arashiro et al., 2009; Cheli et al., 2011; Masny et al., 2010; Osborne et al., 2007). The different outcomes of expression studies may be explained by the intrinsic difficulties in detecting *FRG2* mRNA due to its low expression level and by the presence in the genome of multiple copies of *FRG2.* Moreover *FRG2* is not represented in the gene arrays currently used for RNA expression studies. Whether *FRG2* is involved in FSHD pathogenesis still remains in discussion. Indeed muscle-specific overexpression of *FRG2* in mice does not result in an aberrant phenotype (Figure 6A) (Gabellini et al., 2006), and FSHD patient with proximal deletion encompassing FRG2 have been found (Lemmers et al., 2003). Nevertheless it is worth mentioning that *FRG2* appears late in the evolution together with D4Z4 repeats and it is not present in the mice genome making the mice model for *FRG2* overexpression not conclusive. The function of this protein is still unknown.

#### **6.2 FSHD region gene 1 (***FRG1***)**

34 Neuromuscular Disorders

et al., 2003b). Because D4Z4 can be regarded as a docking platform for protein factors, loss of repeats may generate a local imbalance in the availability of D4Z4 proteins in the cell,

The D4Z4 unit has been completely sequenced (Hewitt et al., 1994; Winokur et al., 1994 Lee et al., 1995). Each D4Z4 repeat unit contains an ORF with a double homeobox putatively encoding the DUX4 protein. *DUX4* belongs to a family of highly homologous genes scattered throughout the genome. One almost identical copy, named *DUX4c*, is located 42 kb proximal to the repeat array. Based on the presence of the molecular signature 4A-PAS, which should allow the stabilization of the mRNA from the distal copy of the *DUX4* gene; it has been proposed that FSHD arises through a toxic gain-of-function mechanism attributable to the pathological expression of *DUX4* mRNA (Lemmers et al., 2010a). More detailed analysis of *DUX4* expression shows that the *DUX4* pre-mRNA can be alternatively spliced and apparently, the FSHD muscle expresses a different splice form of *DUX4* mRNA compared to control muscle (Snider et al., 2010). It is important to note that *DUX4* is a rare transcript and the amount of *DUX4* has been estimated in one copy per cell. To explain how such low expression level of *DUX4* can cause FSHD, it has been proposed that in FSHD muscle the DUX4 protein may exert its toxic effect in a small subset of nuclei, which express a relatively abundant amount of *DUX4* transcript. The possible toxic effect of DUX4 has been inferred on the basis of in vitro and in vivo studies (Kowaljow et al., 2007; Bosnakovski et al., 2008; Wallace et al., 2011) in which *DUX4* was expressed at very high levels. Thus in order to explain FSHD pathogenesis, it is difficult to reconcile those experimental

observations with the very limited amount of *DUX4* detected in human muscle cells.

Although the exact molecular mechanism responsible for FSHD is unknown, it is a common agreement that reduction of D4Z4 elements might cause up-regulation of gene(s) in *cis*. A few genes have been considered as good candidates for FSHD based on their localization

This section will critically discuss function and potential role of 4q35 candidate genes in

FSHD Region Gene 2 (*FRG2*) was originally identified by *in silico* gene prediction (Van Geel et al., 1999) as a region of 3 kb 37 kb proximal to the D4Z4 repeat array. Predicted exons are preceded by a putative muscle specific promoter. *FRG2* gene is composed of four exons and encodes an mRNA of 2084 bp with two alternative polyadenylation sites. The *FRG2* ORF encodes a putative protein of 278 amino acids. The FRG2 protein does not show significant homology to known proteins (Rijkers et al., 2004). Alternative splicing creates an additional alanine codon (Rijkers et al., 2004). Even though *FRG2* has been shown to be nuclear, (Rijkers et al., 2004) its sequence contains not only two potential nuclear localization signals (NLS) but also a peroxisomal targeting signal (PTS1) at the carboxyterminal end of the protein (Swinkels et al., 1992). Copies of *FRG2* are dispersed throughout the genome,

**6. 4q35 genes in FSHD phatogenesis: Role of genes overexpression** 

and/or function.

FSHD development.

**6.1 FSHD region gene 2 (FRG2)** 

and/or lead to new interaction with different proteins at the disease allele.

**5. Direct role of D4Z4: The double homeobox gene 4 (DUX4)** 

In the human genome *FRG1* gene is located 125 kb centromeric to D4Z4 array on chromosome 4. As for many other genes from the 4q subtelomeric region, several copies of FRG1 are present in the human genome (van Deutekom et al., 1996c). The *FRG1* copy on chromosome 4 encodes a 258-amino acid protein. Although the FRG1 protein does not share significant overall homology to any known protein, it contains two nuclear localization signals in the N-terminal region (NLSs, aa 22-25 and 29-32), a bipartite NLS in the Cterminal region (aa 253-261) and a single fascin-like domain (aa 58-176), indicative of an actin-binding protein (Figure 6B), one potential RNA-binding domain (22-35 aa) homologous to several RNA-binding proteins (RBPs). FRG1 protein is highly conserved among invertebrates and vertebrates: human FRG1 shares 42% identity with *C. elegans*, 81% identity with *Xenopus* and 97% identity with mouse protein (Figure 6B). The high level of conservation throughout species suggests that FRG1 might have a very important function that is preserved during the evolution.

Since its discovery, FSHD Region Gene 1 (*FRG1*) has been considered a candidate gene for FSHD (Van Deutekom et al., 1996c). Analysis of its expression level in muscle tissues obtained from FSHD patients and healthy subjects showed that *FRG1* was abnormally upregulated in FSHD affected muscles. Significantly, in lymphocytes from FSHD patients, its expression was equivalent to that observed in normal tissue, indicating that this overexpression in FSHD is muscle-specific (Gabellini et al., 2002). Consistent with this evidence,

Facioscapulohumeral Muscular Dystrophy: From Clinical Data to Molecular Genetics and Return 37

been shown (Petrov et al., 2008). In this context replacement of H3K27me3 by H3K4me3 during myoblasts differentiation might indicate that chromatin structure undergoes dynamic changes during myogenic differentiation that lead to the loosening of the FRG1/4q-D4Z4 array loop in myotubes. Consistently, *FRG1* over-expression was detected in the early stages of differentiation in FSHD myoblasts in comparison with control myoblasts

*FRG1* molecular function has not been elucidated yet. Several observations suggest that it could be involved in RNA processing. FRG1 is a nuclear protein that localizes in Cajal bodies, in nucleoli and in nuclear speckles, sites where RNA processing takes place (van Koningsbruggen et al., 2004). Its interaction with RNA has been demonstrated *in vitro* and *in vivo* (Sun et al., 2011). Proteomic studies found FRG1 as a component of purified spliceosomes (Rappsilber et al., 2002; Bessonov et al., 2010). Moreover in muscle of *FRG1* over-expressing transgenic mice, specific pre-mRNAs undergo aberrant alternative splicing (Gabellini et al., 2006). Studies showed that FRG1 has nuclear and cytoplasmic localizations. Interestingly in human muscle sections, FRG1 localizes with Z-disc (Hanel et al., 2011) an element of muscle sarcomere. In a muscle cells ribosomes are available at sarcomere for local synthesis of contractile proteins providing a mechanism to quickly respond to changes in the extra-cellular environment. It would be interesting to test wheter FRG1 is involved in Z line targeting and/or translation of specific m-RNAs. Despite the interest in FRG1 as candidate for FSHD pathogenesis has diminished because expression studies failed to detect *FRG1* consistently overexpressed in FSHD biopsies (Gabellini et al., 2002; Jiang et al., 2003; Winokur et al., 2003b; Dixit et al., 2007; Osborne et al., 2007; Arashiro et al., 2009), experimental evidences point at the critical role of *FRG1* in muscle development and indicate the presence of negative regulatory mechanisms on its expression, which is released in a myogenic-specific manner. On this basis *FRG1* remains a very suitable candidate gene

The Adenine Nucleotide Translocator gene (*ANT1*) is located approximately 4 Mb centromeric to the tandem array and encodes a 298-amino acid protein. This protein is a member of integral membrane transport molecules family that are among the most abundant constituents of the inner mitochondria membrane (IMM), responsible for the transport of adenine nucleotides across the inner mitochondrial membrane, importing ADP for oxidative phosphorylation and exporting ATP to the cytosol (Klingenberg and Aquila, 1982). There are three different ANT genes in humans, *ANT1*, *ANT2*, and *ANT3*. These genes share 88% amino acid sequence identity and are characterized by a distinct tissue specific expression patterns (Levy et al., 2000; Stepien et al., 1992). *ANT1* is the predominant isoform expressed in the mitochondria of heart and skeletal muscle tissue. In addition to regulating adenine nucleotide pools, ANT1 functions as a component of the mitochondrial permeability transition pore (PTP), which is essential for the release of pro-apoptotic proteins during the activation of the intrinsic-apoptosis pathway (Bauer et al., 1999; Sharer et al., 2002). *ANT1* overexpression seems critical in inducing programmed cell death in different eukaryotic cell lines (Bauer et al., 1999). Although mice overexpressing *ANT1* do not show evident dystrophic phenotype (Figure 6A) (Gabellini et al., 2006), an increased amount of ANT1 protein was detected in both unaffected and affected FSHD muscles in

(Bodega et al., 2009).

for FSHD pathophysiology.

**6.3 Adenine Nucleotide Translocator (ANT1)** 

Fig. 6. **FSHD Region Gene 1 (FRG1). A.** WT and FRG1, FRG2 and ANT1 transgenic mice**.**  Only FRG1 transgenic mice develop a muscular dystrophy with features of the human disease. **B**. FRG1 protein domains **C.** Alignament of FRG1 homologs: human FRG1 shares 42% identity with *C. elegans*, 81% identity with *Xenopus* and 97% identity with mouse.

transgenic mice over-expressing *FRG1* develop a muscular dystrophy with features of the human disease (Figure 6A). Importantly the myopathic features of these mice are corrected by the use of RNA interference to target and reduce FRG1 level in the affected muscles. Interestingly, the same result was obtained by two groups using two different experimental approaches (Wallace et al., 2011; Bortolanza et al., 2011). Furthermore, in muscles of *FRG1* transgenic mice and FSHD patients, specific pre-mRNAs undergo aberrant alternative splicing. Collectively, these results suggest that FSHD might results from inappropriate over-expression of *FRG1* in skeletal muscle, which leads to abnormal alternative splicing of specific pre-mRNAs (Gabellini et al., 2006).

Recent studies show the crucial role of FRG1 in maintaining proper muscle structure and function (Hanel et al., 2011; Hanel et al., 2009; Liu et al., 2010). In *C. elegans*, frg1 protein localized both in nuclei and in the dense bodies that are homologous to vertebrate Z-disk. Interestingly *frg1* overexpression in this invertebrate model disrupts the body-wall musculature and the muscular organization (Liu et al., 2010). In *Xenopus* both knock down and overexpression of *frg1* resulted in defective growth and morphogenesis of the myotome indicating that precise levels of *frg1* must be maintained for normal muscle morphology (Hanel et al., 2009). Together these results strongly suggest an evolutionary conserved function of *FRG1* in muscular development. Additional evidences support the role of *FRG1* in muscle cell biology. *FRG1* expression increases during myogenic differentiation. Its activation is paralleled by chromatin remodeling at the *FRG1* promoter with loss of the polycomb repressor complex and replacement of the H3K27 trimethylation (H3K27me3) repression marker with the H3K4 trimethylation (H3K4me3) activation marker (Bodega et al., 2009). Interestingly the physical interaction between *FRG1* promoter and D4Z4 array has been shown (Petrov et al., 2008). In this context replacement of H3K27me3 by H3K4me3 during myoblasts differentiation might indicate that chromatin structure undergoes dynamic changes during myogenic differentiation that lead to the loosening of the FRG1/4q-D4Z4 array loop in myotubes. Consistently, *FRG1* over-expression was detected in the early stages of differentiation in FSHD myoblasts in comparison with control myoblasts (Bodega et al., 2009).

*FRG1* molecular function has not been elucidated yet. Several observations suggest that it could be involved in RNA processing. FRG1 is a nuclear protein that localizes in Cajal bodies, in nucleoli and in nuclear speckles, sites where RNA processing takes place (van Koningsbruggen et al., 2004). Its interaction with RNA has been demonstrated *in vitro* and *in vivo* (Sun et al., 2011). Proteomic studies found FRG1 as a component of purified spliceosomes (Rappsilber et al., 2002; Bessonov et al., 2010). Moreover in muscle of *FRG1* over-expressing transgenic mice, specific pre-mRNAs undergo aberrant alternative splicing (Gabellini et al., 2006). Studies showed that FRG1 has nuclear and cytoplasmic localizations. Interestingly in human muscle sections, FRG1 localizes with Z-disc (Hanel et al., 2011) an element of muscle sarcomere. In a muscle cells ribosomes are available at sarcomere for local synthesis of contractile proteins providing a mechanism to quickly respond to changes in the extra-cellular environment. It would be interesting to test wheter FRG1 is involved in Z line targeting and/or translation of specific m-RNAs. Despite the interest in FRG1 as candidate for FSHD pathogenesis has diminished because expression studies failed to detect *FRG1* consistently overexpressed in FSHD biopsies (Gabellini et al., 2002; Jiang et al., 2003; Winokur et al., 2003b; Dixit et al., 2007; Osborne et al., 2007; Arashiro et al., 2009), experimental evidences point at the critical role of *FRG1* in muscle development and indicate the presence of negative regulatory mechanisms on its expression, which is released in a myogenic-specific manner. On this basis *FRG1* remains a very suitable candidate gene for FSHD pathophysiology.

#### **6.3 Adenine Nucleotide Translocator (ANT1)**

36 Neuromuscular Disorders

Fig. 6. **FSHD Region Gene 1 (FRG1). A.** WT and FRG1, FRG2 and ANT1 transgenic mice**.**  Only FRG1 transgenic mice develop a muscular dystrophy with features of the human disease. **B**. FRG1 protein domains **C.** Alignament of FRG1 homologs: human FRG1 shares 42% identity with *C. elegans*, 81% identity with *Xenopus* and 97% identity with mouse.

transgenic mice over-expressing *FRG1* develop a muscular dystrophy with features of the human disease (Figure 6A). Importantly the myopathic features of these mice are corrected by the use of RNA interference to target and reduce FRG1 level in the affected muscles. Interestingly, the same result was obtained by two groups using two different experimental approaches (Wallace et al., 2011; Bortolanza et al., 2011). Furthermore, in muscles of *FRG1* transgenic mice and FSHD patients, specific pre-mRNAs undergo aberrant alternative splicing. Collectively, these results suggest that FSHD might results from inappropriate over-expression of *FRG1* in skeletal muscle, which leads to abnormal alternative splicing of

Recent studies show the crucial role of FRG1 in maintaining proper muscle structure and function (Hanel et al., 2011; Hanel et al., 2009; Liu et al., 2010). In *C. elegans*, frg1 protein localized both in nuclei and in the dense bodies that are homologous to vertebrate Z-disk. Interestingly *frg1* overexpression in this invertebrate model disrupts the body-wall musculature and the muscular organization (Liu et al., 2010). In *Xenopus* both knock down and overexpression of *frg1* resulted in defective growth and morphogenesis of the myotome indicating that precise levels of *frg1* must be maintained for normal muscle morphology (Hanel et al., 2009). Together these results strongly suggest an evolutionary conserved function of *FRG1* in muscular development. Additional evidences support the role of *FRG1* in muscle cell biology. *FRG1* expression increases during myogenic differentiation. Its activation is paralleled by chromatin remodeling at the *FRG1* promoter with loss of the polycomb repressor complex and replacement of the H3K27 trimethylation (H3K27me3) repression marker with the H3K4 trimethylation (H3K4me3) activation marker (Bodega et al., 2009). Interestingly the physical interaction between *FRG1* promoter and D4Z4 array has

specific pre-mRNAs (Gabellini et al., 2006).

The Adenine Nucleotide Translocator gene (*ANT1*) is located approximately 4 Mb centromeric to the tandem array and encodes a 298-amino acid protein. This protein is a member of integral membrane transport molecules family that are among the most abundant constituents of the inner mitochondria membrane (IMM), responsible for the transport of adenine nucleotides across the inner mitochondrial membrane, importing ADP for oxidative phosphorylation and exporting ATP to the cytosol (Klingenberg and Aquila, 1982). There are three different ANT genes in humans, *ANT1*, *ANT2*, and *ANT3*. These genes share 88% amino acid sequence identity and are characterized by a distinct tissue specific expression patterns (Levy et al., 2000; Stepien et al., 1992). *ANT1* is the predominant isoform expressed in the mitochondria of heart and skeletal muscle tissue. In addition to regulating adenine nucleotide pools, ANT1 functions as a component of the mitochondrial permeability transition pore (PTP), which is essential for the release of pro-apoptotic proteins during the activation of the intrinsic-apoptosis pathway (Bauer et al., 1999; Sharer et al., 2002). *ANT1* overexpression seems critical in inducing programmed cell death in different eukaryotic cell lines (Bauer et al., 1999). Although mice overexpressing *ANT1* do not show evident dystrophic phenotype (Figure 6A) (Gabellini et al., 2006), an increased amount of ANT1 protein was detected in both unaffected and affected FSHD muscles in

Facioscapulohumeral Muscular Dystrophy: From Clinical Data to Molecular Genetics and Return 39

is the nucleosome that consists of 146 bp of DNA wrapped around a protein octamer of core histone proteins (Kornberg et al., 1974; Finch et al., 1977). Histone proteins may be posttranslational modified, by acetylation, methylation, phosphorylation, ubiquitination, SUMOylation and ADP-ribosylation (Bernstein et al., 2007). Modified histones are likely to control the structure and/or function of the chromatin fiber, with different modifications yielding distinct functional consequences. Furthermore, recruitment of chromatinassociating proteins may depend upon the recognition of a specific histone modification pattern (Strahl and Allis, 2000; Peterson and Laniel, 2004). Extracellular and intracellular stimuli may change these patterns of modification, making the chromatin itself an integrator of various signaling pathways, ultimately affecting basic cellular processes such as transcription or replication (Cheung et al., 2000; Nightingale et al., 2006). *In vivo*, chromatin exists as fibers with differing degrees of compaction. The morphologically distinct classes of chromatin within the nucleus of higher eukaryotes are heterochromatin, which is more compacted and generally transcriptionally inactive, and euchromatin, wich is less compacted and generally transcriptionally active (Frenster et al., 1963). Although D4Z4 unit harbors two classes of repetitive DNA, hhspm3 and LSau, both of which are found predominantly in heterochromatic domains of the genome, FSHD locus at 4qter does not share some of the common properties of heterochromatin. For instance it does not colocalize with DAPI-intense loci or it does not replicate in late S-phase. A recent study on D4Z4 histone modification seems to indicate that the repeat array may be organized in distinct domains, some characterized by transcriptionally repressive heterochromatin and others by transcriptionally permissive euchromatin (Zeng et al., 2009). These results indicate that the D4Z4 locus might display a chromatin structure more similar to euchromatin and favor the hypothesis that this region might be more dynamic than expected. Interestingly loss of marks of unexpressed heterochromatin such as histone H3K9me3 was observed in both FSHD with or without D4Z4 contraction. This phenomenon seems to be strictly associated with FSHD phenotype; in fact it was not found in ICF syndrome, despite its apparent similarity to FSHD with regard to D4Z4 DNA hypomethylation, or in other types of muscular dystrophies tested (Zeng et al., 2009). H3K9 methylation at D4Z4 is specifically mediated by the histone methyltransferase SUV39H1 (Zeng et al., 2009), which interacts with MyoD to suppress MyoD-dependent muscle gene expression (Mal, 2006). Interestingly, the heterochromatin binding protein HP1, which mediates transcriptional silencing (Bannister et al., 2001; Bernard et al., 2001), and the sister chromatid cohesion complex, cohesin, bind to D4Z4 in an H3K9me3-dependent manner and their recruitment is seriously compromised in FSHD (Zeng et al., 2009). These data support the indirect mechanism (Figure 5 C) where loss of repeats generates structural and functional modification, possibly through epigenetic changes in the histone pattern, which in turn might have an effect on transcriptional regulation in *cis* and/or in *trans*. It is reasonable to anticipate that future studies on the possible chromatin organization involving D4Z4 and its changes in FSHD

may provide critical insight into the mechanism of FSHD pathogenesis.

The alteration of 4q35 gene expression observed in FSHD affected muscle (Gabellini et al., 2002) raised the question whether D4Z4 was directly involved in transcriptional control of 4q35 genes. The analysis of the interaction between D4Z4 and nuclear proteins revealed the presence of a 27 bp binding site (DBE, D4Z4 Binding Element) able to recruit a multi-protein

**7.3 Long distance effect: A repressor complex binding D4Z4** 

comparison to healthy controls (Laoudj-Chenivesse et al., 2005). Even though both increase of oxidative stress and *ANT1* overexpression are proposed to be early events in the development of FSHD, it remains unclear if these are sequential or parallel processes (Winokur et al., 2003a).
