Fig. 1. **Schematic representation of Polymorphisms at the 4q and 10q subtelomeres**.

Schematic representation of the method used to calculate D4Z4 repeat numbers from *EcoR*Ifragment sizes. Seven and eight D4Z4 repeats (31-36 kb *EcoR*I fragment size) were defined to be the upper diagnostic range for FSHD. D4Z4 repeat units on chromosomes 4 and 10 can be distinguished because all repeats on 10q contain *BlnI* restriction sites (**B**), while all D4Z4 repeats on 4q contain *XapI* restriction sites (**X**).

2004). The presence of somatic mosaicism for a rearrangement of D4Z4 was found in as much as 3% of the general population (van der Maarel et al., 2000), suggesting that the D4Z4

A complication of molecular testing by Southern analysis is represented by the presence of a polymorphic region recognized by probe p13E-11 at the subtelomeric region of chromosome 10q, which shares numerous homologies with the 4q subtelomere (Bakker et al., 1995; Deidda et al., 1996). The repeat element at 10q is 98% identical to D4Z4 at 4q, and the size of 10q EcoRI alleles varies between 11 and 300 kb (1-100 D4Z4 units). Moreover, 10% of these alleles are shorter than 35 kb (8 D4Z4 units) (Bakker et al., 1996; Bakker et al., 1995), overlapping the 4q alleles. Clearly these overlapping features can interfere with the molecular diagnosis of FSHD. Nevertheless the presence of a BlnI restriction site within the 3.3 kb element associated with chromosome 10q allows the discrimination between 4q and 10q alleles (Deidda et al., 1996). As a result, Southern blot hybridization of EcoRI and EcoRI/BlnI digested genomic DNA is used

Fig. 1. **Schematic representation of Polymorphisms at the 4q and 10q subtelomeres**. Schematic representation of the method used to calculate D4Z4 repeat numbers from *EcoR*Ifragment sizes. Seven and eight D4Z4 repeats (31-36 kb *EcoR*I fragment size) were defined to be the upper diagnostic range for FSHD. D4Z4 repeat units on chromosomes 4 and 10 can be distinguished because all repeats on 10q contain *BlnI* restriction sites (**B**), while all D4Z4

repeats on 4q contain *XapI* restriction sites (**X**).

repeat is highly recombinogenic.

for the molecular diagnosis of FSHD (Lunt, 1998) (Figure 1).

Through years, additional findings have emerged that need to be considered in the molecular diagnosis of FSHD. First, translocated 4-type repeats residing on chromosome 10q as well as translocated 10-type repeats residing on chromosome 4q are found in 10% of the population (van Deutekom et al., 1996b; van Overveld et al., 2000; Matsumura et al., 2002;). Therefore, FSHD-sized D4Z4 alleles may be attributed incorrectly to chromosome 10 and viceversa. Second, deletions at 4q encompassing the genomic sequence recognized by probe p13E-11 have been detected in FSHD cases. Thus D4Z4 short arrays might not be detected by using the standard diagnostic procedure. The frequency of such extended deletions has been estimated around 3% (Lemmers et al., 2003) and represents a possible caveat of FSHD molecular diagnosis. Third, 5-10% of subjects showing FSHD clinical features do not carry D4Z4 reduced alleles. Possible explanations for such anomalous cases include a different mechanism at 4q35, such as D4Z4 hypomethylation (De Greef et al, 2009) or the presence of other mutations not linked to the FSHD locus at 4q35. At present, no FSHD families linked to other chromosomal loci have been described. Figure 2 summarizes the diagnostic flow chart that should be used to study the 4q35 region in FSHD patients.

Fig. 2. **Schematic representation of FSHD diagnostic approach at 4q35 locus**. Abbreviations: **Ch.DNA**: chromosomal DNA embedded in plugs; **Gen.DNA**: genomic DNA in solution; **PFGE**: Pulse Field Gel Electrophoresis; **LGE**: Linear Gel Electrophoresis; **E**: restriction enzyme EcoRI; **B**: restriction enzyme BlnI; **X**: restriction enzyme XapI; **N**: restriction enzyme NotI; **H**: restriction enzyme HindIII.

Facioscapulohumeral Muscular Dystrophy: From Clinical Data to Molecular Genetics and Return 27

most common form of FSHD, and 8-10 repeats with a milder disease and reduced penetrance. Nevertheless great variability of clinical expression has been described among FSHD patients even within the same family. Interestingly it has been suggested that patients harboring D4Z4 alleles of ≥35 kb (≥ 8 repeats) were less likely to present the classic FSHD phenotype as compared with patients with alleles of <35 kb (<8 repeats) (Felice and Whitaker, 2005). Several clinical reports described myopathic patients, carrying alleles of 38 kb (9 repeats) or larger, showing typical and atypical FSHD phenotypes (Vitelli et al., 1999; Felice et al., 2000; Felice and Moore, 2001; Butz et al., 2003; Krasnianski et al., 2003). However D4Z4 repeat arrays of size between 38-45 kb (9-11 repeats) were encountered in 3% of 200 control subjects in a Dutch study (van Overveld et al., 2000). These findings seems to indicate that in a substantial proportion of 38 to 45 kb-sized repeat arrays penetrance may be close to zero, but in some families 38−45 kb alleles are associated with myopathy (Butz et al., 2003). Remarkably D4Z4 repeat array of size between 21-34 kb (4-8 repeats) were found in 3% of 801 Italian and Brazilian samples of normal individuals unrelated to any FSHD patients, indicating that in this size-range, additional factors influence the disease expression (Scionti et al., 2012b). In conclusion the high variability in clinical expression makes difficult to establish a prognostic correlation between the number of the D4Z4 repeats and the severity of the disease. There is the necessity of clinical and molecular studies on large cohorts of FSHD patients and families to obtain significant information on FSHD

Based on the need of gaining statistically significant observations through large cohorts studies, a standardized clinical evaluation tool for patients with FSHD was created. The clinical protocol examines muscle groups specifically affected in FSHD. The test uses functional criteria, which allow expression of clinical severity in quantitative terms (Lamperti et al., 2010). The clinical examination results in an evaluation scale, which is divided into six independent sections that assess the strength and the functionality of (I) facial muscles (scored from 0 to 2); (II) scapular girdle muscles (scored from 0 to 3); (III) upper limb muscles (scored from 0 to 2); (IV) distal leg muscles (scored from 0 to 2); (V) pelvic girdle muscles (scored from 0 to 5); and (VI) abdominal muscles (scored from 0 to 1). The evaluation scale allows the functional quantification of muscle weakness in FSHD patients. This examination protocol, which is associated with a questionnaire that collects information on the clinical history of the subject, generates a disability score resulting from the sum of six independent scores of separately evaluated muscle regions, including the facial and abdominal muscles, which are specifically affected by FSHD. The total score can range from 0, when no signs of muscle weakness are present, to 15, when all muscle groups tested are severely impaired. The protocol represents a robust evaluation procedure for FSHD patients that can be performed easily in the medical office and is not influenced by the tester. The robustness of the clinical evaluation protocol provides a tool that can therefore be used by different neurologists in large cooperative clinical studies and allows translating what is called clinical impression of the progressive involvement of specific muscle groups into a number (Lamperti et al., 2010) (The FSHD clinical form and the FSHD evaluation scale form, as well as a visual guide to clinical assessment, are available online at

development and generate useful prognostic information.

www.fshd.it).

**2.3.3 A standardized clinical evaluation tool: FSHD score** 

#### **2.3 FSHD clinical ascertainment and molecular diagnosis: Necessity of standardized clinical examination**

As said before, genomic studies conducted on groups of FSHD patients and families revealed the numerous difficulties that can be encountered in the molecular characterization of the 4q35 locus. Through years, the complexity of molecular diagnosis has been paralleled by the emerging complexity of FSHD clinical ascertainment. At present, criteria established in 1991, before the advent of molecular diagnosis (Padberg et al., 1991), and in 1998, following the ENMC workshop on FSHD (Lunt 1998), need probably to be reconsidered in light of the most recent observations.

#### **2.3.1 Penetrance**

Non-penetrance in FSHD was estimated to be less than 2% after the age of 50 years and more likely with allele sizes larger than 30 kb (Tawil et al., 1996). However, asymptomatic gene carriers seem to be more prominent in some families, and non-penetrance has even been found in carriers of 25 kb D4Z4 alleles (Ricci et al., 1999). In his work, Ricci et al. detected D4Z4 reduced alleles in several unaffected family members, named nonpenetrant carriers, who are capable of transmitting the disease to their offspring. In addition reduced penetrance for D4Z4 reduced alleles was described in families in which patients heterozygous for FSHD alleles on both 4q chromosomes were present (Wohlgemuth et al., 2003; Tonini et al., 2004). Gender differences have been also described in FSHD, with males apparently more affected than females (Tonini et al., 2004). Nowadays correlation between penetrance of FSHD, length of the repeat array, age and sex is unsettled. Thus, the risk of developing the disease in correlation with D4Z4 allele sizes cannot be estimated and no prognostic tools are available. In addition several clinical reports describe patients displaying clinical and genetic features of FSHD associated with other documented muscle disorders including mitochondrial diseases (Chuenkongkaew et al., 2005; Filosto et al., 2008), glycogenosis (Nadaj-Pakleza et al., 2009), dystrophinopathies (Rudnik-Schoneborn et al., 2008). In all these cases the presence of the FSHD molecular defect seems to aggravate the clinical phenotype. Finally, phenotypic features of FSHD can be found in other myopathies (Oya et al., 2001; Saenz et al., 2005) as well as atypical phenotypes can be displayed by subjects carrying the FSHD molecular defect (Figueroa and Chapin, 2010; Tsuji et al., 2009; Zouvelou et al., 2009). All together these observations suggest that the variable penetrance observed in the FSHD population may be the result of the interaction of several factors. Indeed the presence of lowpenetrant alleles suggests that susceptibility for FSHD is not only determined by the intrinsic properties of the diseased allele but also by additional factors that can be genetic, epigenetic and/or environmental factors. Identification of factors influencing FSHD clinical outcome remains one of the major challenges of FSHD research.

#### **2.3.2 Severity of the disease and repeats number: Does a linear correlation exist?**

An inverse relationship has been established between the D4Z4 repeat size and the severity and progression of the disease (Lunt et al., 1995; Ricci et al., 1999; Tawil et al., 1996; Zatz et al., 1998). In general, individuals with ≥11 repeats are healthy; in contrast, 1-3 D4Z4 repeats is associated with a severe form of disease that presents in childhood, 4-7 repeats with the most common form of FSHD, and 8-10 repeats with a milder disease and reduced penetrance. Nevertheless great variability of clinical expression has been described among FSHD patients even within the same family. Interestingly it has been suggested that patients harboring D4Z4 alleles of ≥35 kb (≥ 8 repeats) were less likely to present the classic FSHD phenotype as compared with patients with alleles of <35 kb (<8 repeats) (Felice and Whitaker, 2005). Several clinical reports described myopathic patients, carrying alleles of 38 kb (9 repeats) or larger, showing typical and atypical FSHD phenotypes (Vitelli et al., 1999; Felice et al., 2000; Felice and Moore, 2001; Butz et al., 2003; Krasnianski et al., 2003). However D4Z4 repeat arrays of size between 38-45 kb (9-11 repeats) were encountered in 3% of 200 control subjects in a Dutch study (van Overveld et al., 2000). These findings seems to indicate that in a substantial proportion of 38 to 45 kb-sized repeat arrays penetrance may be close to zero, but in some families 38−45 kb alleles are associated with myopathy (Butz et al., 2003). Remarkably D4Z4 repeat array of size between 21-34 kb (4-8 repeats) were found in 3% of 801 Italian and Brazilian samples of normal individuals unrelated to any FSHD patients, indicating that in this size-range, additional factors influence the disease expression (Scionti et al., 2012b). In conclusion the high variability in clinical expression makes difficult to establish a prognostic correlation between the number of the D4Z4 repeats and the severity of the disease. There is the necessity of clinical and molecular studies on large cohorts of FSHD patients and families to obtain significant information on FSHD development and generate useful prognostic information.

#### **2.3.3 A standardized clinical evaluation tool: FSHD score**

26 Neuromuscular Disorders

**2.3 FSHD clinical ascertainment and molecular diagnosis: Necessity of standardized** 

As said before, genomic studies conducted on groups of FSHD patients and families revealed the numerous difficulties that can be encountered in the molecular characterization of the 4q35 locus. Through years, the complexity of molecular diagnosis has been paralleled by the emerging complexity of FSHD clinical ascertainment. At present, criteria established in 1991, before the advent of molecular diagnosis (Padberg et al., 1991), and in 1998, following the ENMC workshop on FSHD (Lunt 1998), need probably to be reconsidered in

Non-penetrance in FSHD was estimated to be less than 2% after the age of 50 years and more likely with allele sizes larger than 30 kb (Tawil et al., 1996). However, asymptomatic gene carriers seem to be more prominent in some families, and non-penetrance has even been found in carriers of 25 kb D4Z4 alleles (Ricci et al., 1999). In his work, Ricci et al. detected D4Z4 reduced alleles in several unaffected family members, named nonpenetrant carriers, who are capable of transmitting the disease to their offspring. In addition reduced penetrance for D4Z4 reduced alleles was described in families in which patients heterozygous for FSHD alleles on both 4q chromosomes were present (Wohlgemuth et al., 2003; Tonini et al., 2004). Gender differences have been also described in FSHD, with males apparently more affected than females (Tonini et al., 2004). Nowadays correlation between penetrance of FSHD, length of the repeat array, age and sex is unsettled. Thus, the risk of developing the disease in correlation with D4Z4 allele sizes cannot be estimated and no prognostic tools are available. In addition several clinical reports describe patients displaying clinical and genetic features of FSHD associated with other documented muscle disorders including mitochondrial diseases (Chuenkongkaew et al., 2005; Filosto et al., 2008), glycogenosis (Nadaj-Pakleza et al., 2009), dystrophinopathies (Rudnik-Schoneborn et al., 2008). In all these cases the presence of the FSHD molecular defect seems to aggravate the clinical phenotype. Finally, phenotypic features of FSHD can be found in other myopathies (Oya et al., 2001; Saenz et al., 2005) as well as atypical phenotypes can be displayed by subjects carrying the FSHD molecular defect (Figueroa and Chapin, 2010; Tsuji et al., 2009; Zouvelou et al., 2009). All together these observations suggest that the variable penetrance observed in the FSHD population may be the result of the interaction of several factors. Indeed the presence of lowpenetrant alleles suggests that susceptibility for FSHD is not only determined by the intrinsic properties of the diseased allele but also by additional factors that can be genetic, epigenetic and/or environmental factors. Identification of factors influencing FSHD

clinical outcome remains one of the major challenges of FSHD research.

**2.3.2 Severity of the disease and repeats number: Does a linear correlation exist?** 

An inverse relationship has been established between the D4Z4 repeat size and the severity and progression of the disease (Lunt et al., 1995; Ricci et al., 1999; Tawil et al., 1996; Zatz et al., 1998). In general, individuals with ≥11 repeats are healthy; in contrast, 1-3 D4Z4 repeats is associated with a severe form of disease that presents in childhood, 4-7 repeats with the

**clinical examination** 

**2.3.1 Penetrance** 

light of the most recent observations.

Based on the need of gaining statistically significant observations through large cohorts studies, a standardized clinical evaluation tool for patients with FSHD was created. The clinical protocol examines muscle groups specifically affected in FSHD. The test uses functional criteria, which allow expression of clinical severity in quantitative terms (Lamperti et al., 2010). The clinical examination results in an evaluation scale, which is divided into six independent sections that assess the strength and the functionality of (I) facial muscles (scored from 0 to 2); (II) scapular girdle muscles (scored from 0 to 3); (III) upper limb muscles (scored from 0 to 2); (IV) distal leg muscles (scored from 0 to 2); (V) pelvic girdle muscles (scored from 0 to 5); and (VI) abdominal muscles (scored from 0 to 1). The evaluation scale allows the functional quantification of muscle weakness in FSHD patients. This examination protocol, which is associated with a questionnaire that collects information on the clinical history of the subject, generates a disability score resulting from the sum of six independent scores of separately evaluated muscle regions, including the facial and abdominal muscles, which are specifically affected by FSHD. The total score can range from 0, when no signs of muscle weakness are present, to 15, when all muscle groups tested are severely impaired. The protocol represents a robust evaluation procedure for FSHD patients that can be performed easily in the medical office and is not influenced by the tester. The robustness of the clinical evaluation protocol provides a tool that can therefore be used by different neurologists in large cooperative clinical studies and allows translating what is called clinical impression of the progressive involvement of specific muscle groups into a number (Lamperti et al., 2010) (The FSHD clinical form and the FSHD evaluation scale form, as well as a visual guide to clinical assessment, are available online at www.fshd.it).

Facioscapulohumeral Muscular Dystrophy: From Clinical Data to Molecular Genetics and Return 29

Fig. 3. The D4Z4 repeat array within the subtelomere of chromosomes 4q and 10q varies in size between 1 and 100 D4Z4 units (3.3–330 kb) and it is indicated with triangles. Elements that distinguish subjects include: 1. The chromosomal localization of the D4Z4 repeat, chromosome 4q35 or 10q26. 2. The Simple Sequence Length Polymorphism (SSLP). It is a combination of five Variable Number Tandem Repeats, an 8 bp insertion/deletion, and two SNPs localized 3.5 kb proximal to D4Z4 and vary in length between 157 and 182 bp. 3. Single nucleotide polymorphism AT(T/C)AAA (SNP) in the pLAM region. 4. A large sequence variation (termed 4qA or B) that is distal to D4Z4. In the 4qB variant the terminal 3.3-kb repeat contains only 570 bp of a complete repeat, whereas in the 4qA variant the terminal repeat is a divergent 3.3-kb repeat named pLAM. 4q chromosomes which do not hybridize to probes for A and B

A worldwide population (including African, European and Asian HAPMAP panels) analysis of 4q subtelomeric polymorphisms flanking the D4Z4 array revealed 17 distinct haplotypes on chromosome 4q (Lemmers et al., 2010b). On the basis of sequence similarities, all haplotypes were categorized in two groups: the major group 1 consists mainly of the haplotypes 4A159, 4A161 and 4B163, which are the most common in all three HAPMAP populations. The major group 2 contains other standard and nonstandard 4q haplotypes (4A166 and 4A168). Evolutionary studies showed that haplotypes 4A159 and 4A161 represent the oldest human D4Z4 haplotypes. Similarly, the 4A168 haplotype is most probably the oldest haplotype that belongs to major group 2. It has been hypothesized that all other haplotypes originate from only four discrete sequence-transfer events during

Analysis of 4qter polymorphisms of 80 unrelated Dutch patients with FSHD revealed that D4Z4-reduced alleles are associated with the 4qA variant (Lemmers et al., 2002). Subsequently, by studying three families in which two D4Z4-reduced alleles segregate, it has been proposed that 4qB chromosomes carrying short D4Z4 repeats do not cause FSHD, since only subjects carrying D4Z4-reduced alleles associated with the 4qA polymorphism had FSHD (Lemmers et al., 2004). The almost exclusive association between FSHD diseaseexpression and the D4Z4-reduced allele of the 4qA type has been confirmed in a large cohort of 164 unrelated patients with FSHD from Turkey and the UK. Even though that study described FSHD patients lacking the 4qA/4qB end (Thomas et al., 2007). Subsequent studies led to the hypothesis that D4Z4 contraction on 4qA chromosome *per se* is not sufficient to cause disease. Analysis of SSLP proximal to the repeat array in 86 FSHD patients

are termed "null" and their sequences vary from case to case.

**3.2 Permissive and non-permissive genetic background** 

human evolution (Lemmers et al., 2010b).

Use of the FSHD score can support studies for defining the natural history of the disease throughout time. Importantly, definition of the clinical involvement of specific muscle groups by a number permits identification and characterization of atypical cases and support the definition of clinical subcategories among FSHD patients.

By assessing the correlation between clinical severity, results of molecular analysis, and anamnestic records, the FSHD score can provide useful information for defining FSHD nosology.

### **3. Genomic characteristic of the 4q35 region: D4Z4 and role of specific polymorphisms**

Since the discovery of the FSHD molecular defect (Wijmenga et al 1992b), many studies suggested the possibility that reduction of D4Z4 repeat units on chromosome 4 alone is not sufficient to FSHD development (Weiffenbach et al 1993; van Overveld et al., 2000; Lemmers et al., 2002, Lemmers et al., 2007). Thus, a detailed genomic characterization of the 4q35 region led to the identification of polymorphic regions flanking the D4Z4 repeat array which could contribute to FSHD onset.

### **3.1 Genetic variability and haplotypes**

D4Z4 is part of a family of 3.3-kb repeats (D4Z4) are dispersed throughout the human genome and are generally found associated with regions of heterochromatin (Lyle et al.,1995). A homologue D4Z4 tandem array is present at the 10q telomere. This homology between the subtelomeric region of 10q and 4q is not confined to D4Z4 repeats but extends proximal 42 kb and distally to include the telomere (van Geel et al., 2002). However, despite the high level of sequence similarity (> 98% nucleotide identity) between the 10q and 4q subtelomeres, FSHD is associated only with the chromosome 4q (Bakker et al., 1995; Deidda et al., 1996). In the attempt of explaining the unique association between D4Z4 reduction of chromosome 4q and FSHD, a bi-allelic polymorphism was identified distal to the repeat array (van Geel et al., 2002). Two distinct polymorphic regions, named 4qA and 4qB, were observed at the distal end of chromosome 4q. Within 4qA there is a polymorphic 8-kb region of 68bp satellite DNA immediately distal to the D4Z4, and adjacent to this is a 1-kb divergent (TTAGGG)*n* array. None of these repeats is present within the 4qB sequence. In 4qB polymorphism, the last 3.3-kb repeat contains only the first 570 bp of a complete unit, whereas in 4qA the terminal repeat is a divergent 3.3-kb repeat named pLAM (van Deutekom et al., 1993). Sequence alignment and subsequent phylogenetic analysis of subtelomeric region of 10q showed a close relationship between 4qA sequences and 10q suggesting that the 4q subtelomere has been transferred onto chromosome 10q (van Geel et al., 2002). The distribution of these allelic variants is heterogeneous and depends on the studied population. Lemmers and colleagues (2004), analyzing 80 Dutch control individuals, have observed an almost equal frequencies of 4qA and 4qB alleles (42% and 58%, respectively), whereas in another study, conducted on 66 Italian control individuals, an overrepresentation of 4qA telomeres has been reported (68% for 4qA and 32% for 4qB) (Rossi et al., 2007). Proximal to the D4Z4 repeat array a simple sequence-length polymorphism (SSLP) has been described and those sequences are in a range between 157 bp and 180 bp (Figure 3).

Use of the FSHD score can support studies for defining the natural history of the disease throughout time. Importantly, definition of the clinical involvement of specific muscle groups by a number permits identification and characterization of atypical cases and

By assessing the correlation between clinical severity, results of molecular analysis, and anamnestic records, the FSHD score can provide useful information for defining FSHD

Since the discovery of the FSHD molecular defect (Wijmenga et al 1992b), many studies suggested the possibility that reduction of D4Z4 repeat units on chromosome 4 alone is not sufficient to FSHD development (Weiffenbach et al 1993; van Overveld et al., 2000; Lemmers et al., 2002, Lemmers et al., 2007). Thus, a detailed genomic characterization of the 4q35 region led to the identification of polymorphic regions flanking the D4Z4 repeat array which

D4Z4 is part of a family of 3.3-kb repeats (D4Z4) are dispersed throughout the human genome and are generally found associated with regions of heterochromatin (Lyle et al.,1995). A homologue D4Z4 tandem array is present at the 10q telomere. This homology between the subtelomeric region of 10q and 4q is not confined to D4Z4 repeats but extends proximal 42 kb and distally to include the telomere (van Geel et al., 2002). However, despite the high level of sequence similarity (> 98% nucleotide identity) between the 10q and 4q subtelomeres, FSHD is associated only with the chromosome 4q (Bakker et al., 1995; Deidda et al., 1996). In the attempt of explaining the unique association between D4Z4 reduction of chromosome 4q and FSHD, a bi-allelic polymorphism was identified distal to the repeat array (van Geel et al., 2002). Two distinct polymorphic regions, named 4qA and 4qB, were observed at the distal end of chromosome 4q. Within 4qA there is a polymorphic 8-kb region of 68bp satellite DNA immediately distal to the D4Z4, and adjacent to this is a 1-kb divergent (TTAGGG)*n* array. None of these repeats is present within the 4qB sequence. In 4qB polymorphism, the last 3.3-kb repeat contains only the first 570 bp of a complete unit, whereas in 4qA the terminal repeat is a divergent 3.3-kb repeat named pLAM (van Deutekom et al., 1993). Sequence alignment and subsequent phylogenetic analysis of subtelomeric region of 10q showed a close relationship between 4qA sequences and 10q suggesting that the 4q subtelomere has been transferred onto chromosome 10q (van Geel et al., 2002). The distribution of these allelic variants is heterogeneous and depends on the studied population. Lemmers and colleagues (2004), analyzing 80 Dutch control individuals, have observed an almost equal frequencies of 4qA and 4qB alleles (42% and 58%, respectively), whereas in another study, conducted on 66 Italian control individuals, an overrepresentation of 4qA telomeres has been reported (68% for 4qA and 32% for 4qB) (Rossi et al., 2007). Proximal to the D4Z4 repeat array a simple sequence-length polymorphism (SSLP) has been described and those sequences are in a range between 157

**3. Genomic characteristic of the 4q35 region: D4Z4 and role of specific** 

support the definition of clinical subcategories among FSHD patients.

nosology.

**polymorphisms** 

could contribute to FSHD onset.

bp and 180 bp (Figure 3).

**3.1 Genetic variability and haplotypes**

Fig. 3. The D4Z4 repeat array within the subtelomere of chromosomes 4q and 10q varies in size between 1 and 100 D4Z4 units (3.3–330 kb) and it is indicated with triangles. Elements that distinguish subjects include: 1. The chromosomal localization of the D4Z4 repeat, chromosome 4q35 or 10q26. 2. The Simple Sequence Length Polymorphism (SSLP). It is a combination of five Variable Number Tandem Repeats, an 8 bp insertion/deletion, and two SNPs localized 3.5 kb proximal to D4Z4 and vary in length between 157 and 182 bp. 3. Single nucleotide polymorphism AT(T/C)AAA (SNP) in the pLAM region. 4. A large sequence variation (termed 4qA or B) that is distal to D4Z4. In the 4qB variant the terminal 3.3-kb repeat contains only 570 bp of a complete repeat, whereas in the 4qA variant the terminal repeat is a divergent 3.3-kb repeat named pLAM. 4q chromosomes which do not hybridize to probes for A and B are termed "null" and their sequences vary from case to case.

A worldwide population (including African, European and Asian HAPMAP panels) analysis of 4q subtelomeric polymorphisms flanking the D4Z4 array revealed 17 distinct haplotypes on chromosome 4q (Lemmers et al., 2010b). On the basis of sequence similarities, all haplotypes were categorized in two groups: the major group 1 consists mainly of the haplotypes 4A159, 4A161 and 4B163, which are the most common in all three HAPMAP populations. The major group 2 contains other standard and nonstandard 4q haplotypes (4A166 and 4A168). Evolutionary studies showed that haplotypes 4A159 and 4A161 represent the oldest human D4Z4 haplotypes. Similarly, the 4A168 haplotype is most probably the oldest haplotype that belongs to major group 2. It has been hypothesized that all other haplotypes originate from only four discrete sequence-transfer events during human evolution (Lemmers et al., 2010b).

#### **3.2 Permissive and non-permissive genetic background**

Analysis of 4qter polymorphisms of 80 unrelated Dutch patients with FSHD revealed that D4Z4-reduced alleles are associated with the 4qA variant (Lemmers et al., 2002). Subsequently, by studying three families in which two D4Z4-reduced alleles segregate, it has been proposed that 4qB chromosomes carrying short D4Z4 repeats do not cause FSHD, since only subjects carrying D4Z4-reduced alleles associated with the 4qA polymorphism had FSHD (Lemmers et al., 2004). The almost exclusive association between FSHD diseaseexpression and the D4Z4-reduced allele of the 4qA type has been confirmed in a large cohort of 164 unrelated patients with FSHD from Turkey and the UK. Even though that study described FSHD patients lacking the 4qA/4qB end (Thomas et al., 2007). Subsequent studies led to the hypothesis that D4Z4 contraction on 4qA chromosome *per se* is not sufficient to cause disease. Analysis of SSLP proximal to the repeat array in 86 FSHD patients

Facioscapulohumeral Muscular Dystrophy: From Clinical Data to Molecular Genetics and Return 31

healthy subjects from Italy and Brazil showed that that 3% of individuals from the general population carry alleles with reduced number (4-8) of D4Z4 repeats on chromosome 4q and one third of these alleles occurs in combination with the 4A161PAS haplotype (Scionti et al., 2012b) All these findings challenge the hypothesis that 4APAS structure is necessary and sufficient for the development of FSHD. This discovery is not incompatible with evidence implicating DUX4 or other factors as important mediators of disease. Nonetheless, it does

After the genetic correlation between D4Z4 and FSHD the most difficult task has been to explain the role of D4Z4 in disease development. D4Z4 can directly cause FSHD through DUX4 expression; on the other end D4Z4 reduction might indirectly cause FSHD by exerting long distance effects. None of the proposed models entirely explain the mechanism leading to disease. In this regard the scientific community does not express undisputed consensus.

Fig. 5. **Models for the molecular basis of FSHD. A**. Healthy individuals carry 11–150 units of D4Z4, whereas FSHD patients have less than 11 repeats. **B. DIRECT MECHANISM**: reduction of D4Z4 repeat array leads to the synthesis of DUX4 transcript, which is normally not transcribed, through changes in D4Z4 heterochromatin and/or stabilization of DUX4 mRNA. **C. INDIRECT MECHANISM:** the reduction of D4Z4 repeats leads to modifications of the spatial and structural organization of chromatin generating changes of transcriptional

**B C** 

control over the expression of candidate genes localized in *cis* or in *trans*.

demonstrate that FSHD pathogenesis is more complex than currently thought.

**4. One disease (too) many theories** 

**A**

and 222 healthy controls revealed a unique association of FSHD with the 161 allele and the 4qA sequence. In particular the haplotype 4A166 associated with D4Z4-reduced alleles was detected in multiple unaffected relatives of two independent families and the 4B163 haplotype was associated with 17 FSHD-sized alleles carried in healthy subjects (Lemmers et al., 2007). On this basis it has been hypothesized that FSHD can develop only in a specific "permissive" chromosomal background represented by the haplotype 4A161. Following this hypothesis, proximal and distal sequences of 4A161 chromosome were compared to those of "nonpermissive" ones, such as 4B163 and 10A166. This approach led to the identification of a single nucleotide polymorphism (SNP, AT(T/C)AAA) in the adjacent pLAM sequence, immediately distal to D4Z4 array. In particular 4A161 and two other uncommon permissive variants, 4A159 and 4A168 presented the ATTAAA variant, which has been interpreted as a polyadenylation signal able to stabilize the *DUX4* transcript (Figure 4a).

Fig. 4. **Schematic representation of the current view of permissive and not-permissive haplotype. a**. Permissive haplotypes **b.** Non-permissive haplotype. The ATTAAA variant creates a polyadenylation signal (PAS) that stabilizes the *DUX4* transcript and has been postulated to be the critical factor causing FSHD.

By contrast sequences associated with non-permissive chromosome 10A166 and 4B did not allow the expression of DUX4 (Lemmers et al., 2010a). Analysis of more than 300 unrelated FSHD patients and 5 families with one or more FSHD patients carrying D4Z4-reduced allele strongly supported the hypothesis that the last 4qA D4Z4 unit with the directly adjacent pLAM sequence including the ATTAAA is necessary to the FSHD development (Lemmers et al., 2010a). On this basis it has been proposed that FSHD arises through a toxic gain of function attributable to the stabilized distal *DUX4* transcript (Lemmers et al., 2010a) (Figure 4b). Despite the intriguing premise, the notion that FSHD is a fully-penetrant autosomal dominant disorder caused by the reduction of D4Z4 repeat number associated with 4A161PAS haplotype is challenged by recently published data. First a study conducted on 750 unrelated FSHD families from Italy revealed that the frequency of individuals carrying two D4Z4 reduced alleles (compound heterozygotes) is 2,7%, a frequency much higher than expected for a fully penetrant autosomal dominant disorder with prevalence of 1 in 20,000. Interestingly in these families with compound heterozygosity, 25% of relatives carrying D4Z4-reduced alleles and 4A161PAS are healthy (Scionti et al., 2012a). Second, characterization of 253 unrelated FSHD probands from the Italian National Registry for FSHD showed that only 127 of them (50.1%) carry D4Z4 alleles with 1-8 D4Z4 associated with 4A161PAS, whereas the remaining FSHD probands carry different haplotypes or alleles with greater number of D4Z4 repeats (Scionti et al., 2012b). Third, molecular analysis of 801 normal healthy subjects from Italy and Brazil showed that that 3% of individuals from the general population carry alleles with reduced number (4-8) of D4Z4 repeats on chromosome 4q and one third of these alleles occurs in combination with the 4A161PAS haplotype (Scionti et al., 2012b) All these findings challenge the hypothesis that 4APAS structure is necessary and sufficient for the development of FSHD. This discovery is not incompatible with evidence implicating DUX4 or other factors as important mediators of disease. Nonetheless, it does demonstrate that FSHD pathogenesis is more complex than currently thought.
