200700 (Faiyaz-Ul-

Homozygous non-sense or frame-shift mutations in the pro- or mature part of *GDF5* will result in a complete knockout of *GDF5*. However, also heterozygous non-sense and frameshift mutations in *GDF5* will severely lower the level of intact protein; assuming equal transcriptional and translational efficiency from both alleles by statistics only 25% of the protein produced will be intact due to its dimeric nature. Hence the complete knockout or partial knockdown of *GDF5* achieved by this type of mutation leads to rather severe skeletal malformation phenotypes such as brachydactyly type C (BDC), symphalangism (SYM1) or multiple synostosis syndrome (SYNS1). One potentially underappreciated possibility is also the formation of nonfunctional heterodimeric ligands if a cell produces more than one TGFβ factor at a time and thus a possible influence of non-sense GDF-5 mutations onto other BMP signals. It is a known fact that in Drosophila the BMP-2 and BMP-7 orthologs Dpp and Screw can form heterodimers with unique functions required for proper development of certain tissues (Shimmi *et al.*, 2005, O'Connor *et al.*, 2006), however in vertebrates existence of such BMP heterodimers has only been postulated or recombinant proteins have been used in the analysis, but existence of such heterodimers has not really been proven *in vivo* (Schmid *et al.*, 2000, Butler & Dodd, 2003) thus a potential "cross"-influence of nonfunctional GDF-5 mutations on other BMPs can only be hypothesized.

Of the 14 missense mutations known in the *GDF5* gene four are located within the pro-part of the GDF-5 protein. Whereas for the TGF-βs the pro-part fulfills an important regulatory role, termed latency, its role for the BMP and GDF subgroup of the TGF-β superfamily is much less clear. Latency was discovered for TGF-β1 in 1984 showing that TGF-β proteins are secreted as large protein complexes that require activation for TGF-β signaling (Lawrence *et al.*, 1984). It is known today that upon secretion the pro-part of TGF-βs is cleaved in the Golgi apparatus by furin proteases at a site between the pro- and mature part containing a consensus RXXR motif (other proteases might substitute for furin proteases but providing for TGF-β proteins with different N-termini) (Dubois *et al.*, 1995). The pro-part also called latency-associated peptide (LAP) however is still non-covalently attached thereby interfering with TGF-β signaling. Activation corresponding to release of the mature part from this intermediate latent complex is achieved either by physicochemical changes in the environment, e.g. acidification or by further proteolysis. Proteins specifically binding LAP have been identified (Miyazono *et al.*, 1988), these latent TGF-β binding proteins (LTBP) interact with the extracellular matrix and play an important role in the TGF-β activation process (for review see (Annes *et al.*, 2003)). For BMPs a process identical to latency as observed for TGF-βs is not known, but the pro-part of the BMPs possibly enhances the otherwise poor solubility of BMPs under physiolocigal conditions and thus might provide for or enhance their long-range activity (Sengle *et al.*, 2008, Sengle *et al.*, 2011). Recent determination of the structure of the TGF-β1 pro-protein now provides for an insight in the regulatory mechanism of the pro-part at atomic level (Shi *et al.*, 2011). The pro-part embraces the mature part of TGF-β like a straitjacket, a long N-terminal α-helix binds into the type I receptor-binding site (in BMPs and GDFs called wrist epitope) thereby blocking receptor access to this epitope. A proline-rich loop termed latency lasso and a second α-helix encompass the fingertips and the back of the second finger of the mature part of TGF-β hence also blocking the type II receptor epitope. The pro-domain monomers form a dimerization site in the C-terminal region called bowtie, which is located above the butterfly-shaped dimeric TGF-β mature part. Two intermolecular disulfide bonds additionally stabilize the dimerization between the pro-domain subunits. Strikingly, the arrangement of the pro- and mature domain resembles the overall architecture found for the Noggin-BMP7 interaction (Groppe *et al.*, 2002). Both receptor-binding epitopes are tightly blocked from receptor access and the binding of the modulator/pro-domain is strongly enhanced through avidity by forming a covalently linked dimer. The importance of the covalent dimer linkage becomes obvious in the rare bone disorder Camurati-Engelmann disease in which these cysteine residues in the TGF-β1 pro-part are mutated resulting in a disrupted dimerization and leading to increased ligand activation (Janssens *et al.*, 2003, Walton *et al.*, 2010).

Missense Mutations in GDF-5 Signaling: Molecular Mechanisms Behind Skeletal Malformation 35

**Figure 9.** Mutations in GDF-5 and its effect on structure or interactions. A) Homology model of pro-GDF-5 based on the structure of pro-TGF-β1 in ribbon representation (Shi *et al.*, 2011). The mature part of GDF-5 (shown in blue and yellow) is embraced by the pro-part with the N-terminal part resembling a straitjacket (in red and orange). This element comprising of two helices block access to both type I and type II receptor binding epitopes. In contrast to the pro-part of TGF-βs the pro-domains of BMPs and GDFs likely do not have intermolecular disulfides (the potential positions of Cys268 and Cys310 are shown) suggesting that the pro/mature part assembly of BMPs and GDFs might be less stable compared

to TGF-βs. Four missense mutations in the pro-part are found to be associated with skeletal

malformation diseases: M173V, S204R, R378Q, and R380Q. The first two mutations (marked by green

Although the sequence homology (as well as differences in the length) between the prodomains of the various TGF-β members is certainly lower than between their mature parts alignments clearly show that all pro-domains will adopt a similar fold (Shi *et al.*, 2011). A homology model for pro-GDF-5 build on the basis of pro-TGF-β1 structure instantly provides for possible explanations to why the effect of latency is quite different between TGF-βs and members of the BMP subgroup. Particularly for GDF-5 (also true for GDF-6 and -7) many loops in the pro-domain are extended possibly creating further sites for proteolytic activation or degradation, secondly BMPs and GDFs lack the two cysteine residues present in the pro-domain being responsible for covalent linkage (see Fig. 9A). This suggests that the pro-domain association is much less stable for BMPs and GDFs (see mutations of cysteines in the Curati-Engelmann disease) and the release of the mature growth factor domain is facilitated without further need of processing. The four mutations in the GDF-5 pro-domain cluster in three different skeletal malformation phenotypes: M173V – BDC, S204R – BDC, R378Q/P436T (compound heterozygous) – Acromesomelic dysplasia, DuPan syndrome, R380Q – BDA2) indicating a loss-of-GDF-5 function in all cases (Everman *et al.*, 2002, Schwabe *et al.*, 2004, Douzgou *et al.*, 2008, Ploger *et al.*, 2008). On the basis of our own model methionine 173 is placed in close proximity to the first helix element blocking type I receptor binding, whereas serine 204 is placed in the so-called arm domain providing the structural scaffold for the straitjacket architecture. Both missense mutations likely lead to (local) unfolding and thus destabilize the pro-protein complex. This might subsequently lead to lower secretion efficiency and the observed loss-of-function phenotype. The mutation

Walton *et al.*, 2010).

observed for TGF-βs is not known, but the pro-part of the BMPs possibly enhances the otherwise poor solubility of BMPs under physiolocigal conditions and thus might provide for or enhance their long-range activity (Sengle *et al.*, 2008, Sengle *et al.*, 2011). Recent determination of the structure of the TGF-β1 pro-protein now provides for an insight in the regulatory mechanism of the pro-part at atomic level (Shi *et al.*, 2011). The pro-part embraces the mature part of TGF-β like a straitjacket, a long N-terminal α-helix binds into the type I receptor-binding site (in BMPs and GDFs called wrist epitope) thereby blocking receptor access to this epitope. A proline-rich loop termed latency lasso and a second α-helix encompass the fingertips and the back of the second finger of the mature part of TGF-β hence also blocking the type II receptor epitope. The pro-domain monomers form a dimerization site in the C-terminal region called bowtie, which is located above the butterfly-shaped dimeric TGF-β mature part. Two intermolecular disulfide bonds additionally stabilize the dimerization between the pro-domain subunits. Strikingly, the arrangement of the pro- and mature domain resembles the overall architecture found for the Noggin-BMP7 interaction (Groppe *et al.*, 2002). Both receptor-binding epitopes are tightly blocked from receptor access and the binding of the modulator/pro-domain is strongly enhanced through avidity by forming a covalently linked dimer. The importance of the covalent dimer linkage becomes obvious in the rare bone disorder Camurati-Engelmann disease in which these cysteine residues in the TGF-β1 pro-part are mutated resulting in a disrupted dimerization and leading to increased ligand activation (Janssens *et al.*, 2003,

Although the sequence homology (as well as differences in the length) between the prodomains of the various TGF-β members is certainly lower than between their mature parts alignments clearly show that all pro-domains will adopt a similar fold (Shi *et al.*, 2011). A homology model for pro-GDF-5 build on the basis of pro-TGF-β1 structure instantly provides for possible explanations to why the effect of latency is quite different between TGF-βs and members of the BMP subgroup. Particularly for GDF-5 (also true for GDF-6 and -7) many loops in the pro-domain are extended possibly creating further sites for proteolytic activation or degradation, secondly BMPs and GDFs lack the two cysteine residues present in the pro-domain being responsible for covalent linkage (see Fig. 9A). This suggests that the pro-domain association is much less stable for BMPs and GDFs (see mutations of cysteines in the Curati-Engelmann disease) and the release of the mature growth factor domain is facilitated without further need of processing. The four mutations in the GDF-5 pro-domain cluster in three different skeletal malformation phenotypes: M173V – BDC, S204R – BDC, R378Q/P436T (compound heterozygous) – Acromesomelic dysplasia, DuPan syndrome, R380Q – BDA2) indicating a loss-of-GDF-5 function in all cases (Everman *et al.*, 2002, Schwabe *et al.*, 2004, Douzgou *et al.*, 2008, Ploger *et al.*, 2008). On the basis of our own model methionine 173 is placed in close proximity to the first helix element blocking type I receptor binding, whereas serine 204 is placed in the so-called arm domain providing the structural scaffold for the straitjacket architecture. Both missense mutations likely lead to (local) unfolding and thus destabilize the pro-protein complex. This might subsequently lead to lower secretion efficiency and the observed loss-of-function phenotype. The mutation

**Figure 9.** Mutations in GDF-5 and its effect on structure or interactions. A) Homology model of pro-GDF-5 based on the structure of pro-TGF-β1 in ribbon representation (Shi *et al.*, 2011). The mature part of GDF-5 (shown in blue and yellow) is embraced by the pro-part with the N-terminal part resembling a straitjacket (in red and orange). This element comprising of two helices block access to both type I and type II receptor binding epitopes. In contrast to the pro-part of TGF-βs the pro-domains of BMPs and GDFs likely do not have intermolecular disulfides (the potential positions of Cys268 and Cys310 are shown) suggesting that the pro/mature part assembly of BMPs and GDFs might be less stable compared to TGF-βs. Four missense mutations in the pro-part are found to be associated with skeletal malformation diseases: M173V, S204R, R378Q, and R380Q. The first two mutations (marked by green

spheres) possibly cause misfolding of the pro-domain thereby weakening the pro-protein and leading to lower secretion efficiency. The latter two mutations are located in the furin protease site (marked as light-blue spheres) and were shown to lower or abrogate proteolytic processing of the pro-protein. B) Homology model of the Noggin:GDF-5 complex (Schwaerzer *et al.*, 2011) based on the crystal structure of the Noggin:BMP-7 complex (Groppe *et al.*, 2002). Noggin, by a similar mechanism but different structural architecture, embraces GDF-5 thereby blocking receptor binding of either subtype through its clip and finger domains. Three missense mutations in GDF-5 associated with symphalangism were shown to have impaired GDF-5 – Noggin interaction: N445T/K, S475N, and E491K. All three mutations are in close proximity of the Noggin clip region suggesting that through loss of interaction with this element GDF-5 binding to Noggin is attenuated. C) Ribbon representation of the mature part of GDF-5 with the two monomeric subunits shown in blue and yellow. The architecture of a GDF-5 dimer resembles a left hand, the α-helix forming the palm, the two β-sheets depicting two fingers and the Nterminus marking the thumb. Consequentally, the receptor binding epitopes were named wrist (type I receptor), formed by the dorsal side of the fingers and the palm, and knuckle (type II receptor), formed by the ventral side of finger 1 and 2. The location of all known mutations associated with skeletal malformation diseases is depicted by spheres, with color-coding according to their belonging to either cystine knot mutations (red), pre-helix loop mutations (green) or mutations affecting Noggin-binding (magenta). D) As in C but rotated clockwise around the x-axis by 90°. E) Ribbon representation of the complex of GDF-5 (in blue and yellow) bound to the extracellular domain of BMPR-IB (grey). The overview clearly shows that affected residues in the pre-helix loop are in contact with receptor elements suggesting that these mutations alter type I receptor binding. F) Magnification of the interaction between residues in the pre-helix loop of GDF-5 and residues in the binding epitope of BMPR-IB. The complete prehelix loop is tightly packed to residues in the threestranded β-sheet of BMPR-IB. GDF-5 Arg438 is involved in hydrogen bonds to His24 located in the β1β2-loop of BMPR-IB. The tight turn structures at the N- and Cterminal end of the pre-helix loop also indicate that the mutations involving the exchange of a proline (P436T) or introduction of a proline (L441P) will likely destroy the conformation of the pre-helix loop thereby affecting receptor binding even if these two residues do not form direct contacts with GDF-5.

Missense Mutations in GDF-5 Signaling: Molecular Mechanisms Behind Skeletal Malformation 37

Of the other eight known disease-related amino acid exchanges in the mature part of GDF-5, several mutations involve the exchange of a cysteine residue participating in the formation of the cystine knot, e.g. C400Y, C429R, C498S or introduce additional cysteine residues, e.g. R399C, R438C, which will interfere with proper formation of the cystine knot, thereby leading to a misfolded inactive protein. Several studies show that under conditions mimicking a homozygous background no secretion of the GDF-5 variant is observed (Everman *et al.*, 2002, Dawson *et al.*, 2006). However, mutations involving cysteines can also act dominant-negatively (see Fig. 9). Thomas *et al.* tested the effect of the GDF-5 mutation C400Y, which is found homozygous in chondrodysplasia Grebe type (Thomas *et al.*, 1997). Upon transfection of only the mutated gene into COS-7 cells resembling a homozygous background no GDF-5 protein could be detected in the cell supernatant, however cotransfection of the genes for wildtype GDF-5 and the variant GDF-5 C400Y clearly attenuated GDF-5 protein levels in the supernatant. This effect was dose-dependent indicating that for heterozygous carriers through differential allelic expression a highly variable phenotype could possibly be observed (Thomas *et al.*, 1997). Furthermore, this study also indicated that the mutation might act dominant negative onto other BMPs by selective heterodimerization. By co-transfection of the gene encoding for GDF-5 C400Y together with either BMP-2, BMP-3 or BMP-7, heterodimers could be isolated from the cell

supernatant that will most likely be non-functional (Thomas *et al.*, 1997).

**3.6. GDF-5 activity is tightly regulated by the BMP antagonist Noggin** 

All other missense mutations in the *GDF5* gene cluster in two regions of the GDF-5 structure (see Fig. 9C/D). Three missense mutations cluster in close proximity of finger 2 of GDF-5, N445T/K (Seemann *et al.*, 2009), S475N (Akarsu *et al.*, 1999, Schwaerzer *et al.*, 2011) and E491K (Wang *et al.*, 2006). The heterozygous mutations N445T and N445K in GDF-5 were identified in patients suffering from multiple synostosis syndrome (SYNS1) characterized by fusion of carpal bones and proximal symphalangism in fingers II to V (Seemann *et al.*, 2009). Analysis of the recombinant GDF-5 variant in BMPR-IB transfected myoblastic C2C12 cells indicated that the mutation did not lead to a loss of GDF-5 function. In fact analyzing the expression of the osteogenic marker alkaline phosphatase in non-transfected C2C12 cells revealed even a gain of activity exemplified by a small but measureable ALP induction when stimulating with GDF-5 N445T but no induction of ALP expression when using wildtype GDF-5. As this activating mutation is located within the wrist (type I receptor binding) epitope of GDF-5 differences in binding to the BMP type I receptors were assumed. However, competition assays using soluble receptor ectodomains showed that binding of the GDF-5 variant N445T to BMPR-IA as well as BMPR-IB is unaltered (Seemann *et al.*, 2009). Sequence comparison with other BMP factors indicated that one of the mutations found, the exchange of Asn445 to lysine, is native in BMP-9 and BMP-10. As the latter factors are insensitive to Noggin inhibition, Seemann *et al.* assumed that this mutation also renders GDF-5 insensitive to inhibition by Noggin. *In vitro* assays indeed confirmed that GDF-5 N445T is not antagonized by recombinant Noggin protein leading to an increase in GDF-5 signaling activity during early stages of limb and joint development where Noggin and *GDF5* expression patterns overlap (Seemann *et al.*, 2005, Seemann *et al.*, 2009). Another mutation in GDF-5 leading to proximal symphalangism is

R380Q targets the pro-domain cleavage site by destroying or attenuating proteolytic processing via furin proteases (Ploger *et al.*, 2008). The now covalent linkage of pro- and mature part of GDF-5 R380Q very likely enhances the competition of the pro-domain with receptor binding and thus leads to loss of or attenuated GDF-5 activity (Ploger *et al.*, 2008). The mechanism by which the double mutation R378Q/P436T causes the skeletal malformation is more complex. As the mutation is compound heterozygous, three GDF-5 variants are potentially produced in the patient. Statistically 50% of the GDF-5 protein would carry both exchanges as a heterodimer and the other 50% would consist of homodimers with either one of the two mutations. Heterozygous carriers of the individual missense mutations R378Q or P436T did not exhibit any skeletal phenotype thus preventing to point towards a particular mutation as disease-causing if found in a homozygous background. For the mutation R378Q it can be assumed that processing of the pro-protein is at least impaired and thus the portion of GDF-5 R378Q homodimer is likely to be inactive as found for R380Q (see Fig. 9) blank (Ploger *et al.*, 2008). The missense mutation P436T is located in the mature part of GDF-5 in the so-called pre-helix loop of the GDF-5 type I receptor-binding epitope (Nickel *et al.*, 2005). Mutation of the equivalent proline residue in BMP-2 strongly decreased binding of this BMP-2 variant to both type I receptors, BMPR-IA and BMPR-IB thus leading to a loss of BMP signaling (Kirsch *et al.*, 2000).

Of the other eight known disease-related amino acid exchanges in the mature part of GDF-5, several mutations involve the exchange of a cysteine residue participating in the formation of the cystine knot, e.g. C400Y, C429R, C498S or introduce additional cysteine residues, e.g. R399C, R438C, which will interfere with proper formation of the cystine knot, thereby leading to a misfolded inactive protein. Several studies show that under conditions mimicking a homozygous background no secretion of the GDF-5 variant is observed (Everman *et al.*, 2002, Dawson *et al.*, 2006). However, mutations involving cysteines can also act dominant-negatively (see Fig. 9). Thomas *et al.* tested the effect of the GDF-5 mutation C400Y, which is found homozygous in chondrodysplasia Grebe type (Thomas *et al.*, 1997). Upon transfection of only the mutated gene into COS-7 cells resembling a homozygous background no GDF-5 protein could be detected in the cell supernatant, however cotransfection of the genes for wildtype GDF-5 and the variant GDF-5 C400Y clearly attenuated GDF-5 protein levels in the supernatant. This effect was dose-dependent indicating that for heterozygous carriers through differential allelic expression a highly variable phenotype could possibly be observed (Thomas *et al.*, 1997). Furthermore, this study also indicated that the mutation might act dominant negative onto other BMPs by selective heterodimerization. By co-transfection of the gene encoding for GDF-5 C400Y together with either BMP-2, BMP-3 or BMP-7, heterodimers could be isolated from the cell supernatant that will most likely be non-functional (Thomas *et al.*, 1997).

36 Mutations in Human Genetic Disease

spheres) possibly cause misfolding of the pro-domain thereby weakening the pro-protein and leading to lower secretion efficiency. The latter two mutations are located in the furin protease site (marked as light-blue spheres) and were shown to lower or abrogate proteolytic processing of the pro-protein. B) Homology model of the Noggin:GDF-5 complex (Schwaerzer *et al.*, 2011) based on the crystal structure of the Noggin:BMP-7 complex (Groppe *et al.*, 2002). Noggin, by a similar mechanism but different structural architecture, embraces GDF-5 thereby blocking receptor binding of either subtype through its clip and finger domains. Three missense mutations in GDF-5 associated with symphalangism were shown to have impaired GDF-5 – Noggin interaction: N445T/K, S475N, and E491K. All three mutations are in close proximity of the Noggin clip region suggesting that through loss of interaction with this element GDF-5 binding to Noggin is attenuated. C) Ribbon representation of the mature part of GDF-5 with the two monomeric subunits shown in blue and yellow. The architecture of a GDF-5 dimer resembles a left hand, the α-helix forming the palm, the two β-sheets depicting two fingers and the Nterminus marking the thumb. Consequentally, the receptor binding epitopes were named wrist (type I receptor), formed by the dorsal side of the fingers and the palm, and knuckle (type II receptor), formed by

the ventral side of finger 1 and 2. The location of all known mutations associated with skeletal malformation diseases is depicted by spheres, with color-coding according to their belonging to either cystine knot mutations (red), pre-helix loop mutations (green) or mutations affecting Noggin-binding (magenta). D) As in C but rotated clockwise around the x-axis by 90°. E) Ribbon representation of the complex of GDF-5 (in blue and yellow) bound to the extracellular domain of BMPR-IB (grey). The overview clearly shows that affected residues in the pre-helix loop are in contact with receptor elements suggesting that these mutations alter type I receptor binding. F) Magnification of the interaction between residues in the pre-helix loop of GDF-5 and residues in the binding epitope of BMPR-IB. The complete prehelix loop is tightly packed to residues in the threestranded β-sheet of BMPR-IB. GDF-5 Arg438 is involved in hydrogen bonds to His24 located in the β1β2-loop of BMPR-IB. The tight turn structures at the N- and Cterminal end of the pre-helix loop also indicate that the mutations involving the exchange of a proline (P436T) or introduction of a proline (L441P) will likely destroy the conformation of the pre-helix loop thereby affecting receptor binding even if these two residues do not form direct contacts with GDF-5.

R380Q targets the pro-domain cleavage site by destroying or attenuating proteolytic processing via furin proteases (Ploger *et al.*, 2008). The now covalent linkage of pro- and mature part of GDF-5 R380Q very likely enhances the competition of the pro-domain with receptor binding and thus leads to loss of or attenuated GDF-5 activity (Ploger *et al.*, 2008). The mechanism by which the double mutation R378Q/P436T causes the skeletal malformation is more complex. As the mutation is compound heterozygous, three GDF-5 variants are potentially produced in the patient. Statistically 50% of the GDF-5 protein would carry both exchanges as a heterodimer and the other 50% would consist of homodimers with either one of the two mutations. Heterozygous carriers of the individual missense mutations R378Q or P436T did not exhibit any skeletal phenotype thus preventing to point towards a particular mutation as disease-causing if found in a homozygous background. For the mutation R378Q it can be assumed that processing of the pro-protein is at least impaired and thus the portion of GDF-5 R378Q homodimer is likely to be inactive as found for R380Q (see Fig. 9) blank (Ploger *et al.*, 2008). The missense mutation P436T is located in the mature part of GDF-5 in the so-called pre-helix loop of the GDF-5 type I receptor-binding epitope (Nickel *et al.*, 2005). Mutation of the equivalent proline residue in BMP-2 strongly decreased binding of this BMP-2 variant to both type I receptors, BMPR-IA

and BMPR-IB thus leading to a loss of BMP signaling (Kirsch *et al.*, 2000).

#### **3.6. GDF-5 activity is tightly regulated by the BMP antagonist Noggin**

All other missense mutations in the *GDF5* gene cluster in two regions of the GDF-5 structure (see Fig. 9C/D). Three missense mutations cluster in close proximity of finger 2 of GDF-5, N445T/K (Seemann *et al.*, 2009), S475N (Akarsu *et al.*, 1999, Schwaerzer *et al.*, 2011) and E491K (Wang *et al.*, 2006). The heterozygous mutations N445T and N445K in GDF-5 were identified in patients suffering from multiple synostosis syndrome (SYNS1) characterized by fusion of carpal bones and proximal symphalangism in fingers II to V (Seemann *et al.*, 2009). Analysis of the recombinant GDF-5 variant in BMPR-IB transfected myoblastic C2C12 cells indicated that the mutation did not lead to a loss of GDF-5 function. In fact analyzing the expression of the osteogenic marker alkaline phosphatase in non-transfected C2C12 cells revealed even a gain of activity exemplified by a small but measureable ALP induction when stimulating with GDF-5 N445T but no induction of ALP expression when using wildtype GDF-5. As this activating mutation is located within the wrist (type I receptor binding) epitope of GDF-5 differences in binding to the BMP type I receptors were assumed. However, competition assays using soluble receptor ectodomains showed that binding of the GDF-5 variant N445T to BMPR-IA as well as BMPR-IB is unaltered (Seemann *et al.*, 2009). Sequence comparison with other BMP factors indicated that one of the mutations found, the exchange of Asn445 to lysine, is native in BMP-9 and BMP-10. As the latter factors are insensitive to Noggin inhibition, Seemann *et al.* assumed that this mutation also renders GDF-5 insensitive to inhibition by Noggin. *In vitro* assays indeed confirmed that GDF-5 N445T is not antagonized by recombinant Noggin protein leading to an increase in GDF-5 signaling activity during early stages of limb and joint development where Noggin and *GDF5* expression patterns overlap (Seemann *et al.*, 2005, Seemann *et al.*, 2009). Another mutation in GDF-5 leading to proximal symphalangism is

E491K discovered in two large Chinese families (Wang *et al.*, 2006). The skeletal malformation phenotype resembles the one seen in aforementioned patients having either the mutation N445T/K (Seemann *et al.*, 2009) or R438L (Seemann *et al.*, 2005) in the *GDF5* gene. Nothing is known about receptor or modulator protein binding of this particular GDF-5 variant, however in the GDF-5 structure Glu491 is in close proximity to Asn445. Moreover, the sidechain carboxamide group of Asn445 is forming a hydrogen bond to the backbone carbonyl of Glu491 possibly suggesting a similar disease-causing molecular mechanism through the loss of inhibition by Noggin as described above by Seemann *et al.* (2009). Modeling of a GDF-5:Noggin complex based on the structure of the BMP-7:Noggin interaction (Groppe *et al.*, 2002) does however not indicate a direct interference of a GDF-5:Noggin interaction by exchanging Glu491 by lysine (see Fig. 9).

Missense Mutations in GDF-5 Signaling: Molecular Mechanisms Behind Skeletal Malformation 39

*al.*, 2008). For BMP-2 and GDF-5 this segment contains the so-called main binding determinant a highly conserved leucine residue, whose polar main chain atoms makes a pair of hydrogen bonds with a conserved glutamine residue present in the BMP type I receptors IA and IB. Mutation of either the leucine to a proline in BMP-2 or GDF-5 or the glutamine residue in BMPR-IA or BMPR-IB leads to a strongly reduced type I receptor affinity (Keller *et al.*, 2004, Kotzsch *et al.*, 2009). In the unbound state this pre-helix loop segment is also rather flexible allowing for geometrical adaptability to different receptor surface geometries. This observation together with the disordered and flexible ligand-binding epitope seen in the BMP type I receptors provides a mechanism for the pronounced ligand-receptor promiscuity seen in the BMP/GDF-subgroup of the TGF-β superfamily (Keller *et al.*, 2004, Allendorph *et al.*, 2007, Klages *et al.*, 2008, Kotzsch *et al.*, 2008, Saremba *et al.*, 2008). Despite structural analyses showed that the pre-helix is flexible before receptor binding, the mutation L441P suggests that in the bound state a geometrically defined conformation is required for (high affinity) binding of BMP type I receptors (Kotzsch *et al.*, 2009). Residue Leu441 is located at the C-terminal end of the pre-helix loop forming a sharp turn together with Ser439 and His440 (see Fig. 9E/F). The sidechain of Leu441 is oriented into the interior of GDF-5 making it implausible that its exchange to proline affects type I receptor binding through altering direct interactions. However, the different backbone torsion angle restraints of a non-proline compared to a proline residue suggest that the L441P mutation alters the conformation of the C-terminal end of the pre-helix loop and that hereby important non-covalent interactions between GDF-5 and its type I receptors are strongly impaired. Although earlier reports claim that the mutation L441P in GDF-5 affects binding to the BMP receptor IB (Faiyaz-Ul-Haque *et al.*, 2002b, Seemann *et al.*, 2005) our own data shows that binding to both BMP type I receptors is strongly attenuated (Kotzsch *et al.*, 2009). A rather complex mutation discovered by Szczaluba *et al.* in patients suffering from DuPan syndrome shows shortening of all toes as well as all fingers but the thumb (Szczaluba *et al.*, 2005). Here in the GDF-5 protein residue Leu437 is deleted and the adjacent residues Ser439 and His440 are mutated to threonine and leucine respectively (see Fig. 9). As these changes grossly alter the sequence as well as conformation of the pre-helix loop, it is not surprising that this GDF-5 compound variant shows no type I receptor binding at all (Kotzsch *et al.*, 2009). Interestingly, although the mutation was found to be heterozygous in the carrier it has a dominant-negative effect (Szczaluba *et al.*, 2005). Misfolding of the mutant protein and hence impaired secretion can be excluded as explanation, as the protein could be recombinantly produced and exhibits wildtype-like affinity to BMP type II receptors. One possible explanation for the quite strong skeletal phenotype might be that this GDF-5 variant is not only inactive but possibly still retains its Noggin-binding capability and therefore can act as a Noggin scavenger similar as to what was described for the BMP-2

The probably most interesting mutation in GDF-5 is the exchange of Arg438 to leucine found in patients suffering from proximal symphalangism (Seemann *et al.*, 2005). Based on a structural-function analysis to determine the GDF-5 type I receptor specificity this amino acid position – 438 if the complete pre-pro-protein is considered and position 57 if

variant L51P (Keller *et al.*, 2004).

The mutation S475N is another mutation in the mature part of GDF-5, which causes multiple synostosis syndrome (SYNS1), a phenotypic description of these heterozygous missense mutations was first reported by Akarsu *et al.* (1999). The phenotype again suggests a gain-offunction in GDF-5 signaling. A detailed analysis of the signaling properties of this GDF-5 variant indeed revealed that GDF-5 S475N is significantly more potent in the chondrogenic differentiation in chicken micromass culture compared to wildtype GDF-5 (Schwaerzer *et al.*, 2011). The mutation is located in the knuckle (type II receptor) epitope of GDF-5 (see Fig. 9C/D). Although no direct structural data is currently available for GDF-5 bound to type I and type II receptors, structure data available on ternary complexes of BMP-2 (Allendorph *et al.*, 2006, Weber *et al.*, 2007) indicated that this highly conserved serine residue is at the center of the BMP/GDF type II receptor interaction. Despite its location exchange of this residue in BMP-2 affected type II receptor binding only marginally (Weber *et al.*, 2007) suggesting that other residues in the BMP-type II receptor interface are more important for the ligand-receptor interaction. However, in GDF-5 Ser475 seems more important for the binding of BMPR-II as indicated by a 7-fold decrease in the binding affinity upon mutation to asparagine, which seems surprising given the fact that this mutant shows an elevated activity compared to wildtype GDF-5 (Schwaerzer *et al.*, 2011). As the BMP type II receptor epitope overlaps heavily with that of Noggin, also the change in binding to Noggin was determined showing that also Noggin binding affinity is similarly decreased by 4-fold. When the effect of Noggin inhibition on BMP factors was investigated by analyzing BMP-induced alkaline phosphatase expression or chondrogenic differentiation in chicken micromass culture in the presence of Noggin, GDF-5 S475N was clearly resistant to antagonizing effects by Noggin, whereas signals from wildtype GDF-5 could be efficiently blocked with Noggin (Schwaerzer *et al.*, 2011). This possibly indicates that the loss in BMP type II receptor binding affinity seen for this variant is overcompensated by the deprivation of Noggin-mediated inhibition (Schwaerzer *et al.*, 2011).

#### **3.7. Type I receptor binding as well as receptor specificity is essential for correct GDF-5 function**

A clear hotspot for disease-related mutations is found for the so-called pre-helix loop located in the wrist epitope of GDF-5 (Nickel *et al.*, 2005). This loop is the key interaction element for BMP-type I receptor interaction (Kirsch *et al.*, 2000, Keller *et al.*, 2004, Kotzsch *et* 

Glu491 by lysine (see Fig. 9).

**GDF-5 function** 

E491K discovered in two large Chinese families (Wang *et al.*, 2006). The skeletal malformation phenotype resembles the one seen in aforementioned patients having either the mutation N445T/K (Seemann *et al.*, 2009) or R438L (Seemann *et al.*, 2005) in the *GDF5* gene. Nothing is known about receptor or modulator protein binding of this particular GDF-5 variant, however in the GDF-5 structure Glu491 is in close proximity to Asn445. Moreover, the sidechain carboxamide group of Asn445 is forming a hydrogen bond to the backbone carbonyl of Glu491 possibly suggesting a similar disease-causing molecular mechanism through the loss of inhibition by Noggin as described above by Seemann *et al.* (2009). Modeling of a GDF-5:Noggin complex based on the structure of the BMP-7:Noggin interaction (Groppe *et al.*, 2002) does however not indicate a direct interference of a GDF-5:Noggin interaction by exchanging

The mutation S475N is another mutation in the mature part of GDF-5, which causes multiple synostosis syndrome (SYNS1), a phenotypic description of these heterozygous missense mutations was first reported by Akarsu *et al.* (1999). The phenotype again suggests a gain-offunction in GDF-5 signaling. A detailed analysis of the signaling properties of this GDF-5 variant indeed revealed that GDF-5 S475N is significantly more potent in the chondrogenic differentiation in chicken micromass culture compared to wildtype GDF-5 (Schwaerzer *et al.*, 2011). The mutation is located in the knuckle (type II receptor) epitope of GDF-5 (see Fig. 9C/D). Although no direct structural data is currently available for GDF-5 bound to type I and type II receptors, structure data available on ternary complexes of BMP-2 (Allendorph *et al.*, 2006, Weber *et al.*, 2007) indicated that this highly conserved serine residue is at the center of the BMP/GDF type II receptor interaction. Despite its location exchange of this residue in BMP-2 affected type II receptor binding only marginally (Weber *et al.*, 2007) suggesting that other residues in the BMP-type II receptor interface are more important for the ligand-receptor interaction. However, in GDF-5 Ser475 seems more important for the binding of BMPR-II as indicated by a 7-fold decrease in the binding affinity upon mutation to asparagine, which seems surprising given the fact that this mutant shows an elevated activity compared to wildtype GDF-5 (Schwaerzer *et al.*, 2011). As the BMP type II receptor epitope overlaps heavily with that of Noggin, also the change in binding to Noggin was determined showing that also Noggin binding affinity is similarly decreased by 4-fold. When the effect of Noggin inhibition on BMP factors was investigated by analyzing BMP-induced alkaline phosphatase expression or chondrogenic differentiation in chicken micromass culture in the presence of Noggin, GDF-5 S475N was clearly resistant to antagonizing effects by Noggin, whereas signals from wildtype GDF-5 could be efficiently blocked with Noggin (Schwaerzer *et al.*, 2011). This possibly indicates that the loss in BMP type II receptor binding affinity seen for this variant is overcompensated by the deprivation of Noggin-mediated inhibition (Schwaerzer *et al.*, 2011).

**3.7. Type I receptor binding as well as receptor specificity is essential for correct** 

A clear hotspot for disease-related mutations is found for the so-called pre-helix loop located in the wrist epitope of GDF-5 (Nickel *et al.*, 2005). This loop is the key interaction element for BMP-type I receptor interaction (Kirsch *et al.*, 2000, Keller *et al.*, 2004, Kotzsch *et*  *al.*, 2008). For BMP-2 and GDF-5 this segment contains the so-called main binding determinant a highly conserved leucine residue, whose polar main chain atoms makes a pair of hydrogen bonds with a conserved glutamine residue present in the BMP type I receptors IA and IB. Mutation of either the leucine to a proline in BMP-2 or GDF-5 or the glutamine residue in BMPR-IA or BMPR-IB leads to a strongly reduced type I receptor affinity (Keller *et al.*, 2004, Kotzsch *et al.*, 2009). In the unbound state this pre-helix loop segment is also rather flexible allowing for geometrical adaptability to different receptor surface geometries. This observation together with the disordered and flexible ligand-binding epitope seen in the BMP type I receptors provides a mechanism for the pronounced ligand-receptor promiscuity seen in the BMP/GDF-subgroup of the TGF-β superfamily (Keller *et al.*, 2004, Allendorph *et al.*, 2007, Klages *et al.*, 2008, Kotzsch *et al.*, 2008, Saremba *et al.*, 2008). Despite structural analyses showed that the pre-helix is flexible before receptor binding, the mutation L441P suggests that in the bound state a geometrically defined conformation is required for (high affinity) binding of BMP type I receptors (Kotzsch *et al.*, 2009). Residue Leu441 is located at the C-terminal end of the pre-helix loop forming a sharp turn together with Ser439 and His440 (see Fig. 9E/F). The sidechain of Leu441 is oriented into the interior of GDF-5 making it implausible that its exchange to proline affects type I receptor binding through altering direct interactions. However, the different backbone torsion angle restraints of a non-proline compared to a proline residue suggest that the L441P mutation alters the conformation of the C-terminal end of the pre-helix loop and that hereby important non-covalent interactions between GDF-5 and its type I receptors are strongly impaired. Although earlier reports claim that the mutation L441P in GDF-5 affects binding to the BMP receptor IB (Faiyaz-Ul-Haque *et al.*, 2002b, Seemann *et al.*, 2005) our own data shows that binding to both BMP type I receptors is strongly attenuated (Kotzsch *et al.*, 2009). A rather complex mutation discovered by Szczaluba *et al.* in patients suffering from DuPan syndrome shows shortening of all toes as well as all fingers but the thumb (Szczaluba *et al.*, 2005). Here in the GDF-5 protein residue Leu437 is deleted and the adjacent residues Ser439 and His440 are mutated to threonine and leucine respectively (see Fig. 9). As these changes grossly alter the sequence as well as conformation of the pre-helix loop, it is not surprising that this GDF-5 compound variant shows no type I receptor binding at all (Kotzsch *et al.*, 2009). Interestingly, although the mutation was found to be heterozygous in the carrier it has a dominant-negative effect (Szczaluba *et al.*, 2005). Misfolding of the mutant protein and hence impaired secretion can be excluded as explanation, as the protein could be recombinantly produced and exhibits wildtype-like affinity to BMP type II receptors. One possible explanation for the quite strong skeletal phenotype might be that this GDF-5 variant is not only inactive but possibly still retains its Noggin-binding capability and therefore can act as a Noggin scavenger similar as to what was described for the BMP-2 variant L51P (Keller *et al.*, 2004).

The probably most interesting mutation in GDF-5 is the exchange of Arg438 to leucine found in patients suffering from proximal symphalangism (Seemann *et al.*, 2005). Based on a structural-function analysis to determine the GDF-5 type I receptor specificity this amino acid position – 438 if the complete pre-pro-protein is considered and position 57 if numbering starts with the mature part of GDF-5 - was shown before to be solely responsible for the BMPR-IB binding preference of GDF-5 (see Fig. 9E/F) (Nickel *et al.*, 2005). The equivalent residue in BMP-2, which binds both BMP type I receptors, BMPR-IA and BMPR-IB, with equally high affinity is alanine. In contrast, in GDF-5 this position is occupied by a large positively charged arginine being also the largest difference in amino acid sequence within the central type I receptor-binding epitope. Upon exchange of Arg438 in GDF-5 to alanine, GDF-5 R438A bound both type I receptors with the same affinity and with binding characteristics indistinguishable from those of BMP-2 (Nickel *et al.*, 2005). Recent structure analysis of GDF-5 bound to its type I receptor BMPR-IB revealed a molecular mechanism by which GDF-5 "discriminates" between both type I receptors (Kotzsch *et al.*, 2009). A loop between the two Nterminal β-strands of the BMP type I receptors can adopt different conformations dependent on the amino acid sequence. As this loop is in contact to the "GDF-5 specificity determining" amino acid Arg438 BMP type I receptors can be selected through the presence or absence of a steric hindrance. BMPs with large bulky sidechains at this position such as GDF-5 of the prehelix loop can only bind to BMPR-IB, whereas BMPs with small sidechains such as BMP-2 or BMP-4 can bind both BMP type I receptors equally well (Kotzsch *et al.*, 2009).

Missense Mutations in GDF-5 Signaling: Molecular Mechanisms Behind Skeletal Malformation 41

When GDF-5 was discovered, due to its highly defined expression pattern during limb development, which precisely correlates with the location of all future joints throughout the limb, it was assumed immediately that this particular TGF-β factor takes the center stage in the development of all synovial joints. It thus came as a surprise when the *GDF5* knockout mice despite being affected in joint and limb development still showed multiple joints being developed quite normally. Genetic and functional analyses of human skeletal malformation diseases such as brachydactyly or chondroplasia showed that not only a number of other genes can lead to loss of joints or limb deformations similar to those seen in the *GDF5* null mice, but that also different mutations in GDF-5 can result in very distinct malformation phenotypes. Further studies revealed that often these different factors, many of them acting as morphogens themselves, such as Wnts and its (co-)receptors, members of the Sonic Hedgehog family or the FGFs, do not act independently but can be upstream or downstream of the TGF-β signaling cascade or even form positive or negative feedback loops with signaling components of the TGF-β superfamily. This complex regulatory network is further complicated by the fact that components of the TGF-β superfamily ligands, receptors as well as antagonists – are known to function via highly promiscuous protein-protein interactions. Even if we restrict our focus onto the regulatory signaling network of GDF-5, its highly overlapping receptor binding specificities with other BMPs, such as BMP-2, BMP-6 or BMP-7, all of which are expressed in the direct neighborhood of the developing joint, make immediately clear that mutations altering binding of one particular ligand-receptor pair will ultimately affect the signaling output of other BMP

members even when those are not affected by mutations themselves.

as defined other factors will take over the GDF-5 function.

One mutation in GDF-5 – R438L – best exemplifies the dilemma. This mutation enables GDF-5 to now efficiently bind to a second BMP type I receptor, BMPR-IA. However this receptor is usually utilized by BMP-2 also present during joint development. As it is not known whether the GDF-5 variant with the altered type I receptor specificity delivers the same signal via this receptor as BMP-2 or whether it can signal at all through this BMP receptor in the present cellular context, developing a molecular disease mechanism explaining the mode of operation for this mutant seems impossible. In addition to this fuzzy BMP ligand-receptor network modulators like Noggin act like hub proteins interacting with multiple BMP ligands with a distinct BMP specificity profile. These interactions are again often linked to feedback loops leading to a precisely defined equilibrium of BMPs, BMP receptors and other modulators, which as a sum deliver a defined biological outcome. Classical morphogens such as the BMPs are considered to function via a concentration gradient, which is then interpreted by the different cells by responding to a particular morphogen threshold. However, the discrepancy of strong *GDF5* expression in all future joint locations and the highly localized effect seen in *GDF5* knockouts suggests that responsiveness to or the differentiation program run by GDF-5 is encoded along the digital ray by the various other morphogens in a temperospatial manner, thus allowing to run the differentiation program for joint formation by GDF-5 only at certain times at very defined places, whereas at other places or at earlier or later developmental stages

**4. Conclusion** 

Analysis of this BMP-2 like GDF-5 variant revealed that in a cell line (ATDC5) having prochondrogenic properties and not expressing the BMPR-IB receptor this variant now has the same signaling properties and efficiency as BMP-2 (Nickel *et al.*, 2005). Thus under these conditions GDF-5 can signal via the BMPR-IA receptor and signaling efficiency is only decreased by the lower affinity of wildtype GDF-5 for BMPR-IA. Most interestingly, despite having the same receptor binding properties as BMP-2, GDF-5 R438A still does not induce ALP expression in the myoblastic cell line C2C12 (Klammert *et al.*, 2011). As RT-PCR analysis did not reveal significant differences in BMP receptor expression between both cell lines, ATDC5 and C2C12, other mechanism must exist that determine whether GDF-5 can fully signal through a particular BMP type I receptor. This observation also indicates that GDF-5 by binding to BMPR-IA can activate signaling on some cell types whereas on other cell types it might compete with BMP-2 for BMPR-IA and act as an antagonist (Klammert *et al.*, 2011). The mutation found in SYM1 affected humans, R438L, does not show a complete loss in BMP type I receptor specificity, the larger leucine sidechain in comparison to alanine leads to a 6 to 9-fold higher affinity to BMPR-IB compared to BMPR-IA (Seemann *et al.*, 2005, Kotzsch *et al.*, 2009). However, the result will likely be similar as above in that the mutation R438L renders GDF-5 into a protein that has BMP-2 like receptor binding properties. As BMP-2 is assumed to induce or at least regulate apoptosis in the interdigital mesenchyme (Yokouchi *et al.*, 1996, Merino *et al.*, 1999a), one would first expect increased apoptosis in patients carrying the mutation R438L in GDF-5 due to the presence of an additional BMP-2 like factor (Seemann *et al.*, 2005). However, our latest observation that increased BMPR-IA binding by GDF-5 R438A might not induce full signaling in all cell types possibly indicates that here the gain-of-function mutation in GDF-5 surprisingly leads to a loss of BMP-2 signaling in certain areas of the developing joint by competing for the binding to the same receptor BMPR-IA thereby might impede BMP-2 induced apoptosis which finally results in joint fusion (Klammert *et al.*, 2011).

#### **4. Conclusion**

40 Mutations in Human Genetic Disease

joint fusion (Klammert *et al.*, 2011).

numbering starts with the mature part of GDF-5 - was shown before to be solely responsible for the BMPR-IB binding preference of GDF-5 (see Fig. 9E/F) (Nickel *et al.*, 2005). The equivalent residue in BMP-2, which binds both BMP type I receptors, BMPR-IA and BMPR-IB, with equally high affinity is alanine. In contrast, in GDF-5 this position is occupied by a large positively charged arginine being also the largest difference in amino acid sequence within the central type I receptor-binding epitope. Upon exchange of Arg438 in GDF-5 to alanine, GDF-5 R438A bound both type I receptors with the same affinity and with binding characteristics indistinguishable from those of BMP-2 (Nickel *et al.*, 2005). Recent structure analysis of GDF-5 bound to its type I receptor BMPR-IB revealed a molecular mechanism by which GDF-5 "discriminates" between both type I receptors (Kotzsch *et al.*, 2009). A loop between the two Nterminal β-strands of the BMP type I receptors can adopt different conformations dependent on the amino acid sequence. As this loop is in contact to the "GDF-5 specificity determining" amino acid Arg438 BMP type I receptors can be selected through the presence or absence of a steric hindrance. BMPs with large bulky sidechains at this position such as GDF-5 of the prehelix loop can only bind to BMPR-IB, whereas BMPs with small sidechains such as BMP-2 or

BMP-4 can bind both BMP type I receptors equally well (Kotzsch *et al.*, 2009).

Analysis of this BMP-2 like GDF-5 variant revealed that in a cell line (ATDC5) having prochondrogenic properties and not expressing the BMPR-IB receptor this variant now has the same signaling properties and efficiency as BMP-2 (Nickel *et al.*, 2005). Thus under these conditions GDF-5 can signal via the BMPR-IA receptor and signaling efficiency is only decreased by the lower affinity of wildtype GDF-5 for BMPR-IA. Most interestingly, despite having the same receptor binding properties as BMP-2, GDF-5 R438A still does not induce ALP expression in the myoblastic cell line C2C12 (Klammert *et al.*, 2011). As RT-PCR analysis did not reveal significant differences in BMP receptor expression between both cell lines, ATDC5 and C2C12, other mechanism must exist that determine whether GDF-5 can fully signal through a particular BMP type I receptor. This observation also indicates that GDF-5 by binding to BMPR-IA can activate signaling on some cell types whereas on other cell types it might compete with BMP-2 for BMPR-IA and act as an antagonist (Klammert *et al.*, 2011). The mutation found in SYM1 affected humans, R438L, does not show a complete loss in BMP type I receptor specificity, the larger leucine sidechain in comparison to alanine leads to a 6 to 9-fold higher affinity to BMPR-IB compared to BMPR-IA (Seemann *et al.*, 2005, Kotzsch *et al.*, 2009). However, the result will likely be similar as above in that the mutation R438L renders GDF-5 into a protein that has BMP-2 like receptor binding properties. As BMP-2 is assumed to induce or at least regulate apoptosis in the interdigital mesenchyme (Yokouchi *et al.*, 1996, Merino *et al.*, 1999a), one would first expect increased apoptosis in patients carrying the mutation R438L in GDF-5 due to the presence of an additional BMP-2 like factor (Seemann *et al.*, 2005). However, our latest observation that increased BMPR-IA binding by GDF-5 R438A might not induce full signaling in all cell types possibly indicates that here the gain-of-function mutation in GDF-5 surprisingly leads to a loss of BMP-2 signaling in certain areas of the developing joint by competing for the binding to the same receptor BMPR-IA thereby might impede BMP-2 induced apoptosis which finally results in When GDF-5 was discovered, due to its highly defined expression pattern during limb development, which precisely correlates with the location of all future joints throughout the limb, it was assumed immediately that this particular TGF-β factor takes the center stage in the development of all synovial joints. It thus came as a surprise when the *GDF5* knockout mice despite being affected in joint and limb development still showed multiple joints being developed quite normally. Genetic and functional analyses of human skeletal malformation diseases such as brachydactyly or chondroplasia showed that not only a number of other genes can lead to loss of joints or limb deformations similar to those seen in the *GDF5* null mice, but that also different mutations in GDF-5 can result in very distinct malformation phenotypes. Further studies revealed that often these different factors, many of them acting as morphogens themselves, such as Wnts and its (co-)receptors, members of the Sonic Hedgehog family or the FGFs, do not act independently but can be upstream or downstream of the TGF-β signaling cascade or even form positive or negative feedback loops with signaling components of the TGF-β superfamily. This complex regulatory network is further complicated by the fact that components of the TGF-β superfamily ligands, receptors as well as antagonists – are known to function via highly promiscuous protein-protein interactions. Even if we restrict our focus onto the regulatory signaling network of GDF-5, its highly overlapping receptor binding specificities with other BMPs, such as BMP-2, BMP-6 or BMP-7, all of which are expressed in the direct neighborhood of the developing joint, make immediately clear that mutations altering binding of one particular ligand-receptor pair will ultimately affect the signaling output of other BMP members even when those are not affected by mutations themselves.

One mutation in GDF-5 – R438L – best exemplifies the dilemma. This mutation enables GDF-5 to now efficiently bind to a second BMP type I receptor, BMPR-IA. However this receptor is usually utilized by BMP-2 also present during joint development. As it is not known whether the GDF-5 variant with the altered type I receptor specificity delivers the same signal via this receptor as BMP-2 or whether it can signal at all through this BMP receptor in the present cellular context, developing a molecular disease mechanism explaining the mode of operation for this mutant seems impossible. In addition to this fuzzy BMP ligand-receptor network modulators like Noggin act like hub proteins interacting with multiple BMP ligands with a distinct BMP specificity profile. These interactions are again often linked to feedback loops leading to a precisely defined equilibrium of BMPs, BMP receptors and other modulators, which as a sum deliver a defined biological outcome. Classical morphogens such as the BMPs are considered to function via a concentration gradient, which is then interpreted by the different cells by responding to a particular morphogen threshold. However, the discrepancy of strong *GDF5* expression in all future joint locations and the highly localized effect seen in *GDF5* knockouts suggests that responsiveness to or the differentiation program run by GDF-5 is encoded along the digital ray by the various other morphogens in a temperospatial manner, thus allowing to run the differentiation program for joint formation by GDF-5 only at certain times at very defined places, whereas at other places or at earlier or later developmental stages as defined other factors will take over the GDF-5 function.

## **Author details**

Tina V. Hellmann and Thomas D. Mueller *Dept. Molecular Plant Physiology and Biophysics, Julius-von-Sachs Institute of the University Wuerzburg, Wuerzburg, Germany* 

Missense Mutations in GDF-5 Signaling: Molecular Mechanisms Behind Skeletal Malformation 43

Brunet, L.J., McMahon, J.A., McMahon, A.P., and Harland, R.M. (1998). Noggin, cartilage morphogenesis, and joint formation in the mammalian skeleton. *Science,* 280, 5368, pp.

Butler, S.J., and Dodd, J. (2003). A role for BMP heterodimers in roof plate-mediated

Byrnes, A.M., Racacho, L., Nikkel, S.M., Xiao, F., MacDonald, H.*, et al.* (2010). Mutations in GDF5 presenting as semidominant brachydactyly A1. *Hum Mutat,* 31, 10, pp. 1155-1162 Carcamo, J., Weis, F.M., Ventura, F., Wieser, R., Wrana, J.L.*, et al.* (1994). Type I receptors specify growth-inhibitory and transcriptional responses to transforming growth factor

Caronia, G., Goodman, F.R., McKeown, C.M., Scambler, P.J., and Zappavigna, V. (2003). An I47L substitution in the HOXD13 homeodomain causes a novel human limb malformation by producing a selective loss of function. *Development,* 130, 8, pp. 1701-

Chaikuad, A., Sanvitale, C., Mahajan, P., Daga, N., Cooper, C.*, et al.* (2010a). Crystal structure of the cytoplasmic domain of the bone morphogenetic protein receptor type-1B (BMPR1B) in complex with FKBP12 and LDN-193189. http://www.rcsb.org Protein

Chaikuad, A., Alfano, I., Shrestha, B., Muniz, J.R.C., Petrie, K.*, et al.* (2010b). Crystal structure of the kinase domain of type I activin receptor (ACVR1) in complex with

Craig, F.M., Bentley, G., and Archer, C.W. (1987). The spatial and temporal pattern of collagens I and II and keratan sulphate in the developing chick metatarsophalangeal

Dathe, K., Kjaer, K.W., Brehm, A., Meinecke, P., Nurnberg, P.*, et al.* (2009). Duplications involving a conserved regulatory element downstream of BMP2 are associated with

Davis, A.P., and Capecchi, M.R. (1996). A mutational analysis of the 5' HoxD genes: dissection of genetic interactions during limb development in the mouse. *Development,*

Dawson, K., Seeman, P., Sebald, E., King, L., Edwards, M.*, et al.* (2006). GDF5 is a second

Debeer, P., Fryns, J.P., Devriendt, K., Baten, E., Huysmans, C.*, et al.* (2004). A novel NOG mutation Pro37Arg in a family with tarsal and carpal synostoses. *Am J Med Genet A,*

Derynck, R., Gelbart, W.M., Harland, R.M., Heldin, C.H., Kern, S.E.*, et al.* (1996). Nomenclature: vertebrate mediators of TGFbeta family signals. *Cell,* 87, 2, pp. 173 Dixon, M.E., Armstrong, P., Stevens, D.B., and Bamshad, M. (2001). Identical mutations in NOG can cause either tarsal/carpal coalition syndrome or proximal symphalangism.

locus for multiple-synostosis syndrome. *Am J Hum Genet,* 78, 4, pp. 708-712

FKBP12 and dorsomorphin. http://www.rcsb.org Protein Databank (PDB). RCSB Chen, H.B., Shen, J., Ip, Y.T., and Xu, L. (2006). Identification of phosphatases for Smad in

repulsion of commissural axons. *Neuron,* 38, 3, pp. 389-401

beta and activin. *Mol Cell Biol,* 14, 6, pp. 3810-3821

the BMP/DPP pathway. *Genes Dev,* 20, 6, pp. 648-653

brachydactyly type A2. *Am J Hum Genet,* 84, 4, pp. 483-492

joint. *Development,* 99, 3, pp. 383-391

1455-1457

1712

Databank (PDB). RCSB

122, 4, pp. 1175-1185

128A, 4, pp. 439-440

*Genet Med,* 3, 5, pp. 349-353

Joachim Nickel *Dept. Tissue Engineering and Regenerative Medicine, University Hospital Wuerzburg, Wuerzburg, Germany* 

## **Acknowledgement**

We thank Markus Peer and Juliane E. Fiebig for helpful discussions and critically reading the manuscript.

## **5. References**


Brunet, L.J., McMahon, J.A., McMahon, A.P., and Harland, R.M. (1998). Noggin, cartilage morphogenesis, and joint formation in the mammalian skeleton. *Science,* 280, 5368, pp. 1455-1457

42 Mutations in Human Genetic Disease

*Wuerzburg, Wuerzburg, Germany* 

Tina V. Hellmann and Thomas D. Mueller

*Biochemistry,* 46, 43, pp. 12238-12247

*J Cell Sci,* 116, Pt 2, pp. 217-224

tooth development. *Development,* 131, 10, pp. 2257-2268

receptor. *J Biol Chem,* 280, 33, pp. 29820-29827

signaling. *Nat Struct Mol Biol,* 16, 7, pp. 691-697

*Dept. Molecular Plant Physiology and Biophysics, Julius-von-Sachs Institute of the University* 

We thank Markus Peer and Juliane E. Fiebig for helpful discussions and critically reading

Akarsu, A.N., Rezaie, T., Demirtas, M., Farhud, D.D., and Sarfarazi, M. (1999). Multiple synostosis type 2 (SYNS2) maps to 20q11.2 and caused by a missense mutation in the

Allendorph, G.P., Vale, W.W., and Choe, S. (2006). Structure of the ternary signaling complex of a TGF-beta superfamily member. *Proc Natl Acad Sci U S A,* 103, 20, pp. 7643-

Andl, T., Ahn, K., Kairo, A., Chu, E.Y., Wine-Lee, L.*, et al.* (2004). Epithelial Bmpr1a regulates differentiation and proliferation in postnatal hair follicles and is essential for

Annes, J.P., Munger, J.S., and Rifkin, D.B. (2003). Making sense of latent TGFbeta activation.

Babitt, J.L., Zhang, Y., Samad, T.A., Xia, Y., Tang, J.*, et al.* (2005). Repulsive guidance molecule (RGMa), a DRAGON homologue, is a bone morphogenetic protein co-

Basit, S., Naqvi, S.K., Wasif, N., Ali, G., Ansar, M.*, et al.* (2008). A novel insertion mutation in the cartilage-derived morphogenetic protein-1 (CDMP1) gene underlies Grebe-type chondrodysplasia in a consanguineous Pakistani family. *BMC Med Genet,* 9, pp. 102 Bell, J. (1951). On brachydactyly and symphalangism. *The treasury of human inheritance,* 5, 1,

Bosanac, I., Maun, H.R., Scales, S.J., Wen, X., Lingel, A.*, et al.* (2009). The structure of SHH in complex with HHIP reveals a recognition role for the Shh pseudo active site in

growth/differentiation factor 5 (GDF5). *Am J Hum Genet,* 65, 4, pp. A281-A281 Allendorph, G.P., Isaacs, M.J., Kawakami, Y., Izpisua Belmonte, J.C., and Choe, S. (2007). BMP-3 and BMP-6 structures illuminate the nature of binding specificity with receptors.

*Dept. Tissue Engineering and Regenerative Medicine, University Hospital Wuerzburg,* 

**Author details** 

Joachim Nickel

the manuscript.

**5. References** 

7648

pp. 1-31

*Wuerzburg, Germany* 

**Acknowledgement** 


Douzgou, S., Lehmann, K., Mingarelli, R., Mundlos, S., and Dallapiccola, B. (2008). Compound heterozygosity for GDF5 in Du Pan type chondrodysplasia. *Am J Med Genet A,* 146A, 16, pp. 2116-2121

Missense Mutations in GDF-5 Signaling: Molecular Mechanisms Behind Skeletal Malformation 45

Gong, Y., Krakow, D., Marcelino, J., Wilkin, D., Chitayat, D.*, et al.* (1999). Heterozygous mutations in the gene encoding noggin affect human joint morphogenesis. *Nat Genet,*

Gray, P.C., Bilezikjian, L.M., and Vale, W. (2002). Antagonism of activin by inhibin and inhibin receptors: a functional role for betaglycan. *Mol Cell Endocrinol,* 188, 1-2, pp. 254-

Greenwald, J., Groppe, J., Gray, P., Wiater, E., Kwiatkowski, W.*, et al.* (2003). The BMP7/ActRII extracellular domain complex provides new insights into the cooperative

Groppe, J., Greenwald, J., Wiater, E., Rodriguez-Leon, J., Economides, A.N.*, et al.* (2002). Structural basis of BMP signalling inhibition by the cystine knot protein Noggin. *Nature,*

Gruneberg, H., and Lee, A.J. (1973). The anatomy and development of brachypodism in the

Guo, S., Zhou, J., Gao, B., Hu, J., Wang, H.*, et al.* (2010). Missense mutations in IHH impair Indian Hedgehog signaling in C3H10T1/2 cells: Implications for brachydactyly type A1,

Hatta, T., Konishi, H., Katoh, E., Natsume, T., Ueno, N.*, et al.* (2000). Identification of the ligand-binding site of the BMP type IA receptor for BMP-4. *Biopolymers,* 55, 5, pp. 399-

Hattersley, G., Hewick, R., and Rosen, V. (1995). In-Situ Localization and in-Vitro Activity of

Heinecke, K., Seher, A., Schmitz, W., Mueller, T.D., Sebald, W.*, et al.* (2009). Receptor oligomerization and beyond: a case study in bone morphogenetic proteins. *BMC Biol,* 7,

Heldin, C.H., Miyazono, K., and ten Dijke, P. (1997). TGF-beta signalling from cell membrane to nucleus through SMAD proteins. *Nature,* 390, 6659, pp. 465-471 Hinchliffe, J.R., and Johnson, D.R. (1980). *The development of the vertebrate limb: an approach through experiment, genetics, and evolution*. Oxford University Press, ISBN 9780198575528 Hirshoren, N., Gross, M., Banin, E., Sosna, J., Bargal, R.*, et al.* (2008). P35S mutation in the NOG gene associated with Teunissen-Cremers syndrome and features of multiple NOG

Hogan, B.L. (1996). Bone morphogenetic proteins in development. *Curr Opin Genet Dev,* 6, 4,

Holley, S.A., Neul, J.L., Attisano, L., Wrana, J.L., Sasai, Y.*, et al.* (1996). The Xenopus dorsalizing factor noggin ventralizes Drosophila embryos by preventing DPP from

Hoodless, P.A., Haerry, T., Abdollah, S., Stapleton, M., O'Connor, M.B.*, et al.* (1996). MADR1, a MAD-related protein that functions in BMP2 signaling pathways. *Cell,* 85, 4,

and new targets for Hedgehog signaling. *Cell Mol Biol Lett,* 15, 1, pp. 153-176

Haines, R.W. (1947). The development of joints. *J Anat,* 81, 1, pp. 33-55

joint-fusion syndromes. *Eur J Med Genet,* 51, 4, pp. 351-357

activating its receptor. *Cell,* 86, 4, pp. 607-617

nature of receptor assembly. *Mol Cell,* 11, 3, pp. 605-617

mouse. *J Embryol Exp Morphol,* 30, 1, pp. 119-141

Bmp-13. *J Bone Miner Res,* 10, pp. S163-S163

21, 3, pp. 302-304

420, 6916, pp. 636-642

260

406

pp. 59

pp. 432-438

pp. 489-500


*A,* 146A, 16, pp. 2116-2121

270, 18, pp. 10618-10624

1208

6, pp. 454-458

1200

*Arthritis Rheum,* 60, 7, pp. 2055-2064

*Am J Med Genet,* 111, 1, pp. 31-37

Douzgou, S., Lehmann, K., Mingarelli, R., Mundlos, S., and Dallapiccola, B. (2008). Compound heterozygosity for GDF5 in Du Pan type chondrodysplasia. *Am J Med Genet* 

Dubois, C.M., Laprise, M.H., Blanchette, F., Gentry, L.E., and Leduc, R. (1995). Processing of transforming growth factor beta 1 precursor by human furin convertase. *J Biol Chem,*

Egli, R.J., Southam, L., Wilkins, J.M., Lorenzen, I., Pombo-Suarez, M.*, et al.* (2009). Functional analysis of the osteoarthritis susceptibility-associated GDF5 regulatory polymorphism.

Emery, S.B., Meyer, A., Miller, L., and Lesperance, M.M. (2009). Otosclerosis or congenital stapes ankylosis? The diagnostic role of genetic analysis. *Otol Neurotol,* 30, 8, pp. 1204-

Everman, D.B., Bartels, C.F., Yang, Y., Yanamandra, N., Goodman, F.R.*, et al.* (2002). The mutational spectrum of brachydactyly type C. *Am J Med Genet,* 112, 3, pp. 291-296 Faiyaz-Ul-Haque, M., Ahmad, W., Wahab, A., Haque, S., Azim, A.C.*, et al.* (2002a). Frameshift mutation in the cartilage-derived morphogenetic protein 1 (CDMP1) gene and severe acromesomelic chondrodysplasia resembling Grebe-type chondrodysplasia.

Faiyaz-Ul-Haque, M., Ahmad, W., Zaidi, S.H., Haque, S., Teebi, A.S.*, et al.* (2002b). Mutation in the cartilage-derived morphogenetic protein-1 (CDMP1) gene in a kindred affected with fibular hypoplasia and complex brachydactyly (DuPan syndrome). *Clin Genet,* 61,

Faiyaz-Ul-Haque, M., Faqeih, E.A., Al-Zaidan, H., Al-Shammary, A., and Zaidi, S.H. (2008). Grebe-type chondrodysplasia: a novel missense mutation in a conserved cysteine of the

Francis-West, P.H., Abdelfattah, A., Chen, P., Allen, C., Parish, J.*, et al.* (1999). Mechanisms of GDF-5 action during skeletal development. *Development,* 126, 6, pp. 1305-1315 Francois, V., Solloway, M., O'Neill, J.W., Emery, J., and Bier, E. (1994). Dorsal-ventral patterning of the Drosophila embryo depends on a putative negative growth factor

Galjaard, R.J., van der Ham, L.I., Posch, N.A., Dijkstra, P.F., Oostra, B.A.*, et al.* (2001). Differences in complexity of isolated brachydactyly type C cannot be attributed to locus

Gao, B., Guo, J., She, C., Shu, A., Yang, M.*, et al.* (2001). Mutations in IHH, encoding Indian

Gao, B., Hu, J., Stricker, S., Cheung, M., Ma, G.*, et al.* (2009). A mutation in Ihh that causes digit abnormalities alters its signalling capacity and range. *Nature,* 458, 7242, pp. 1196-

Garamszegi, N., Dore, J.J., Jr., Penheiter, S.G., Edens, M., Yao, D.*, et al.* (2001). Transforming growth factor beta receptor signaling and endocytosis are linked through a COOH

terminal activation motif in the type I receptor. *Mol Biol Cell,* 12, 9, pp. 2881-2893

growth differentiation factor 5. *J Bone Miner Metab,* 26, 6, pp. 648-652

encoded by the short gastrulation gene. *Genes Dev,* 8, 21, pp. 2602-2616

hedgehog, cause brachydactyly type A-1. *Nat Genet,* 28, 4, pp. 386-388

heterogeneity alone. *Am J Med Genet,* 98, 3, pp. 256-262


Huse, M., Chen, Y.G., Massague, J., and Kuriyan, J. (1999). Crystal structure of the cytoplasmic domain of the type I TGF beta receptor in complex with FKBP12. *Cell,* 96, 3, pp. 425-436

Missense Mutations in GDF-5 Signaling: Molecular Mechanisms Behind Skeletal Malformation 47

Lawrence, D.A., Pircher, R., Kryceve-Martinerie, C., and Jullien, P. (1984). Normal embryo fibroblasts release transforming growth factors in a latent form. *J Cell Physiol,* 121, 1, pp.

Lehmann, K., Seemann, P., Boergermann, J., Morin, G., Reif, S.*, et al.* (2006). A novel R486Q mutation in BMPR1B resulting in either a brachydactyly type C/symphalangism-like

Lehmann, K., Seemann, P., Silan, F., Goecke, T.O., Irgang, S.*, et al.* (2007). A new subtype of brachydactyly type B caused by point mutations in the bone morphogenetic protein

Lehmann, K., Seemann, P., Stricker, S., Sammar, M., Meyer, B.*, et al.* (2003). Mutations in bone morphogenetic protein receptor 1B cause brachydactyly type A2. *Proc Natl Acad* 

Lin, L., Valore, E.V., Nemeth, E., Goodnough, J.B., Gabayan, V.*, et al.* (2007). Iron transferrin regulates hepcidin synthesis in primary hepatocyte culture through hemojuvelin and

Liu, M., Wang, X., Cai, Z., Tang, Z., Cao, K.*, et al.* (2006). A novel heterozygous mutation in the Indian hedgehog gene (IHH) is associated with brachydactyly type A1 in a Chinese

Lopez-Casillas, F., Wrana, J.L., and Massague, J. (1993). Betaglycan presents ligand to the

Lyons, K.M., Pelton, R.W., and Hogan, B.L. (1989). Patterns of expression of murine Vgr-1 and BMP-2a RNA suggest that transforming growth factor-beta-like genes coordinately

Macias, D., Ganan, Y., Sampath, T.K., Piedra, M.E., Ros, M.A.*, et al.* (1997). Role of BMP-2 and OP-1 (BMP-7) in programmed cell death and skeletogenesis during chick limb

Maloul, A., Rossmeier, K., Mikic, B., Pogue, V., and Battaglia, T. (2006). Geometric and material contributions to whole bone structural behavior in GDF-7-deficient mice.

Mangino, M., Flex, E., Digilio, M.C., Giannotti, A., and Dallapiccola, B. (2002). Identification of a novel NOG gene mutation (P35S) in an Italian family with symphalangism. *Hum* 

Manjon, C., Sanchez-Herrero, E., and Suzanne, M. (2007). Sharp boundaries of Dpp signalling trigger local cell death required for Drosophila leg morphogenesis. *Nat Cell* 

Marcelino, J., Sciortino, C.M., Romero, M.F., Ulatowski, L.M., Ballock, R.T.*, et al.* (2001). Human disease-causing NOG missense mutations: effects on noggin secretion, dimer formation, and bone morphogenetic protein binding. *Proc Natl Acad Sci U S A,* 98, 20,

Massague, J. (2000). How cells read TGF-beta signals. *Nat Rev Mol Cell Biol,* 1, 3, pp. 169-178 Massague, J., Seoane, J., and Wotton, D. (2005). Smad transcription factors. *Genes Dev,* 19, 23,

regulate aspects of embryonic development. *Genes Dev,* 3, 11, pp. 1657-1668

phenotype or brachydactyly type A2. *Eur J Hum Genet,* 14, 12, pp. 1248-1254

antagonist NOGGIN. *Am J Hum Genet,* 81, 2, pp. 388-396

TGF beta signaling receptor. *Cell,* 73, 7, pp. 1435-1444

development. *Development,* 124, 6, pp. 1109-1117

*Sci U S A,* 100, 21, pp. 12277-12282

BMP2/4. *Blood,* 110, 6, pp. 2182-2189

family. *J Hum Genet,* 51, 8, pp. 727-731

*Connect Tissue Res,* 47, 3, pp. 157-162

*Mutat,* 19, 3, pp. 308

*Biol,* 9, 1, pp. 57-63

pp. 11353-11358

pp. 2783-2810

184-188


Lawrence, D.A., Pircher, R., Kryceve-Martinerie, C., and Jullien, P. (1984). Normal embryo fibroblasts release transforming growth factors in a latent form. *J Cell Physiol,* 121, 1, pp. 184-188

46 Mutations in Human Genetic Disease

*Chem,* 278, 9, pp. 7718-7724

and E. *Am J Hum Genet,* 72, 4, pp. 984-997

mutation. *Trends Genet,* 10, 1, pp. 16-21

receptor specificity. *EMBO J* 28, 7, pp. 937-947

47, 46, pp. 11930-11939

4, pp. 1137-1139

pp. 5876-5887

481

pp. 663-666

pp. 425-436

pp. 543-548

Huse, M., Chen, Y.G., Massague, J., and Kuriyan, J. (1999). Crystal structure of the cytoplasmic domain of the type I TGF beta receptor in complex with FKBP12. *Cell,* 96, 3,

Huse, M., Muir, T.W., Xu, L., Chen, Y.G., Kuriyan, J.*, et al.* (2001). The TGF beta receptor activation process: an inhibitor- to substrate-binding switch. *Mol Cell,* 8, 3, pp. 671-682 Janssens, K., ten Dijke, P., Ralston, S.H., Bergmann, C., and Van Hul, W. (2003). Transforming growth factor-beta 1 mutations in Camurati-Engelmann disease lead to increased signaling by altering either activation or secretion of the mutant protein. *J Biol* 

Johnson, D., Kan, S.H., Oldridge, M., Trembath, R.C., Roche, P.*, et al.* (2003). Missense mutations in the homeodomain of HOXD13 are associated with brachydactyly types D

Karp, S.J., Schipani, E., St-Jacques, B., Hunzelman, J., Kronenberg, H.*, et al.* (2000). Indian hedgehog coordinates endochondral bone growth and morphogenesis via parathyroid hormone related-protein-dependent and -independent pathways. *Development,* 127, 3,

Keller, S., Nickel, J., Zhang, J.L., Sebald, W., and Mueller, T.D. (2004). Molecular recognition

Kingsley, D.M. (1994). What do BMPs do in mammals? Clues from the mouse short-ear

Kirsch, T., Nickel, J., and Sebald, W. (2000). BMP-2 antagonists emerge from alterations in the low-affinity binding epitope for receptor BMPR-II. *EMBO J* 19, 13, pp. 3314-3324 Klages, J., Kotzsch, A., Coles, M., Sebald, W., Nickel, J.*, et al.* (2008). The solution structure of BMPR-IA reveals a local disorder-to-order transition upon BMP-2 binding. *Biochemistry,*

Klammert, U., Kübler, A., Wuerzler, K.K., Sebald, W., Mueller, T.D.*, et al.* (2011). Dependent

Kosaki, K., Sato, S., Hasegawa, T., Matsuo, N., Suzuki, T.*, et al.* (2004). Premature ovarian failure in a female with proximal symphalangism and Noggin mutation. *Fertil Steril,* 81,

Kotzsch, A., Nickel, J., Seher, A., Heinecke, K., van Geersdaele, L.*, et al.* (2008). Structure analysis of bone morphogenetic protein-2 type I receptor complexes reveals a mechanism of receptor inactivation in juvenile polyposis syndrome. *J Biol Chem,* 283, 9,

Kotzsch, A., Nickel, J., Seher, A., Sebald, W., and Muller, T.D. (2009). Crystal structure analysis reveals a spring-loaded latch as molecular mechanism for GDF-5-type I

Krause, C., Guzman, A., and Knaus, P. (2011). Noggin. *Int J Biochem Cell Biol,* 43, 4, pp. 478-

Lanske, B., Karaplis, A.C., Lee, K., Luz, A., Vortkamp, A.*, et al.* (1996). PTH/PTHrP receptor in early development and Indian hedgehog-regulated bone growth. *Science,* 273, 5275,

on the Cellular Context GDF-5 can act as potent BMP-2 Inhibitor. *submitted*

of BMP-2 and BMP receptor IA. *Nat Struct Mol Biol,* 11, 5, pp. 481-488


Masuya, H., Nishida, K., Furuichi, T., Toki, H., Nishimura, G.*, et al.* (2007). A novel dominant-negative mutation in Gdf5 generated by ENU mutagenesis impairs joint formation and causes osteoarthritis in mice. *Hum Mol Genet,* 16, 19, pp. 2366-2375

Missense Mutations in GDF-5 Signaling: Molecular Mechanisms Behind Skeletal Malformation 49

Newfeld, S.J., Wisotzkey, R.G., and Kumar, S. (1999). Molecular evolution of a developmental pathway: phylogenetic analyses of transforming growth factor-beta family ligands, receptors and Smad signal transducers. *Genetics,* 152, 2, pp. 783-795 Nickel, J., Kotzsch, A., Sebald, W., and Mueller, T.D. (2005). A single residue of GDF-5

Nickel, J., Sebald, W., Groppe, J.C., and Mueller, T.D. (2009). Intricacies of BMP receptor

Nishitoh, H., Ichijo, H., Kimura, M., Matsumoto, T., Makishima, F.*, et al.* (1996). Identification of type I and type II serine/threonine kinase receptors for

O'Connor, M.B., Umulis, D., Othmer, H.G., and Blair, S.S. (2006). Shaping BMP morphogen gradients in the Drosophila embryo and pupal wing. *Development,* 133, 2, pp. 183-193 Oldridge, M., Fortuna, A.M., Maringa, M., Propping, P., Mansour, S.*, et al.* (2000). Dominant mutations in ROR2, encoding an orphan receptor tyrosine kinase, cause brachydactyly

Onichtchouk, D., Chen, Y.G., Dosch, R., Gawantka, V., Delius, H.*, et al.* (1999). Silencing of TGF-beta signalling by the pseudoreceptor BAMBI. *Nature,* 401, 6752, pp. 480-485 Owens, E.M., and Solursh, M. (1982). Cell-cell interaction by mouse limb cells during in vitro chondrogenesis: analysis of the brachypod mutation. *Dev Biol,* 91, 2, pp. 376-388 Oxley, C.D., Rashid, R., Goudie, D.R., Stranks, G., Baty, D.U.*, et al.* (2008). Growth and skeletal development in families with NOGGIN gene mutations. *Horm Res,* 69, 4, pp.

Pathi, S., Rutenberg, J.B., Johnson, R.L., and Vortkamp, A. (1999). Interaction of Ihh and BMP/Noggin signaling during cartilage differentiation. *Dev Biol,* 209, 2, pp. 239-253 Perez, W.D., Weller, C.R., Shou, S., and Stadler, H.S. (2010). Survival of Hoxa13 homozygous mutants reveals a novel role in digit patterning and appendicular skeletal development.

Ploger, F., Seemann, P., Schmidt-von Kegler, M., Lehmann, K., Seidel, J.*, et al.* (2008). Brachydactyly type A2 associated with a defect in proGDF5 processing. *Hum Mol Genet,*

Pogue, R., and Lyons, K. (2006). BMP signaling in the cartilage growth plate. *Curr Top Dev* 

Polinkovsky, A., Robin, N.H., Thomas, J.T., Irons, M., Lynn, A.*, et al.* (1997). Mutations in CDMP1 cause autosomal dominant brachydactyly type C. *Nat Genet,* 17, 1, pp. 18-19 Potti, T.A., Petty, E.M., and Lesperance, M.M. (2011). A comprehensive review of reported heritable noggin-associated syndromes and proposed clinical utility of one broadly inclusive diagnostic term: NOG-related-symphalangism spectrum disorder (NOG-SSD).

Reddi, A.H. (1998). Role of morphogenetic proteins in skeletal tissue engineering and

Reynard, L.N., Bui, C., Canty-Laird, E.G., Young, D.A., and Loughlin, J. (2011). Expression of the osteoarthritis-associated gene GDF5 is modulated epigenetically by DNA

defines binding specificity to BMP receptor IB. *J Mol Biol,* 349, 5, pp. 933-947

growth/differentiation factor-5. *J Biol Chem,* 271, 35, pp. 21345-21352

assembly. *Cytokine Growth Factor Rev,* 20, 5-6, pp. 367-377

type B. *Nat Genet,* 24, 3, pp. 275-278

*Dev Dyn,* 239, 2, pp. 446-457

*Hum Mutat,* 32, 8, pp. 877-886

regeneration. *Nat Biotechnol,* 16, 3, pp. 247-252

methylation. *Hum Mol Genet,* 20, 17, pp. 3450-3460

17, 9, pp. 1222-1233

*Biol,* 76, pp. 1-48

221-226


Newfeld, S.J., Wisotzkey, R.G., and Kumar, S. (1999). Molecular evolution of a developmental pathway: phylogenetic analyses of transforming growth factor-beta family ligands, receptors and Smad signal transducers. *Genetics,* 152, 2, pp. 783-795

48 Mutations in Human Genetic Disease

pp. 17208-17213

206, 1, pp. 33-45

4, pp. 475-485

2, pp. 15-22

123-136

*Nat Genet,* 39, 4, pp. 529-533

Masuya, H., Nishida, K., Furuichi, T., Toki, H., Nishimura, G.*, et al.* (2007). A novel dominant-negative mutation in Gdf5 generated by ENU mutagenesis impairs joint

McMahon, J.A., Takada, S., Zimmerman, L.B., Fan, C.M., Harland, R.M.*, et al.* (1998). Noggin-mediated antagonism of BMP signaling is required for growth and patterning

Merino, R., Macias, D., Ganan, Y., Economides, A.N., Wang, X.*, et al.* (1999a). Expression and function of Gdf-5 during digit skeletogenesis in the embryonic chick leg bud. *Dev Biol,*

Merino, R., Rodriguez-Leon, J., Macias, D., Ganan, Y., Economides, A.N.*, et al.* (1999b). The BMP antagonist Gremlin regulates outgrowth, chondrogenesis and programmed cell

Mikic, B., Bierwert, L., and Tsou, D. (2006). Achilles tendon characterization in GDF-7

Minina, E., Schneider, S., Rosowski, M., Lauster, R., and Vortkamp, A. (2005). Expression of Fgf and Tgfbeta signaling related genes during embryonic endochondral ossification.

Minina, E., Wenzel, H.M., Kreschel, C., Karp, S., Gaffield, W.*, et al.* (2001). BMP and Ihh/PTHrP signaling interact to coordinate chondrocyte proliferation and

Mishina, Y., Suzuki, A., Ueno, N., and Behringer, R.R. (1995). Bmpr encodes a type I bone morphogenetic protein receptor that is essential for gastrulation during mouse

Mitrovic, D. (1978). Development of the diarthrodial joints in the rat embryo. *Am J Anat,* 151,

Miyamoto, Y., Mabuchi, A., Shi, D., Kubo, T., Takatori, Y.*, et al.* (2007). A functional polymorphism in the 5' UTR of GDF5 is associated with susceptibility to osteoarthritis.

Miyazawa, K., Shinozaki, M., Hara, T., Furuya, T., and Miyazono, K. (2002). Two major Smad pathways in TGF-beta superfamily signalling. *Genes Cells,* 7, 12, pp. 1191-1204 Miyazono, K. (2000). TGF-beta signaling by Smad proteins. *Cytokine Growth Factor Rev,* 11, 1-

Miyazono, K., Hellman, U., Wernstedt, C., and Heldin, C.H. (1988). Latent high molecular weight complex of transforming growth factor beta 1. Purification from human platelets

Mundlos, S. (2009). The brachydactylies: a molecular disease family. *Clin Genet,* 76, 2, pp.

Nakao, A., Roijer, E., Imamura, T., Souchelnytskyi, S., Stenman, G.*, et al.* (1997). Identification of Smad2, a human Mad-related protein in the transforming growth

and structural characterization. *J Biol Chem,* 263, 13, pp. 6407-6415

factor beta signaling pathway. *J Biol Chem,* 272, 5, pp. 2896-2900

of the neural tube and somite. *Genes Dev,* 12, 10, pp. 1438-1452

death in the developing limb. *Development,* 126, 23, pp. 5515-5522

deficient mice. *J Orthop Res,* 24, 4, pp. 831-841

differentiation. *Development,* 128, 22, pp. 4523-4534

embryogenesis. *Genes Dev,* 9, 24, pp. 3027-3037

*Gene Expr Patterns,* 6, 1, pp. 102-109

formation and causes osteoarthritis in mice. *Hum Mol Genet,* 16, 19, pp. 2366-2375 McLellan, J.S., Yao, S., Zheng, X., Geisbrecht, B.V., Ghirlando, R.*, et al.* (2006). Structure of a heparin-dependent complex of Hedgehog and Ihog. *Proc Natl Acad Sci U S A,* 103, 46,


Ricard, N., Bidart, M., Mallet, C., Lesca, G., Giraud, S.*, et al.* (2010). Functional analysis of the BMP9 response of ALK1 mutants from HHT2 patients: a diagnostic tool for novel ACVRL1 mutations. *Blood,* 116, 9, pp. 1604-1612

Missense Mutations in GDF-5 Signaling: Molecular Mechanisms Behind Skeletal Malformation 51

Sengle, G., Ono, R.N., Lyons, K.M., Bachinger, H.P., and Sakai, L.Y. (2008). A new model for growth factor activation: type II receptors compete with the prodomain for BMP-7. *J* 

Sengle, G., Ono, R.N., Sasaki, T., and Sakai, L.Y. (2011). Prodomains of transforming growth factor beta (TGFbeta) superfamily members specify different functions: extracellular matrix interactions and growth factor bioavailability. *J Biol Chem,* 286, 7, pp. 5087-5099 Settle, S., Marker, P., Gurley, K., Sinha, A., Thacker, A.*, et al.* (2001). The BMP family member Gdf7 is required for seminal vesicle growth, branching morphogenesis, and

Settle, S.H., Jr., Rountree, R.B., Sinha, A., Thacker, A., Higgins, K.*, et al.* (2003). Multiple joint and skeletal patterning defects caused by single and double mutations in the mouse

Shi, M., Zhu, J., Wang, R., Chen, X., Mi, L.*, et al.* (2011). Latent TGF-beta structure and

Shi, Y., and Massague, J. (2003). Mechanisms of TGF-beta signaling from cell membrane to

Shimmi, O., and O'Connor, M.B. (2003). Physical properties of Tld, Sog, Tsg and Dpp protein interactions are predicted to help create a sharp boundary in Bmp signals during dorsoventral patterning of the Drosophila embryo. *Development,* 130, 19, pp.

Shimmi, O., Umulis, D., Othmer, H., and O'Connor, M.B. (2005). Facilitated transport of a Dpp/Scw heterodimer by Sog/Tsg leads to robust patterning of the Drosophila

Smith, W.C., and Harland, R.M. (1992). Expression cloning of noggin, a new dorsalizing factor localized to the Spemann organizer in Xenopus embryos. *Cell,* 70, 5, pp. 829-840 Song, K., Krause, C., Shi, S., Patterson, M., Suto, R.*, et al.* (2010). Identification of a key residue mediating bone morphogenetic protein (BMP)-6 resistance to noggin inhibition allows for engineered BMPs with superior agonist activity. *J Biol Chem,* 285, 16, pp.

Southam, L., Rodriguez-Lopez, J., Wilkins, J.M., Pombo-Suarez, M., Snelling, S.*, et al.* (2007). An SNP in the 5'-UTR of GDF5 is associated with osteoarthritis susceptibility in Europeans and with in vivo differences in allelic expression in articular cartilage. *Hum* 

St-Jacques, B., Hammerschmidt, M., and McMahon, A.P. (1999). Indian hedgehog signaling regulates proliferation and differentiation of chondrocytes and is essential for bone

Stelzer, C., Winterpacht, A., Spranger, J., and Zabel, B. (2003). Grebe dysplasia and the

Storm, E.E., Huynh, T.V., Copeland, N.G., Jenkins, N.A., Kingsley, D.M.*, et al.* (1994). Limb alterations in brachypodism mice due to mutations in a new member of the TGF beta-

spectrum of CDMP1 mutations. *Pediatr Pathol Mol Med,* 22, 1, pp. 77-85

*Mol Biol,* 381, 4, pp. 1025-1039

cytodifferentiation. *Dev Biol,* 234, 1, pp. 138-150

Gdf6 and Gdf5 genes. *Dev Biol,* 254, 1, pp. 116-130

activation. *Nature,* 474, 7351, pp. 343-349

blastoderm embryo. *Cell,* 120, 6, pp. 873-886

the nucleus. *Cell,* 113, 6, pp. 685-700

4673-4682

12169-12180

*Mol Genet,* 16, 18, pp. 2226-2232

formation. *Genes Dev,* 13, 16, pp. 2072-2086

superfamily. *Nature,* 368, 6472, pp. 639-643


Sengle, G., Ono, R.N., Lyons, K.M., Bachinger, H.P., and Sakai, L.Y. (2008). A new model for growth factor activation: type II receptors compete with the prodomain for BMP-7. *J Mol Biol,* 381, 4, pp. 1025-1039

50 Mutations in Human Genetic Disease

*Genet,* 8, 3, pp. 97-102

*Febs J,* 275, 1, pp. 172-183

*Development,* 127, 5, pp. 957-967

*Hum Genet,* 67, 4, pp. 822-831

e1000747

*Med Genet A,* 124A, 4, pp. 356-363

ACVRL1 mutations. *Blood,* 116, 9, pp. 1604-1612

ossification. *J Bone Miner Res,* 14, 7, pp. 1145-1152

Ricard, N., Bidart, M., Mallet, C., Lesca, G., Giraud, S.*, et al.* (2010). Functional analysis of the BMP9 response of ALK1 mutants from HHT2 patients: a diagnostic tool for novel

Rosen, V., and Thies, R.S. (1992). The BMP proteins in bone formation and repair. *Trends* 

Rouault, K., Scotet, V., Autret, S., Gaucher, F., Dubrana, F.*, et al.* (2010). Evidence of association between GDF5 polymorphisms and congenital dislocation of the hip in a

Rudnik-Schoneborn, S., Takahashi, T., Busse, S., Schmidt, T., Senderek, J.*, et al.* (2010). Facioaudiosymphalangism syndrome and growth acceleration associated with a

Sakou, T., Onishi, T., Yamamoto, T., Nagamine, T., Sampath, T.*, et al.* (1999). Localization of Smads, the TGF-beta family intracellular signaling components during endochondral

Samad, T.A., Rebbapragada, A., Bell, E., Zhang, Y., Sidis, Y.*, et al.* (2005). DRAGON, a bone

Saremba, S., Nickel, J., Seher, A., Kotzsch, A., Sebald, W.*, et al.* (2008). Type I receptor binding of bone morphogenetic protein 6 is dependent on N-glycosylation of the ligand.

Schmid, B., Furthauer, M., Connors, S.A., Trout, J., Thisse, B.*, et al.* (2000). Equivalent genetic roles for bmp7/snailhouse and bmp2b/swirl in dorsoventral pattern formation.

Schwabe, G.C., Tinschert, S., Buschow, C., Meinecke, P., Wolff, G.*, et al.* (2000). Distinct mutations in the receptor tyrosine kinase gene ROR2 cause brachydactyly type B. *Am J* 

Schwabe, G.C., Turkmen, S., Leschik, G., Palanduz, S., Stover, B.*, et al.* (2004). Brachydactyly type C caused by a homozygous missense mutation in the prodomain of CDMP1. *Am J* 

Schwaerzer, G.K., Hiepen, C., Schrewe, H., Nickel, J., Ploeger, F.*, et al.* (2011). New insights into the molecular mechanisms of multiple synostoses syndrome: Mutation within the GDF5 knuckle epitope causes noggin-resistance. *J Bone Miner Res,* 27, 2, pp. 429-442 Sebald, W., Nickel, J., Zhang, J.L., and Mueller, T.D. (2004). Molecular recognition in bone morphogenetic protein (BMP)/receptor interaction. *Biol Chem,* 385, 8, pp. 697-710 Seemann, P., Brehm, A., Konig, J., Reissner, C., Stricker, S.*, et al.* (2009). Mutations in GDF5 reveal a key residue mediating BMP inhibition by NOGGIN. *PLoS Genet,* 5, 11, pp.

Seemann, P., Schwappacher, R., Kjaer, K.W., Krakow, D., Lehmann, K.*, et al.* (2005). Activating and deactivating mutations in the receptor interaction site of GDF5 cause

Seki, K., and Hata, A. (2004). Indian hedgehog gene is a target of the bone morphogenetic

symphalangism or brachydactyly type A2. *J Clin Invest,* 115, 9, pp. 2373-2381

protein signaling pathway. *J Biol Chem,* 279, 18, pp. 18544-18549

Caucasian population. *Osteoarthritis Cartilage,* 18, 9, pp. 1144-1149

heterozygous NOG mutation. *Am J Med Genet A,* 152A, 6, pp. 1540-1544

morphogenetic protein co-receptor. *J Biol Chem,* 280, 14, pp. 14122-14129


Storm, E.E., and Kingsley, D.M. (1996). Joint patterning defects caused by single and double mutations in members of the bone morphogenetic protein (BMP) family. *Development,* 122, 12, pp. 3969-3979

Missense Mutations in GDF-5 Signaling: Molecular Mechanisms Behind Skeletal Malformation 53

Wanek, N., Muneoka, K., Holler-Dinsmore, G., Burton, R., and Bryant, S.V. (1989). A staging

Wang, X., Xiao, F., Yang, Q., Liang, B., Tang, Z.*, et al.* (2006). A novel mutation in GDF5 causes autosomal dominant symphalangism in two Chinese families. *Am J Med Genet A,*

Weber, D., Kotzsch, A., Nickel, J., Harth, S., Seher, A.*, et al.* (2007). A silent H-bond can be mutationally activated for high-affinity interaction of BMP-2 and activin type IIB

Weekamp, H.H., Kremer, H., Hoefsloot, L.H., Kuijpers-Jagtman, A.M., Cruysberg, J.R.*, et al.* (2005). Teunissen-Cremers syndrome: a clinical, surgical, and genetic report. *Otol* 

Wiater, E., and Vale, W. (2003). Inhibin is an antagonist of bone morphogenetic protein

Wieser, R., Wrana, J.L., and Massague, J. (1995). GS domain mutations that constitutively activate T beta R-I, the downstream signaling component in the TGF-beta receptor

Wolfman, N.M., Celeste, A.J., Cox, K., Hattersley, G., Nelson, R.*, et al.* (1995). Preliminary Characterization of the Biological-Activities Rhbmp-12. *J Bone Miner Res,* 10, pp. S148-

Wolfman, N.M., Hattersley, G., Cox, K., Celeste, A.J., Nelson, R.*, et al.* (1997). Ectopic induction of tendon and ligament in rats by growth and differentiation factors 5, 6, and

Wotton, D., and Massague, J. (2001). Smad transcriptional corepressors in TGF beta family

Wu, M.Y., and Hill, C.S. (2009). Tgf-beta superfamily signaling in embryonic development

Yi, S.E., LaPolt, P.S., Yoon, B.S., Chen, J.Y., Lu, J.K.*, et al.* (2001). The type I BMP receptor BmprIB is essential for female reproductive function. *Proc Natl Acad Sci U S A,* 98, 14,

Yokouchi, Y., Sakiyama, J., Kameda, T., Iba, H., Suzuki, A.*, et al.* (1996). BMP-2/-4 mediate programmed cell death in chicken limb buds. *Development,* 122, 12, pp. 3725-3734 Zhao, G.Q. (2003). Consequences of knocking out BMP signaling in the mouse. *Genesis,* 35, 1,

Zhao, X., Sun, M., Zhao, J., Leyva, J.A., Zhu, H.*, et al.* (2007). Mutations in HOXD13 underlie syndactyly type V and a novel brachydactyly-syndactyly syndrome. *Am J Hum Genet,*

Zhu, H., Kavsak, P., Abdollah, S., Wrana, J.L., and Thomsen, G.H. (1999). A SMAD ubiquitin ligase targets the BMP pathway and affects embryonic pattern formation. *Nature,* 400,

Zimmerman, L.B., De Jesus-Escobar, J.M., and Harland, R.M. (1996). The Spemann organizer signal noggin binds and inactivates bone morphogenetic protein 4. *Cell,* 86, 4, pp. 599-

7, members of the TGF-beta gene family. *J Clin Invest,* 100, 2, pp. 321-330

signaling. *Curr Top Microbiol Immunol,* 254, pp. 145-164

and homeostasis. *Dev Cell,* 16, 3, pp. 329-343

system for mouse limb development. *J Exp Zool,* 249, 1, pp. 41-49

140A, 17, pp. 1846-1853

*Neurotol,* 26, 1, pp. 38-51

S148

pp. 7994-7999

pp. 43-56

606

80, 2, pp. 361-371

6745, pp. 687-693

receptor. *BMC Struct Biol,* 7, pp. 6

signaling. *J Biol Chem,* 278, 10, pp. 7934-7941

complex. *EMBO J,* 14, 10, pp. 2199-2208


Wanek, N., Muneoka, K., Holler-Dinsmore, G., Burton, R., and Bryant, S.V. (1989). A staging system for mouse limb development. *J Exp Zool,* 249, 1, pp. 41-49

52 Mutations in Human Genetic Disease

122, 12, pp. 3969-3979

30148-30156

2, pp. 139-145

1, pp. 58-64

1599-1607

*Invest,* 120, 6, pp. 1994-2004

273, 5275, pp. 613-622

8286

digit development. *Dev Biol,* 209, 1, pp. 11-27

*Am J Med Genet A,* 138, 4, pp. 379-383

lineages. *Development,* 125, 1, pp. 21-31

syndrome. *Clin Genet,* 60, 6, pp. 447-451

Storm, E.E., and Kingsley, D.M. (1996). Joint patterning defects caused by single and double mutations in members of the bone morphogenetic protein (BMP) family. *Development,*

Storm, E.E., and Kingsley, D.M. (1999). GDF5 coordinates bone and joint formation during

Suzuki, M., Ueno, N., and Kuroiwa, A. (2003). Hox proteins functionally cooperate with the GC box-binding protein system through distinct domains. *J Biol Chem,* 278, 32, pp.

Szczaluba, K., Hilbert, K., Obersztyn, E., Zabel, B., Mazurczak, T.*, et al.* (2005). Du Pan syndrome phenotype caused by heterozygous pathogenic mutations in CDMP1 gene.

Takagi, T., Moribe, H., Kondoh, H., and Higashi, Y. (1998). DeltaEF1, a zinc finger and homeodomain transcription factor, is required for skeleton patterning in multiple

Takahashi, T., Takahashi, I., Komatsu, M., Sawaishi, Y., Higashi, K.*, et al.* (2001). Mutations of the NOG gene in individuals with proximal symphalangism and multiple synostosis

ten Dijke, P., Miyazono, K., and Heldin, C.H. (1996). Signaling via hetero-oligomeric complexes of type I and type II serine/threonine kinase receptors. *Curr Opin Cell Biol,* 8,

Thomas, J.T., Kilpatrick, M.W., Lin, K., Erlacher, L., Lembessis, P.*, et al.* (1997). Disruption of human limb morphogenesis by a dominant negative mutation in CDMP1. *Nat Genet,* 17,

Tylzanowski, P., Mebis, L., and Luyten, F.P. (2006). The Noggin null mouse phenotype is strain dependent and haploinsufficiency leads to skeletal defects. *Dev Dyn,* 235, 6, pp.

Ueno, N., Ling, N., Ying, S.Y., Esch, F., Shimasaki, S.*, et al.* (1987). Isolation and partial characterization of follistatin: a single-chain Mr 35,000 monomeric protein that inhibits the release of follicle-stimulating hormone. *Proc Natl Acad Sci U S A,* 84, 23, pp. 8282-

van den Ende, J.J., Mattelaer, P., Declau, F., Vanhoenacker, F., Claes, J.*, et al.* (2005). The facio-audio-symphalangism syndrome in a four generation family with a nonsense

Villavicencio-Lorini, P., Kuss, P., Friedrich, J., Haupt, J., Farooq, M.*, et al.* (2010). Homeobox genes d11-d13 and a13 control mouse autopod cortical bone and joint formation. *J Clin* 

Vortkamp, A., Lee, K., Lanske, B., Segre, G.V., Kronenberg, H.M.*, et al.* (1996). Regulation of rate of cartilage differentiation by Indian hedgehog and PTH-related protein. *Science,*

Walton, K.L., Makanji, Y., Chen, J., Wilce, M.C., Chan, K.L.*, et al.* (2010). Two distinct regions of latency-associated peptide coordinate stability of the latent transforming growth

mutation in the NOG-gene. *Clin Dysmorphol,* 14, 2, pp. 73-80

factor-beta1 complex. *J Biol Chem,* 285, 22, pp. 17029-17037

Temtamy, S.A., and Aglan, M.S. (2008). Brachydactyly. *Orphanet J Rare Dis,* 3, pp. 15

	- Zou, H., Wieser, R., Massague, J., and Niswander, L. (1997). Distinct roles of type I bone morphogenetic protein receptors in the formation and differentiation of cartilage. *Genes Dev,* 11, 17, pp. 2191-2203

**Chapter 3** 

© 2012 Varret and Rabès, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**Missense Mutation in the** *LDLR* **Gene:** 

Hypercholesterolemia is a major risk factor for atherosclerosis and its premature cardiovascular complications. Hypercholesterolemia can be multifactorial (diet, genetic background...) or - less frequently - monogenic, leading to Autosomal Dominant Hypercholesterolemia (ADH, OMIM #143890). ADH is characterised by a selective elevation of plasmatic Low Density Lipoprotein (LDL) levels, tendinous xanthoma and premature coronary heart disease. ADH has proven to be genetically heterogeneous and associated with defects in at least 3 different genes: *LDLR* (LDL receptor), *APOB* (apolipoprotein B) and

Familial hypercholesterolemia (FH, OMIM #606945) is the most frequent form of ADH and is due to mutations within the gene encoding the LDL specific receptor. FH is an autosomal co-dominant trait, with homozygotes being more severely affected than heterozygotes (Goldstein and Brown, 1989). FH is also one of the most common inherited disorders with a frequency of heterozygotes estimated to be 1:500 and a frequency of homozygotes being ≈ 1:106 in most populations. In certain communities, such as French Canadians (Moorjani et al. 1989), Finns (Koivisto et al. 1992), Afrikaners (Kotze et al. 1989; Leitersdorf et al. 1989), Druze (Landsberger et al. 1992) and Lebanese (Lehrman et al. 1987), FH frequency can be as

The human low-density lipoprotein receptor mediates the transport of LDL into cells via endocytosis, and thus plays a major role in the clearance of lipoproteins from the blood. In 1973, by studying homozygous patient fibroblasts, Michael S. Brown and Joseph L.

and reproduction in any medium, provided the original work is properly cited.

**A Wide Spectrum in the Severity** 

**of Familial Hypercholesterolemia** 

Mathilde Varret and Jean-Pierre Rabès

http://dx.doi.org/10.5772/36432

**1. Introduction** 

Additional information is available at the end of the chapter

*PCSK9* (proprotein convertase subtilisin-kexin type 9).

high as 1/67 because of founder effects.

**2. The LDL receptor** 

**Chapter 3** 

## **Missense Mutation in the** *LDLR* **Gene: A Wide Spectrum in the Severity of Familial Hypercholesterolemia**

Mathilde Varret and Jean-Pierre Rabès

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/36432

## **1. Introduction**

54 Mutations in Human Genetic Disease

*Dev,* 11, 17, pp. 2191-2203

Zou, H., Wieser, R., Massague, J., and Niswander, L. (1997). Distinct roles of type I bone morphogenetic protein receptors in the formation and differentiation of cartilage. *Genes* 

> Hypercholesterolemia is a major risk factor for atherosclerosis and its premature cardiovascular complications. Hypercholesterolemia can be multifactorial (diet, genetic background...) or - less frequently - monogenic, leading to Autosomal Dominant Hypercholesterolemia (ADH, OMIM #143890). ADH is characterised by a selective elevation of plasmatic Low Density Lipoprotein (LDL) levels, tendinous xanthoma and premature coronary heart disease. ADH has proven to be genetically heterogeneous and associated with defects in at least 3 different genes: *LDLR* (LDL receptor), *APOB* (apolipoprotein B) and *PCSK9* (proprotein convertase subtilisin-kexin type 9).

> Familial hypercholesterolemia (FH, OMIM #606945) is the most frequent form of ADH and is due to mutations within the gene encoding the LDL specific receptor. FH is an autosomal co-dominant trait, with homozygotes being more severely affected than heterozygotes (Goldstein and Brown, 1989). FH is also one of the most common inherited disorders with a frequency of heterozygotes estimated to be 1:500 and a frequency of homozygotes being ≈ 1:106 in most populations. In certain communities, such as French Canadians (Moorjani et al. 1989), Finns (Koivisto et al. 1992), Afrikaners (Kotze et al. 1989; Leitersdorf et al. 1989), Druze (Landsberger et al. 1992) and Lebanese (Lehrman et al. 1987), FH frequency can be as high as 1/67 because of founder effects.

## **2. The LDL receptor**

The human low-density lipoprotein receptor mediates the transport of LDL into cells via endocytosis, and thus plays a major role in the clearance of lipoproteins from the blood. In 1973, by studying homozygous patient fibroblasts, Michael S. Brown and Joseph L.

Goldstein showed that the deficient protein in Familial Hypercholesterolemia was the LDL receptor (Goldstein and Brown, 1985).

Missense Mutation in the *LDLR* Gene: A Wide Spectrum in the Severity of Familial Hypercholesterolemia 57

shown to interact with the phosphotyrosine binding (PTB) domain of a specific clathrin adaptor protein encoded by the *LDLRAP1* gene. Mutations in the *LDLRAP1* gene have been reported in Autosomal Recessive Hypercholesterolemia (Garcia et al. 2001, Soutar

The reminder of exon 18 specifies the 2,6 kb 3' untranslated region of the mRNA.

**Figure 1.** Correspondence between functional domains of the protein and exons of the *LDLR* gene.

In normal fibroblasts, the precursor protein is modified in the ER: the 21 amino acid signal peptide is cleaved and the precursor of 120 kDa is O-glycosylated to give rise to the 160 kDa protein. The resultant mature protein is transported from the Golgi apparatus to the cell surface within 30 minutes. The transmembrane receptor is present at the surface of most cell types and mediates the transport of LDL into cells, via receptor-mediated endocytosis, thus playing a pivotal role in cholesterol homeostasis (Goldstein and Brown, 2009). By endosome acidification, the lipoparticle is dissociated from the receptor, degraded and the receptor

Mutations involving a small number of nucleotides, from point mutations to small deletions or insertions, account for 90% of all mutations in the *LDLR* gene, while the remaining are major rearrangements due to unequal recombination between the 30 *Alu* sequences

2010).

recycles back into the membrane.

**3. Mutations in the** *LDLR* **gene** 

The *LDLR* gene is localised at 19p13.1-p13.3, spans 45 kb and includes 18 exons (Lindgren et al. 1985; Yamamoto et al. 1984). It is ubiquitously expressed and encodes a glycoprotein of 839 amino acids that is pivotal in cholesterol homeostasis. The correspondence between the 6 functional domains of the protein and the exons of the *LDLR* gene is now well-established (Figure 1) (See Jeon and Blacklow 2005 for a review).


shown to interact with the phosphotyrosine binding (PTB) domain of a specific clathrin adaptor protein encoded by the *LDLRAP1* gene. Mutations in the *LDLRAP1* gene have been reported in Autosomal Recessive Hypercholesterolemia (Garcia et al. 2001, Soutar 2010).

The reminder of exon 18 specifies the 2,6 kb 3' untranslated region of the mRNA.

56 Mutations in Human Genetic Disease

receptor (Goldstein and Brown, 1985).

(Figure 1) (See Jeon and Blacklow 2005 for a review).

Goldstein showed that the deficient protein in Familial Hypercholesterolemia was the LDL

The *LDLR* gene is localised at 19p13.1-p13.3, spans 45 kb and includes 18 exons (Lindgren et al. 1985; Yamamoto et al. 1984). It is ubiquitously expressed and encodes a glycoprotein of 839 amino acids that is pivotal in cholesterol homeostasis. The correspondence between the 6 functional domains of the protein and the exons of the *LDLR* gene is now well-established

1. The signal peptide (21 amino acids) encoded by exon 1 is necessary for transport to the cell membrane and is cleaved during translocation into the endoplasmic reticulum (ER). 2. The ligand binding domain, encoded by exons 2 to 6 mediates the interaction with lipoproteins. This domain is made of seven modules named LDL receptor type A repeat (LR) and homologous to sequences of the protein C9 of the complement cascade (Südolf et al. 1985). Each LR module is about 40 residues long, has six conserved cysteine residues, and contains a conserved acidic region near the C-terminus which serves as a calcium-binding site (Yamamoto et al. 1984, Fass et al. 1997). Mutational studies of the seven LR modules of the LDL receptor indicate that modules 3-7 all contribute significantly to the binding of LDL particles (Russel et al. 1989). Each of the LR5 and LR6 modules is essentially structurally independent of the other (North et al. 1999). 3. The EGF precursor homology domain (400 amino acids encoded by exons 7 to 14) is made of three 40 amino acids repeats homologous to the EGF precursor, and is involved in the dissociation of the receptor and the lipoprotein in the endocytosis machinery. The two first repeats are contiguous and separated from the third by a 280 amino acid sequence that contains five copies of a conserved motif (YWTD) repeated once for each of 40-60 amino acids. The first epidermal growth factor-like repeat (EGF-A) in the EGF homology domain interacts in a sequence-specific manner with proprotein convertase subtilisin/kexin type 9 (PCSK9) (Zhang et al. 2007, Kwon et al. 2008). PCSK9 post-translationally regulates hepatic LDL receptors by binding to them on the cell surface and by leading to their degradation. Gain-of-function mutations that increase the affinity of PCSK9 toward the receptor and increase plasma LDL-cholesterol levels in humans, have been reported in the *PCSK9* gene associated with Autosomal Dominant Hypercholesterolemia (Abifadel et al. 2003, 2009). Loss-of-function mutations that decrease the affinity of PCSK9 toward the receptor have also been reported in the

*PCSK9* gene associated with low plasma levels of LDL (Cohen et al. 2005).

exon 17, is essential to the attachment of the receptor to the cell membrane.

4. Exon 15 encodes a 58 amino acid sequence that is enriched in serines and threonines, which serve as attachment sites for O-linked sugar chains. The absence of this exon has no significant functional consequence in cultured hamster fibroblasts (Davis et al. 1986). 5. The 22 amino acids membrane-anchoring domain, encoded by exon 16 and the 5' end of

6. The 50 amino acid cytoplasmic tails, encoded by the remainder of exon 17 and the 5' end of exon 18, are involved in the endocytosis of the protein. The NPXY motif was shown to interact with the AP-2 clathrin adaptor and thus is important in the localisation of the receptor in coated pits on the cell surface. The NPXY motif was also

**Figure 1.** Correspondence between functional domains of the protein and exons of the *LDLR* gene.

In normal fibroblasts, the precursor protein is modified in the ER: the 21 amino acid signal peptide is cleaved and the precursor of 120 kDa is O-glycosylated to give rise to the 160 kDa protein. The resultant mature protein is transported from the Golgi apparatus to the cell surface within 30 minutes. The transmembrane receptor is present at the surface of most cell types and mediates the transport of LDL into cells, via receptor-mediated endocytosis, thus playing a pivotal role in cholesterol homeostasis (Goldstein and Brown, 2009). By endosome acidification, the lipoparticle is dissociated from the receptor, degraded and the receptor recycles back into the membrane.

#### **3. Mutations in the** *LDLR* **gene**

Mutations involving a small number of nucleotides, from point mutations to small deletions or insertions, account for 90% of all mutations in the *LDLR* gene, while the remaining are major rearrangements due to unequal recombination between the 30 *Alu* sequences

identified throughout the gene (Hobbs et al. 1990). To date, more than 1400 point mutations and small deletions or insertions associated with FH have been reported in the *LDLR* gene (http://www.ucl.ac.uk/fh and www.umd.be/LDLR/).

Missense Mutation in the *LDLR* Gene: A Wide Spectrum in the Severity of Familial Hypercholesterolemia 59

1990). The LDL receptor is known to be a cysteine-rich protein in which disulphide bonds between two cysteines are essential for ensuring the correct folding of 10 major modules

The number of mutations affecting an amino acid is not always related to its frequency in the protein. Cysteine, tryptophane and aspartate are more frequently affected than others residues, indicating that they are essential actors of protein activity. Substitutions affect 57 (90%) of the 63 cysteines of the LDL receptor, 43 (57%) of the 75 aspartates and 12 (60%) of the 20 tryptophanes. Cysteines are involved in the folding of the ligand binding and EGFlike domains. Aspartates are also highly conserved residues of the repeated modules of the LDL binding domain. Their negative charges are involved in bonds with positively charged residues of the apo B and apo E ligands. Apart from its hydrophobicity, tryptophane does not have a structural or functional role as manifest as those of a cysteine or a charged residue. However, along with methionine, tryptophane is the only amino acid encoded by a

necessary for protein activity (Russell et al. 1989, Kurniawan et al. 2001).

single codon, probably explaining its "more mutable" trait observed here.

**Figure 2.** Distribution of point mutations within the LDL receptor gene (*LDLR*).

A certain proportion of the disease-causing substitutions (missense and nonsense mutations), ~25%, have been shown to alter functional splicing signals within exons, such as exonic splicing enhancers (ESE), to create an alternative splice site within exons that is used preferentially, or induce the loss of the consensus exonic splice site (Cartegni et al. 2002,

The UMD-LDLR database (www.umd.be/LDLR/) actually includes 1404 point mutations, small deletions or insertions and mutations affecting splicing (intronic mutations) in the *LDLR* gene reported in the literature. It cannot accommodate mutations from the UTR and promoter regions, and large deletions or insertions or indels. In addition, two mutations that affect the same allele are entered as two different records linked by the same sample ID. If the same mutation has been reported in apparently unrelated patients (for example, the c.1A>C (p.Met1Leu) identified in Spanish (Chaves et al. 2001), British (Day et al. 1997) and Dutch patients (Fouchier et al. 2005), separate entries were made for each patient as recurrent mutations, in the absence of haplotypes demonstrating a common ancestor.

Among these 1404 small DNA variations of the *LDLR* gene, 58.5% are missense mutations, 21.7% are small deletions or insertions, 10.4 % are nonsense and 9.4% are splice site mutations. A large majority of these small DNA variations are single nucleotide substitutions (76.6%, 1076/1404), including 75.1% missense, 13.6% nonsense and 11.3% splice site mutations.

#### **3.1. Missense mutations**

Missense mutations are the most numerous of the small DNA variations (58.5%, 821/1404) reported in the *LDLR* gene in association with Familial Hypercholesterolemia (FH). Like the other small DNA variations in the *LDLR* gene, missense mutations are widely distributed throughout the whole sequence of the gene (Figure 2). Therefore, no real mutation hot spot can be defined which sustains the need to scan the whole gene sequence to identify FHcausing mutations in the diagnostic procedures.

The CpG dinucleotide has been shown to be a hot spot for mutations in humans because it can undergo oxidative deamination of 5-methyl cytosine (Krawczak et al. 1998). The *LDLR* gene sequence includes 123 CpG dinucleotides, accounting for 4.8% of the coding sequence. This ratio is similar to the mean percentage of CpG (3.7%) in the coding sequence of a large number of genes involved in human diseases and localised on autosomes (Cooper and Krawczak 1990). Missense mutations are the only substitutions in the *LDLR* gene occurring at the CpG dinucleotide for 4.8% (46/954) of all the single nucleotide variations. Interestingly, in the *LDLR* gene, the percentage of substitution occurring at the CpG (4.8%) is significantly lower than the mean observed for disease-causing mutations in other genes (37%) (Cooper and Krawczak 1990). There is no explanation, to date, for this observation.

In the LDL receptor protein, the most numerous amino acids are aspartate (8.7%), serine (8.1%), leucine (7.7%), cysteine (7.3%), glycine (7.2%) and valine (6.7%). The less represented amino acids are methionine (1.3%), tyrosine (2.0%), histidine (2.2%), tryptophane (2.3%) and phenylalanine (3.0%). This distribution of amino acids is consistent with the one reported for human proteins in general, with an exception for cysteine that is less abundant (3%) (Lewin 1990). The LDL receptor is known to be a cysteine-rich protein in which disulphide bonds between two cysteines are essential for ensuring the correct folding of 10 major modules necessary for protein activity (Russell et al. 1989, Kurniawan et al. 2001).

58 Mutations in Human Genetic Disease

site mutations.

**3.1. Missense mutations** 

causing mutations in the diagnostic procedures.

(http://www.ucl.ac.uk/fh and www.umd.be/LDLR/).

identified throughout the gene (Hobbs et al. 1990). To date, more than 1400 point mutations and small deletions or insertions associated with FH have been reported in the *LDLR* gene

The UMD-LDLR database (www.umd.be/LDLR/) actually includes 1404 point mutations, small deletions or insertions and mutations affecting splicing (intronic mutations) in the *LDLR* gene reported in the literature. It cannot accommodate mutations from the UTR and promoter regions, and large deletions or insertions or indels. In addition, two mutations that affect the same allele are entered as two different records linked by the same sample ID. If the same mutation has been reported in apparently unrelated patients (for example, the c.1A>C (p.Met1Leu) identified in Spanish (Chaves et al. 2001), British (Day et al. 1997) and Dutch patients (Fouchier et al. 2005), separate entries were made for each patient as

recurrent mutations, in the absence of haplotypes demonstrating a common ancestor.

Among these 1404 small DNA variations of the *LDLR* gene, 58.5% are missense mutations, 21.7% are small deletions or insertions, 10.4 % are nonsense and 9.4% are splice site mutations. A large majority of these small DNA variations are single nucleotide substitutions (76.6%, 1076/1404), including 75.1% missense, 13.6% nonsense and 11.3% splice

Missense mutations are the most numerous of the small DNA variations (58.5%, 821/1404) reported in the *LDLR* gene in association with Familial Hypercholesterolemia (FH). Like the other small DNA variations in the *LDLR* gene, missense mutations are widely distributed throughout the whole sequence of the gene (Figure 2). Therefore, no real mutation hot spot can be defined which sustains the need to scan the whole gene sequence to identify FH-

The CpG dinucleotide has been shown to be a hot spot for mutations in humans because it can undergo oxidative deamination of 5-methyl cytosine (Krawczak et al. 1998). The *LDLR* gene sequence includes 123 CpG dinucleotides, accounting for 4.8% of the coding sequence. This ratio is similar to the mean percentage of CpG (3.7%) in the coding sequence of a large number of genes involved in human diseases and localised on autosomes (Cooper and Krawczak 1990). Missense mutations are the only substitutions in the *LDLR* gene occurring at the CpG dinucleotide for 4.8% (46/954) of all the single nucleotide variations. Interestingly, in the *LDLR* gene, the percentage of substitution occurring at the CpG (4.8%) is significantly lower than the mean observed for disease-causing mutations in other genes (37%) (Cooper and Krawczak 1990). There is no explanation, to date, for this observation.

In the LDL receptor protein, the most numerous amino acids are aspartate (8.7%), serine (8.1%), leucine (7.7%), cysteine (7.3%), glycine (7.2%) and valine (6.7%). The less represented amino acids are methionine (1.3%), tyrosine (2.0%), histidine (2.2%), tryptophane (2.3%) and phenylalanine (3.0%). This distribution of amino acids is consistent with the one reported for human proteins in general, with an exception for cysteine that is less abundant (3%) (Lewin The number of mutations affecting an amino acid is not always related to its frequency in the protein. Cysteine, tryptophane and aspartate are more frequently affected than others residues, indicating that they are essential actors of protein activity. Substitutions affect 57 (90%) of the 63 cysteines of the LDL receptor, 43 (57%) of the 75 aspartates and 12 (60%) of the 20 tryptophanes. Cysteines are involved in the folding of the ligand binding and EGFlike domains. Aspartates are also highly conserved residues of the repeated modules of the LDL binding domain. Their negative charges are involved in bonds with positively charged residues of the apo B and apo E ligands. Apart from its hydrophobicity, tryptophane does not have a structural or functional role as manifest as those of a cysteine or a charged residue. However, along with methionine, tryptophane is the only amino acid encoded by a single codon, probably explaining its "more mutable" trait observed here.

**Figure 2.** Distribution of point mutations within the LDL receptor gene (*LDLR*).

A certain proportion of the disease-causing substitutions (missense and nonsense mutations), ~25%, have been shown to alter functional splicing signals within exons, such as exonic splicing enhancers (ESE), to create an alternative splice site within exons that is used preferentially, or induce the loss of the consensus exonic splice site (Cartegni et al. 2002,

Sterne-Weiler et al. 2011). Within the *LDLR* gene, 28.4% of the reported missense mutations are predicted to alter functional splicing signals. The missense mutation c.2140G>C (p.Glu714Gln) that was predicted to be benign with four prediction tools for substitutions (Polyphen\*, SIFT\*, Pmut\* and SNPs3D\*) was predicted to create the loss of the intron 14 donor splice site with either NetGene2\* and NNSPLICE\* prediction tools for splice site mutations (Marduel et al. 2010). It is clear, however, that mRNA analyses are necessary to support these predictions, as performed for a small number of exonic substitutions. The conservative amino acid substitution c.2389 G>T (p.V776L) that would be unlikely to affect LDL receptor function, concerns the last nucleotide of exon 16 and causes exon 16 skipping (Bourbon et al. 2009). These missense mutations would therefore be likely to exert their major pathological effects on splicing rather than through an alteration in the amino acid sequence of the LDL receptor. This is reinforced by the observation of several silent substitutions associated with the clinical phenotype of familial hypercholesterolemia. The silent mutation p.Leu605Leu (c.1813C>T) was predicted to create a new donor splice site AGGT at position 1813 in exon 12. The use of this new donor site would lead to the substitution of leucine 605 by a threonine, the deletion of 11 amino acids (from Alanine 606 to Aspartate 616), a frameshift and the appearance of a premature termination 49 codons further on (Marduel et al. 2010). The variant, c.621C>T (p.Gly207Gly), was found to be associated with altered splicing. The nucleotide change leading to p.Gly207Gly resulted in the generation of new 3'-splice donor site in exon 4 of the LDL receptor gene. Splicing of this alternate splice site leads to an in-frame 75-base pair deletion in a stable mRNA of exon 4 and nonsense-mediated mRNA decay (Defesche et al. 2008). The silent mutation, p.Arg406Arg, that also introduces a new splice site, causes a deletion of 31 bp in the *LDLR* mRNA sequence, and introduces a premature termination 4 codons further on (Bourbon et al. 2007).

Missense Mutation in the *LDLR* Gene: A Wide Spectrum in the Severity of Familial Hypercholesterolemia 61

evidence that a repeated motif flanking the frameshift event could be involved in the aetiology of the mutation in 48.0% of the deletional events and in 29.2% of the insertional

Half of the frameshift mutations involved a single nucleotide: 58.5% (103/176) among deletions and 56.5% (48/85) among insertions. In half of the deletion cases and in half the insertion cases, the single nucleotide deletion/insertion occurs within runs of 2 to 7 identical bases. Runs of identical bases are known to cause deletions/insertions according to the

Deletions involving larger sequences (from 2 to 49 bp) can be divided into three different types: (1) One of the repeated flanking sequences is included in the deletion, which is also explained by the slipped mispairing mechanism occurring at DNA replication (Ball et al. 2005); (2) The repeated sequences flanking the deletion are not included in the frameshift mutation, which is explained by homologous recombination between palindromic or symmetric repeated sequences (Cooper 1995); (3) Parts of the flanking repeated sequences are included in the deletion. To date, no molecular mechanism has been identified to explain

Insertions involving larger sequences (from 2 to 23 bp) can be explained by the same mechanisms as described for deletions, and can be divided into two different types: (1) The inserted sequence is a duplication; (2) The inserted sequence is new within the *LDLR* gene sequence. This latter observation raises the hypothesis that very probably insertions do not occur at random but rather in order to create repeated sequences that were not present in the original gene sequence. A consensus sequence, GTAAGT, was frequently identified flanking small deletions or insertions (Ball et al. 2005). In the *LDLR* gene sequence, this consensus is present at the 3' end of exon 4 at position c.681-687. Among the 96 deletions (in frame and frameshift) in the *LDLR* gene, 11 (11.5%) are at this position pointing to a discrete hot spot for insertions, as observed in Figure 2 and in accordance with previous reports (Kotze et al. 1996).

Nonsense mutations represent 10.4% (146/1404) of the small DNA variations in the *LDLR*

Among the 860 codons of the *LDLR* gene sequence, 253 potential stop codons (codons that can be turned into a stop codon with only one substitution) were identified (29.4%) and were not equally distributed throughout the whole gene. In exons 2 to 8, more than 33% of the protein codons are potential stop codons, while less than 21% of the protein codons are potential stop codons in exons 9, 10, 13, 15 and 16. Among these 253 potential stop codons,

The number of mutations affecting potential stop codons is not always related to their frequency in each exon. Potential stop codons are more frequently affected by mutation in exons 3, 9, 10 and 14, with 57.1%, 50.0%, 46.2% and 53.3% respectively of potential stop codons in each exon carrying a mutational event. Conversely, in exons 1, 12, 13 and 17,

gene, and 13.6% (146/1076) of the FH-causing substitutions.

93 of them (36.8%) are affected by a mutational event.

slipped mispairing mechanism occurring at DNA replication (Ball et al. 2005).

events.

such deletional events.

**3.3. Nonsense mutations** 


Tools for *in silico* prediction of protein function.

#### **3.2. Frameshift mutations**

Among the 1404 small DNA variations of the *LDLR* gene, a total of 305 (21.7%) are small deletions or insertions, including 261 (85.6%) independent mutations leading to a frameshift and 55 (14.4%) in-frame deletions or insertions. This proportion of in-frame small deletions or insertions is consistent with observations made for other disease-causing genes (Cooper, Antonarakis and Krawczak 1995). The frameshift mutations are due to either a small deletion (176/261, 12.5%) or insertion/duplication (85/261, 6.0%) of a few nucleotides (from 1 to 49 for deletions, from 1 to 23 for insertions). The sequence context analysis provides evidence that a repeated motif flanking the frameshift event could be involved in the aetiology of the mutation in 48.0% of the deletional events and in 29.2% of the insertional events.

Half of the frameshift mutations involved a single nucleotide: 58.5% (103/176) among deletions and 56.5% (48/85) among insertions. In half of the deletion cases and in half the insertion cases, the single nucleotide deletion/insertion occurs within runs of 2 to 7 identical bases. Runs of identical bases are known to cause deletions/insertions according to the slipped mispairing mechanism occurring at DNA replication (Ball et al. 2005).

Deletions involving larger sequences (from 2 to 49 bp) can be divided into three different types: (1) One of the repeated flanking sequences is included in the deletion, which is also explained by the slipped mispairing mechanism occurring at DNA replication (Ball et al. 2005); (2) The repeated sequences flanking the deletion are not included in the frameshift mutation, which is explained by homologous recombination between palindromic or symmetric repeated sequences (Cooper 1995); (3) Parts of the flanking repeated sequences are included in the deletion. To date, no molecular mechanism has been identified to explain such deletional events.

Insertions involving larger sequences (from 2 to 23 bp) can be explained by the same mechanisms as described for deletions, and can be divided into two different types: (1) The inserted sequence is a duplication; (2) The inserted sequence is new within the *LDLR* gene sequence. This latter observation raises the hypothesis that very probably insertions do not occur at random but rather in order to create repeated sequences that were not present in the original gene sequence. A consensus sequence, GTAAGT, was frequently identified flanking small deletions or insertions (Ball et al. 2005). In the *LDLR* gene sequence, this consensus is present at the 3' end of exon 4 at position c.681-687. Among the 96 deletions (in frame and frameshift) in the *LDLR* gene, 11 (11.5%) are at this position pointing to a discrete hot spot for insertions, as observed in Figure 2 and in accordance with previous reports (Kotze et al. 1996).

#### **3.3. Nonsense mutations**

60 Mutations in Human Genetic Disease

al. 2007).

Sterne-Weiler et al. 2011). Within the *LDLR* gene, 28.4% of the reported missense mutations are predicted to alter functional splicing signals. The missense mutation c.2140G>C (p.Glu714Gln) that was predicted to be benign with four prediction tools for substitutions (Polyphen\*, SIFT\*, Pmut\* and SNPs3D\*) was predicted to create the loss of the intron 14 donor splice site with either NetGene2\* and NNSPLICE\* prediction tools for splice site mutations (Marduel et al. 2010). It is clear, however, that mRNA analyses are necessary to support these predictions, as performed for a small number of exonic substitutions. The conservative amino acid substitution c.2389 G>T (p.V776L) that would be unlikely to affect LDL receptor function, concerns the last nucleotide of exon 16 and causes exon 16 skipping (Bourbon et al. 2009). These missense mutations would therefore be likely to exert their major pathological effects on splicing rather than through an alteration in the amino acid sequence of the LDL receptor. This is reinforced by the observation of several silent substitutions associated with the clinical phenotype of familial hypercholesterolemia. The silent mutation p.Leu605Leu (c.1813C>T) was predicted to create a new donor splice site AGGT at position 1813 in exon 12. The use of this new donor site would lead to the substitution of leucine 605 by a threonine, the deletion of 11 amino acids (from Alanine 606 to Aspartate 616), a frameshift and the appearance of a premature termination 49 codons further on (Marduel et al. 2010). The variant, c.621C>T (p.Gly207Gly), was found to be associated with altered splicing. The nucleotide change leading to p.Gly207Gly resulted in the generation of new 3'-splice donor site in exon 4 of the LDL receptor gene. Splicing of this alternate splice site leads to an in-frame 75-base pair deletion in a stable mRNA of exon 4 and nonsense-mediated mRNA decay (Defesche et al. 2008). The silent mutation, p.Arg406Arg, that also introduces a new splice site, causes a deletion of 31 bp in the *LDLR* mRNA sequence, and introduces a premature termination 4 codons further on (Bourbon et

> NetGene2 http://www.cbs.dtu.dk/services/NetGene2/ NNSPLICE http://www.fruitfly.org/seq\_tools/splice.html Polyphen http://genetics.bwh.harvard.edu/pph/

Among the 1404 small DNA variations of the *LDLR* gene, a total of 305 (21.7%) are small deletions or insertions, including 261 (85.6%) independent mutations leading to a frameshift and 55 (14.4%) in-frame deletions or insertions. This proportion of in-frame small deletions or insertions is consistent with observations made for other disease-causing genes (Cooper, Antonarakis and Krawczak 1995). The frameshift mutations are due to either a small deletion (176/261, 12.5%) or insertion/duplication (85/261, 6.0%) of a few nucleotides (from 1 to 49 for deletions, from 1 to 23 for insertions). The sequence context analysis provides

Pmut http://mmb2.pcb.ub.es:8080/PMut/

SIFT http://sift.jcvi.org/

Tools for *in silico* prediction of protein function.

**3.2. Frameshift mutations** 

SNP3D http://www.snps3d.org/

Nonsense mutations represent 10.4% (146/1404) of the small DNA variations in the *LDLR* gene, and 13.6% (146/1076) of the FH-causing substitutions.

Among the 860 codons of the *LDLR* gene sequence, 253 potential stop codons (codons that can be turned into a stop codon with only one substitution) were identified (29.4%) and were not equally distributed throughout the whole gene. In exons 2 to 8, more than 33% of the protein codons are potential stop codons, while less than 21% of the protein codons are potential stop codons in exons 9, 10, 13, 15 and 16. Among these 253 potential stop codons, 93 of them (36.8%) are affected by a mutational event.

The number of mutations affecting potential stop codons is not always related to their frequency in each exon. Potential stop codons are more frequently affected by mutation in exons 3, 9, 10 and 14, with 57.1%, 50.0%, 46.2% and 53.3% respectively of potential stop codons in each exon carrying a mutational event. Conversely, in exons 1, 12, 13 and 17, 16.7%, 18.2%, 20.0% and 26.7% respectively of the potential stop codons are affected by a mutational event.

Missense Mutation in the *LDLR* Gene: A Wide Spectrum in the Severity of Familial Hypercholesterolemia 63

shape; transversions are interchanges of purine for pyrimidine bases, which involve exchange of one-ring and two-ring structures. Therefore, there are twice as many possible transversions as there are transitions. However, among human diseases-causing substitutions, transitions (63%) are observed more frequently than transversions (37%)

Accordingly, in the *LDLR* gene, missense mutations due to transitions (55.9%, 459/821) are more frequent than substitutions due to transversions (42.5%, 349/821) (Figure 3). Like exonic mutational events, small DNA variations at the splice site are substitutions (92.4%, 122/132) or small deletions/insertions (9.1%, 12/132). Again, among the intronic substitutions, transitions (59.8%, 73/122) are observed more frequently than transversions (40.1%, 49/122) (Figure 3). Interestingly, in the *LDLR* gene, the ratio of transversion/transition is different for nonsense mutations. The transversions are the more frequent mutational event leading to a stop codon (52.7%, 77/146) compared to transitions

**Figure 3.** Molecular events frequency of the different groups of mutations. Values are given in % of

Because of the constraints mediated by the genetic code, transition A>G and transversion A>C, G>C cannot be at the origin of a stop codon. Thus, only two transitional events (G>A

(Cooper and Krawczak 1990).

(47.3%, 69/146) (Figure 3).

each event within each group of mutation.

#### **3.4. Splice site mutations**

Among the 1404 small DNA variations of the *LDLR* gene, a total of 132 (9.4%) are splice site mutations and, among the 1076 single nucleotide FH-causing substitutions, 122 (11.4%) are intronic. From the analysis of a large number of genes, a mean proportion of 15% for splice site mutations among disease-causing DNA substitutions was evaluated (Krawczak et al. 2007). The expected frequency of splice site substitutions within the *LDLR* gene is 9% (Cooper and Krawczak 1990). The number of FH-causing splice site substitutions observed in this wide review of the literature (9.5%) is thus consistent with the expected value for the *LDLR* gene.

Among the 132 splice site mutations of the *LDLR* gene, 14 (10.6%) are mid-intronic mutations situated at more than 10 bp of intron/exon junctions. Half of the intronic mutational events in the *LDLR* gene (55.3%, 73/132) affect the two canonical ''AG'' and ''GT'' highly conserved dinucleotides of the acceptor and donor splice sites respectively. Accordingly to the analysis of a large number of disease-causing mutations in different genes (Krawckak et al. 1992), within the *LDLR* gene intronic mutations affecting a donor splice site are more frequent (65.1%, 86/132) than mutations affecting an acceptor splice site (36.4%, 48/132).

## **4. Comparative analysis of mutations in the** *LDLR* **gene**

To facilitate the mutational analysis of the *LDLR* gene and promote the analysis of the relationship between genotype and phenotype, in 1997 we created a software package along with a computerised database: UMD-LDLR. For each mutation, information is provided at several levels: at the gene level (exon and codon number, wild type and mutant codon, mutational event, mutation name), at the mRNA level (size, processing), at the protein level (wild type and mutant amino acid, affected domain, activity, mutation class), and at the personal level (ethnic background, age, sex, body mass index and familial history of coronary heart disease). The software package contains routines for the analysis of the LDLR database that were developed with the 4th dimensionR (4D) package from ACI. The use of the 4D SGDB gives access to optimised multi-criteria research and sorting tools to select records from any field. Moreover, 13 routines were specifically developed (Varret et al. 1997, 1998, Villèger et al. 2002, Béroud et al. 2005, www.umd.be/LDLR/).

The aim of this study was to analyse these four mutation groups at the molecular, biological and clinical level.

#### **4.1. Analysis of** *LDLR* **mutations at the molecular level**

#### *4.1.1. Frequency of mutational events*

DNA substitutions are of two types: transitions are interchanges of two-ring purines (A>G and G>A) or of one-ring pyrimidines (C>T and T>C) and, therefore, involve bases of similar shape; transversions are interchanges of purine for pyrimidine bases, which involve exchange of one-ring and two-ring structures. Therefore, there are twice as many possible transversions as there are transitions. However, among human diseases-causing substitutions, transitions (63%) are observed more frequently than transversions (37%) (Cooper and Krawczak 1990).

62 Mutations in Human Genetic Disease

**3.4. Splice site mutations** 

mutational event.

and clinical level.

*4.1.1. Frequency of mutational events* 

16.7%, 18.2%, 20.0% and 26.7% respectively of the potential stop codons are affected by a

Among the 1404 small DNA variations of the *LDLR* gene, a total of 132 (9.4%) are splice site mutations and, among the 1076 single nucleotide FH-causing substitutions, 122 (11.4%) are intronic. From the analysis of a large number of genes, a mean proportion of 15% for splice site mutations among disease-causing DNA substitutions was evaluated (Krawczak et al. 2007). The expected frequency of splice site substitutions within the *LDLR* gene is 9% (Cooper and Krawczak 1990). The number of FH-causing splice site substitutions observed in this wide review of the literature (9.5%) is thus consistent with the expected value for the *LDLR* gene.

Among the 132 splice site mutations of the *LDLR* gene, 14 (10.6%) are mid-intronic mutations situated at more than 10 bp of intron/exon junctions. Half of the intronic mutational events in the *LDLR* gene (55.3%, 73/132) affect the two canonical ''AG'' and ''GT'' highly conserved dinucleotides of the acceptor and donor splice sites respectively. Accordingly to the analysis of a large number of disease-causing mutations in different genes (Krawckak et al. 1992), within the *LDLR* gene intronic mutations affecting a donor splice site are more frequent (65.1%,

To facilitate the mutational analysis of the *LDLR* gene and promote the analysis of the relationship between genotype and phenotype, in 1997 we created a software package along with a computerised database: UMD-LDLR. For each mutation, information is provided at several levels: at the gene level (exon and codon number, wild type and mutant codon, mutational event, mutation name), at the mRNA level (size, processing), at the protein level (wild type and mutant amino acid, affected domain, activity, mutation class), and at the personal level (ethnic background, age, sex, body mass index and familial history of coronary heart disease). The software package contains routines for the analysis of the LDLR database that were developed with the 4th dimensionR (4D) package from ACI. The use of the 4D SGDB gives access to optimised multi-criteria research and sorting tools to select records from any field. Moreover, 13 routines were specifically developed (Varret et al. 1997,

The aim of this study was to analyse these four mutation groups at the molecular, biological

DNA substitutions are of two types: transitions are interchanges of two-ring purines (A>G and G>A) or of one-ring pyrimidines (C>T and T>C) and, therefore, involve bases of similar

86/132) than mutations affecting an acceptor splice site (36.4%, 48/132).

**4. Comparative analysis of mutations in the** *LDLR* **gene** 

1998, Villèger et al. 2002, Béroud et al. 2005, www.umd.be/LDLR/).

**4.1. Analysis of** *LDLR* **mutations at the molecular level** 

Accordingly, in the *LDLR* gene, missense mutations due to transitions (55.9%, 459/821) are more frequent than substitutions due to transversions (42.5%, 349/821) (Figure 3). Like exonic mutational events, small DNA variations at the splice site are substitutions (92.4%, 122/132) or small deletions/insertions (9.1%, 12/132). Again, among the intronic substitutions, transitions (59.8%, 73/122) are observed more frequently than transversions (40.1%, 49/122) (Figure 3). Interestingly, in the *LDLR* gene, the ratio of transversion/transition is different for nonsense mutations. The transversions are the more frequent mutational event leading to a stop codon (52.7%, 77/146) compared to transitions (47.3%, 69/146) (Figure 3).

**Figure 3.** Molecular events frequency of the different groups of mutations. Values are given in % of each event within each group of mutation.

Because of the constraints mediated by the genetic code, transition A>G and transversion A>C, G>C cannot be at the origin of a stop codon. Thus, only two transitional events (G>A and C>T) and 6 transversional events (A>T, C>A, C>G, G>T, T>A and T>G) lead to a stop codon, which means that half of the transitional events and a quarter of the transversional events are not involved in nonsense mutations. These constraints can explain the observed difference in the ratio of transversion/transition between missense and nonsense mutations.

Missense Mutation in the *LDLR* Gene: A Wide Spectrum in the Severity of Familial Hypercholesterolemia 65

terminal part of the EGF-like domain which is rich in YWTD repeats which are essential for the correct folding of the receptor at the cell surface. To date, there is no explanation as to the observed deficit of substitutions within exons 17 and 18 encoding the COOH-terminal part of the membrane-anchoring domain and the cytoplasmic tail, which are essential for the attachment of the receptor to the cell membrane and in the endocytosis of the protein.

In exon 2, we observed a significant deficit of missenses and a significant excess of nonsenses (Table 1). Exon 2 encodes the first LR motif of the ligand binding domain in the LDL receptor. To date, there is no data revealing a more or less essential function of this LR

Interestingly, nonsense mutations are the only ones that present a significant excess in exon 7 of the *LDLR* gene (Table 1). This excess relies upon the high frequency of the c.1048C>T, p.Arg350X mutation, formerly called FH-Fossum. Indeed, this mutation is reported in 9 apparently unrelated patients from different geographic origins: Norway (Solberg et al. 1994), the Netherlands (Lombardi et al. 1995), the U.K. (Day et al. 1997), Poland (Gorski et al. 1998), Germany (Thiart et al. 1998), Canada (Gaudet et al. 1999), Japan (Yu et al. 2002), Denmark (Damgaard et al. 2005) and Spain (Brusgaard et al. 2006). In the absence of haplotypes demonstrating a common ancestor, these mutational events are supposed to be

 **2,6** 1,7 ns 2,7 ns 1,9 ns **5,0** 2,5 < 0.01 11,6 < 0.001 3,9 ns **4,8** 6,4 < 0.05 6,8 < 0.05 6,5 < 0.02 **14,9** 20,5 < 0.001 20,5 < 0.001 20,5 < 0.001 **4,8** 4,4 ns 3,4 ns 4,3 ns **4,9** 7,0 < 0.01 5,5 ns 6,8 < 0.01 **4,7** 5,2 ns 8,9 < 0.02 5,7 ns **5,0** 5,3 ns 4,8 ns 5,2 ns **6,6** 11,2 < 0.001 4,1 ns 10,1 < 0.001 **8,7** 7,3 ns 6,2 ns 7,1 ns **4,7** 4,9 ns 4,8 ns 4,9 ns **5,2** 6,4 ns 2,1 ns 5,7 ns **5,6** 5,3 ns 2,1 ns 4,8 ns **5,9** 6,4 ns 9,6 ns 6,9 ns **6,4** 1,6 < 0.001 2,1 < 0.05 1,7 < 0.001 **2,8** 1,5 < 0.05 0,0 < 0.05 1,3 < 0.01 **6,1** 2,3 < 0.001 4,8 ns 2,7 < 0.001 **1,3** 0,1 < 0.01 0,0 ns 0,1 < 0.001 **Table 1.** Distribution of the different exonic substitutions throughout the 18 exons of the *LDLR* gene.

**Observed nonsenses**

**% significance % significance % significance**

**Observed exonic substitutions**

recurrent and to correspond to a mutational hot-spot in the *LDLR* gene.

**Observed missenses**

motif when compared with the six others.

**Exon Expected** 

**mutations (%)**

However, the ratio of transversion/transition is consistent with the one observed for human diseases-causing substitutions (Cooper and Krawczak 1990) when the three groups of mutations are taken together (missense, nonsense and splice). Altogether, transitions (55.9%, 601/1076) are observed more frequently than transversions (44.1%, 475/1076).

#### *4.1.2. Distribution of the substitutions in the 18 exons of the LDLR gene*

The expected number of mutations in each exon is estimated by the 'Stat exons' tool of the UMD software according to the size and the composition (mutability of each codon) of each exon (Béroud et al. 2000 and 2005). This analysis enables the detection of a statistically significant difference between observed and expected mutations.

For exons 1, 5, 8 and 10 to 14, all types of substitutions are distributed as expected. There is a significant excess of all substitutions (missense and nonsense) within exons 3 and 4 (Table 1), indicating discrete mutational hot-spots and underlining the essential role played by the encoded domains in protein function. Exon 3 encodes the second LR motif of the ligand binding domain in the LDL receptor. To date, there is no data revealing a more essential function of this LR motif when compared to the six others. Exon 4 encodes the three central LR motifs (LR3, LR4 and LR5) of the ligand binding domain in the LDL receptor. The LR5 motif have been shown to be the only one of the seven LR motifs to be able to bind the two ligands of the receptor, apo B and apo E, while the 6 other motifs only bind apo B (Russel et al. 1989). Thus, the mutations affecting this motif are associated with a more severe alteration of lipoprotein catabolism and, therefore, have a higher tendency to be selected by FH definition criteria. There is a significant deficit of all substitutions (missense and nonsense) within exons 15 and 16 (Table 1) indicating discrete mutational cold-spots. Exon 15 encodes the O-linked sugar domain of the LDL receptor that has been shown to have no significant functional activity (Davis et al. 1986). To date, there is no explanation as to the observed deficit of substitutions within exon 16 which encodes the membrane-anchoring domain that is essential to the attachment of the receptor to the cell membrane.

The two types of exonic substitutions (missense and nonsense) are differently distributed in exons 2, 6, 7, 9, 17 and 18 of the *LDLR* gene (Table 1). Missense mutations are the only ones presenting a significant excess in exons 6 and 9 and a significant deficit in exons 17 and 18 (Table 1), maybe reflecting a bias in this analysis due to the different number of mutations of each type. Nonsense mutations are less numerous than missense mutations, a significant difference is thus less probably obtained for nonsenses than for missenses. Nevertheless, these observations indicate discrete mutational hot-spots within exons 6 and 9 and discrete mutational cold-spots within exons 17 and 18. Exon 6 encodes the last LR motif of the ligand binding domain in the LDL receptor. To date, there is no data revealing a more essential function of this LR motif when compared with the six others. Exon 9 encodes the NH2terminal part of the EGF-like domain which is rich in YWTD repeats which are essential for the correct folding of the receptor at the cell surface. To date, there is no explanation as to the observed deficit of substitutions within exons 17 and 18 encoding the COOH-terminal part of the membrane-anchoring domain and the cytoplasmic tail, which are essential for the attachment of the receptor to the cell membrane and in the endocytosis of the protein.

64 Mutations in Human Genetic Disease

and C>T) and 6 transversional events (A>T, C>A, C>G, G>T, T>A and T>G) lead to a stop codon, which means that half of the transitional events and a quarter of the transversional events are not involved in nonsense mutations. These constraints can explain the observed difference in the ratio of transversion/transition between missense and nonsense mutations.

However, the ratio of transversion/transition is consistent with the one observed for human diseases-causing substitutions (Cooper and Krawczak 1990) when the three groups of mutations are taken together (missense, nonsense and splice). Altogether, transitions (55.9%,

The expected number of mutations in each exon is estimated by the 'Stat exons' tool of the UMD software according to the size and the composition (mutability of each codon) of each exon (Béroud et al. 2000 and 2005). This analysis enables the detection of a statistically

For exons 1, 5, 8 and 10 to 14, all types of substitutions are distributed as expected. There is a significant excess of all substitutions (missense and nonsense) within exons 3 and 4 (Table 1), indicating discrete mutational hot-spots and underlining the essential role played by the encoded domains in protein function. Exon 3 encodes the second LR motif of the ligand binding domain in the LDL receptor. To date, there is no data revealing a more essential function of this LR motif when compared to the six others. Exon 4 encodes the three central LR motifs (LR3, LR4 and LR5) of the ligand binding domain in the LDL receptor. The LR5 motif have been shown to be the only one of the seven LR motifs to be able to bind the two ligands of the receptor, apo B and apo E, while the 6 other motifs only bind apo B (Russel et al. 1989). Thus, the mutations affecting this motif are associated with a more severe alteration of lipoprotein catabolism and, therefore, have a higher tendency to be selected by FH definition criteria. There is a significant deficit of all substitutions (missense and nonsense) within exons 15 and 16 (Table 1) indicating discrete mutational cold-spots. Exon 15 encodes the O-linked sugar domain of the LDL receptor that has been shown to have no significant functional activity (Davis et al. 1986). To date, there is no explanation as to the observed deficit of substitutions within exon 16 which encodes the membrane-anchoring

601/1076) are observed more frequently than transversions (44.1%, 475/1076).

domain that is essential to the attachment of the receptor to the cell membrane.

The two types of exonic substitutions (missense and nonsense) are differently distributed in exons 2, 6, 7, 9, 17 and 18 of the *LDLR* gene (Table 1). Missense mutations are the only ones presenting a significant excess in exons 6 and 9 and a significant deficit in exons 17 and 18 (Table 1), maybe reflecting a bias in this analysis due to the different number of mutations of each type. Nonsense mutations are less numerous than missense mutations, a significant difference is thus less probably obtained for nonsenses than for missenses. Nevertheless, these observations indicate discrete mutational hot-spots within exons 6 and 9 and discrete mutational cold-spots within exons 17 and 18. Exon 6 encodes the last LR motif of the ligand binding domain in the LDL receptor. To date, there is no data revealing a more essential function of this LR motif when compared with the six others. Exon 9 encodes the NH2-

*4.1.2. Distribution of the substitutions in the 18 exons of the LDLR gene* 

significant difference between observed and expected mutations.

In exon 2, we observed a significant deficit of missenses and a significant excess of nonsenses (Table 1). Exon 2 encodes the first LR motif of the ligand binding domain in the LDL receptor. To date, there is no data revealing a more or less essential function of this LR motif when compared with the six others.

Interestingly, nonsense mutations are the only ones that present a significant excess in exon 7 of the *LDLR* gene (Table 1). This excess relies upon the high frequency of the c.1048C>T, p.Arg350X mutation, formerly called FH-Fossum. Indeed, this mutation is reported in 9 apparently unrelated patients from different geographic origins: Norway (Solberg et al. 1994), the Netherlands (Lombardi et al. 1995), the U.K. (Day et al. 1997), Poland (Gorski et al. 1998), Germany (Thiart et al. 1998), Canada (Gaudet et al. 1999), Japan (Yu et al. 2002), Denmark (Damgaard et al. 2005) and Spain (Brusgaard et al. 2006). In the absence of haplotypes demonstrating a common ancestor, these mutational events are supposed to be recurrent and to correspond to a mutational hot-spot in the *LDLR* gene.


**Table 1.** Distribution of the different exonic substitutions throughout the 18 exons of the *LDLR* gene.

#### **4.2. Analysis of** *LDLR* **mutations at the biological level**

#### *4.2.1. Functional classes of LDLR gene's mutations*

Mutations in the *LDLR* gene have been classified into 5 functional groups based on the characteristics of the mutant protein produced and analysed in patients' fibroblasts (Hobbs et al 1992):

Missense Mutation in the *LDLR* Gene: A Wide Spectrum in the Severity of Familial Hypercholesterolemia 67

**Figure 4.** Distribution of the different mutations according to the three main functional classes.

(from 2% to 47% for heterozygotes and from 2% to 11% for homozygotes).

In the UMD-LDLR database, the LDL receptor activity measured in patients' fibroblasts is available for 91 single nucleotide mutations: assays were performed for 24 heterozygote carriers, 22 homozygote carriers and 45 compound heterozygotes.. For homozygote carriers of a missense mutation, the mean LDL receptor activity is 8.7% rather than 2.7% for carriers of a mutation leading to a protein of abnormal size (nonsense, frameshift and splice) (Figure 5). For heterozygote carriers of a missense mutation, the mean LDL receptor activity is 33.2% rather than 19.8% for carriers of an abnormal-protein mutation. Moreover, a gradient can be drawn for compound heterozygotes with a mean LDL receptor activity of 13.3%, 7.3% and 3.6% for carriers of two missense mutations, one missense and one abnormal-protein mutation and two abnormal-protein mutations respectively (Figure 5). Once again, these observations are globally in agreement with an admittedly more severe phenotype for mutations leading to a protein of abnormal size when compared with missense mutations. However, missense mutations in the *LDLR* gene are associated with a larger spectrum of LDL receptor activity in fibroblasts (from 2% to 67% for heterozygotes and from 2% to 22.5% for homozygotes) when compared with mutations leading to a protein of abnormal size

*4.2.2. LDL receptor activity* 

Class 1 mutations disrupt the synthesis of the LDL receptor and no precursor is produced (null alleles).

Class 2 mutations block transport to the Golgi apparatus: mutations are reported in class 2A when a complete defect in transport to the cell membrane is observed and in class 2B when receptors are transported at a detectable - but markedly reduced - rate.

Class 3 mutations produce proteins that reach the membrane but fail to bind the LDL.

Class 4 mutations produce a receptor that binds the lipoprotein but which cannot be internalised. The mutations affecting the cytoplasmic domain alone are classed 4A, while those also affecting the membrane-spanning region are classed 4B.

Class 5 mutations block the acid-dependant dissociation of the receptor and the ligand in the endosome, an essential event for receptor recycling.

The link between the functional class type of the mutation and the severity of the disease has been established, and patients carrying a class 1 mutation are more severely affected than those with a mutation from another functional group (Hobbs et al 1992). In the UMD-LDLR database, among the 288 single nucleotide mutations with available data concerning the functional group, 42.0% (121/288) are class 2B, 31.9% (92/288) are class 1, 13.5% (39/288) are class 5, 7.6% (22/288) are class 2A, 3.8% (11/288) are class 4A and 1.0% (3/288) are class 3. Class 1 mutations are mainly nonsense and frameshift mutations (66.3% nonsenses, 30.4% frameshifts and 3.3% missenses) and 62% of them are localised in exons 2 to 6, encoding the ligand binding domain for one half and in exons 7 to 14 encoding the EGF-like domain for the other half (Figure 4). Class 2B mutations are mainly missense mutations (92.6% missenses and 7.4% frameshifts) and 71% of them are localised in exons 2 to 6, encoding the ligand binding domain (Figure 4). Class 5 mutations are mainly missense mutations (95% missenses and 5% splice site mutations) and 95% of them are localised in exons 7 to 14, encoding the EGF-like domain (Figure 4). Class 2A, 3 and 4A mutations are mainly missense mutations (59% missenses, 22% nonsenses and 19% frameshifts) and 67% of them are localised in exons 7 to 14, encoding the EGF-like domain. As expected, the localisation of these different classes of mutations is consistent with the functional definition of each class. The higher prevalence of mutations at the origin of truncated proteins (nonsenses and frameshifts) within the class 1 functional group is consistent with the expected null allele effect of these kinds of mutations. Altogether, these observations are globally in agreement with the admitted dogma according to which mutations leading to a protein of abnormal size (nonsense, frameshift and splice) are at the origin of a more severe phenotype than missense mutations.

Missense Mutation in the *LDLR* Gene: A Wide Spectrum in the Severity of Familial Hypercholesterolemia 67

**Figure 4.** Distribution of the different mutations according to the three main functional classes.

#### *4.2.2. LDL receptor activity*

66 Mutations in Human Genetic Disease

et al 1992):

(null alleles).

missense mutations.

**4.2. Analysis of** *LDLR* **mutations at the biological level** 

receptors are transported at a detectable - but markedly reduced - rate.

those also affecting the membrane-spanning region are classed 4B.

endosome, an essential event for receptor recycling.

Mutations in the *LDLR* gene have been classified into 5 functional groups based on the characteristics of the mutant protein produced and analysed in patients' fibroblasts (Hobbs

Class 1 mutations disrupt the synthesis of the LDL receptor and no precursor is produced

Class 2 mutations block transport to the Golgi apparatus: mutations are reported in class 2A when a complete defect in transport to the cell membrane is observed and in class 2B when

Class 4 mutations produce a receptor that binds the lipoprotein but which cannot be internalised. The mutations affecting the cytoplasmic domain alone are classed 4A, while

Class 5 mutations block the acid-dependant dissociation of the receptor and the ligand in the

The link between the functional class type of the mutation and the severity of the disease has been established, and patients carrying a class 1 mutation are more severely affected than those with a mutation from another functional group (Hobbs et al 1992). In the UMD-LDLR database, among the 288 single nucleotide mutations with available data concerning the functional group, 42.0% (121/288) are class 2B, 31.9% (92/288) are class 1, 13.5% (39/288) are class 5, 7.6% (22/288) are class 2A, 3.8% (11/288) are class 4A and 1.0% (3/288) are class 3. Class 1 mutations are mainly nonsense and frameshift mutations (66.3% nonsenses, 30.4% frameshifts and 3.3% missenses) and 62% of them are localised in exons 2 to 6, encoding the ligand binding domain for one half and in exons 7 to 14 encoding the EGF-like domain for the other half (Figure 4). Class 2B mutations are mainly missense mutations (92.6% missenses and 7.4% frameshifts) and 71% of them are localised in exons 2 to 6, encoding the ligand binding domain (Figure 4). Class 5 mutations are mainly missense mutations (95% missenses and 5% splice site mutations) and 95% of them are localised in exons 7 to 14, encoding the EGF-like domain (Figure 4). Class 2A, 3 and 4A mutations are mainly missense mutations (59% missenses, 22% nonsenses and 19% frameshifts) and 67% of them are localised in exons 7 to 14, encoding the EGF-like domain. As expected, the localisation of these different classes of mutations is consistent with the functional definition of each class. The higher prevalence of mutations at the origin of truncated proteins (nonsenses and frameshifts) within the class 1 functional group is consistent with the expected null allele effect of these kinds of mutations. Altogether, these observations are globally in agreement with the admitted dogma according to which mutations leading to a protein of abnormal size (nonsense, frameshift and splice) are at the origin of a more severe phenotype than

Class 3 mutations produce proteins that reach the membrane but fail to bind the LDL.

*4.2.1. Functional classes of LDLR gene's mutations* 

In the UMD-LDLR database, the LDL receptor activity measured in patients' fibroblasts is available for 91 single nucleotide mutations: assays were performed for 24 heterozygote carriers, 22 homozygote carriers and 45 compound heterozygotes.. For homozygote carriers of a missense mutation, the mean LDL receptor activity is 8.7% rather than 2.7% for carriers of a mutation leading to a protein of abnormal size (nonsense, frameshift and splice) (Figure 5). For heterozygote carriers of a missense mutation, the mean LDL receptor activity is 33.2% rather than 19.8% for carriers of an abnormal-protein mutation. Moreover, a gradient can be drawn for compound heterozygotes with a mean LDL receptor activity of 13.3%, 7.3% and 3.6% for carriers of two missense mutations, one missense and one abnormal-protein mutation and two abnormal-protein mutations respectively (Figure 5). Once again, these observations are globally in agreement with an admittedly more severe phenotype for mutations leading to a protein of abnormal size when compared with missense mutations. However, missense mutations in the *LDLR* gene are associated with a larger spectrum of LDL receptor activity in fibroblasts (from 2% to 67% for heterozygotes and from 2% to 22.5% for homozygotes) when compared with mutations leading to a protein of abnormal size (from 2% to 47% for heterozygotes and from 2% to 11% for homozygotes).

Missense Mutation in the *LDLR* Gene: A Wide Spectrum in the Severity of Familial Hypercholesterolemia 69

**Figure 6.** Distribution of total- and LDL-cholesterol plasmatic levels for heterozygotes carriers of a missense (M), a frameshift (F), a splice site (S) or a nonsense (N) mutation in the LDLR gene.

*Missense* N 133 144 152 137

*Frameshift* N 60 63 73 64

*Splice* N 22 25 30 24

*Nonsenses* N 24 24 27 27

*Missense* N 13 15 14 12

*Frameshift* N 3 3 3 2

*Splice* N 3 3 5 4

*Nonsenses* N 2 2 2 2

**Table 2.** Mean plasmatic lipid levels for heterozygotes and homozygote carriers of missense, frameshift, splice site or nonsense mutations in the *LDLR* gene. Values are in mmol/L.

*4.3.2. Clinical expression of familial hypercholesterolemia among LDLR gene mutation* 

Of the 1061 unique events reported in the UMD-LDLR database, clinical data is available for only 230 of them (22%) including 25 homozygote carriers and 215 heterozygote carriers of

**Heterozygotes**

**Homozygotes**

*carriers* 

**HDL-Cholesterol LDL-Cholesterol Total Cholesterol Triglycerides**

Mean (SD) 1.31 (0.51) 7.50 (2.38) 9.50 (2.18) 1.66 (0.94)

Mean (SD) 1.21 (0.34) 7.84 (2.05) 9.89 (2.22) 1.39 (0.89)

Mean (SD) 1.28 (0.41) 7.17 (2.08) 9.56 (2.20) 1.49 (0.54)

Mean (SD) 1.17 (0.40) 7.74 (1.64) 9.43 (1.53) 1.46 (0.73)

Mean (SD) 1.04 (0.41) 15.55 (4.96) 17.39 (4.49) 1.42 (0.72)

Mean (SD) 0.66 (0.21) 16.01 (1.17) 17.43 (0.93) 1.23 (0.04)

Mean (SD) 0.67 (0.16) 15.25 (1.79) 18.06 (4.74) 1.34 (0.17)

Mean (SD) 0.87 (0.52) 17.54 (0.37) 19.56 (0.76) 2.00 (1.27)

**Figure 5.** LDL receptor activity in fibroblast from mutation carriers. The values are expressed as % of LDL binding compared with the values obtained for normocholesterolemic subjects. M: missense. N: null allele (frameshift, splice, nonsense).

#### **4.3. Analysis of** *LDLR* **mutations at the biochemical/clinical level**

#### *4.3.1. Plasmatic lipid levels among LDLR gene mutations carriers*

Among the 1061 unique events included in the UMD-LDLR database, lipid values are available for only 307 of them (29%), corresponding with 25 homozygote carriers and 282 heterozygote carriers of different molecular events within the *LDLR* gene (Table 2). According to the biochemical definition of familial hypercholesterolemia, triglycerides and HDL-cholesterol levels were within the normal range while the total- and LDL-cholesterol levels were elevated. As expected for a co-dominant disease, the total- and LDL-cholesterol levels were higher for homozygote mutation carriers than for molecular heterozygotes. No differences were observed between the four groups of mutations (missenses, frameshifts, splice sites and nonsenses), suggesting a similar effect of missense and mutations leading to a protein of abnormal size (nonsense, frameshift and splice) on the biochemical expression of the disease. Furthermore, no differences were observed among the distribution of totaland LDL-cholesterol levels among the four groups of mutations (Figure 6).

null allele (frameshift, splice, nonsense).

**Figure 5.** LDL receptor activity in fibroblast from mutation carriers. The values are expressed as % of LDL binding compared with the values obtained for normocholesterolemic subjects. M: missense. N:

Among the 1061 unique events included in the UMD-LDLR database, lipid values are available for only 307 of them (29%), corresponding with 25 homozygote carriers and 282 heterozygote carriers of different molecular events within the *LDLR* gene (Table 2). According to the biochemical definition of familial hypercholesterolemia, triglycerides and HDL-cholesterol levels were within the normal range while the total- and LDL-cholesterol levels were elevated. As expected for a co-dominant disease, the total- and LDL-cholesterol levels were higher for homozygote mutation carriers than for molecular heterozygotes. No differences were observed between the four groups of mutations (missenses, frameshifts, splice sites and nonsenses), suggesting a similar effect of missense and mutations leading to a protein of abnormal size (nonsense, frameshift and splice) on the biochemical expression of the disease. Furthermore, no differences were observed among the distribution of total-

**4.3. Analysis of** *LDLR* **mutations at the biochemical/clinical level** 

and LDL-cholesterol levels among the four groups of mutations (Figure 6).

*4.3.1. Plasmatic lipid levels among LDLR gene mutations carriers* 

**Figure 6.** Distribution of total- and LDL-cholesterol plasmatic levels for heterozygotes carriers of a missense (M), a frameshift (F), a splice site (S) or a nonsense (N) mutation in the LDLR gene.


**Table 2.** Mean plasmatic lipid levels for heterozygotes and homozygote carriers of missense, frameshift, splice site or nonsense mutations in the *LDLR* gene. Values are in mmol/L.

#### *4.3.2. Clinical expression of familial hypercholesterolemia among LDLR gene mutation carriers*

Of the 1061 unique events reported in the UMD-LDLR database, clinical data is available for only 230 of them (22%) including 25 homozygote carriers and 215 heterozygote carriers of different molecular events within the *LDLR* gene (Table 3). This clinical data concerns tendinous cholesterol deposits - such as xanthomas - and the diagnosis of premature coronary artery disease (CAD). Tendinous xanthomas are more frequently observed for the carriers of a mutation leading to a protein of abnormal size rather than for the heterozygotes for a missense mutation (Table 3). Once more, this observation is in agreement with the admitted dogma according to which mutations leading to a protein of abnormal size (nonsense, frameshift and splice) are at the origin of a more severe phenotype than are missense mutations. However, no differences were observed for the occurrence of CAD between missenses and those mutations leading to a protein of abnormal size (Table 3). This latter observation suggests a similar effect with regard to missense and mutation leading to a protein of abnormal size (nonsense, frameshift and splice) in the clinical expression of the disease.

Missense Mutation in the *LDLR* Gene: A Wide Spectrum in the Severity of Familial Hypercholesterolemia 71

*AP-HP, Hôpital A. Paré, Laboratoire de Biochimie et Génétique Moléculaire, Boulogne-Billancourt,* 

Abifadel M, Varret M, Rabès JP, Allard D, Ouguerram K, Devillers M, Cruaud C, Benjannet S, Wickham L, Erlich D, Derré A, Villéger L, Farnier M, Beucler I, Bruckert E, Chambaz J, Chanu B, Lecerf JM, Luc G, Moulin P, Weissenbach J, Prat A, Krempf M, Junien C, Seidah NG, Boileau C. Mutations in PCSK9 cause autosomal dominant

Abifadel M, Rabès JP, Devillers M, Munnich A, Erlich D, Junien C, Varret M, Boileau C. Mutations and polymorphisms in the proprotein convertase subtilisin kexin 9 (PCSK9)

Ball EV, Stenson PD, Abeysinghe SS, Krawczak M, Cooper DN, Chuzhanova NA. Microdeletions and microinsertions causing human genetic disease: common mechanisms of mutagenesis and the role of local DNA sequence complexity. Hum

Béroud C, Collod-Beroud G, Boileau C et al. UMD (Universal Mutation Database): a generic software to build and analyze locus-specific databases. Human Mutation 2000: 15: 86-94. Béroud C, Hamroun D, Collod-Beroud G et al. UMD (Universal Mutation Database): 2005

Bourbon M, Sun XM, Soutar AK. A rare polymorphism in the low density lipoprotein (LDL)

Bourbon M, Duarte MA, Alves AC, Medeiros AM, Marques L, Soutar AK. Genetic diagnosis of familial hypercholesterolaemia: the importance of functional analysis of potential

Brown MS, Goldstein JL. A receptor-mediated pathway for cholesterol homeostasis. Science

Molecular genetic analysis of 1053 Danish individuals with clinical signs of familial

Cartegni L, Chew SL, Krainer AR. Listening to silence and understanding nonsense: exonic

Chaves FJ, Real JT, García-García AB, Civera M, Armengod ME, Ascaso JF, Carmena R. Genetic diagnosis of familial hypercholesterolemia in a South European outbreed

gene that affects mRNA splicing. Atherosclerosis. 2007: 195(1): e17-20

splice-site mutations. J Med Genet. 2009: 46(5): 352-7.

hypercholesterolemia. Clin Genet. 2006: 69(3): 277-83.

mutations that affect splicing. Nat Rev Genet. 2002: 3(4): 285-98.

Brusgaard K, Jordan P, Hansen H, Hansen AB, Hørder M.

gene in cholesterol metabolism and disease. Hum Mutat. 2009: 30(4): 520-9.

*Université Versailles Saint-Quentin-en-Yvelines, UFR de Médecine Paris Ile-de-France Ouest,* 

hypercholesterolemia. Nat Genet. 2003: 34(2): 154-6.

**Author details** 

*INSERM U698, Paris, France* 

*INSERM U698, Paris, France* 

*Université Paris Denis Diderot, France* 

Mutat. 2005: 26(3): 205-13.

1986: 232(4746): 34–47.

update. Hum Mut 2005: 26(3): 184-191.

Mathilde Varret

Jean-Pierre Rabès

*Guyancourt, France* 

**6. References** 

*France* 


**Table 3.** Clinical expression of familial hypercholesterolemia for heterozygotes carriers of different mutations in the *LDLR* gene.

## **5. Conclusion**

To date, it seems logical that mutations leading to a protein of abnormal size (nonsense, frameshift and splice) are at the origin of a more severe phenotype than missense mutations. The genotype/phenotype correlations performed with the UMD-LDLR database provide molecular, biological and clinical evidence that underlies this dogma. Moreover, missense mutations in the *LDLR* gene are the source of a wider spectrum in the severity of FH, than are mutations leading to a protein of abnormal size, from an almost normal phenotype to very severe forms of the disease.

Mutations in the *LDLR* gene are numerous and frequently recurrent but, conversely, rarely sporadic. These observations reveal not only the high mutability at one time of this gene, but also that these mutations were probably selected through time. It can be postulated that a hypercholesterolemic mutation could have given a selective advantage to carriers and may be a member of the pool of alleles that constitute the «"thrifty genotype" (Neel at al. 1998). The thrifty genotype hypothesis suggested that, in the early years of life, the hypercholesterolemic genotype was thrifty in the sense of being exceptionally efficient in the utilisation of food. It would thereby confer a survival advantage during times of food shortage. However, in contemporary societies, as food is usually available in unlimited amounts, the thrifty genotype no longer provides a survival advantage but instead renders its owners more susceptible to hypercholesterolemia.

### **Author details**

70 Mutations in Human Genetic Disease

disease.

mutations in the *LDLR* gene.

very severe forms of the disease.

its owners more susceptible to hypercholesterolemia.

**5. Conclusion** 

different molecular events within the *LDLR* gene (Table 3). This clinical data concerns tendinous cholesterol deposits - such as xanthomas - and the diagnosis of premature coronary artery disease (CAD). Tendinous xanthomas are more frequently observed for the carriers of a mutation leading to a protein of abnormal size rather than for the heterozygotes for a missense mutation (Table 3). Once more, this observation is in agreement with the admitted dogma according to which mutations leading to a protein of abnormal size (nonsense, frameshift and splice) are at the origin of a more severe phenotype than are missense mutations. However, no differences were observed for the occurrence of CAD between missenses and those mutations leading to a protein of abnormal size (Table 3). This latter observation suggests a similar effect with regard to missense and mutation leading to a protein of abnormal size (nonsense, frameshift and splice) in the clinical expression of the

**Sex ratio (M/F)** 1.06 (83/78) 1.09 (60/55) **Age (mean years ± SD)** 39.6 ± 17.5 36.8 ± 14.9

**Tendinous xanthomas** 106 50 50 109 65 35 **Table 3.** Clinical expression of familial hypercholesterolemia for heterozygotes carriers of different

To date, it seems logical that mutations leading to a protein of abnormal size (nonsense, frameshift and splice) are at the origin of a more severe phenotype than missense mutations. The genotype/phenotype correlations performed with the UMD-LDLR database provide molecular, biological and clinical evidence that underlies this dogma. Moreover, missense mutations in the *LDLR* gene are the source of a wider spectrum in the severity of FH, than are mutations leading to a protein of abnormal size, from an almost normal phenotype to

Mutations in the *LDLR* gene are numerous and frequently recurrent but, conversely, rarely sporadic. These observations reveal not only the high mutability at one time of this gene, but also that these mutations were probably selected through time. It can be postulated that a hypercholesterolemic mutation could have given a selective advantage to carriers and may be a member of the pool of alleles that constitute the «"thrifty genotype" (Neel at al. 1998). The thrifty genotype hypothesis suggested that, in the early years of life, the hypercholesterolemic genotype was thrifty in the sense of being exceptionally efficient in the utilisation of food. It would thereby confer a survival advantage during times of food shortage. However, in contemporary societies, as food is usually available in unlimited amounts, the thrifty genotype no longer provides a survival advantage but instead renders

**CAD** 100 58 42 99 52 48

**Missenses Frameshifts, Splice sites, Nonsenses**

N Yes (%) No (%) N Yes (%) No (%)

Mathilde Varret *INSERM U698, Paris, France Université Paris Denis Diderot, France* 

Jean-Pierre Rabès *INSERM U698, Paris, France AP-HP, Hôpital A. Paré, Laboratoire de Biochimie et Génétique Moléculaire, Boulogne-Billancourt, France Université Versailles Saint-Quentin-en-Yvelines, UFR de Médecine Paris Ile-de-France Ouest, Guyancourt, France* 

#### **6. References**


population: influence of low-density lipoprotein (LDL) receptor gene mutations on treatment response to simvastatin in total, LDL, and high-density lipoprotein cholesterol. J Clin Endocrinol Metab. 2001: 86(10): 4926-32.

Missense Mutation in the *LDLR* Gene: A Wide Spectrum in the Severity of Familial Hypercholesterolemia 73

Górski B, Kubalska J, Naruszewicz M, Lubiński J. LDL-R and Apo-B-100 gene mutations in

Hobbs HH, Russell DW, Brown MS, Goldstein JL. The LDL receptor locus in familial hypercholesterolemia: mutational analysis of a membrane protein. Annu Rev Genet

Hobbs HH, Brown MS, Goldstein JL. Molecular genetics of the LDL receptor gene in familial

Jeon H, Blacklow SC. Structure and physiologic function of the low-density lipoprotein

Koivisto UM, Turtola H, Aalto-Setala K, et al. The familial hypercholesterolemia (FH)-North Karelia mutation of the low density lipoprotein receptor gene deletes seven nucleotides

Kotze MJ, Thiart R, Loubser O, de Villiers JN, Santos M, Vargas MA, Peeters AV. Mutation analysis reveals an insertional hotspot in exon 4 of the LDL receptor gene. Hum Genet.

Krawczak M, Cooper DN. Single base-pair substitutions in pathology and evolution: two

Krawczak M, Ball EV, Cooper DN. Neighboring-nucleotide effects on the rates of germ-line single-base-pair substitution in human genes. Am J Hum Genet 1998: 63(2): 474-88. Krawczak M, Thomas NS, Hundrieser B, Mort M, Wittig M, Hampe J, Cooper DN. Single base-pair substitutions in exon-intron junctions of human genes: nature, distribution,

Kurniawan ND, Aliabadizadeh K, Brereton IM, Kroon PA, Smith R. NMR structure and backbone dynamics of a concatemer of epidermal growth factor homology modules of

Kwon HJ, Lagace TA, McNutt MC, Horton JD, Deisenhofer J. Molecular basis for LDL receptor recognition by PCSK9. Proc Natl Acad Sci U S A. 2008: 105(6): 1820-5. Landsberger D, Meiner V, Reshef A, et al. A nonsense mutation in the LDL receptor gene leads to familial hypercholesterolemia in the Druze sect. Am J Hum Genet 1992: 50: 427-

Lehrman MA, Schneider WJ, Brown MS, et al. The Lebanese allele at the low density lipoprotein receptor locus. Nonsense mutation produces truncated receptor that is

Leitersdorf E, Van der Westhuyzen DR, Coetzee GA, Hobbs HH. Two common low density lipoprotein receptor gene mutations cause familial hypercholesterolemia in Afrikaners.

Lindgren V, Luskey KL, Russell DW, Francke U. Human genes involved in cholesterol metabolism: chromosomal mapping of the loci for the low density lipoprotein receptor and 3-hydroxy-3- methylglutaryl-coenzyme A reductase with cDNA probes. Proc Natl

and consequences for mRNA splicing. Hum Mutat. 2007: 28(2): 150-8.

retained in endoplasmic reticulum. J Biol Chem 1987: 262: 401-10.

Lewin B, in "Genes IV" (Oxford Cell Press, New-York, 1990).

the human low-density lipoprotein receptor. J Mol Biol. 2001: 311(2): 341-56.

of exon 6 and is a common cause of FH in Finland. J Clin Invest 1992: 90: 219-28. Kotze MJ, Langenhoven E, Warnich L, et al. The identification of two low-density lipoprotein receptor gene mutations in South African familial hypercholesterolaemia. S

Polish familial hypercholesterolemias. Hum Genet 1998: 102(5): 562-5.

hypercholesterolemia. Hum Mutat 1992: 1: 445-66.

sides to the same coin. Hum Mutat. 1996: 8(1): 23-31.

receptor. Annu Rev Biochem. 2005: 74: 535-62.

1990: 24: 133-170.

Afr Med J 1989: 76: 399-401.

J Clin Invest 1989: 84: 954-61.

Acad Sci USA 1985: 82: 8567-71.

1996: 98(4): 476-8.

33.


Górski B, Kubalska J, Naruszewicz M, Lubiński J. LDL-R and Apo-B-100 gene mutations in Polish familial hypercholesterolemias. Hum Genet 1998: 102(5): 562-5.

72 Mutations in Human Genetic Disease

548–555.

BMC Med Genet. 2005: 6: 15.

1997: 34(2): 111-6.

Biol 2009: 29(4): 431-8.

Biol Chem. 1986: 261(6): 2828-38.

population: influence of low-density lipoprotein (LDL) receptor gene mutations on treatment response to simvastatin in total, LDL, and high-density lipoprotein

Cohen J, Pertsemlidis A, Kotowski IK, Graham R, Garcia CK, Hobbs HH. Low LDL cholesterol in individuals of African descent resulting from frequent nonsense

Cooper DN, Krawczak M. The mutational spectrum of single base-pair substitutions causing human genetic disease: patterns and predictions. Hum Genet 1990: 85(1): 55-74. Cooper DN, Antonarakis SE and Krawczak M. The nature and mechanisms of human gene mutation. In: Scriver CR, Beaudet AL, Sly WS and Valle D eds. The metabolic basis of

Crick FHC. Codon – anticodon pairing: the wobble hypothesis. J. Mol. Biol., 19(1966), pp.

Damgaard D, Nissen PH, Jensen LG, Nielsen GG, Stenderup A, Larsen ML, Faergeman O. Detection of large deletions in the LDL receptor gene with quantitative PCR methods.

Davis CG, Elhammer A, Russell DW, et al. Deletion of clustered O-linked carbohydrates does not impair function of low density lipoprotein receptor in transfected fibroblasts. J

Day IN, Haddad L, O'Dell SD, Day LB, Whittall RA, Humphries SE. Identification of a common low density lipoprotein receptor mutation (R329X) in the south of England: complete linkage disequilibrium with an allele of microsatellite D19S394. J Med Genet.

Defesche JC, Schuurman EJ, Klaaijsen LN, Khoo KL, Wiegman A, Stalenhoef AF. Silent exonic mutations in the low-density lipoprotein receptor gene that cause familial

Fouchier SW, Kastelein JJ, Defesche JC. Update of the molecular basis of familial

Garcia CK, Wilund K, Arca M, et al. Autosomal recessive hypercholesterolemia caused by mutations in a putative LDL receptor adaptor protein. Science 2001: 292(5520): 1394-8. Gaudet D, Vohl MC, Couture P, Moorjani S, Tremblay G, Perron P, Gagné C, Després JP. Contribution of receptor negative versus receptor defective mutations in the LDLreceptor gene to angiographically assessed coronary artery disease among young (25-49 years) versus middle-aged (50-64 years) men. Atherosclerosis. 1999 : 143(1): 153-61. Goldstein JL, Schrott HG, Hazzard WR et al. Hyperlipidemia in coronary heart disease. II. Genetic analysis of lipid levels in 176 families and delineation of a new inherited

Goldstein J, Brown M. Familial hypercholesterlemia. In: Scriver C, Beaudet A, Sly W, eds. The metabolic basis of inherited diseases. New York: Mc Graw-Hill, 1989: 1215–1250. Goldstein JL, Brown MS. History of Discovery: The LDL receptor. Arterioscler Thromb Vasc

hypercholesterolemia by affecting mRNA splicing. Clin Genet. 2008: 73(6): 573-8. Fass D, Blacklow S, Kim PS, Berger JM. Molecular basis of familial hypercholesterolaemia

from structure of LDL receptor module. Nature 1997: 388(6643): 691-3.

hypercholesterolemia in The Netherlands. Hum Mutat. 2005: 26(6): 550-6.

disorder, combined hyperlipidemia. J Clin Invest 1973: 52(7): 1544–1568.

cholesterol. J Clin Endocrinol Metab. 2001: 86(10): 4926-32.

mutations in PCSK9. Nat Genet. 2005: 37(2):161-5.

inherited diseases. New York: Mc Graw-Hill, 1995.


Lombardi P, Sijbrands EJ, van de Giessen K, Smelt AH, Kastelein JJ, Frants RR, Havekes LM. Mutations in the low density lipoprotein receptor gene of familial hypercholesterolemic patients detected by denaturing gradient gel electrophoresis and direct sequencing. J Lipid Res. 1995: 36(4): 860-7.

**Chapter 4** 

© 2012 AlFadhli, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

distribution, and reproduction in any medium, provided the original work is properly cited.

and reproduction in any medium, provided the original work is properly cited.

**Missense Mutation in Cancer in Correlation** 

Cancer is a complex genetic disease caused by abnormal alteration (mutations) in DNA sequences that leads to dyregulation of normal cellular processes thereby driving tumor growth. The study of such causal mutations is a central focus of cancer biology for two reasons; first is to reveal the molecular mechanisms of tumorigenesis, second is to provide insight in the development of novel therapeutic and diagnostic approaches. Although hundreds of genes are known to be mutated in cancers our understanding of mutational events in cancer cells remains incomplete (Futreal PA et al, 2004). This however has widely opened the field of cancer genomics studies which aims to provide new insights into the

As we are in the era of evidence-based molecular diagnosis, predictive testing, genetic counseling, gene-informed cancer risk assessment, and preventative and personalized medicine, therefore, studying the Mendelian genetics of the familial forms of cancer is one approach that can set up the basis for gene-informed risk assessment and management for the patient and family. Herein we selected a Mendelian genetics form of familial cancer such as hereditary tumor syndromic endocrine neoplasias caused by highly penetrant germline mutations leading to pheochromocytoma-paraganglioma syndromes. An example of such syndromes are autosomal dominant disorders; von Hippel-Lindau (VHL); Multiple endocrine neoplasia syndrome type 1 (MEN-1), loss-of-function germline mutations in the tumor suppressor gene MEN1 increase the risk of developing pituitary, parathyroid and pancreatic islet tumors, and less commonly thymic carcinoids, lipomas and benign adrenocortical tumors. In the case of multiple endocrine neoplasia type 2 (MEN 2), gain-offunction germline mutations clustered in specific codons of the RET proto-oncogene increase the risk of developing medullary thyroid carcinoma (MTC), phaeochromocytoma and parathyroid tumors. PTEN mutations in Cowden syndrome (CS), associated with

**to Its Phenotype – VHL as a Model** 

Additional information is available at the end of the chapter

molecular mechanisms that lead to tumorigenesis.

Suad AlFadhli

**1. Introduction** 

http://dx.doi.org/10.5772/36727


## **Missense Mutation in Cancer in Correlation to Its Phenotype – VHL as a Model**

Suad AlFadhli

74 Mutations in Human Genetic Disease

Lipid Res. 1995: 36(4): 860-7.

Chem 1989: 264(36): 21682-8.

Lab Invest. 1994: 54(8): 605-9.

IUBMB Life. 2010: 62(2): 125-31.

Research 1998: 26(1): 248-252.

Lombardi P, Sijbrands EJ, van de Giessen K, Smelt AH, Kastelein JJ, Frants RR, Havekes LM. Mutations in the low density lipoprotein receptor gene of familial hypercholesterolemic patients detected by denaturing gradient gel electrophoresis and direct sequencing. J

Marduel M, Carrie A, Sassolas A et al. Molecular spectrum of autosomal dominant

Moorjani S, Roy M, Gagne C, et al. Homozygous familial hypercholesterolemia among

Neel JV, Weder AB, Julius S. Type II diabetes, essential hypertension, and obesity as 'syndromes of impaired genetic homeostasis': the 'thrifty genotype' hypothesis enters

North CL, Blacklow SC. Structural independence of ligand-binding modules five and six of

Russell DW, Brown MS, Goldstein JL. Different combinations of cysteine-rich repeats mediate binding of low density lipoprotein receptor to two different proteins. J Biol

Solberg K, Rødningen OK, Tonstad S, Ose L, Leren TP. Familial hypercholesterolaemia caused by a non-sense mutation in codon 329 of the LDL receptor gene. Scand J Clin

Soutar AK. Rare genetic causes of autosomal dominant or recessive hypercholesterolaemia.

Sterne-Weiler T, Howard J, Mort M, Cooper DN, Sanford JR. Loss of exon identity is a common mechanism of human inherited disease. Genome Res. 2011: 21(10): 1563-71. Sudhof TC, Russell DW, Goldstein JL, et al. Cassette of eight exons shared by genes for LDL

Thiart R, Loubser O, de Villiers JN, Marx MP, Zaire R, Raal FJ, Kotze MJ. Two novel and two known low-density lipoprotein receptor gene mutations in German patients with

Varret M, Rabes JP, Collod-Beroud G et al. Software and database for the analysis of mutations

Varret M, Rabes JP, Thiart R et al. LDLR Database (second edition): new additions to the database and the software, and results of the first molecular analysis. Nucleic Acids

Villèger L, Abifadel M, Allard D et al. The UMD-LDLR database: additions to the software

Yamamoto T, Davis CG, Brown MS, et al. The human LDL receptor: a cysteine-rich protein

Yu W, Nohara A, Higashikata T, Lu H, Inazu A, Mabuchi H. Molecular genetic analysis of familial hypercholesterolemia: spectrum and regional difference of LDL receptor gene

Zhang DW, Lagace TA, Garuti R, Zhao Z, McDonald M, Horton JD, Cohen JC, Hobbs HH. Binding of proprotein convertase subtilisin/kexin type 9 to epidermal growth factor-like repeat A of low density lipoprotein receptor decreases receptor recycling and increases

in the human LDL receptor gene. Nucleic Acids Research 1997: 25(1): 172-180.

and 490 new entries to the database. Human Mutation 2002: 20(2): 81-87.

mutations in Japanese population. Atherosclerosis. 2002: 165(2): 335-42.

with multiple Alu sequences in its mRNA. Cell 1984: 39: 27-38.

degradation. J Biol Chem. 2007: 282(25): 18602-12.

hypercholesterolemia in France. Human Mutation 2010: 31: E1811-1824.

French Canadians in Quebec Province. Arteriosclerosis 1989: 9: 211-6.

the 21st century. Perspect Biol Med 1998: 42: 44–74.

the LDL receptor. Biochemistry 1999: 38(13): 3926-35.

receptor and EGF precursor. Science 1985: 228: 893-5.

familial hypercholesterolemia. Hum Mutat. 1998: Suppl 1: S232-3.

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/36727

## **1. Introduction**

Cancer is a complex genetic disease caused by abnormal alteration (mutations) in DNA sequences that leads to dyregulation of normal cellular processes thereby driving tumor growth. The study of such causal mutations is a central focus of cancer biology for two reasons; first is to reveal the molecular mechanisms of tumorigenesis, second is to provide insight in the development of novel therapeutic and diagnostic approaches. Although hundreds of genes are known to be mutated in cancers our understanding of mutational events in cancer cells remains incomplete (Futreal PA et al, 2004). This however has widely opened the field of cancer genomics studies which aims to provide new insights into the molecular mechanisms that lead to tumorigenesis.

As we are in the era of evidence-based molecular diagnosis, predictive testing, genetic counseling, gene-informed cancer risk assessment, and preventative and personalized medicine, therefore, studying the Mendelian genetics of the familial forms of cancer is one approach that can set up the basis for gene-informed risk assessment and management for the patient and family. Herein we selected a Mendelian genetics form of familial cancer such as hereditary tumor syndromic endocrine neoplasias caused by highly penetrant germline mutations leading to pheochromocytoma-paraganglioma syndromes. An example of such syndromes are autosomal dominant disorders; von Hippel-Lindau (VHL); Multiple endocrine neoplasia syndrome type 1 (MEN-1), loss-of-function germline mutations in the tumor suppressor gene MEN1 increase the risk of developing pituitary, parathyroid and pancreatic islet tumors, and less commonly thymic carcinoids, lipomas and benign adrenocortical tumors. In the case of multiple endocrine neoplasia type 2 (MEN 2), gain-offunction germline mutations clustered in specific codons of the RET proto-oncogene increase the risk of developing medullary thyroid carcinoma (MTC), phaeochromocytoma and parathyroid tumors. PTEN mutations in Cowden syndrome (CS), associated with

© 2012 AlFadhli, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

breast, thyroid, and endometrial neoplasias. Identification and characterization of germline mutations in the predisposition genes of the great majority of these syndromes has empowered the clinical practice by the retrieved genetic information which guides medical management.

Missense Mutation in Cancer in Correlation to Its Phenotype – VHL as a Model 77

many of oncogenes and the tumor suppressor genes as part of the Mutanom project

Stehr H *et al* study describes in a quantitative way, the opposing structural effects of cancerassociated missense mutations in oncogenes and tumor suppressors. Using COSMIC database (Forbes SA, 2008). Stehr H *et al* has assessed the effects of 1992 mutations cancer-associated mutations representing two common mechanisms through which tumorigenesis is initiated: via gain-of-function of oncogenes and loss-of-function of tumor suppressors (Vogelstein B et al, 1993). Then compared them to the effects of natural variants and randomized mutations. They focused on mechanisms of cancer mutations that have a consequence at the structural level. Another significant body of work has been published on consequences of mutations in a structural context (Ng PC, 2003, 2006; Ramensky V, et al, 2002; Wang Z et al, 2001; Karchin R et al, 2009). These studies differ in that either they focus on estimating the effects of individual

Studies of structural effects of mutations have found that disease mutations primarily occur in the protein core (Ramensky V, et al, 2002; Wang Z et al, 2001). This trend was confirmed only for the set of tumor suppressors. In contrast, core residues in oncogenes are significantly less often mutated than expected by chance. This is in agreement with Stehr H *et al* results for protein stability. Mutations located in the protein core are often destabilizing and result in loss-of-function. Thus, Stehr H *et al* data suggests that the loss-of-function of tumor suppressors is often caused by destabilization of the protein. They also suggested that specific mutations of functional sites that can either disable enzymatic activity and regulatory mechanisms or increase protein activity are often responsible for oncogene activation. Stehr H *et al* results show that the most frequently mutated types of functional sites in oncogenes are ATP and GTP binding sites and that the frequency of mutation is significantly higher than expected. This suggests that mutations of ATP and GTP binding sites are specific and common mechanisms of oncogene activation. Examples for such activating mutations near ATP binding sites have been described in the literature (Davies H

Liu H *et al* investigated >120,000 mutation samples in 66 well-known tumor suppressor genes and oncogenes of the COSMIC database, and found a set of significant differences in mutation patterns (e.g., non-3n-indel, non-sense SNP and mutation hotspot) between them. They also developed indices to readily distinguish one from another and predict clearly the unknown oncogenesis genes as tumor suppressors (e.g., ASXL1, HNF1A and KDM6A) or oncogenes (e.g., FOXL2, MYD88 and TSHR). Based on their results, a third gene group was classified, which has a mutational pattern between tumor suppressors and oncogenes. The concept of the third gene group was thought to help in understanding gene function in different cancers or

von Hippel-Lindau (VHL) disease (MIM 193300) is a dominantly inherited familial cancer syndrome. It is caused by mutations in the VHL tumor suppressor gene with an incidence of

individual patients and to know the exact function of genes in oncogenesis.

(http://www.mutanom.org).

mutations or they use different sets of disease mutations.

et al, 2002; Shu HK et al, 1990, Jeffers M, et al, 1997).

**4. The clinical of VHL disease** 

This review focuses specifically on the analysis of missense mutations in oncogenes and the tumor suppressor genes, though these genes can also be mutated through a variety of other mechanisms such as DNA amplification, translocation, and deletion. Unlike synonymous or silent mutations, which do not cause amino acid changes, missense mutations are nonsynonymous amino acid substitutions that are typically caused by single-base nucleotide point mutations. However, many random missense mutations are not expected to alter protein function due to plasticity built into many amino acid residues.

## **2. Cancer and the "two hits" of Knudson's hypothesis**

Before proceeding into missense mutation in tumor suppressor gene we ought to introduce the "two hits" of Knudson's hypothesis. Alfred Knudson Jr in 1971 published his inspiring statistical analysis of the childhood cancer retinoblastoma where he found that retinoblastoma tend to be multifocal in familial cases and unifocal in sporadic presentation (Knudson A. G. Jr, 1971). Knudson postulated that patients with the familial form of the cancer would be born with one mutant allele and that all cells in that organ or tissue would be at risk, accounting for early onset and the multifocal nature of the disease. In contrast, sporadic tumors would develop only if a mutation occurred in both alleles within the same cell, and, as each event would be expected to occur with low frequency, most tumors would develop late in life and in a unifocal manner. His observations led him to propose a two-hit theory of carcinogenesis. The "two hits" of Knudson's hypothesis, which has proved true for many tumors, recognized that familial forms of cancer might hold the key to the identification of important regulatory elements known as tumor-suppressor genes (Ayerbes et al, 2008;.

#### **3. Missense mutations in oncogenes and the tumor suppressor genes**

Using the second generation sequencing approaches provided detailed information on the frequency and position of single point mutations as well as structural aberrations of cancer genomes such as small insertions and deletions, focal copy number alterations, and genomic rearrangementsm (Wood LD et al, 2007;. Jones S et al, 2008; Greenman C et al, 2007; Sjoblom T et al, 2006; Pleasance ED et al 2010a,b; Cancer Genome Atlas Research Network, 2008). The findings show that the complexity of each cancer genome is far greater than expected and that extensive variations exist between different cancer types as well as between different tumor samples of the same cancer type. Several recent studies have used the Catalogue Of Somatic Mutations In Cancer (COSMIC) database to discriminate oncogenes and the tumor suppressor genes by using the difference in their mutation patterns in order to understand oncogenesis and diagnose cancers (Forbes SA et al, 2008; Stehr H et al, 2011; Liu H, 2011). Such investigations at the systems level are currently being performed for many of oncogenes and the tumor suppressor genes as part of the Mutanom project (http://www.mutanom.org).

Stehr H *et al* study describes in a quantitative way, the opposing structural effects of cancerassociated missense mutations in oncogenes and tumor suppressors. Using COSMIC database (Forbes SA, 2008). Stehr H *et al* has assessed the effects of 1992 mutations cancer-associated mutations representing two common mechanisms through which tumorigenesis is initiated: via gain-of-function of oncogenes and loss-of-function of tumor suppressors (Vogelstein B et al, 1993). Then compared them to the effects of natural variants and randomized mutations. They focused on mechanisms of cancer mutations that have a consequence at the structural level. Another significant body of work has been published on consequences of mutations in a structural context (Ng PC, 2003, 2006; Ramensky V, et al, 2002; Wang Z et al, 2001; Karchin R et al, 2009). These studies differ in that either they focus on estimating the effects of individual mutations or they use different sets of disease mutations.

Studies of structural effects of mutations have found that disease mutations primarily occur in the protein core (Ramensky V, et al, 2002; Wang Z et al, 2001). This trend was confirmed only for the set of tumor suppressors. In contrast, core residues in oncogenes are significantly less often mutated than expected by chance. This is in agreement with Stehr H *et al* results for protein stability. Mutations located in the protein core are often destabilizing and result in loss-of-function. Thus, Stehr H *et al* data suggests that the loss-of-function of tumor suppressors is often caused by destabilization of the protein. They also suggested that specific mutations of functional sites that can either disable enzymatic activity and regulatory mechanisms or increase protein activity are often responsible for oncogene activation. Stehr H *et al* results show that the most frequently mutated types of functional sites in oncogenes are ATP and GTP binding sites and that the frequency of mutation is significantly higher than expected. This suggests that mutations of ATP and GTP binding sites are specific and common mechanisms of oncogene activation. Examples for such activating mutations near ATP binding sites have been described in the literature (Davies H et al, 2002; Shu HK et al, 1990, Jeffers M, et al, 1997).

Liu H *et al* investigated >120,000 mutation samples in 66 well-known tumor suppressor genes and oncogenes of the COSMIC database, and found a set of significant differences in mutation patterns (e.g., non-3n-indel, non-sense SNP and mutation hotspot) between them. They also developed indices to readily distinguish one from another and predict clearly the unknown oncogenesis genes as tumor suppressors (e.g., ASXL1, HNF1A and KDM6A) or oncogenes (e.g., FOXL2, MYD88 and TSHR). Based on their results, a third gene group was classified, which has a mutational pattern between tumor suppressors and oncogenes. The concept of the third gene group was thought to help in understanding gene function in different cancers or individual patients and to know the exact function of genes in oncogenesis.

#### **4. The clinical of VHL disease**

76 Mutations in Human Genetic Disease

management.

et al, 2008;.

breast, thyroid, and endometrial neoplasias. Identification and characterization of germline mutations in the predisposition genes of the great majority of these syndromes has empowered the clinical practice by the retrieved genetic information which guides medical

This review focuses specifically on the analysis of missense mutations in oncogenes and the tumor suppressor genes, though these genes can also be mutated through a variety of other mechanisms such as DNA amplification, translocation, and deletion. Unlike synonymous or silent mutations, which do not cause amino acid changes, missense mutations are nonsynonymous amino acid substitutions that are typically caused by single-base nucleotide point mutations. However, many random missense mutations are not expected to alter

Before proceeding into missense mutation in tumor suppressor gene we ought to introduce the "two hits" of Knudson's hypothesis. Alfred Knudson Jr in 1971 published his inspiring statistical analysis of the childhood cancer retinoblastoma where he found that retinoblastoma tend to be multifocal in familial cases and unifocal in sporadic presentation (Knudson A. G. Jr, 1971). Knudson postulated that patients with the familial form of the cancer would be born with one mutant allele and that all cells in that organ or tissue would be at risk, accounting for early onset and the multifocal nature of the disease. In contrast, sporadic tumors would develop only if a mutation occurred in both alleles within the same cell, and, as each event would be expected to occur with low frequency, most tumors would develop late in life and in a unifocal manner. His observations led him to propose a two-hit theory of carcinogenesis. The "two hits" of Knudson's hypothesis, which has proved true for many tumors, recognized that familial forms of cancer might hold the key to the identification of important regulatory elements known as tumor-suppressor genes (Ayerbes

**3. Missense mutations in oncogenes and the tumor suppressor genes** 

Using the second generation sequencing approaches provided detailed information on the frequency and position of single point mutations as well as structural aberrations of cancer genomes such as small insertions and deletions, focal copy number alterations, and genomic rearrangementsm (Wood LD et al, 2007;. Jones S et al, 2008; Greenman C et al, 2007; Sjoblom T et al, 2006; Pleasance ED et al 2010a,b; Cancer Genome Atlas Research Network, 2008). The findings show that the complexity of each cancer genome is far greater than expected and that extensive variations exist between different cancer types as well as between different tumor samples of the same cancer type. Several recent studies have used the Catalogue Of Somatic Mutations In Cancer (COSMIC) database to discriminate oncogenes and the tumor suppressor genes by using the difference in their mutation patterns in order to understand oncogenesis and diagnose cancers (Forbes SA et al, 2008; Stehr H et al, 2011; Liu H, 2011). Such investigations at the systems level are currently being performed for

protein function due to plasticity built into many amino acid residues.

**2. Cancer and the "two hits" of Knudson's hypothesis** 

von Hippel-Lindau (VHL) disease (MIM 193300) is a dominantly inherited familial cancer syndrome. It is caused by mutations in the VHL tumor suppressor gene with an incidence of 1:31-36000 live births worldwide across all ethnic backgrounds, with similar prevalence in both genders (Maher *et al.*, 1991; Maher, *et al.*2004). The prevalence however was shown to be higher in some population withtin the same ethnicity such as 1:39 000 in South-West Germany and 1:53 000 in Eastern England (Maher ER et al, 1991; Neumann H et al, 1991). VHL is characterized by marked age-dependent penetrance and phenotypic variability. The factors that affect the actual clinical expression and tumor formation, including age of onset, tissue and organ-specific lesions, severity of lesions, and recurrence, are unknown. VHL main clinical manifestations are:

Missense Mutation in Cancer in Correlation to Its Phenotype – VHL as a Model 79

as well; however, unlike the completely benign cysts in the general population, renal cysts in VHL patients might degenerate into RCC (Kaelin et al., 2004). However, it is unlikely that RCC in all VHL patients originates from cysts, or that all cysts will eventually become malignant. RCC often overproduces VEGF, and thus can be very vascular (Berse et al., 1992;

VHL patient can also have low-grade adenocarcinomas of the temporal bone, also known as endolymphatic sac tumors (ELST), pancreatic tumor, and epididymal or board ligament cystadenomas (Gruber et al., 1980; Neumann and Wiestler, 1991; Maher et al., 2004; Kaelin et al., 2007). ELST in VHL cases can be detected by MRI or CT imaging in up to 11% of patients (Manski TJ, et al., 1997). Although often asymptomatic, the most frequent clinical presentation is hearing loss (mean age 22 years), but tinnitus and vertigo also occur in many cases. In addition to the inherited risk for developing cancer, VHL patients develop cystic disease in various organs including the kidney, pancreas, and liver (Hough et al., 1994;

Tumor growth commonly cycled between growth and quiescent phases. Patients with numerous tumors experienced growth and quiescent phases simultaneously, suggesting that a combination of acquired genetic lesions and hormonal activity influence tumor

Molecular genetic mutation and phenotypic clustering has allowed development of a

As mentioned previously VHL disease can be classified into VHL Type 1 or Type 2 depending on the phenotype. Type 1 describes those with typical VHL manifestations such as emangioblastomas and RCC, but does not include pheochromocytomas. Once a pheochromocytoma occurs the classification becomes Type 2. Type 2, accounting for 7–20% of VHL kindreds, is further subdivided into: (2A) pheochromocytomas and other typical VHL manifestations except RCC, (2B) the full spectrum of VHL disease including pheochromocytomas, RCC, and other typical VHL manifestation, and Type (2C) identifies those with familial risk of isolated pheochromocytoma (Gross D et al, 1996; Martin R, et al., 1998), although there are some kindreds without identified VHL mutation raising the possibility of another genetic locus (Woodward ER et al, 1997; Crossey et al., 1994b; Garcia

The morbidity of VHL disease depends on the organ system involved. For example, retinal hemangioblastomas can result in retinal detachment and/or blindness (Webster et al., 1999). Mortality is often due to either metastasis of RCC or complications of CNS

clinical classification, although intra-familial variability is well recognized.

Sato et al., 1994; Takahashi et al., 1994).

**4.4. Others clinical manifestations** 

**5. VHL clinical classification:** 

et al., 1997; Mulvihill et al., 1997).

**6. Morbidity and Mortality of VHL** 

growth.

Lubensky et al., 1998; Maher et al., 1990b; Maher, 2004).

#### **4.1. Hemangioplastoms**

Hemangioplastoms of the central nervous system (CNS) which are typically located in the cerebellum, but can also occur at the brainstem, spinal cord, and rarely, at the lumbosacral nerve roots and supratentorial (Neumann et al., 1995). Retinal or CNS hemangioblastomas are often the earliest manifestations of VHL disease and the most common, occurring in up to 80% of patients (Maher et al., 1990b; Melmon and Rosen, 1964; Weil et al., 2003). VHLassociated cerebellar hemangioblastomas are diagnosed at a mean age of 29–33 years, much earlier than sporadic cerebellar hemangioblastomas (Hes et al., 2000a, 2000b; Wanebo et al., 2003). These lesions are rarely malignant, but enlargement or bleeding within the CNS can result in neurological damage and death (Pavesi et al., 2008). A lower incidence of CNS hemangioblastomas has been documented in specific ethnic populations (12% Finland (Niemela M et al., 1999); 5% German (Zbar B et al., 1999). Patients with cerebellar haemangioblastomas typically present with symptoms of increased intracranial pressure and limb or truncal ataxia (depending on the precise location of the tumor). Wanebo et al. (2003) showed most CNS hemangioblastomas were associated with cysts that were often larger than other hemangioblastomas.

#### **4.2. Pheochromocytoma**

Pheochromocytomas are endocrine neoplasias with intra- or extra-adrenal gland lesions that appear histologically as an expansion of large chromaffin positive cells, derived from neural crest cells (Lee et al., 2005). Seven to 18% of VHL patients are afflicted with pheochromocytomas (Crossey et al., 1994a; Garcia et al., 1997). The absence or present of this phenotype will type the VHL into type 1or 2 (A,B,C), respectively (Woodward ER et al., 1997; Hofstra RMW et al., 1996). Untreated pheochromocytomas can result in hypertension and subsequent acute heart disease, brain edema, and stroke.

#### **4.3. Clear cell renal cell carcinoma (RCC)**

Clear cell renal cell carcinoma (RCC) occurs in up to 70% of patients with VHL and is a frequent cause of death. 70% of VHL patients have the risk of developing RCC by 60 years old (Maher et al., 1990b, 1991; Whaley et al., 1994), at an average age of 44 years versus the average age of 62 years, at which sporadic RCC develops in the general population (http://www.umd.be/VHL/W\_VHL /clinic.shtml). Renal cysts are common in VHL patients as well; however, unlike the completely benign cysts in the general population, renal cysts in VHL patients might degenerate into RCC (Kaelin et al., 2004). However, it is unlikely that RCC in all VHL patients originates from cysts, or that all cysts will eventually become malignant. RCC often overproduces VEGF, and thus can be very vascular (Berse et al., 1992; Sato et al., 1994; Takahashi et al., 1994).

#### **4.4. Others clinical manifestations**

78 Mutations in Human Genetic Disease

main clinical manifestations are:

larger than other hemangioblastomas.

and subsequent acute heart disease, brain edema, and stroke.

**4.3. Clear cell renal cell carcinoma (RCC)** 

**4.2. Pheochromocytoma** 

**4.1. Hemangioplastoms** 

1:31-36000 live births worldwide across all ethnic backgrounds, with similar prevalence in both genders (Maher *et al.*, 1991; Maher, *et al.*2004). The prevalence however was shown to be higher in some population withtin the same ethnicity such as 1:39 000 in South-West Germany and 1:53 000 in Eastern England (Maher ER et al, 1991; Neumann H et al, 1991). VHL is characterized by marked age-dependent penetrance and phenotypic variability. The factors that affect the actual clinical expression and tumor formation, including age of onset, tissue and organ-specific lesions, severity of lesions, and recurrence, are unknown. VHL

Hemangioplastoms of the central nervous system (CNS) which are typically located in the cerebellum, but can also occur at the brainstem, spinal cord, and rarely, at the lumbosacral nerve roots and supratentorial (Neumann et al., 1995). Retinal or CNS hemangioblastomas are often the earliest manifestations of VHL disease and the most common, occurring in up to 80% of patients (Maher et al., 1990b; Melmon and Rosen, 1964; Weil et al., 2003). VHLassociated cerebellar hemangioblastomas are diagnosed at a mean age of 29–33 years, much earlier than sporadic cerebellar hemangioblastomas (Hes et al., 2000a, 2000b; Wanebo et al., 2003). These lesions are rarely malignant, but enlargement or bleeding within the CNS can result in neurological damage and death (Pavesi et al., 2008). A lower incidence of CNS hemangioblastomas has been documented in specific ethnic populations (12% Finland (Niemela M et al., 1999); 5% German (Zbar B et al., 1999). Patients with cerebellar haemangioblastomas typically present with symptoms of increased intracranial pressure and limb or truncal ataxia (depending on the precise location of the tumor). Wanebo et al. (2003) showed most CNS hemangioblastomas were associated with cysts that were often

Pheochromocytomas are endocrine neoplasias with intra- or extra-adrenal gland lesions that appear histologically as an expansion of large chromaffin positive cells, derived from neural crest cells (Lee et al., 2005). Seven to 18% of VHL patients are afflicted with pheochromocytomas (Crossey et al., 1994a; Garcia et al., 1997). The absence or present of this phenotype will type the VHL into type 1or 2 (A,B,C), respectively (Woodward ER et al., 1997; Hofstra RMW et al., 1996). Untreated pheochromocytomas can result in hypertension

Clear cell renal cell carcinoma (RCC) occurs in up to 70% of patients with VHL and is a frequent cause of death. 70% of VHL patients have the risk of developing RCC by 60 years old (Maher et al., 1990b, 1991; Whaley et al., 1994), at an average age of 44 years versus the average age of 62 years, at which sporadic RCC develops in the general population (http://www.umd.be/VHL/W\_VHL /clinic.shtml). Renal cysts are common in VHL patients VHL patient can also have low-grade adenocarcinomas of the temporal bone, also known as endolymphatic sac tumors (ELST), pancreatic tumor, and epididymal or board ligament cystadenomas (Gruber et al., 1980; Neumann and Wiestler, 1991; Maher et al., 2004; Kaelin et al., 2007). ELST in VHL cases can be detected by MRI or CT imaging in up to 11% of patients (Manski TJ, et al., 1997). Although often asymptomatic, the most frequent clinical presentation is hearing loss (mean age 22 years), but tinnitus and vertigo also occur in many cases. In addition to the inherited risk for developing cancer, VHL patients develop cystic disease in various organs including the kidney, pancreas, and liver (Hough et al., 1994; Lubensky et al., 1998; Maher et al., 1990b; Maher, 2004).

Tumor growth commonly cycled between growth and quiescent phases. Patients with numerous tumors experienced growth and quiescent phases simultaneously, suggesting that a combination of acquired genetic lesions and hormonal activity influence tumor growth.

## **5. VHL clinical classification:**

Molecular genetic mutation and phenotypic clustering has allowed development of a clinical classification, although intra-familial variability is well recognized.

As mentioned previously VHL disease can be classified into VHL Type 1 or Type 2 depending on the phenotype. Type 1 describes those with typical VHL manifestations such as emangioblastomas and RCC, but does not include pheochromocytomas. Once a pheochromocytoma occurs the classification becomes Type 2. Type 2, accounting for 7–20% of VHL kindreds, is further subdivided into: (2A) pheochromocytomas and other typical VHL manifestations except RCC, (2B) the full spectrum of VHL disease including pheochromocytomas, RCC, and other typical VHL manifestation, and Type (2C) identifies those with familial risk of isolated pheochromocytoma (Gross D et al, 1996; Martin R, et al., 1998), although there are some kindreds without identified VHL mutation raising the possibility of another genetic locus (Woodward ER et al, 1997; Crossey et al., 1994b; Garcia et al., 1997; Mulvihill et al., 1997).

#### **6. Morbidity and Mortality of VHL**

The morbidity of VHL disease depends on the organ system involved. For example, retinal hemangioblastomas can result in retinal detachment and/or blindness (Webster et al., 1999). Mortality is often due to either metastasis of RCC or complications of CNS hemangioblastomas (Filling-Katz et al., 1991; Maher et al., 1990b; Neumann et al., 1992); however, due to improved screening guidelines, life expectancy of VHL patients has improved.

Missense Mutation in Cancer in Correlation to Its Phenotype – VHL as a Model 81

the biologic consequences that these pathways play in the angiogenesis and tumor formation central to VHL. Additionally, VHL protein has functions that are independent of HIF-1alpha and HIF-2alpha and are thought to be important for its tumor-suppressor action, assembly of the extracellular matrix, control of microtubule dynamics, regulation of

Germline mutations, including large deletions/rearrangements, in the *VHL* gene, linked to 3p25-p26, are etiologic for virtually all VHL disease (Latif, F. et al., 1993; Stolle, C. et al., 1998; Zbar, B. et al., 1996). These VHL germline mutations may be also detected in patients with autosomal dominant familial non-syndromic phaeochromocytoma (Woodward ER et al., 1997; Neumann HP et al., 2002). Specific VHL missense mutations can cause an autosomal recessive form of polycythaemia without any evidence of VHL disease (AngSO et al., 2002; Gordeuk VR et al., 2004). Germ-line mutation confers genetic risk of tumor formation in concert with somatic second VHL allele loss or DNA methylation inactivation. However, somatic loss or inactivation of the wild-type vhl allele has been demonstrated in central nervous system (CNS) sporadic hemangioblastomas (Gnarra JR et al., 1994; Kanno H et al., 2000; Foster K et al., 1994; Herman JG et al 1994; Oberstrass J, et al., 1996; Tse J et al., 1997; Lee J-Y et al., 1998), in sporadic and VHL-associated renal cell carcinomas (RCCs) (Latif F et al, 1993; Shuin T et al., 1994; Phillips JL et al., 2001), pheochromocytoma (Bender BU et al., 2000; Linehan WM et al., 2001) and in endolymphatic sac tumors (ELSTs)

More than 300 germline mutations have been identified in familial VHL. These occur throughout the coding region with only a few mutations appearing in multiple families (Zbar B et al., 1996; Beroud C et al., 1996). The new mutation rate has been estimated at between 3 and 20% (Latif F et al., 1993; Richard S et al., 1994; Schimke RN et al., 2000). Although decreased penetrance has been described (Maddock IF et al., 1994),

There has been limited correlation between specific mutation and phenotype, although some data on genotype-phenotype correlations have been reported (Neumann H et al., 1998; Hes F et al., 2000). Such correlations have revealed that certain missense mutations confer a high risk of pheochromocytoma (VHL type 1) whereas loss of pVHL through large deletions or nonsense-mediated decay appears to be incompatible with pheochromocytoma development (VHL type 2). [Chen et al., 1995; Cybulski et al., 2002; Glavac et al., 1996; Hes et al., 2000a, 2000b; Maher et al., 1996; Neumann and Bender, 1998; Ong et al., 2007; Zbar et al.,

Interestingly, missense mutations causing amino acid changes on the surface of pVHL appear to have a higher risk for pheochromocytomas than missense mutations occurring deep within the protein; surface missense mutations also appear to have a higher risk for pheochromocytomas than deletions, nonsense, and frameshift mutations [Ong et al., 2007]. Thus, pheochromocytoma development appears to be related to an intact, but altered pVHL,

comprehensive familial molecular data have not yet been reported to clarify this rate.

apoptosis, and possibly stabilization of TP53 proteins (Frew IJ and Krek W. 2007).

**8. Molecular genetics of VHL disease** 

(Vortmeyer AO et al., 2000).

1996].

#### **7. VHL gene and pVHL function**

The human VHL gene is a 10-kb region located on the short arm of chromosome 3 (3p25.3) (Richards et al., 1993) and consists of 3 exons (Kuzmin et al., 1995; Latif et al., 1993a, 1993b): Exon1 spans codons 1–113, exon 2 spans codons 114–154, and exon 3 spans codons 155–213. Two protein products are encoded by VHL: a 30-kDa full-length protein (p30, 213 amino acids, NM\_000551.2 [variant 1 mRNA]) and a shorter protein product of 19-kDa (p19, 160 amino acids NM\_198156.1 [variant 2 mRNA]), which is generated by alternative translation initiation at an internal methionine at position 54 (Blankenship et al., 1999). Although evolutionary conservation of VHL sequence is very strong over most of the pVHL19 sequence, the first 53 amino acids included in pVHL30 are less well conserved and functional studies suggest that the two pVHL isoforms have equivalent effects (Woodward ER et al, 2000; Iliopoulos O et al, 1998). The VHL mRNA and protein is widely expressed in both fetal and adult tissues (Richards FM et al., 1996; Corless CL et al., 1997) and can be found in all multicellular organisms examined to date without known similarity to other proteins (van M et al., 2001). Remarkable progress has been made in elaborating the function of pVHL and the role its inactivation plays in the pathophysiology of this disorder, including dysregulation of angiogenesis and tumor formation.

Given the lack of primary sequence homology to other proteins, the function of pVHL has been derived from studying pVHL interactors and associated proteins. Roles in oxygendependent angiogenesis, tumorigenesis, fibronectin matrix assembly and cytoskeleton organization, cell cycle control and cellular differentiation have been proposed. The Nterminal acidic domain of VHLp30 contains eight repetitions of a five-residue acidic repeat, which are absent in VHLp19. Phosphorylation of this acidic domain participates in tumor suppression and this domain binds the Kinesin-2 adaptor KAP3, thus mediating microtubule-binding (Lolkema et al., 2005, 2007). This domain is also responsible for binding metastasis suppressor Nm23H2, a protein known to regulate dynamin-dependent endocystosis (Hsu et al., 2006). Further downstream, the β-sheet domain (residues 63–154) binds HIF0a subunits at residues 65–117 and the α-helical domain (residues 155–192) binds the Elongin B and Elongin C (Elongin BC) complex at residues 158–184 (Feldman et al., 1999). Binding of pVHL to the Elongin BC is mediated by the chaperonin TRiC/ CCT. Elongin BC binding to pVHL requires TRiC, and VHL mutations causing defects in binding to Elongin BC are associated with VHL disease (Feldman et al., 1999). pVHL inactivation leads to an overexpression of hypoxia-inducible factor (HIF) and upregulation of its targets (vascular endothelial growth factor (VEGF), erythropoietin, transforming growth factor (TGF)-beta, alpha). Whether this is the sole etiologic factor causing characteristic VHL hemangioblastoma formation remains to be clarified. Evidence also suggests that pVHL inactivation alters fibronectin extracellular matrix formation, and that pVHL may participate in cellular differentiation and cell cycle control. Ongoing studies are directed at elaborating the biologic consequences that these pathways play in the angiogenesis and tumor formation central to VHL. Additionally, VHL protein has functions that are independent of HIF-1alpha and HIF-2alpha and are thought to be important for its tumor-suppressor action, assembly of the extracellular matrix, control of microtubule dynamics, regulation of apoptosis, and possibly stabilization of TP53 proteins (Frew IJ and Krek W. 2007).

### **8. Molecular genetics of VHL disease**

80 Mutations in Human Genetic Disease

**7. VHL gene and pVHL function** 

including dysregulation of angiogenesis and tumor formation.

improved.

hemangioblastomas (Filling-Katz et al., 1991; Maher et al., 1990b; Neumann et al., 1992); however, due to improved screening guidelines, life expectancy of VHL patients has

The human VHL gene is a 10-kb region located on the short arm of chromosome 3 (3p25.3) (Richards et al., 1993) and consists of 3 exons (Kuzmin et al., 1995; Latif et al., 1993a, 1993b): Exon1 spans codons 1–113, exon 2 spans codons 114–154, and exon 3 spans codons 155–213. Two protein products are encoded by VHL: a 30-kDa full-length protein (p30, 213 amino acids, NM\_000551.2 [variant 1 mRNA]) and a shorter protein product of 19-kDa (p19, 160 amino acids NM\_198156.1 [variant 2 mRNA]), which is generated by alternative translation initiation at an internal methionine at position 54 (Blankenship et al., 1999). Although evolutionary conservation of VHL sequence is very strong over most of the pVHL19 sequence, the first 53 amino acids included in pVHL30 are less well conserved and functional studies suggest that the two pVHL isoforms have equivalent effects (Woodward ER et al, 2000; Iliopoulos O et al, 1998). The VHL mRNA and protein is widely expressed in both fetal and adult tissues (Richards FM et al., 1996; Corless CL et al., 1997) and can be found in all multicellular organisms examined to date without known similarity to other proteins (van M et al., 2001). Remarkable progress has been made in elaborating the function of pVHL and the role its inactivation plays in the pathophysiology of this disorder,

Given the lack of primary sequence homology to other proteins, the function of pVHL has been derived from studying pVHL interactors and associated proteins. Roles in oxygendependent angiogenesis, tumorigenesis, fibronectin matrix assembly and cytoskeleton organization, cell cycle control and cellular differentiation have been proposed. The Nterminal acidic domain of VHLp30 contains eight repetitions of a five-residue acidic repeat, which are absent in VHLp19. Phosphorylation of this acidic domain participates in tumor suppression and this domain binds the Kinesin-2 adaptor KAP3, thus mediating microtubule-binding (Lolkema et al., 2005, 2007). This domain is also responsible for binding metastasis suppressor Nm23H2, a protein known to regulate dynamin-dependent endocystosis (Hsu et al., 2006). Further downstream, the β-sheet domain (residues 63–154) binds HIF0a subunits at residues 65–117 and the α-helical domain (residues 155–192) binds the Elongin B and Elongin C (Elongin BC) complex at residues 158–184 (Feldman et al., 1999). Binding of pVHL to the Elongin BC is mediated by the chaperonin TRiC/ CCT. Elongin BC binding to pVHL requires TRiC, and VHL mutations causing defects in binding to Elongin BC are associated with VHL disease (Feldman et al., 1999). pVHL inactivation leads to an overexpression of hypoxia-inducible factor (HIF) and upregulation of its targets (vascular endothelial growth factor (VEGF), erythropoietin, transforming growth factor (TGF)-beta, alpha). Whether this is the sole etiologic factor causing characteristic VHL hemangioblastoma formation remains to be clarified. Evidence also suggests that pVHL inactivation alters fibronectin extracellular matrix formation, and that pVHL may participate in cellular differentiation and cell cycle control. Ongoing studies are directed at elaborating Germline mutations, including large deletions/rearrangements, in the *VHL* gene, linked to 3p25-p26, are etiologic for virtually all VHL disease (Latif, F. et al., 1993; Stolle, C. et al., 1998; Zbar, B. et al., 1996). These VHL germline mutations may be also detected in patients with autosomal dominant familial non-syndromic phaeochromocytoma (Woodward ER et al., 1997; Neumann HP et al., 2002). Specific VHL missense mutations can cause an autosomal recessive form of polycythaemia without any evidence of VHL disease (AngSO et al., 2002; Gordeuk VR et al., 2004). Germ-line mutation confers genetic risk of tumor formation in concert with somatic second VHL allele loss or DNA methylation inactivation. However, somatic loss or inactivation of the wild-type vhl allele has been demonstrated in central nervous system (CNS) sporadic hemangioblastomas (Gnarra JR et al., 1994; Kanno H et al., 2000; Foster K et al., 1994; Herman JG et al 1994; Oberstrass J, et al., 1996; Tse J et al., 1997; Lee J-Y et al., 1998), in sporadic and VHL-associated renal cell carcinomas (RCCs) (Latif F et al, 1993; Shuin T et al., 1994; Phillips JL et al., 2001), pheochromocytoma (Bender BU et al., 2000; Linehan WM et al., 2001) and in endolymphatic sac tumors (ELSTs) (Vortmeyer AO et al., 2000).

More than 300 germline mutations have been identified in familial VHL. These occur throughout the coding region with only a few mutations appearing in multiple families (Zbar B et al., 1996; Beroud C et al., 1996). The new mutation rate has been estimated at between 3 and 20% (Latif F et al., 1993; Richard S et al., 1994; Schimke RN et al., 2000). Although decreased penetrance has been described (Maddock IF et al., 1994), comprehensive familial molecular data have not yet been reported to clarify this rate.

There has been limited correlation between specific mutation and phenotype, although some data on genotype-phenotype correlations have been reported (Neumann H et al., 1998; Hes F et al., 2000). Such correlations have revealed that certain missense mutations confer a high risk of pheochromocytoma (VHL type 1) whereas loss of pVHL through large deletions or nonsense-mediated decay appears to be incompatible with pheochromocytoma development (VHL type 2). [Chen et al., 1995; Cybulski et al., 2002; Glavac et al., 1996; Hes et al., 2000a, 2000b; Maher et al., 1996; Neumann and Bender, 1998; Ong et al., 2007; Zbar et al., 1996].

Interestingly, missense mutations causing amino acid changes on the surface of pVHL appear to have a higher risk for pheochromocytomas than missense mutations occurring deep within the protein; surface missense mutations also appear to have a higher risk for pheochromocytomas than deletions, nonsense, and frameshift mutations [Ong et al., 2007]. Thus, pheochromocytoma development appears to be related to an intact, but altered pVHL,

which has seeded the hypothesis that these mutations may induce gain-of-function possibly through a dominant negative effect [Hoffman et al., 2001; Lee et al., 2005; Maher and Kaelin, 1997; Stebbins et al., 1999]. Nordstrom-O'Brien et al., 2010, analyzed 1548 VHL families and provided a wealth of data for genotype–phenotype correlations. They found 52% had missense mutations most frequently occurred at codons 65, 76, 78, 98, splice mutations at codon 155, 158, 161, 162, and 167. 13% had frameshift, 11% had nonsense, 6% had in-frame deletions/ insertions, 11% had large/complete deletions, and 7% had splice mutations. Mutations that predict absence of functional protein (deletion, frame-shift, nonsense, and splice) are associated in 96-97% of cases with type 1 phenotype and show an increased risk of RCC (including type 2b cases). This suggests that expressed dysfunctional protein may be required for pheochromocytoma formation. Missense mutations are associated with type 2 phenotype (hemangioblastoma and pheochromocytoma +/- RCC) in 69-98% of cases (Stolle C et al., 1998; Chen F et al., 1995; Zbar B et al., 1996). While Nordstrom-O'Brien et al., found 83.5% of VHL Type 2 families mainly had missense mutations. However, this is not as high as some studies, reporting up to 96% of those with pheochromocytomas to have missense mutations (Zbar et al., 1996). Nordstrom-O'Brien et al., found low percentage of VHL Type 2 families (0.5-7%) had other types of mutation such as nonsense, frameshift, splice, in-frame deletion/insertions, and partial deletions. The small percentage of nonsense and partial deletions along with the absence of complete deletions supports theories that an intact though altered pVHL is associated with pheochromocytomas. Stratifying missense mutations into those that resulted in substitution of a surface amino acid and those that disrupted structural integrity demonstrated that surface amino acid substitutions conferred a higher pheochromocytoma risk (Ong KR et al., 2007). Although loss of heterozygosity has been reported in endolymphatic sac tumors (ELST) tumors (Kawahara N et al., 1999; Vortmeyer AO et al., 1997) no predominant mutation has been identified.

Missense Mutation in Cancer in Correlation to Its Phenotype – VHL as a Model 83

HIF-binding site and do not affect the ability of pVHL to bind Elongin C (Clifford et al., 2001). Therefore, classifying missense substitutions according to their predicted effect on pVHL structure enhances the ability to predict pheochromocytoma risk (Ong KR et al., 2007) Nordstrom-O'Brien et al 2010 suggested that increased identification of new mutations and new patients with previously described mutations gives momentum to the search for the exact role of pVHL in its normal and mutated form. Understanding such functions and its association with specific mutations allows for identification of disease risks in individual patients. Such insight will offer improved diagnostics, surveillance, and treatment of VHL

Ongoing delineation of clinical subtypes may allow for better genotype-phenotype correlations, prediction of clinical progression and molecular mutation-directed clinical management. There is significant intra-familial difference in clinical expressivity and as of yet limited knowledge about modifiers of this phenotypic variation (Webster AR, et al, 1998). Prediction of the clinical course in any one patient based on molecular data is

Ang SO, Chen H, HirotaK et al: Disruption of oxygen homeostasis underlies congenital

Ayerbes VM, Gallego AG, Prado DS, Fonseca JP, Campelo G R, Aparicio ALM. Origin of

Bender BU, Gutsche M, Glasker S, et al. Differential genetic alterations in von Hippel-Lindau syndrome-associated and sporadic pheochromocytomas. J Clin Endocrinol

Beroud C, Joly D, Gallou C, et al. Software and database for the analysis of mutation in VHL

Berse B, Brown LF, Van de Water L, Dvorak HF, Senger DR. Vascular permeability factor (vascular endothelial growth factor) gene is expressed differentially in normal tissues,

Blankenship C, Naglich JG, Whaley JM, Seizinger B, Kley N. Alternate choice of initiation codon produces a biologically active product of the von Hippel Lindau gene with

Cancer Genome Atlas Research Network: Comprehensive genomic characterization defines

Chen F, Kishida T, Yao M, et al. Germline mutations in the von Hippel-Lindau disease tumor suppressor gene: correlations with phenotype. Hum Mutat 1995; 5:66-75.

human glioblastoma genes and core pathways. Nature 2008; 455:1061-1068.

patients (Nordstrom-O'Brien et al., 2010).

*Molecular Genetics, Kuwait University, Kuwait* 

Metab 2000; 85:4568-4574.

Chuvash polycythemia. Nat Genet 2002; 32: 614–621.

renal cell carcinomas. Clin Transl Oncol. 2008 Nov;10(11):697-712.

gene [www.umd.necker.fr:2005]. Nucleic Acids Res 1998; 26:256-258.

macrophages, and tumors. Mol Biol Cell 1992; 3:211–220.

tumor suppressor activity. Oncogene 1999;18:1529–1535.

therefore difficult.

**Author details** 

Suad AlFadhli

**9. References** 

It may be difficult, however, to predict functional biologic consequences from specific point mutations without direct functional assays as reported in recent RCC *in-vitro* mutation panel studies.

The recent characterization of the VHL protein crystal structure might suggests possible functional consequences of specific mutations. If we focus on the structure of the pVHL we can predict the effect of the mutation on the functionality of the pVHL and therefore the phenotype resulted. Mutation-specific dysfunction may depend on protein destabilization, altered interactor binding at the various pVHP binding domains or potential alteration in binding to other factors involved in tumor suppressor/activator activity. pVHL has two domains: an amino-terminal domain rich in β-sheet (the β-domain) and a smaller carboxyterminal α-helical domain (the α-domain). A large portion of the α-domain surface interacts with Elongin C, which binds to other members (e.g., Elongin B, Cul2, and Rbx1) of an SCF-like E3 ubiquitin-protein ligase complex as mentioned earlier. Obviously, loss of function VHL mutations prevents Elongin C binding and target ubiquitylation (Clifford et al., 2001). The β-domain on the other side has a macromolecular binding site targets the HIF-1α and HIF-2α regulatory subunits for proteasomal degradation. Whereas Type 1 and Type 2B mutations impair pVHL binding to Elongin C, Type 2A mutations map to the β-domain HIF-binding site and do not affect the ability of pVHL to bind Elongin C (Clifford et al., 2001). Therefore, classifying missense substitutions according to their predicted effect on pVHL structure enhances the ability to predict pheochromocytoma risk (Ong KR et al., 2007)

Nordstrom-O'Brien et al 2010 suggested that increased identification of new mutations and new patients with previously described mutations gives momentum to the search for the exact role of pVHL in its normal and mutated form. Understanding such functions and its association with specific mutations allows for identification of disease risks in individual patients. Such insight will offer improved diagnostics, surveillance, and treatment of VHL patients (Nordstrom-O'Brien et al., 2010).

Ongoing delineation of clinical subtypes may allow for better genotype-phenotype correlations, prediction of clinical progression and molecular mutation-directed clinical management. There is significant intra-familial difference in clinical expressivity and as of yet limited knowledge about modifiers of this phenotypic variation (Webster AR, et al, 1998). Prediction of the clinical course in any one patient based on molecular data is therefore difficult.

#### **Author details**

82 Mutations in Human Genetic Disease

studies.

which has seeded the hypothesis that these mutations may induce gain-of-function possibly through a dominant negative effect [Hoffman et al., 2001; Lee et al., 2005; Maher and Kaelin, 1997; Stebbins et al., 1999]. Nordstrom-O'Brien et al., 2010, analyzed 1548 VHL families and provided a wealth of data for genotype–phenotype correlations. They found 52% had missense mutations most frequently occurred at codons 65, 76, 78, 98, splice mutations at codon 155, 158, 161, 162, and 167. 13% had frameshift, 11% had nonsense, 6% had in-frame deletions/ insertions, 11% had large/complete deletions, and 7% had splice mutations. Mutations that predict absence of functional protein (deletion, frame-shift, nonsense, and splice) are associated in 96-97% of cases with type 1 phenotype and show an increased risk of RCC (including type 2b cases). This suggests that expressed dysfunctional protein may be required for pheochromocytoma formation. Missense mutations are associated with type 2 phenotype (hemangioblastoma and pheochromocytoma +/- RCC) in 69-98% of cases (Stolle C et al., 1998; Chen F et al., 1995; Zbar B et al., 1996). While Nordstrom-O'Brien et al., found 83.5% of VHL Type 2 families mainly had missense mutations. However, this is not as high as some studies, reporting up to 96% of those with pheochromocytomas to have missense mutations (Zbar et al., 1996). Nordstrom-O'Brien et al., found low percentage of VHL Type 2 families (0.5-7%) had other types of mutation such as nonsense, frameshift, splice, in-frame deletion/insertions, and partial deletions. The small percentage of nonsense and partial deletions along with the absence of complete deletions supports theories that an intact though altered pVHL is associated with pheochromocytomas. Stratifying missense mutations into those that resulted in substitution of a surface amino acid and those that disrupted structural integrity demonstrated that surface amino acid substitutions conferred a higher pheochromocytoma risk (Ong KR et al., 2007). Although loss of heterozygosity has been reported in endolymphatic sac tumors (ELST) tumors (Kawahara N et al., 1999;

Vortmeyer AO et al., 1997) no predominant mutation has been identified.

It may be difficult, however, to predict functional biologic consequences from specific point mutations without direct functional assays as reported in recent RCC *in-vitro* mutation panel

The recent characterization of the VHL protein crystal structure might suggests possible functional consequences of specific mutations. If we focus on the structure of the pVHL we can predict the effect of the mutation on the functionality of the pVHL and therefore the phenotype resulted. Mutation-specific dysfunction may depend on protein destabilization, altered interactor binding at the various pVHP binding domains or potential alteration in binding to other factors involved in tumor suppressor/activator activity. pVHL has two domains: an amino-terminal domain rich in β-sheet (the β-domain) and a smaller carboxyterminal α-helical domain (the α-domain). A large portion of the α-domain surface interacts with Elongin C, which binds to other members (e.g., Elongin B, Cul2, and Rbx1) of an SCF-like E3 ubiquitin-protein ligase complex as mentioned earlier. Obviously, loss of function VHL mutations prevents Elongin C binding and target ubiquitylation (Clifford et al., 2001). The β-domain on the other side has a macromolecular binding site targets the HIF-1α and HIF-2α regulatory subunits for proteasomal degradation. Whereas Type 1 and Type 2B mutations impair pVHL binding to Elongin C, Type 2A mutations map to the β-domain Suad AlFadhli *Molecular Genetics, Kuwait University, Kuwait* 

#### **9. References**


Clifford SC, Cockman ME, Smallwood AC, Mole DR, Woodward ER, Maxwell PH, Ratcliffe PJ, Maher ER. Contrasting effects on HIF-1alpha regulation by disease-causing pVHL mutations correlate with patterns of tumourigenesis in von Hippel-Lindau disease. Hum Mol Genet 2001;10:1029–1038.

Missense Mutation in Cancer in Correlation to Its Phenotype – VHL as a Model 85

Glavac D, Neumann HP, Wittke C, Jaenig H, Masek O, Streicher T, Pausch F, Engelhardt D, Plate KH, Hofler H, Chen F, Zbar B, Brauch H. Mutations in the VHL tumor suppressor gene and associated lesions in families with von Hippel-Lindau disease from central

Gnarra JR, Tory K, Weng Y, et al. Mutations of the VHL tumour suppressor gene in renal

Gordeuk VR, Sergueeva AI, Miasnikova GY et al: Congenital disorder of oxygen-sensing: association of the homozygous Chuvash polycythemia VHL mutation with thrombosis

Gross D, Avishai N, Meiner V, et al. Familial pheochromocytoma associated with a novel mutation in the von Hippel-Lindau gene. J Clin Endocrinol Metab 1996; 81:147-149. Gruber MB, Healey GB, Toguri AG, Warren MM. Papillary cystadenoma of epididymis:

Herman JG, Latif F, Weng Y, et al. Silencing of the VHL tumor-suppressor gene by DNA

Hes F, Zewald R, Peeters T, Sijmons R, Links T, Verheij J, Matthijs G, Leguis E, Mortier G, van der Torren K, Rosman M, Lips C, Pearson P, van der Luijt R. Genotype–phenotype correlations in families with deletions in the von Hippel-Lindau (VHL) gene. Hum

Hes FJ, McKee S, Taphoorn MJ, Rehal P, van Der Luijt RB, McMahon R, van Der Smagt JJ, Dow D, Zewald RA, Whittaker J, Lips CJ, MacDonald F, Pearson PL, Maher ER. Cryptic von Hippel-Lindau disease: germline mutations in patients with haemangioblastoma

Hoffman MA, Ohh M, Yang H, Klco JM, Ivan M, Kaelin Jr WG. von Hippel-Lindau protein mutants linked to type 2C VHL disease preserve the ability to downregulate HIF. Hum

Hofstra RMW, Stelwagen T, Stulp RP, et al. Extensive mutation screening of RET in sporadic medullary thyroid carcinoma and of RET and VHL in sporadic pheochromocytoma reveals involvement of these genes in only a minority of cases. J Clin Endocrinol Metab

Hough DM, Stephens DH, Johnson CD, Binkovitz LA. Pancreatic lesions in von Hippel-Lindau disease: prevalence, clinical significance, and CT findings. AJR Am J Roentgenol

Hsu T, Adereth Y, Kose N, Dammai V. Endocytic function of von Hippel-Lindau tumor suppressor protein regulates surface localization of fibroblast growth factor receptor 1

Iliopoulos O, Ohh M, Kaelin Jr WG: pVHL19 is a biologically active product of the von Hippel-Lindau gene arising from internal translation initiation. Proc Natl Acad Sci USA

Ivan M, KaelinWG.The vonHippel-Lindau tumor suppressor protein. Curr Opin Genet Dev

and vascular abnormalities but not tumors. Blood 2004; 103: 3924–3932.

component of von Hippel-Lindau syndrome. Urology 1980; 16:305–306.

methylation in renal carcinoma. Proc Natl Acad Sci USA 1994; 91:9700-9704.

Europe. Hum Genet 1996; 98:271–280.

carcinoma. Nat Genet 1994; 7:85-90.

Genet 2000a;106:425–431.

only. J Med Genet 2000b; 37:939–943.

and cell motility. J Biol Chem 2006; 281:12069–12080.

Mol Genet 2001; 10:1019–1027.

1996; 81:2881

1994;162:1091–1094.

1998; 95: 11661–1166.

2001; 11:27-34.


Glavac D, Neumann HP, Wittke C, Jaenig H, Masek O, Streicher T, Pausch F, Engelhardt D, Plate KH, Hofler H, Chen F, Zbar B, Brauch H. Mutations in the VHL tumor suppressor gene and associated lesions in families with von Hippel-Lindau disease from central Europe. Hum Genet 1996; 98:271–280.

84 Mutations in Human Genetic Disease

417:949-954.

Genet 1994; 3:2169-2173.

Nature 2007; 446:153-158.

2007;19:685-690.

1061.

Hum Mol Genet 2001;10:1029–1038.

phenotype. Hum Mol Genet 1994b;3:1303–1308.

deletions of the entire VHL gene. J Med Genet 2002;39:E38.

Von Hippel-Lindau disease. Neurology 1991;41:41–46.

Curr Protoc Hum Genet 2008; Chapter 10, Unit 10 11.

Clifford SC, Cockman ME, Smallwood AC, Mole DR, Woodward ER, Maxwell PH, Ratcliffe PJ, Maher ER. Contrasting effects on HIF-1alpha regulation by disease-causing pVHL mutations correlate with patterns of tumourigenesis in von Hippel-Lindau disease.

Corless CL, Kibel AS, Iliopoulos O et al: Immunostaining of the von Hippel-Lindau gene product in normal and neoplastic human tissues. Hum Path 1997; 28: 459–464. Crossey PA, Foster K, Richards FM, Phipps ME, Latif F, Tory K, Jones MH, Bentley E, Kumar R, Lerman MI, Zbar B, Affara NA, Ferguson-Smith MA, Maher ER. Molecular genetic investigations of the mechanism of tumourigenesis in von Hippel-Lindau

Crossey PA, Richards FM, Foster K, Green JS, Prowse A, Latif F, Lerman MI, Zbar B Affara NA, Ferguson-Smith MA, Maher ER. Identification of intragenic mutations in the von Hippel-Lindau disease tumour suppressor gene and correlation with disease

Cybulski C, Krzystolik K, Murgia A, Gorski B, et al. Germline mutations in the von Hippel-Lindau (VHL) gene in patients from Poland: disease presentation in patients with

Davies H, Bignell GR, Cox C, Stephens P, Edkins S, Clegg S, Teague J, Woffendin H, Garnett MJ, Bottomley W, et al: Mutations of the BRAF gene in human cancer. Nature 2002;

Feldman DE, Thulasiraman V, Ferreyra RG, Frydman J. Formation of the VHLelongin BC tumor suppressor complex is mediated by the chaperonin TRiC. Mol Cell 1999;4:1051–

Filling-Katz MR, Choyke PL, Oldfield E, Charnas L, Patronas NJ, Glenn GM, GorinMB, Morgan JK, Linehan WM, Seizinger BR, Zbar B. Central nervous system involvement in

Forbes SA, Bhamra G, Bamford S, Dawson E, Kok C, Clements J, Menzies A, Teague JW, Futreal PA, Stratton MR: The Catalogue of Somatic Mutations in Cancer (COSMIC).

Foster K, Prowse A, van den Berg A, et al. Somatic mutations of the von Hippel-Lindau disease tumour suppressor gene in non-familial clear cell renal carcinoma. Hum Mol

Frew IJ and Krek W. Multitasking by pVHL in tumour suppression. *Curr Opin Cell Biol* 

Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR.

Greenman C, Stephens P, Smith R, Dalgliesh GL, Hunter C, Bignell G, Davies H, Teague J, Butler A, Stevens C, et al: Patterns of somatic mutation in human cancer genomes.

Garcia A, Matias-Guiu X, Cabezas R, Chico A, Prat J, Baiget M, De Leiva A. Molecular diagnosis of von Hippel-Lindau disease in a kindred with a predominance of familial

A census of human cancer genes. Nat Rev Cancer. 2004;4(3):177-83.

phaeochromocytoma. Clin Endocrinol (Oxf) 1997; 46:359–363

disease: analysis of allele loss in VHL tumours. Hum Genet 1994a; 93:53–58.


Jeffers M, Schmidt L, Nakaigawa N, Webb CP, Weirich G, Kishida T, Zbar B, VandeWoude GF: Activating mutations for the met tyrosine kinase receptor in human cancer. Proc Natl Acad Sci USA 1997; 94:11445-11450.

Missense Mutation in Cancer in Correlation to Its Phenotype – VHL as a Model 87

Lolkema MP, Mans DA, Snijckers CM, van Noort M, van Beest M, Voest EE, Giles RH. The von Hippel-Lindau tumour suppressor interacts with microtubules through kinesin-2.

Lubensky IA, Pack S, Ault D, Vortmeyer AO, Libutti SK, Choyke PL, Walther MM, Linehan WM, Zhuang Z. Multiple neuroendocrine tumors of the pancreas in von Hippel-Lindau disease patients: histopathological and molecular genetic analysis. Am J Pathol 1998;

Maddock IF, Moran A, Maher ER, et al. A genetic register for von Hippel-Lindau disease. J

Maher ER, Iselius L, Yates JR, Littler M, Benjamin C, Harris R, Sampson J,Williams A, Ferguson-Smith MA, Morton N. Von Hippel-Lindau disease: a genetic study. J Med

Maher ER, Kaelin Jr WG. von Hippel-Lindau disease. Medicine (Baltimore) 1997; 76:381–391. Maher ER, Webster AR, Richards FM, Green JS, Crossey PA, Payne SJ, Moore AT. Phenotypic expression in von Hippel-Lindau disease: correlations with germline VHL

Maher ER, Yates JR, Harries R, Benjamin C, Harris R, Moore AT, Ferguson-Smith MA. Clinical features and natural history of von Hippel-Lindau disease. Q J Med 1990b;

Manski TJ, Heffner DK, Glenn GM et al: Endolymphatic sac tumors—A source of morbid hearing loss in von Hippel-Lindau disease. Jama-Journal of the American Medical

Martin R, Hockey A, Walpole I, et al. Variable penetrance of familial pheochromocytoma associated with the von Hippel-Lindau gene mutation, S68W. Mutations in brief no 150.

Melmon KL, Rosen SW. Lindau's disease. Review of the literature and study of a large

Mulvihill JJ, Ferrell RE, Carty SE, Tisherman SE, Zbar B. Familial pheochromocytoma due to mutant von Hippel-Lindau disease gene. Arch Intern Med 1997; 157:1390–1391. Neumann H, Bender B. Genotype-phenotype correlations in von Hippel-Lindau disease. J

Neumann HP, Wiestler OD. Clustering of features of von Hippel-Lindau syndrome:

Neumann HP, Bausch B, McWhinney SR et al: Germ-line mutations in nonsyndromic

Neumann HP, Eng C, Mulligan LM, Glavac D, Zauner I, Ponder BA, Crossey PA, Maher ER, Brauch H. Consequences of direct genetic testing for germline mutations in the clinical management of families with multiple endocrine neoplasia, type II. JAMA 1995;

Neumann HP, Eggert HR, Scheremet R, Schumacher M, Mohadjer M, Wakhloo AK, Volk B, Hettmannsperger U, Riegler P, Schollmeyer P. Central nervous system lesions in von

Hippel-Lindau syndrome. J Neurol Neurosurg Psychiatry 1992; 55:898–901.

evidence for a complex genetic locus. Lancet 1991; 337:1052–1054.

Phaeochromocytoma. N Engl J Med 2002; 346: 1459–6621.

Maher ER. Von Hippel-Lindau disease. Curr Mol Med 2004; 4:833–842.

gene mutations. J Med Genet 1996; 33:328–332.

Association 1997; 277: 1461–1466.

Online Hum Mutat 1998; 12:71.

Intern Med 1998; 243:541±545..

274:1149–1151.

kindred. Am J Med 1964; 36:595–617.

FEBS Lett 2007; 581:4571–4576.

Med Genet 1996; 33:120-127.

Genet 1991; 28:443–447

153:223–231.

77:1151–1163.


10:35-52.

Sci. USA 1971;68:820–823

1995;10(11):2185-94.

1998; 58:504-508.

8: 2005; 155–167.

134:315-329.

Natl Acad Sci USA 1997; 94:11445-11450.

Cancer Res 2004; 10(18 Pt 2):6290S–6295S.

suppressor gene. Science 1993b; 260:1317–1320.

domain. J Biol Chem 2005; 80:22205–22211.

Jeffers M, Schmidt L, Nakaigawa N, Webb CP, Weirich G, Kishida T, Zbar B, VandeWoude GF: Activating mutations for the met tyrosine kinase receptor in human cancer. Proc

Jones S, Zhang X, Parsons DW, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Kamiyama H, Jimeno A, et al: Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science (New York, NY) 2008; 321:1801-1806. Kaelin Jr WG. The von Hippel-Lindau tumor suppressor gene and kidney cancer. Clin

Kanno H, Saljooque F, Yamamoto I, et al. Role of the von Hippel-Lindau tumor suppressor

Karchin R: Next generation tools for the annotation of human SNPs. Brief Bioinform 2009;

Kawahara N, Kume H, Ueki K, et al. VHL gene inactivation in an endolymphatic sac tumor

Knudson A. G. Jr Mutation and cancer: statistical study of retinoblastoma. Proc. Natl. Acad.

Kuzmin I, Duh FM, Latif F, Geil L, Zbar B, Lerman MI. 1995. Identification of the promoter of the human von Hippel-Lindau disease tumor suppressor gene. Oncogene.

Latif F, Duh FM, Gnarra J, Tory K, Kuzmin I, Yao M, Stackhouse T, Modi W, Geil L, Schmidt L, Li H, Orcutt ML, Maher E, Richards F, Phipps M, Ferguson-Smith M, Le Paslier D, Linehan WM, Zbar B, Lerman MI. von Hippel-Lindau syndrome: cloning and identification of the plasma membrane Ca(11)-transporting ATPase isoform 2 gene that

Latif F, Tory K, Gnarra J, Yao M, Duh FM, Orcutt ML, Stackhouse T, Kuzmin I, Modi W, Geil L, and many others. Identification of the von Hippel-Lindau disease tumor

Lee J-Y, Dong S-M, Park W-S, et al. Loss of heterozygosity and somatic mutations of the VHL tumor suppressor gene in sporadic cerebellar hemangioblastomas. Cancer Res

Lee S, Nakamura E, Yang H, Wei W, Linggi MS, Sajan MP, Farese RV, Freeman RS, Carter BD, Kaelin Jr WG, Schlisio S. Neuronal apoptosis linked to EglN3 prolyl hydroxylase and familial pheochromocytoma genes: developmental culling and cancer. Cancer Cell

Linehan WM, Eisenhofer G, Walther MM, Goldstein DS. Recent advances in genetics, diagnosis, localization and treatment of pheochromocytoma. Ann Intern Med 2001;

Liu H, Xing Y, Yang S, Tian D. Remarkable difference of somatic mutation patterns between

Lolkema MP, Gervais ML, Snijckers CM, Hill RP, Giles RH, Voest EE, Ohh M. Tumor suppression by the von Hippel-Lindau protein requires phosphorylation of the acidic

oncogenes and tumor suppressor genes. Oncol Rep. 2011;26(6):1539-46.

resides in the von Hippel-Lindau gene region. Cancer Res 1993a;53:861–867.

Kaelin WG. Von Hippel-Lindau disease. Annu Rev Pathol 2007; 2:145–173.

protein during neuronal differentiation. Cancer Res 2000; 60:2820-2824.

associated with von Hippel-Lindau disease. Neurology 1999; 53:208-210.


Neumann HP, Wiestler OD. Clustering of features and genetics of von Hippel-Lindau syndrome. Lancet 1991; 338: 258.

Missense Mutation in Cancer in Correlation to Its Phenotype – VHL as a Model 89

Schimke RN, Collins D, Stolle CA. Von Hippel-Lindau syndrome. In: GeneClinics: clinical

Shu HK, Pelley RJ, Kung HJ: Tissue-specific transformation by epidermal growth factor receptor: a single point mutation within the ATP-binding pocket of the erbB product increases its intrinsic kinase activity and activates its sarcomagenic potential. Proc Natl

Shuin T, Kondo K, Torigoe S, et al. Frequent somatic mutations and loss of heterozygosity of the von Hippel-Lindau tumor suppressor gene in primary human renal cell carcinomas.

Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, et al: The consensus coding sequences of human breast and colorectal

Stebbins CE, Kaelin Jr WG, Pavletich NP. Structure of the VHL–ElonginC–ElonginB complex: implications for VHL tumor suppressor function. Science 1999; 284:455–461. Stehr H, Jang SH, Duarte JM, Wierling C, Lehrach H, Lappe M, Lange BM. The structural impact of cancer-associated missense mutations in oncogenes and tumor suppressors.

Stolle C, Glenn G, Zbar B, et al. Improved detection of germline mutations in the von Hippel-Lindau disease tumor suppressor gene. Hum Mutat 1998; 12:417-423. Takahashi A, Sasaki H, Kim SJ, Tobisu K, Kakizoe T, Tsukamoto T, Kumamoto Y, Sugimura T, Terada M. Markedly increased amounts of messenger RNAs for vascular endothelial growth factor and placenta growth factor in renal cell carcinoma associated with

Tse J, Wong J, Lo K-W, et al. Molecular genetic analysis of the von Hippel-Lindau disease tumor suppressor gene in familial and sporadic cerebellar hemangioblastomas. Am J

Vogelstein B, Kinzler KW: The multistep nature of cancer. Trends Genet 1993; 9:138-141. Vortmeyer AO, Huang SC, Koch CA, et al. Somatic von Hippel-Lindau gene mutations detected in sporadic endolymphatic sac tumors. Cancer Res 2000; 60:5963-5965. Vortmeyer AO, Choo D, Pack SD, et al. Von Hippel-Lindau disease gene alterations associated with endolymphatic sac tumor. J Natl Cancer Inst 1997; 89:970-972. Wanebo JE, Lonser RR, Glenn GM, Oldfield EH. The natural history of hemangioblastomas of the central nervous system in patients with von Hippel-Lindau disease. J Neurosurg

Wang Z, Moult J: SNPs, protein structure, and disease. Human Mutation 2001, 17:263-270. Webster AR, Maher ER, Moore AT. Clinical characteristics of ocular angiomatosis in von Hippel-Lindau disease and correlation with germline mutation. Arch Ophthalmol 1999;

Webster AR, Richards FM, MacRonald FE, et al. An analysis of phenotypic variation in the familial cancer syndrome von Hippel-Lindau disease: evidence for modifier effects. Am

genetic information resource; www.geneclinics.org/profiles/vhl.

Acad Sci USA 1990; 87:9103-9107.

Cancer Res 1994; 54:2852-2855.

cancers. Science 2006; 314:268-274.

angiogenesis. Cancer Res 1994; 54: 4233–4237.

Mol Cancer. 2011;10:54.

Clin Pathol 1997; 107:459-466.

J Hum Genet 1998; 63:1025-1035.

2003.; 98:82–94.

117:371–378.


Schimke RN, Collins D, Stolle CA. Von Hippel-Lindau syndrome. In: GeneClinics: clinical genetic information resource; www.geneclinics.org/profiles/vhl.

88 Mutations in Human Genetic Disease

syndrome. Lancet 1991; 338: 258.

acids research 2003; 31:3812-3814.

system. J Pathol 1996; 179:151-156.

Cancer 2001; 31:1-9.

outcome. J Neurosurg Sci 2008; 52:29–36.

of tobacco exposure. Nature 2010b; 463:184-190.

Hippel-Lindau disease. Adv Nephrol 1994; 23:1-27.

Nucleic acids research 2002; 30:3894-3900.

Annu Rev Genomics Hum Genet 2006;7:61-80.

Hippel-Lindau disease. Hum Mutat. 2010; 31(5):521-37.

Neumann HP, Wiestler OD. Clustering of features and genetics of von Hippel-Lindau

Ng PC, Henikoff S. Predicting the effects of amino acid substitutions on protein function.

Ng PC, Henikoff S: SIFT: Predicting amino acid changes that affect protein function. Nucleic

Niemela M, Lemeta S, Summanen P, et al. Long-term prognosis of haemangioblastoma of the CNS: impact of von Hippel-Lindau disease. Acta Neurochir 1999; 141:1147±1156. Nordstrom-O'Brien M, van der Luijt RB, van Rooijen E, van den Ouweland AM, Majoor-Krakauer DF, Lolkema MP, van Brussel A, Voest EE, Giles RH.Genetic analysis of von

Oberstrass J, Reifenberger G, Reifenberger J, et al. Mutations of the von Hippel-Lindau tumour suppressor gene in capillary haemangioblastomas of the central nervous

Ong KR, Woodward ER, Killick P, Lim C, Macdonald F, Maher ER. Genotype–phenotype

Pavesi G, Feletti A, Berlucchi S, Opocher G, Martella M, Murgia A, Scienza R. Neurosurgical treatment of von Hippel-Lindau-associated hemangioblastomas: benefits, risks and

Phillips JL, Ghadimi BM, Wangsa D, et al. Molecular cytogenetic characterization of early and late renal cell carcinomas in von Hippel-Lindau disease. Genes Chromosomes

Pleasance ED, Cheetham RK, Stephens PJ, McBride DJ, Humphray SJ, Greenman CD, Varela I, Lin ML, Ordonez GR, Bignell GR, et al: A comprehensive catalogue of somatic

Pleasance ED, Stephens PJ, O'Meara S, McBride DJ, Meynert A, Jones D, Lin ML, Beare D, Lau KW, Greenman C, et al: A small-cell lung cancer genome with complex signatures

Ramensky V, Bork P, Sunyaev S: Human non-synonymous SNPs: server and survey.

Richards FM, Maher ER, Latif F, Phipps ME, Tory K, Lush M, Crossey PA, Oostra B, Enblad P, Gustavson KH, Green J, Turner G, Yates JRW, Linehan WM, Affara NA, Lerman M, Zbar B, Ferguson-Smith MA. Detailed genetic mapping of the von Hippel-Lindau

Richards FM, Payne SJ, Zbar B et al: Molecular Analysis of De-Novo Germline Mutations in

Richards FM, Schofield PN, Fleming S: Expression of the von Hippel-Lindau disease tumour suppressor gene during human embryogenesis. Hum Mol Gen 1996; 5: 639–644. Richard S, Chauveau D, Chretien Y, et al. Renal lesions and pheochromocytoma in von

Sato K, Terada K, Sugiyama T, Takahashi S, Saito M, Moriyama M, Kakinuma H, Suzuki Y, Kato M, Kato T. Frequent overexpression of vascular endothelial growth factor gene in

correlations in von Hippel-Lindau disease. Hum Mutat 2007; 28:143–149

mutations from a human cancer genome. Nature 2010a; 463:191-196.

disease tumour suppressor gene. J Med Genet 1993; 30:104–107.

the von Hippel-Lindau Disease Gene. Hum Mol Gen 1995; 4:2139–2143.

human renal cell carcinoma. Tohoku J Exp Med 1994; 173:355–360.


Weil RJ, Lonser RR, DeVroom HL, Wanebo JE, Oldfield EH. Surgical management of brainstem hemangioblastomas in patients with von Hippel-Lindau disease. J Neurosurg 2003; 98:95–105

**Chapter 5** 

© 2012 Tica et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**Genotype-Phenotype Disturbances** 

Mihaela Tica, Valeria Tica, Alexandru Naumescu, Mihaela Uta, Ovidiu Vlaicu and Elena Ionica

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48366

response to therapy and/ or survival.

**1. Introduction** 

**of Some Biomarkers in Colorectal Cancer** 

Colorectal carcinoma (CRC) is one of the most common human cancers. In 2008, 1.233.000 new CRC patients were diagnosed worldwide and about 608.000 deaths caused by colorectal cancer were estimated making it the fourth most common cause of death from cancer in the world. Five-year survival for CRC patients indicates a percent of 54.0% in Europe. Additionally, from the five-year survival, it was observed 74.0% of survival for patients with stage I, 66.5% for patients with stage IIA, 73.1% for patients with stage IIIA and only 5.7% for patients with stage IV disease (Stanczak, 2011). The success of colorectal cancer screening programs has resulted in an increasing number of biopsies of early neoplastic lesions with subtle histological features, making development of ancillary diagnostic testing for CRC essential. The incorporation of ancillary techniques, such as immunohistochemistry, cytochemical staining, electron microscopy, cytogenetic and, more recently, molecular testing, has made a significant impact in the diagnosis and management of solid tumors. Interpretation of hematoxylin-eosin stained slides by light microscopy remains the basic of anatomic pathology. However, an expanding menu of molecular assays continues to be implemented owing to their clinical utility in diagnosis, prognosis and risk assessment, therapy selection, as well as cancer screening and minimal residual disease detection. Carcinomas tend to carry multiple, complex, non-recurrent chromosomal and molecular aberrations, and they were not traditionally considered ideal candidates for molecular testing. However, this is changing with the discovery and implementation of new diagnostic, prognostic, and therapeutic molecular markers. Although single molecular biomarkers have proved useful, technical advances allowed performing the global genomic, epigenomic, or proteomic profiling of solid tumor malignancies. The research continues for more definitive molecular indicators that correlate with histological features and patient


**Chapter 5** 

## **Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer**

Mihaela Tica, Valeria Tica, Alexandru Naumescu, Mihaela Uta, Ovidiu Vlaicu and Elena Ionica

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48366

## **1. Introduction**

90 Mutations in Human Genetic Disease

2003; 98:95–105

318:1108-1113.

284:455461.

Weil RJ, Lonser RR, DeVroom HL, Wanebo JE, Oldfield EH. Surgical management of brainstem hemangioblastomas in patients with von Hippel-Lindau disease. J Neurosurg

Whaley JM, Naglich J, Gelbert L, Hsia YE, Lamiell JM, Green JS, Collins D, Neumann HP, Laidlaw J, Li FP, Klein-Szanto AJP, Seizinger BR, Kley N. Germ-line mutations in the von Hippel-Lindau tumor-suppressor gene are similar to somatic von Hippel-Lindau aberrations in sporadic renal cell carcinoma. Am J Hum Genet 1994; 55:1092–1102. Wood LD, Parsons DW, Jones S, Lin J, Sjoblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, et al: The genomic landscapes of human breast and colorectal cancers. Science 2007;

Woodward ER, Buchberger A, Clifford SC et al: Comparative sequence analysis of the VHL

Woodward ER, Eng C, McMaon R, et al. Genetic predisposition to phaeochromocytoma: analysis of candidate genes GDNF, RET and VHL. Hum Mol Genet 1997; 6:1051-1056. Woodward ER, Eng C, McMaon R, et al. Genetic predisposition to phaeochromocytoma: analysis of candidate genes GDNF, RET and VHL Hum Mol Genet 1997; 6:1051-1056. Woodward ER, Eng C, McMahon R et al: Genetic predisposition to phaeochromocytoma: Analysis of candidate genes GDNF, RET and VHL. Hum Mol Genet 1997; 6: 1051–1056. Zbar B, Kaelin W, Maher E, et al. Third International Meeting on von Hippel-Lindau

Zbar, B., Kishida, F. Chen, *et al*. Germlinemutations in the von Hippel-Lindau disease (VHL) gene in families from North American, Europe and Japan. Hum. Mutat. 1996; 8:348–357. Zbar B, Kishida T, Chen F, et al. Germline mutations in the von Hippel-Lindau disease (VHL) gene in families from North America, Europe, and Japan. Hum Mutat 1996;

Zbar B, Kishida T, Chen F, Schmidt L, Maher ER, Richards FM, Crossey PA,Webster AR,

Neumann HP, Tisherman S, Mulvihill JJ, Gross DJ, Shuin T, Whaley J,Seizinger B, Kley N, Olschwang S, Boisson C, Richard S, Lips CH,Lerman M, Linehan WM. Germline mutations in the Von Hippel-Lindaudisease (VHL) gene in families from North

tumor suppressor gene. Genomics 2000; 65: 253–265.

Affara NA, Ferguson-Smith MA, Brauch H, Glavac D,

America, Europe, and Japan. Hum Mutat 1996; 8:348–357.

disease. Cancer Res 1999; 59:2251-2253.

Colorectal carcinoma (CRC) is one of the most common human cancers. In 2008, 1.233.000 new CRC patients were diagnosed worldwide and about 608.000 deaths caused by colorectal cancer were estimated making it the fourth most common cause of death from cancer in the world. Five-year survival for CRC patients indicates a percent of 54.0% in Europe. Additionally, from the five-year survival, it was observed 74.0% of survival for patients with stage I, 66.5% for patients with stage IIA, 73.1% for patients with stage IIIA and only 5.7% for patients with stage IV disease (Stanczak, 2011). The success of colorectal cancer screening programs has resulted in an increasing number of biopsies of early neoplastic lesions with subtle histological features, making development of ancillary diagnostic testing for CRC essential. The incorporation of ancillary techniques, such as immunohistochemistry, cytochemical staining, electron microscopy, cytogenetic and, more recently, molecular testing, has made a significant impact in the diagnosis and management of solid tumors. Interpretation of hematoxylin-eosin stained slides by light microscopy remains the basic of anatomic pathology. However, an expanding menu of molecular assays continues to be implemented owing to their clinical utility in diagnosis, prognosis and risk assessment, therapy selection, as well as cancer screening and minimal residual disease detection. Carcinomas tend to carry multiple, complex, non-recurrent chromosomal and molecular aberrations, and they were not traditionally considered ideal candidates for molecular testing. However, this is changing with the discovery and implementation of new diagnostic, prognostic, and therapeutic molecular markers. Although single molecular biomarkers have proved useful, technical advances allowed performing the global genomic, epigenomic, or proteomic profiling of solid tumor malignancies. The research continues for more definitive molecular indicators that correlate with histological features and patient response to therapy and/ or survival.

© 2012 Tica et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Increasing understanding of cancer biology is beginning to explain the reasons for therapeutic failures. Signal transduction research have revealed that the receptors, enzymes and transcription factors that regulate cell fate are virtually all connected into an complex network of cross-regulatory interactions. The cell fate control system is not only interconnected but also highly redundant, such that if a gene or protein is disabled, another can perform a similar function (Rizzo, P, 2008). Key molecular mechanisms implicated in the genesis of CRC include chromosomal instability, DNA repair defects, and aberrant methylation. Chromosomal instability causes structural chromosomal anomalies, usually during DNA replication, with subsequent loss of tumor suppressor genes. DNA repair defects are caused by mutations in genes responsible for the repair of base-base DNA mismatches. These can be found as germline mutations or somatic methylation anomalies in acquired cases of CRC. A significant proportion of cases of CRC associated with mismatch repair anomalies occur on the right side of the colon and have a characteristic histological appearance. DNA repair defects can be detected indirectly by the associated epiphenomenon of microsatellite instability or unrepaired strand slippage within microsatellite regions.

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 93

mutations have β-catenin mutations (Muhammad WS, 2010). *APC* gene product, a 310kDa protein located both in the cytoplasm and in the nucleus, interacting with β-catenin on the signaling pathway of Wnt-1. At the N-terminus site, the APC protein contains Armadillorepeat binding domains and oligomerization domain and at the C-terminus site there are EB1 and tumor suppressor protein DLG binding domains. The APC protein also contains three 15-amino acids and seven 20-amino acids repeat regions from which the second one was show to be involved in the negative regulation of β-catenin protein expression in cells. At the 5'-end *APC* gene we can found the mutation cluster region (MCR) which is responsible for most of the mutations in *APC* gene which create truncated proteins. The truncated proteins contain ASEF (APC-stimulated guanine nucleotide exchange factor) and β-catenin binding sites in the armadillo-repeat domain but loose the β-catenin regulatory activity which is located in the 20-amino acids repeat domain (Narayan S, 2003). The diverse effects of mutations in *APC* gene indicates that this molecule plays a key role in the

**β-Catenin** is a member of the cadherin-based cell adhesive complex, which also acts as a transcription factor if the protein is translocated to the nucleus. When it is not bound to Ecadherin and participating in cell-to-cell adhesion, a cytoplasmic degradation complex (consisting of APC, Axin, GSK-3β, and β-catenin) leads to β-catenin phosphorylation and degradation. When *APC* gene loss the normal function, β-catenin is not efficiently degraded and accumulates in the cytoplasm and is translocated to the nucleus where bind to a family of transcription factors called T-cell factor (TCF) or lymphoid enhancer factor (LEF) proteins and lead to transcriptional activation of certain target genes like c-Myc and Cyclin D. β-Catenine gene (*CTNNB1*) is located on the 3p chromosome and modifications in expression are associated with both early and tardive genetic events (Stanczak A, 2011). Most human cancers that involve *CTNNB1*mutations possess changes in exon 3 (amino acid residues in the N-terminus region), which provides loses binding affinity to GSK-3β, the kinase that phosphorylates and degrades β-catenin, in normal cells (Samowitz W.S., 1999). *APC* mutations are present in 80% of sporadic carcinomas (Knudson AG, 2001). Mutations in the *CTNNB1* gene at various key phosphorylation sites have been identified in CRC and several other solid tumors and it seem to prevent destruction of β-catenin by the proteasome

The **E-cadherin** gene (*CDH1*) is located on chromosome 16q22.1 and it contains 2.6 kb of coding sequences with 16 exons. There are overwhelming genetic data to support the role of E-cadherin as a tumor/ invasion suppressor in epithelial cells, and loss of expression, as well as mutations, has been described in a number of epithelial cancers. The implication of the *CDH1* gene in the process of carcinogenesis was initially associated with the gastric cancer because at this gene level somatic mutations which were associated with different types of diffuse gastric cancer (Becker KF, 1994) were observed. Subsequent research showed the existence of some germline mutations of *CDH1* in the families with dominant autosomal susceptibility for the hereditary diffuse gastric cancer (Suriano G, 2005). The genetic studies up to the present are sustaining the suppressor invasive/tumoral role of E-cadenin in the epithelial cells, and the expression loss along with mutations were described in some types

regulation of cell growth in a number of colonic and extracolonic tissues.

pathway, which then leads to constitutive activation of Wnt signaling.

Taking all these into account, we can conclude that study of colorectal carcinogenesis provides fundamental insights into the general mechanisms of cancer evolution. Now, it is believed that there are two patho-genetically distinct pathways for the development of colon cancer involving stepwise accumulation of multiple mutations. However, the genes involved and the mechanisms by which the mutations arisen are different.

The pathway, sometimes called the APC/ β-caterin pathway, is characterized by chromosomal instability that results in stepwise accumulation of mutations in a series of oncogenes and tumor suppressor genes. The molecular evolution of colon cancer along this pathway occurs through a series of morphologically identifiable stages. Initially, there is localized colon epithelial proliferation. This is followed by the formation of small adenomas that progressively enlarge, become more dysplastic, and ultimately develop into invasive cancers. This is referred to as the adenoma-carcinoma sequence. The genes that are correlated with this pathway are as follows:

**Adenomatous Polyposis Coli (APC)** - *APC* gene is located on chromosome 5 in 5q21 locus, and the mutations appearing at its level are responsible for the progression of CRC. Reported mutations in the *APC* gene include missense mutations and deletions, resulting in synthesis of truncated APC proteins. While "inherited" mutations are not clustered in a certain region of the gene but appear at the 5'-end or in nearby it, somatic mutations are clustered in the central region. The *APC* gene mutation is the genetic basis for FAP (Familial Adenomatous Polyposis) syndrome and fulfills the "first hit" concept advanced by Knudson in the 1970s. FAP patients have hundreds to thousands of colorectal adenomas and early onset carcinoma and allelic mutation of the *APC* gene followed by a loss of heterozygosity (LOH) is a common feature. Loss of this gene is believed to be the earliest event in the formation of adenomas. APC is involved in cell migration and adhesion and regulates levels of β-catenin (Senda T, 2005), an important mediator of the Wnt/ β-catenin signaling pathway. More than 80% of CRC have inactivated *APC*, and 50% of cancers without *APC*

mutations have β-catenin mutations (Muhammad WS, 2010). *APC* gene product, a 310kDa protein located both in the cytoplasm and in the nucleus, interacting with β-catenin on the signaling pathway of Wnt-1. At the N-terminus site, the APC protein contains Armadillorepeat binding domains and oligomerization domain and at the C-terminus site there are EB1 and tumor suppressor protein DLG binding domains. The APC protein also contains three 15-amino acids and seven 20-amino acids repeat regions from which the second one was show to be involved in the negative regulation of β-catenin protein expression in cells. At the 5'-end *APC* gene we can found the mutation cluster region (MCR) which is responsible for most of the mutations in *APC* gene which create truncated proteins. The truncated proteins contain ASEF (APC-stimulated guanine nucleotide exchange factor) and β-catenin binding sites in the armadillo-repeat domain but loose the β-catenin regulatory activity which is located in the 20-amino acids repeat domain (Narayan S, 2003). The diverse effects of mutations in *APC* gene indicates that this molecule plays a key role in the regulation of cell growth in a number of colonic and extracolonic tissues.

92 Mutations in Human Genetic Disease

Increasing understanding of cancer biology is beginning to explain the reasons for therapeutic failures. Signal transduction research have revealed that the receptors, enzymes and transcription factors that regulate cell fate are virtually all connected into an complex network of cross-regulatory interactions. The cell fate control system is not only interconnected but also highly redundant, such that if a gene or protein is disabled, another can perform a similar function (Rizzo, P, 2008). Key molecular mechanisms implicated in the genesis of CRC include chromosomal instability, DNA repair defects, and aberrant methylation. Chromosomal instability causes structural chromosomal anomalies, usually during DNA replication, with subsequent loss of tumor suppressor genes. DNA repair defects are caused by mutations in genes responsible for the repair of base-base DNA mismatches. These can be found as germline mutations or somatic methylation anomalies in acquired cases of CRC. A significant proportion of cases of CRC associated with mismatch repair anomalies occur on the right side of the colon and have a characteristic histological appearance. DNA repair defects can be detected indirectly by the associated epiphenomenon of microsatellite instability or unrepaired strand slippage within microsatellite regions.

Taking all these into account, we can conclude that study of colorectal carcinogenesis provides fundamental insights into the general mechanisms of cancer evolution. Now, it is believed that there are two patho-genetically distinct pathways for the development of colon cancer involving stepwise accumulation of multiple mutations. However, the genes

The pathway, sometimes called the APC/ β-caterin pathway, is characterized by chromosomal instability that results in stepwise accumulation of mutations in a series of oncogenes and tumor suppressor genes. The molecular evolution of colon cancer along this pathway occurs through a series of morphologically identifiable stages. Initially, there is localized colon epithelial proliferation. This is followed by the formation of small adenomas that progressively enlarge, become more dysplastic, and ultimately develop into invasive cancers. This is referred to as the adenoma-carcinoma sequence. The genes that are

**Adenomatous Polyposis Coli (APC)** - *APC* gene is located on chromosome 5 in 5q21 locus, and the mutations appearing at its level are responsible for the progression of CRC. Reported mutations in the *APC* gene include missense mutations and deletions, resulting in synthesis of truncated APC proteins. While "inherited" mutations are not clustered in a certain region of the gene but appear at the 5'-end or in nearby it, somatic mutations are clustered in the central region. The *APC* gene mutation is the genetic basis for FAP (Familial Adenomatous Polyposis) syndrome and fulfills the "first hit" concept advanced by Knudson in the 1970s. FAP patients have hundreds to thousands of colorectal adenomas and early onset carcinoma and allelic mutation of the *APC* gene followed by a loss of heterozygosity (LOH) is a common feature. Loss of this gene is believed to be the earliest event in the formation of adenomas. APC is involved in cell migration and adhesion and regulates levels of β-catenin (Senda T, 2005), an important mediator of the Wnt/ β-catenin signaling pathway. More than 80% of CRC have inactivated *APC*, and 50% of cancers without *APC*

involved and the mechanisms by which the mutations arisen are different.

correlated with this pathway are as follows:

**β-Catenin** is a member of the cadherin-based cell adhesive complex, which also acts as a transcription factor if the protein is translocated to the nucleus. When it is not bound to Ecadherin and participating in cell-to-cell adhesion, a cytoplasmic degradation complex (consisting of APC, Axin, GSK-3β, and β-catenin) leads to β-catenin phosphorylation and degradation. When *APC* gene loss the normal function, β-catenin is not efficiently degraded and accumulates in the cytoplasm and is translocated to the nucleus where bind to a family of transcription factors called T-cell factor (TCF) or lymphoid enhancer factor (LEF) proteins and lead to transcriptional activation of certain target genes like c-Myc and Cyclin D. β-Catenine gene (*CTNNB1*) is located on the 3p chromosome and modifications in expression are associated with both early and tardive genetic events (Stanczak A, 2011). Most human cancers that involve *CTNNB1*mutations possess changes in exon 3 (amino acid residues in the N-terminus region), which provides loses binding affinity to GSK-3β, the kinase that phosphorylates and degrades β-catenin, in normal cells (Samowitz W.S., 1999). *APC* mutations are present in 80% of sporadic carcinomas (Knudson AG, 2001). Mutations in the *CTNNB1* gene at various key phosphorylation sites have been identified in CRC and several other solid tumors and it seem to prevent destruction of β-catenin by the proteasome pathway, which then leads to constitutive activation of Wnt signaling.

The **E-cadherin** gene (*CDH1*) is located on chromosome 16q22.1 and it contains 2.6 kb of coding sequences with 16 exons. There are overwhelming genetic data to support the role of E-cadherin as a tumor/ invasion suppressor in epithelial cells, and loss of expression, as well as mutations, has been described in a number of epithelial cancers. The implication of the *CDH1* gene in the process of carcinogenesis was initially associated with the gastric cancer because at this gene level somatic mutations which were associated with different types of diffuse gastric cancer (Becker KF, 1994) were observed. Subsequent research showed the existence of some germline mutations of *CDH1* in the families with dominant autosomal susceptibility for the hereditary diffuse gastric cancer (Suriano G, 2005). The genetic studies up to the present are sustaining the suppressor invasive/tumoral role of E-cadenin in the epithelial cells, and the expression loss along with mutations were described in some types

of epithelial cancers (breast, colorectal, thyroid, endometrium, ovary cancer). Allelic imbalances of the LOH type were frequently observed in metastasizing malignancies derived from liver, prostate and breast. It is presumed that the loss of function contributes to the cancer progression by increasing the level of proliferation, invasion and/or metastasis. The E-caderin phenotypic expression in carcinomas is very well known, but the studies on the appearance of allelic imbalances at the *CDH1* level are rare. E-cadenin expression modifications are frequently associated with a high tumoral level, like the disease of prostate, breast, bladder, pancreas, stomach and colon. The mature protein product belongs to the family of cell–cell adhesion molecules and it plays a fundamental role in the maintenance of cell differentiation and the normal architecture of epithelial tissues (Stanczak A., 2011, Handschuh G, 1999). As an epithelial cell adhesion molecule E-cadherin mediates the contact between neighboring epithelial cells, including the colorectal epithelial cells, and helps to establish the dened membrane domains and cell polarity (Goodwin and Yap 2004). The extracellular domain of E-cadherin is responsible for homotypic binding of adjacent cells, and the cytoplasmic domain of E-cadherin facilitates adhesion through interaction with catenin proteins (Bryant and Stow 2004). The ectodomain of this protein mediates bacterial adhesion to mammalian cells and the cytoplasmic domain is required for internalization. Identified transcript variants arise from mutation at consensus splice sites. E-cadherin expression in epithelial cells is crucial for the establishment and maintenance of epithelial cell polarity.

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 95

consequence involve BRCA1 over-expression. BRCA1 directly link MYC at double-strand break repair and participate to the preserving genome integrity. When *BRCA1* is mutated and have only one normal allele, MYC-associated loss of homology - directed recombination repair should occur earlier than in individuals with two normal *BRCA1* alleles. BRCA1 expression is reduced in at least some sporadic colon adenocarcinomas and somatic loss of one normal *BRCA1* allele is common not only in hereditary but also in sporadic CRC tumors

**Group IIA PLA2** is a 14-kDa enzyme found in a number of tissues and secretory products (Nevaleine TJ, 1993). The plasma concentration of the enzyme increases dramatically in severe infections and other diseases involving generalized inammation and cancer (Ogawa M, 1991). In the gastrointestinal tract, expression of group IIA PLA2 has been localized in Paneth cells of the small intestine (Nevaleine TJ, 1995), metaplastic Paneth cells of gastric (Nevaleine TJ, 1995) and colonic mucosa (Haapamaki MM, 1999) as well as columnar epithelial cells of inammeted colonic mucosa. Functional defects in PLA2 in tumor cells may interfere with the regulatory mechanisms of tumor growth. The *PLA2G2A* gene function is relevant in tumorigenesis, and is a good candidate gene modifying the *Apc* gene in the Min (multiple intestinal neoplasias) mice. On the one hand, it has been suggested that a mutation resulting in splice variants of the *Pla2g2a* gene and in different truncated forms of its protein accounts for the increased number of polyps in mice carrying the Min mutation. Numerous studies suggested that *Pla2g2a* is a candidate gene for *Mom-1*. The analysis of a mouse/ human hybrid panel showed that the *PLA2G2A* gene, located on the human chromosome 1p, is a candidate gene for the MOM-1 locus, (Spirio LN, 1996; Ishiguro Y, 1999; Mounier CM, 2008). It was also observed that the *PLA2G2A* gene is intact, but an allelic imbalance (AI), or an allelic loss, was found at one of the alleles and a loss of

heterozygosity (LOH) was identified on *PLA2G2A* regions (Mihalcea, A, 2009).

The **EGFR** is a member of the HER (human epidermal growth factor receptor) family, and includes HER1 (EGFR, ErbB-1), HER2 (ErbB-2), HER3 (ErbB-3), and HER4 (ErbB-4) (Boss JL, 1989). The natural ligands for EGFR include EGF, transforming growth factor (TGF), amphiregulin, heregulin, heparin-binding EGF, and cellulin. Ligand binding induces receptor dimerisation and subsequent auto-phosphorylation that activates critical pathways for cellular survival and proliferation such as PI3K/Akt, Stat, Src and MAPK. EGFR mediates signaling by activating the MAPK and PI3K signaling cascades (Jhawer M, 2008). EGFR modifications have been described in many cancers as a consequence of mutations or gene amplifications that induce protein over-expression, structural rearrangements and autocrine loops. EGFR abnormalities may have a relevant role in both carcinogenesis and clinical progression of CRC. EGFR is differentially expressed in normal, premalignant, and malignant tissues, and over-expression of EGFR has been documented in up to nearly 90% of cases of metastatic CRC (Boss JL, 1989; Arteaga CL, 2001). In addition, EGFR is overexpressed in a wide range of solid tumors and is involved in their growth and proliferation through various mechanisms. Given the documented role of EGFR in the development and progression of cancers, this receptor signaling pathway represents a rational target for drug development (Vokes EE, 2006; Lee JJ, 2007). Recent clinical data have shown that advanced

(Friedenson B, 2004).

*BRCA1* gene mapped on the long arm of chromosome 17 (17q12-21) was identified by positional cloning methods. Mutations at the level of this gene are responsible in part for inherited predisposition to ovary, breast, prostate and colon cancers. However, whether these mutations are a factor in sporadic forms of these tumours remains unclear. Loss of *BRCA1* heterozygosity represents a molecular alteration presented in colorectal cancer, with unfavorable consequence in survival rates and that can be considered an independent prognosis factor in steps I and II of colorectal cancer stages (Roukos D., 2010). *BRCA1* is a large gene with many functional domains, each with different biological features. The C terminal region is related to the transactivation region of the protein and residues 758–1064 to the domain binding to Rad51, thus working as a complex to repair double stranded DNA breaks. In relation to its repair role, *BRCA1* is also related to co-activation of p53. The relationship of truncating germline mutations in the *BRCA1* gene and breast and ovarian cancers is established. Mutations in this gene are responsible in part for the inherited predisposition to breast and ovarian cancers, and probably for one third of all site specific inherited breast cancer. In previous studies, researchers found a high percentage of LOH in the 17q21 region in sporadic CRC cases. BRCA proteins have a significant role in multiple pathways, signaling cell cycle delays for DNA lesions or leading to apoptosis for severe damage. BRCA proteins function in transcriptional regulation and chromatin remodeling, and they are required to repair double-strand breaks. Double-strand breaks in mammalian chromosomes stimulate the activity of recombination repair enzymes by more than 100-fold. In transformed colon cells of *BRCA1* mutation carriers, BRCA1 functions are probably lost. In almost all colorectal cancers, the mutated *APC* gene, lead to MYC over-expression and as consequence involve BRCA1 over-expression. BRCA1 directly link MYC at double-strand break repair and participate to the preserving genome integrity. When *BRCA1* is mutated and have only one normal allele, MYC-associated loss of homology - directed recombination repair should occur earlier than in individuals with two normal *BRCA1* alleles. BRCA1 expression is reduced in at least some sporadic colon adenocarcinomas and somatic loss of one normal *BRCA1* allele is common not only in hereditary but also in sporadic CRC tumors (Friedenson B, 2004).

94 Mutations in Human Genetic Disease

epithelial cell polarity.

of epithelial cancers (breast, colorectal, thyroid, endometrium, ovary cancer). Allelic imbalances of the LOH type were frequently observed in metastasizing malignancies derived from liver, prostate and breast. It is presumed that the loss of function contributes to the cancer progression by increasing the level of proliferation, invasion and/or metastasis. The E-caderin phenotypic expression in carcinomas is very well known, but the studies on the appearance of allelic imbalances at the *CDH1* level are rare. E-cadenin expression modifications are frequently associated with a high tumoral level, like the disease of prostate, breast, bladder, pancreas, stomach and colon. The mature protein product belongs to the family of cell–cell adhesion molecules and it plays a fundamental role in the maintenance of cell differentiation and the normal architecture of epithelial tissues (Stanczak A., 2011, Handschuh G, 1999). As an epithelial cell adhesion molecule E-cadherin mediates the contact between neighboring epithelial cells, including the colorectal epithelial cells, and helps to establish the dened membrane domains and cell polarity (Goodwin and Yap 2004). The extracellular domain of E-cadherin is responsible for homotypic binding of adjacent cells, and the cytoplasmic domain of E-cadherin facilitates adhesion through interaction with catenin proteins (Bryant and Stow 2004). The ectodomain of this protein mediates bacterial adhesion to mammalian cells and the cytoplasmic domain is required for internalization. Identified transcript variants arise from mutation at consensus splice sites. E-cadherin expression in epithelial cells is crucial for the establishment and maintenance of

*BRCA1* gene mapped on the long arm of chromosome 17 (17q12-21) was identified by positional cloning methods. Mutations at the level of this gene are responsible in part for inherited predisposition to ovary, breast, prostate and colon cancers. However, whether these mutations are a factor in sporadic forms of these tumours remains unclear. Loss of *BRCA1* heterozygosity represents a molecular alteration presented in colorectal cancer, with unfavorable consequence in survival rates and that can be considered an independent prognosis factor in steps I and II of colorectal cancer stages (Roukos D., 2010). *BRCA1* is a large gene with many functional domains, each with different biological features. The C terminal region is related to the transactivation region of the protein and residues 758–1064 to the domain binding to Rad51, thus working as a complex to repair double stranded DNA breaks. In relation to its repair role, *BRCA1* is also related to co-activation of p53. The relationship of truncating germline mutations in the *BRCA1* gene and breast and ovarian cancers is established. Mutations in this gene are responsible in part for the inherited predisposition to breast and ovarian cancers, and probably for one third of all site specific inherited breast cancer. In previous studies, researchers found a high percentage of LOH in the 17q21 region in sporadic CRC cases. BRCA proteins have a significant role in multiple pathways, signaling cell cycle delays for DNA lesions or leading to apoptosis for severe damage. BRCA proteins function in transcriptional regulation and chromatin remodeling, and they are required to repair double-strand breaks. Double-strand breaks in mammalian chromosomes stimulate the activity of recombination repair enzymes by more than 100-fold. In transformed colon cells of *BRCA1* mutation carriers, BRCA1 functions are probably lost. In almost all colorectal cancers, the mutated *APC* gene, lead to MYC over-expression and as **Group IIA PLA2** is a 14-kDa enzyme found in a number of tissues and secretory products (Nevaleine TJ, 1993). The plasma concentration of the enzyme increases dramatically in severe infections and other diseases involving generalized inammation and cancer (Ogawa M, 1991). In the gastrointestinal tract, expression of group IIA PLA2 has been localized in Paneth cells of the small intestine (Nevaleine TJ, 1995), metaplastic Paneth cells of gastric (Nevaleine TJ, 1995) and colonic mucosa (Haapamaki MM, 1999) as well as columnar epithelial cells of inammeted colonic mucosa. Functional defects in PLA2 in tumor cells may interfere with the regulatory mechanisms of tumor growth. The *PLA2G2A* gene function is relevant in tumorigenesis, and is a good candidate gene modifying the *Apc* gene in the Min (multiple intestinal neoplasias) mice. On the one hand, it has been suggested that a mutation resulting in splice variants of the *Pla2g2a* gene and in different truncated forms of its protein accounts for the increased number of polyps in mice carrying the Min mutation. Numerous studies suggested that *Pla2g2a* is a candidate gene for *Mom-1*. The analysis of a mouse/ human hybrid panel showed that the *PLA2G2A* gene, located on the human chromosome 1p, is a candidate gene for the MOM-1 locus, (Spirio LN, 1996; Ishiguro Y, 1999; Mounier CM, 2008). It was also observed that the *PLA2G2A* gene is intact, but an allelic imbalance (AI), or an allelic loss, was found at one of the alleles and a loss of heterozygosity (LOH) was identified on *PLA2G2A* regions (Mihalcea, A, 2009).

The **EGFR** is a member of the HER (human epidermal growth factor receptor) family, and includes HER1 (EGFR, ErbB-1), HER2 (ErbB-2), HER3 (ErbB-3), and HER4 (ErbB-4) (Boss JL, 1989). The natural ligands for EGFR include EGF, transforming growth factor (TGF), amphiregulin, heregulin, heparin-binding EGF, and cellulin. Ligand binding induces receptor dimerisation and subsequent auto-phosphorylation that activates critical pathways for cellular survival and proliferation such as PI3K/Akt, Stat, Src and MAPK. EGFR mediates signaling by activating the MAPK and PI3K signaling cascades (Jhawer M, 2008). EGFR modifications have been described in many cancers as a consequence of mutations or gene amplifications that induce protein over-expression, structural rearrangements and autocrine loops. EGFR abnormalities may have a relevant role in both carcinogenesis and clinical progression of CRC. EGFR is differentially expressed in normal, premalignant, and malignant tissues, and over-expression of EGFR has been documented in up to nearly 90% of cases of metastatic CRC (Boss JL, 1989; Arteaga CL, 2001). In addition, EGFR is overexpressed in a wide range of solid tumors and is involved in their growth and proliferation through various mechanisms. Given the documented role of EGFR in the development and progression of cancers, this receptor signaling pathway represents a rational target for drug development (Vokes EE, 2006; Lee JJ, 2007). Recent clinical data have shown that advanced

colorectal cancer with tumor-promoting mutations of these pathways -- including activating mutations in KRAS, BRAF, and the p110 subunit of PI3K-- do not respond to anti-EGFR therapy.

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 97

margins of resection (invaded/ noninvaded) and also TNM stadialisation. After surgical resection, tumor tissues were cut in small pieces, frozen immediately in liquid nitrogen and

For the initial patients group, only 75 patients who had at least 75% tumor cells were taken in consideration for molecular biology analyses. To perform immunohistochemistry by immunofluorescence (IHF) analyses, five micrometers thick tissue serial sections were incubated with primary antibodies diluted in BSA (bovine serum albumin) in PBS (phosphate buffered saline). After washing with PBS, FITC-conjugated secondary antibodies (Invitrogen) were applied and then the samples were washed again. The protein expression was evaluated by fluorescent microscopy. In order to analyze the mutational status, DNA was extracted from patients' venous blood (as control) and from tumours. DNA preparation was performed using the *Wizard® Genomic DNA Purification kit* (Promega) according to the manufacturer's recommendations. The extracted DNA was stored at -800C until molecular

The medical records of all 93 patients provided their birth date and sex, and the following parameters: tumor location, tumor size, lymph node metastases, pathological stage, vascular

Out of 93 cases, there were 40 womens and 53 mens. The mean age was 50 years. The majority had T3 tumors (31.8%); T2 tumors (25.80%) according to tumor stage of the TNM classification of colon and rectum neoplasm and 53 patients (57%) had lymph node involvement (N+). In the study lot, 17 cases (18.27%) presented metastasis at the time at CRC diagnosis. These were predominantly localized in the liver (12 cases, 70.58%) and rarely in the lungs (4 cases, 23.52%).

Regarding the histopathological type of colorectal tumors, the vast majority was adenocarcinomas (ADK) with different grades of differentiation. Most of the tumors (42 cases: 45.16%) were well differentiated (G1) while 33 cases (35.48 %) were moderately differentiated (G2) and 18 cases (19.35%) poor differentiated (G3) tumors. Beside typical adenocarcinoma another histopathological type of tumors was rare and was localized: i) to the right colon - especially mucinous ADK (5 cases from a total of 9 cases in the all study lot) and 1 adenosquamous carcinoma; ii) to the left colon - 2 cases of mucinous ADK and 1 case of "signet-ring" cell carcinoma; iii) to the rectum - 2 mucinous ADK, 1 squamocellular carcinoma

Our study has not taken into consideration the diet, because most of the patients do not know the food properties or they use food with pro-carcinogen potential. Regarding the diet, we consider that the patient instruction is extremely useful and has to be done by the surgeon doctor after the surgical treatment and then by the family doctor. This approach allows both secondary prophylaxis and control of possible relapses/ recidivists. A monitoring of the patients included in the study will shows the efficiency of medical control

and 1 case of anaplazic carcinoma. Patients characteristic is summarized in Table 1.

stored at - 800C until they were analyzed.

**2.2. Clinicopathological characteristics** 

and neural invasion and tumoral differentiation grading.

biology analyses.

The variability in clinical presentation, aggressiveness, and patterns of treatment failure suggests distinct genotypes and phenotypes identification, which can help future treatment strategies. A new concept called "personalized medicine" may be another beginning of a new era and it has been designed to offer every patient a suitable therapy. By this new approach, "Personalized medicine" can be defined as the tailoring of medical treatment to a specific subset of patients who are usually identified by genetic markers or other molecular profiling strategies. There is an increasing interest in this therapeutic strategy on the part of pharmaceutical and bio-pharmaceutical companies, consumers, and third party payers. Consequently, the level of clinical trial activity surrounding personalized medicines is intensifying as sponsors seek ways to target their therapies to patient populations that would most benefit from them. The aim of the present chapter is to elaborate an experimental model in order to improve the "personalized" therapeutically strategy, by evaluating some key gene expression involved into a crosstalk signaling, in colorectal cancer.

By our study design we have evaluated the comparative expression at proteic and genetic level of several key point proteins (*APC*, *PLA2G2A*, *CDH1*, *BRCA1*, and *EGFR*). Our *in vivo* experiment involved diagnosis testing of CRC patients and molecular biology testing on biological samples in order to clarify the cross-talk of interested genes and to better understand the CRC typology among Romanian patients.

The idea of applying such a model to our studies was generated during the research that we conducted in our projects. We have noticed that between different proteins and genes is a very close relationship, which depends on the tumor type, cell grade and staging. Following a study of a large number of articles published in the international databases we observed that other researchers have drawn the same conclusion.

### **2. Results and discussion**

#### **2.1. Tissue samples and blood**

Samples were obtained with the consent of 93 patients, consisting of histopatologically confirmed colorectal adenomas. Samples were obtained during colonoscopy with biopsy forceps, by harvesting at least four fragments from all the quadrants of the pathological tissue. The surgical intervention for CRC treatment included radical and palliative techniques (right or left hemicolectomy, segmentary colectomy, low anterior rectal resection–Dixon, Milles operation, Hartmann operation). All tumors were histologically (HP) examined by pathologist in order to: (a) confirm the diagnosis of adenocarcinoma, (b) confirm the presence of tumor and evaluate the percentage of tumor cells in these samples, and (c) carry out pathological staging. The complete HP diagnosis included: degree of differentiation (well/ moderate/ poor), vascular, neural and lymphatic invasion, status of the margins of resection (invaded/ noninvaded) and also TNM stadialisation. After surgical resection, tumor tissues were cut in small pieces, frozen immediately in liquid nitrogen and stored at - 800C until they were analyzed.

For the initial patients group, only 75 patients who had at least 75% tumor cells were taken in consideration for molecular biology analyses. To perform immunohistochemistry by immunofluorescence (IHF) analyses, five micrometers thick tissue serial sections were incubated with primary antibodies diluted in BSA (bovine serum albumin) in PBS (phosphate buffered saline). After washing with PBS, FITC-conjugated secondary antibodies (Invitrogen) were applied and then the samples were washed again. The protein expression was evaluated by fluorescent microscopy. In order to analyze the mutational status, DNA was extracted from patients' venous blood (as control) and from tumours. DNA preparation was performed using the *Wizard® Genomic DNA Purification kit* (Promega) according to the manufacturer's recommendations. The extracted DNA was stored at -800C until molecular biology analyses.

#### **2.2. Clinicopathological characteristics**

96 Mutations in Human Genetic Disease

therapy.

cancer.

colorectal cancer with tumor-promoting mutations of these pathways -- including activating mutations in KRAS, BRAF, and the p110 subunit of PI3K-- do not respond to anti-EGFR

The variability in clinical presentation, aggressiveness, and patterns of treatment failure suggests distinct genotypes and phenotypes identification, which can help future treatment strategies. A new concept called "personalized medicine" may be another beginning of a new era and it has been designed to offer every patient a suitable therapy. By this new approach, "Personalized medicine" can be defined as the tailoring of medical treatment to a specific subset of patients who are usually identified by genetic markers or other molecular profiling strategies. There is an increasing interest in this therapeutic strategy on the part of pharmaceutical and bio-pharmaceutical companies, consumers, and third party payers. Consequently, the level of clinical trial activity surrounding personalized medicines is intensifying as sponsors seek ways to target their therapies to patient populations that would most benefit from them. The aim of the present chapter is to elaborate an experimental model in order to improve the "personalized" therapeutically strategy, by evaluating some key gene expression involved into a crosstalk signaling, in colorectal

By our study design we have evaluated the comparative expression at proteic and genetic level of several key point proteins (*APC*, *PLA2G2A*, *CDH1*, *BRCA1*, and *EGFR*). Our *in vivo* experiment involved diagnosis testing of CRC patients and molecular biology testing on biological samples in order to clarify the cross-talk of interested genes and to better

The idea of applying such a model to our studies was generated during the research that we conducted in our projects. We have noticed that between different proteins and genes is a very close relationship, which depends on the tumor type, cell grade and staging. Following a study of a large number of articles published in the international databases we observed

Samples were obtained with the consent of 93 patients, consisting of histopatologically confirmed colorectal adenomas. Samples were obtained during colonoscopy with biopsy forceps, by harvesting at least four fragments from all the quadrants of the pathological tissue. The surgical intervention for CRC treatment included radical and palliative techniques (right or left hemicolectomy, segmentary colectomy, low anterior rectal resection–Dixon, Milles operation, Hartmann operation). All tumors were histologically (HP) examined by pathologist in order to: (a) confirm the diagnosis of adenocarcinoma, (b) confirm the presence of tumor and evaluate the percentage of tumor cells in these samples, and (c) carry out pathological staging. The complete HP diagnosis included: degree of differentiation (well/ moderate/ poor), vascular, neural and lymphatic invasion, status of the

understand the CRC typology among Romanian patients.

that other researchers have drawn the same conclusion.

**2. Results and discussion** 

**2.1. Tissue samples and blood** 

The medical records of all 93 patients provided their birth date and sex, and the following parameters: tumor location, tumor size, lymph node metastases, pathological stage, vascular and neural invasion and tumoral differentiation grading.

Out of 93 cases, there were 40 womens and 53 mens. The mean age was 50 years. The majority had T3 tumors (31.8%); T2 tumors (25.80%) according to tumor stage of the TNM classification of colon and rectum neoplasm and 53 patients (57%) had lymph node involvement (N+). In the study lot, 17 cases (18.27%) presented metastasis at the time at CRC diagnosis. These were predominantly localized in the liver (12 cases, 70.58%) and rarely in the lungs (4 cases, 23.52%).

Regarding the histopathological type of colorectal tumors, the vast majority was adenocarcinomas (ADK) with different grades of differentiation. Most of the tumors (42 cases: 45.16%) were well differentiated (G1) while 33 cases (35.48 %) were moderately differentiated (G2) and 18 cases (19.35%) poor differentiated (G3) tumors. Beside typical adenocarcinoma another histopathological type of tumors was rare and was localized: i) to the right colon - especially mucinous ADK (5 cases from a total of 9 cases in the all study lot) and 1 adenosquamous carcinoma; ii) to the left colon - 2 cases of mucinous ADK and 1 case of "signet-ring" cell carcinoma; iii) to the rectum - 2 mucinous ADK, 1 squamocellular carcinoma and 1 case of anaplazic carcinoma. Patients characteristic is summarized in Table 1.

Our study has not taken into consideration the diet, because most of the patients do not know the food properties or they use food with pro-carcinogen potential. Regarding the diet, we consider that the patient instruction is extremely useful and has to be done by the surgeon doctor after the surgical treatment and then by the family doctor. This approach allows both secondary prophylaxis and control of possible relapses/ recidivists. A monitoring of the patients included in the study will shows the efficiency of medical control and the conscious of this mortal disease. In the studied lot of patients we have not registered cases with relapse, and we cannot predict their future behavior.

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 99

fibers. In the crypt epithelial cells the signal is absent (-). In CRC patients the α-SM expression decreases with increasing disease grade, and disappear in most of the advanced

By labeling the **APC** C-terminus, there were observed changes of protein expression in tumor tissue compared with APC expression in normal tissues. In normal tissues, muscle tunic polyps analysis confirmed the expression of target protein in SM from blood vessels

**Figure 1. α-SM expression.** Smooth muscle, used as a positive marker for immunofluorescence signal, have immunofluorescent signal in blood vessels, intestinal muscularis mucosae and muscularis propria,

With few exceptions, the intensity of fluorescent signal given by the expression of APC is strong (3+), fluorescent signal obtained overlapping fluorescent signal of α-actin expression given by smooth muscle cells (Figure 2). Adenocarcinomas of the colorectal mucosa analysis revealed APC expression changes. During tumorigenesis process, the mucosa is invaded by stromal tissue, the crypts become large, elongate, their architecture is destroyed and the

fluorescent signal intensity of epithelial cells (CE) decreases becoming weak (1+).

**Figure 2. APC expression.** A normal expression with immunofluorescent signal on the border of the crypts and in SM cells can be observed on 8 patient's section, like in normal tissue. On section obtained from patient 3 we can observe a weak intensity on the apical part of epithelial cells and loss of signal, too.

At the same time we observed an increase of its intensity in neoplastic infiltrated cells (CI). In the apical half of the fluorescent signal crypt, epithelial cells and infiltrated cells disappeared (-). The IHF expression pattern overlaps the APC sequential histopathological

CRC, when the tissue is disorganized and a lot of tumor cells are present (Figure 1).

and fibers of the smooth muscle shell structure, where it is stored.

and in the stromal tissue.


**Table 1.** Clinico-pathological characteristics of CRC tumors in the study lot

### **2.3. Immunohistochemical expression by immunofluoresce of the studied proteins**

Because the interpretation of immunohistochemistry analyses remains the basic of anatomic pathology, in our study we first evaluated the protein expression of the key point proteins that were taken in our study. Unlike the normal histopathological analyses, our evaluation was based on protein fluorescent signal which, from our point of view, is more specific than classical immunohistochemistry.

The expression of **α-SM** (smooth muscle) was included in our study as a positive control to prove the method accuracy and it is used as a typical marker for myofibroblasts. It is one of the four muscle actin isoforms, a protein involved in supporting basic contractile apparatus in muscle cells. This expression can be found in vascular cells, intestinal muscularis mucosae and muscularis propria, and in the stromal tissue. In normal tissue, the immunofluorescence signal is strong (+3) around tumor crypts, in the vessel walls and stromal smooth muscle fibers. In the crypt epithelial cells the signal is absent (-). In CRC patients the α-SM expression decreases with increasing disease grade, and disappear in most of the advanced CRC, when the tissue is disorganized and a lot of tumor cells are present (Figure 1).

By labeling the **APC** C-terminus, there were observed changes of protein expression in tumor tissue compared with APC expression in normal tissues. In normal tissues, muscle tunic polyps analysis confirmed the expression of target protein in SM from blood vessels and fibers of the smooth muscle shell structure, where it is stored.

98 Mutations in Human Genetic Disease

**proteins** 

classical immunohistochemistry.

and the conscious of this mortal disease. In the studied lot of patients we have not registered

**< 50 12 > 50 81**

**Male 53 Female 40**

**RC 13 LC 42 RECTUM 38**

**I 23 (24,73%) II 24 (25,80 %) III 29 (31,18%) IV 17 (18,27%)**

**N – (N0) 40 (43%) N +(N1,2,3) 53 (57%)**

**Well differentiated G1 42 Moderately differentiated G2 33 Poor differentiated G3 18**

Total 93

**No CASES n (%)**

cases with relapse, and we cannot predict their future behavior.

**OF CRC TUMORS** 

**Tumor localisation** 

**Lymph nodes status** 

**Histopathological grading**

 **Age**

**Gender**

**Stage**

**CLINICO-PATHOLOGICAL CHARACTERISTICS**

**Table 1.** Clinico-pathological characteristics of CRC tumors in the study lot

**2.3. Immunohistochemical expression by immunofluoresce of the studied** 

Because the interpretation of immunohistochemistry analyses remains the basic of anatomic pathology, in our study we first evaluated the protein expression of the key point proteins that were taken in our study. Unlike the normal histopathological analyses, our evaluation was based on protein fluorescent signal which, from our point of view, is more specific than

The expression of **α-SM** (smooth muscle) was included in our study as a positive control to prove the method accuracy and it is used as a typical marker for myofibroblasts. It is one of the four muscle actin isoforms, a protein involved in supporting basic contractile apparatus in muscle cells. This expression can be found in vascular cells, intestinal muscularis mucosae and muscularis propria, and in the stromal tissue. In normal tissue, the immunofluorescence signal is strong (+3) around tumor crypts, in the vessel walls and stromal smooth muscle

**Figure 1. α-SM expression.** Smooth muscle, used as a positive marker for immunofluorescence signal, have immunofluorescent signal in blood vessels, intestinal muscularis mucosae and muscularis propria, and in the stromal tissue.

With few exceptions, the intensity of fluorescent signal given by the expression of APC is strong (3+), fluorescent signal obtained overlapping fluorescent signal of α-actin expression given by smooth muscle cells (Figure 2). Adenocarcinomas of the colorectal mucosa analysis revealed APC expression changes. During tumorigenesis process, the mucosa is invaded by stromal tissue, the crypts become large, elongate, their architecture is destroyed and the fluorescent signal intensity of epithelial cells (CE) decreases becoming weak (1+).

**Figure 2. APC expression.** A normal expression with immunofluorescent signal on the border of the crypts and in SM cells can be observed on 8 patient's section, like in normal tissue. On section obtained from patient 3 we can observe a weak intensity on the apical part of epithelial cells and loss of signal, too.

At the same time we observed an increase of its intensity in neoplastic infiltrated cells (CI). In the apical half of the fluorescent signal crypt, epithelial cells and infiltrated cells disappeared (-). The IHF expression pattern overlaps the APC sequential histopathological changes occurring in the colorectal carcinogenesis, in which β-catenin and APC play the role of so-called "Second Hit".

In normal colorectal tissue, **β-catenin** expression appears on the membrane of epithelial cells. In tumor tissue, can occur either over-expression of β-catenin in the nucleus where it is translocated from the cytoplasm as a result of *APC* mutation, or signal absence when βcatenin changes. In our study, 33.33% (25/ 75) of patients show a similar β-catenin expression to that of normal tissue because the fluorescent signals were obtained on the membrane of epithelial cells. In 33 CRC patients, the β-catenin target protein expression was changed compared with normal tissue (Figure 3).

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 101

**Figure 4. E-cadherin expression.** A normal expression with immunofluorescent signal on the

in the cytoplasm of epithelial cells and in some infiltrating cells was noticed.

membranous staining (0,1+, 2+).

membrane of epithelial cells can be observed on section from patient 70. In the case of patient 74 we can observe a reduced/ loss of expression in the epithelial cell membranes. On patient 73 an over-expression

there is a strong correlation between the presence of the lymph node invasion status and protein expression of E-cadherin. From a total number of 75 cases of CRC, we observed that patients with lymph node invasion N + (N1, N2, N3) have low or no expression of Ecadherin. Thus E-cadherin could be considered a biomarker that can help to determine the risk in patients with CRC, and a strong indicator of the lymph node status. In the group of N0 CRC tumors from 27 cases, only 77.77% (21/ 27) of patients presented E-cadherin membrane expression in different staining grades, scored as 0, 1+, 2+ , while in the group of lymph node invasion N+ tumors (48 cases) only 35.41% (31/ 48) of patients were positive for

In normal colon mucosa the **sPLA2 type IIA** enzyme was detected by a strong staining in muscularis mucosae in a large fraction of SM cells (recognized by α-SM actin antibody) and vascular SM cells (Figure 5). In lamina propria, the PLA2 type IIA enzyme was detected with a weaker staining (2+), surrounding the crypts (as determined by morphological and histological evaluation), and in vascular smooth muscle. These results show that PLA2 type IIA enzyme is expressed only in smooth muscle cells from normal colon mucosa. An abnormal pattern for PLA2 type IIA expression was observed in 27 of the 75 CRC cases (36.00%), which were examined. In muscularis externa and submucosa, the SM cells express PLA2 type IIA with a strong intensity (3+). The presence of PLA2 type IIA was not observed (-) in other types of cells.

Beginning with mucosa, the PLA2 type IIA expression started to be modified. Thus, near the submucosa, the immunofluorescence signal for PLA2 type IIA was observed in SM cells from lamina propria, but only around crypts, and with a weak signal comparative with the normal pattern (1+). As the crypts get longer with more ramifications, the number of SM cells that express PLA2 type IIA decrease, although we had a positive signal for α-SM actin from all the SM cells. In this area, PLA2 type IIA expression was found in epithelial cells, on the border of Lieberkühn crypts. The number of epithelial cells that express PLA2 type IIA increases during the crypts growing. The immunofluorescence signal is also stronger (3+) than fluorescent signal observed in SM cells. No immunoreaction for PLA2 (type II) was found in all 11 patients' sections (14.66%) that were analyzed. This may suggest that the malignant cells lose their ability to express PLA2 type IIA, when invasive carcinoma develops in the adenoma.

**Figure 3.** β**-Catenin expression on patient 8.** A normal expression with immunofluorescent signal on cytoplasm and on the border of crypts can be observed on the section from patient 8. On section from patient 3 we can observe an over-expression in the cytoplasm/ nucleus of epithelial cells and loss of expression in the membrane.

We can observe how the fluorescent signal on the membrane of epithelial cells gradually decreases in intensity during the tumor progression, along with increased fluorescent signal by over-expression in cytoplasm (in 28 patients) and in the nucleus (in 5 patients).

Regarding **E-cadherin** expression, colorectal tumors showed a heterogeneous type of expression compared to the normal colorectal epithelium in which E-cadherin expression is present on the basolateral membrane to the whole length of the glandular crypts and on the intercellular membranes. An abnormal pattern of expression is observed on CRC tumor sections: i) a reduced expression (2+, 1+) at the membrane level was observed in 20% (15/ 75) of patients; ii) cytoplasmatic expression was observed in 37.33% (28/ 75) of patients and the expression is similar to that observed for β-catenin; iii) loss of expression (-) was observed in 12% (9/ 75) of patients. In 30.66% (23/ 75) of patients, the E-cadherin expression was similar with that observed in normal colon epithelium, in the cell membrane, with strong immunofluorescent signal (3+) and is co-localized with membrane β-catenin (Figure 4).

Comparative analyses of E-cadherin protein expression for CRC tumors with various histological differentiation grades (G1, G2, G3), showed an almost similar expression pattern for all G1, G2 and G3 tumor grades, although the majority of the well differentiated G1 tumors indicated strong membranous signal; the moderately differentiated tumors (G2) showed a heterogeneous membranous signal and some of the poorly differentiated tumors (G3) had no membranous expression for E-cadherin. In the case of lymph nodes analyses,

of so-called "Second Hit".

expression in the membrane.

changed compared with normal tissue (Figure 3).

changes occurring in the colorectal carcinogenesis, in which β-catenin and APC play the role

In normal colorectal tissue, **β-catenin** expression appears on the membrane of epithelial cells. In tumor tissue, can occur either over-expression of β-catenin in the nucleus where it is translocated from the cytoplasm as a result of *APC* mutation, or signal absence when βcatenin changes. In our study, 33.33% (25/ 75) of patients show a similar β-catenin expression to that of normal tissue because the fluorescent signals were obtained on the membrane of epithelial cells. In 33 CRC patients, the β-catenin target protein expression was

**Figure 3.** β**-Catenin expression on patient 8.** A normal expression with immunofluorescent signal on cytoplasm and on the border of crypts can be observed on the section from patient 8. On section from patient 3 we can observe an over-expression in the cytoplasm/ nucleus of epithelial cells and loss of

We can observe how the fluorescent signal on the membrane of epithelial cells gradually decreases in intensity during the tumor progression, along with increased fluorescent signal

Regarding **E-cadherin** expression, colorectal tumors showed a heterogeneous type of expression compared to the normal colorectal epithelium in which E-cadherin expression is present on the basolateral membrane to the whole length of the glandular crypts and on the intercellular membranes. An abnormal pattern of expression is observed on CRC tumor sections: i) a reduced expression (2+, 1+) at the membrane level was observed in 20% (15/ 75) of patients; ii) cytoplasmatic expression was observed in 37.33% (28/ 75) of patients and the expression is similar to that observed for β-catenin; iii) loss of expression (-) was observed in 12% (9/ 75) of patients. In 30.66% (23/ 75) of patients, the E-cadherin expression was similar with that observed in normal colon epithelium, in the cell membrane, with strong immunofluorescent signal (3+) and is co-localized with membrane β-catenin (Figure 4).

Comparative analyses of E-cadherin protein expression for CRC tumors with various histological differentiation grades (G1, G2, G3), showed an almost similar expression pattern for all G1, G2 and G3 tumor grades, although the majority of the well differentiated G1 tumors indicated strong membranous signal; the moderately differentiated tumors (G2) showed a heterogeneous membranous signal and some of the poorly differentiated tumors (G3) had no membranous expression for E-cadherin. In the case of lymph nodes analyses,

by over-expression in cytoplasm (in 28 patients) and in the nucleus (in 5 patients).

**Figure 4. E-cadherin expression.** A normal expression with immunofluorescent signal on the membrane of epithelial cells can be observed on section from patient 70. In the case of patient 74 we can observe a reduced/ loss of expression in the epithelial cell membranes. On patient 73 an over-expression in the cytoplasm of epithelial cells and in some infiltrating cells was noticed.

there is a strong correlation between the presence of the lymph node invasion status and protein expression of E-cadherin. From a total number of 75 cases of CRC, we observed that patients with lymph node invasion N + (N1, N2, N3) have low or no expression of Ecadherin. Thus E-cadherin could be considered a biomarker that can help to determine the risk in patients with CRC, and a strong indicator of the lymph node status. In the group of N0 CRC tumors from 27 cases, only 77.77% (21/ 27) of patients presented E-cadherin membrane expression in different staining grades, scored as 0, 1+, 2+ , while in the group of lymph node invasion N+ tumors (48 cases) only 35.41% (31/ 48) of patients were positive for membranous staining (0,1+, 2+).

In normal colon mucosa the **sPLA2 type IIA** enzyme was detected by a strong staining in muscularis mucosae in a large fraction of SM cells (recognized by α-SM actin antibody) and vascular SM cells (Figure 5). In lamina propria, the PLA2 type IIA enzyme was detected with a weaker staining (2+), surrounding the crypts (as determined by morphological and histological evaluation), and in vascular smooth muscle. These results show that PLA2 type IIA enzyme is expressed only in smooth muscle cells from normal colon mucosa. An abnormal pattern for PLA2 type IIA expression was observed in 27 of the 75 CRC cases (36.00%), which were examined. In muscularis externa and submucosa, the SM cells express PLA2 type IIA with a strong intensity (3+). The presence of PLA2 type IIA was not observed (-) in other types of cells.

Beginning with mucosa, the PLA2 type IIA expression started to be modified. Thus, near the submucosa, the immunofluorescence signal for PLA2 type IIA was observed in SM cells from lamina propria, but only around crypts, and with a weak signal comparative with the normal pattern (1+). As the crypts get longer with more ramifications, the number of SM cells that express PLA2 type IIA decrease, although we had a positive signal for α-SM actin from all the SM cells. In this area, PLA2 type IIA expression was found in epithelial cells, on the border of Lieberkühn crypts. The number of epithelial cells that express PLA2 type IIA increases during the crypts growing. The immunofluorescence signal is also stronger (3+) than fluorescent signal observed in SM cells. No immunoreaction for PLA2 (type II) was found in all 11 patients' sections (14.66%) that were analyzed. This may suggest that the malignant cells lose their ability to express PLA2 type IIA, when invasive carcinoma develops in the adenoma.

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 103

of patients and moderate (2+) in 32.25% (10/ 31) of patients. Moreover, in both cases EGFR expression was observed in cytoplasm of tumoral cells (Figure 7). Complete strong circumferential expression (3+) was found in 45.16% (14/ 31) of patients. Normal expression, like signal absence was observed in 58.67% (44/ 75) of patients. In our study (2+) and/ or (3+) were defined for those cases with EGFR expression in 50% or more tumoral cells on the section. By our study we observed that EGFR expression was significantly associated with higher rates of cell proliferation. EGFR activation and intracellular signal can be a result of its roles in transcription, up-regulation, degradation and gene amplification. Our results demonstrate that EGFR over-expression is correlated with higher tumor stage (III and IV) as compared with weaker EGFR expression. Due to the knowledge of EGFR expression in CRC, now it is possible to apply targeted therapy with cetuximab-EGFR monoclonal antibodies in the treatment algorithm of the CRC at the EGFR-positive patients identified by IHC examination. Also, the observed differentiated association between EGFR expression, ganglion EGFR status – N and tumor differentiation degree - G, could significantly assign to the EGFR the role of prognostic marker for disease recurrence. Determination of EGFR status may be used to identify cases of CRC, which could benefit from anti-EGFR therapies and on the other hand would have the potential to be a rigorous mean for monitoring efficacy of anti-EGFR therapy in CRC (Mendelsohn, 2003). Although EGFR remains a controversial prognostic factor, the association between EGFR over-expression and tumor

stage may have an important role in the anti-EGFR therapy of patients with CRC.

**Figure 7. EGFR expression.** On patient 43 we can observe an over-expression on the membrane of epithelial cells from the crypt foci. In the case of patient 32 we remarked loss of expression. Patient 73

MLPA analysis detects large deletions or duplications in the gene. This is a semi quantitative reaction based on PCR identifying copy number variations and contributes for assessing predictive genetic markers giving an intra-individual variation spectrum of the genes included in this study. It is also a useful tool for the diagnosis of genetic diseases characterized by large genomic rearrangements. In order to perform the test on blood and tissue samples in the first step of our analyses we optimized the procedure for the specific genes. For each gene we optimized the range of DNA concentration in order to have a good signal and to obtain the most suitable mix of primers that we have to use. After protocol optimization we went through the technique and in each run we used three DNA samples

**2.4. Deletion/duplication evaluation for the interested genes (MLPA)** 

presented expression in cytoplasm of tumoral cells.

from blood and tissue for each patient.

**Figure 5. PLA2 type IIA expression.** A normal expression with immunofluorescent signal in SM cells can be observed on section from patient 12. On section from patient 18 we observe an over-expression in infiltrated cells. Patient 62 shows a weakly signal on SM cells around the crypts and on vascular smooth muscle. In the case of patient 60 the loss of signal is remarked.

We characterized the expression of **BRCA1** in 75 sporadic colorectal carcinomas. It was found an increased BRCA1 expression in the apical cell pole of epithelial malignant cells and a significant increase in BRCA1 nuclear foci in tumor colorectal specimens in comparison with the corresponding normal tissues, in 10 cases out of 75 (13.33%). These increases in BRCA1 expression may be explained by the fact that colorectal tissue is subject to very active proliferation and differentiation. In 14 cases out 75 (18.66%) we observed the loss of BRCA1 expression (Figure 6).

**Figure 6. BRCA1 expression.** Patient 43 showed loss of expression in nucleus of epithelial cells. On patient 60 we can observe an over-expression on the epithelial cells from the crypt foci. On other sections from patient 60 over-expression was observed only on the apical pole of epithelial cells.

**The epidermal growth factor receptor (EGFR)** expression had an abnormal pattern in 41.33% (31/ 75) of patients. Out of these, the signal intensity was weak (1+) in 22.58% (7/ 31) of patients and moderate (2+) in 32.25% (10/ 31) of patients. Moreover, in both cases EGFR expression was observed in cytoplasm of tumoral cells (Figure 7). Complete strong circumferential expression (3+) was found in 45.16% (14/ 31) of patients. Normal expression, like signal absence was observed in 58.67% (44/ 75) of patients. In our study (2+) and/ or (3+) were defined for those cases with EGFR expression in 50% or more tumoral cells on the section. By our study we observed that EGFR expression was significantly associated with higher rates of cell proliferation. EGFR activation and intracellular signal can be a result of its roles in transcription, up-regulation, degradation and gene amplification. Our results demonstrate that EGFR over-expression is correlated with higher tumor stage (III and IV) as compared with weaker EGFR expression. Due to the knowledge of EGFR expression in CRC, now it is possible to apply targeted therapy with cetuximab-EGFR monoclonal antibodies in the treatment algorithm of the CRC at the EGFR-positive patients identified by IHC examination. Also, the observed differentiated association between EGFR expression, ganglion EGFR status – N and tumor differentiation degree - G, could significantly assign to the EGFR the role of prognostic marker for disease recurrence. Determination of EGFR status may be used to identify cases of CRC, which could benefit from anti-EGFR therapies and on the other hand would have the potential to be a rigorous mean for monitoring efficacy of anti-EGFR therapy in CRC (Mendelsohn, 2003). Although EGFR remains a controversial prognostic factor, the association between EGFR over-expression and tumor stage may have an important role in the anti-EGFR therapy of patients with CRC.

102 Mutations in Human Genetic Disease

BRCA1 expression (Figure 6).

**Figure 5. PLA2 type IIA expression.** A normal expression with immunofluorescent signal in SM cells can be observed on section from patient 12. On section from patient 18 we observe an over-expression in infiltrated cells. Patient 62 shows a weakly signal on SM cells around the crypts and on vascular

We characterized the expression of **BRCA1** in 75 sporadic colorectal carcinomas. It was found an increased BRCA1 expression in the apical cell pole of epithelial malignant cells and a significant increase in BRCA1 nuclear foci in tumor colorectal specimens in comparison with the corresponding normal tissues, in 10 cases out of 75 (13.33%). These increases in BRCA1 expression may be explained by the fact that colorectal tissue is subject to very active proliferation and differentiation. In 14 cases out 75 (18.66%) we observed the loss of

**Figure 6. BRCA1 expression.** Patient 43 showed loss of expression in nucleus of epithelial cells. On patient 60 we can observe an over-expression on the epithelial cells from the crypt foci. On other sections from patient 60 over-expression was observed only on the apical pole of epithelial cells.

**The epidermal growth factor receptor (EGFR)** expression had an abnormal pattern in 41.33% (31/ 75) of patients. Out of these, the signal intensity was weak (1+) in 22.58% (7/ 31)

smooth muscle. In the case of patient 60 the loss of signal is remarked.

**Figure 7. EGFR expression.** On patient 43 we can observe an over-expression on the membrane of epithelial cells from the crypt foci. In the case of patient 32 we remarked loss of expression. Patient 73 presented expression in cytoplasm of tumoral cells.

#### **2.4. Deletion/duplication evaluation for the interested genes (MLPA)**

MLPA analysis detects large deletions or duplications in the gene. This is a semi quantitative reaction based on PCR identifying copy number variations and contributes for assessing predictive genetic markers giving an intra-individual variation spectrum of the genes included in this study. It is also a useful tool for the diagnosis of genetic diseases characterized by large genomic rearrangements. In order to perform the test on blood and tissue samples in the first step of our analyses we optimized the procedure for the specific genes. For each gene we optimized the range of DNA concentration in order to have a good signal and to obtain the most suitable mix of primers that we have to use. After protocol optimization we went through the technique and in each run we used three DNA samples from blood and tissue for each patient.

According to the microsatellites alteration assay we performed the MLPA analysis of *APC* and *BRCA1* genes and two other genes (*EGFR* and *CDH1*) were included.

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 105

**Figure 9.** MLPA chromatograms for the patient 31.

**Figure 10.** Mutational profile of *APC* by MLPA

**Figure 8.** MLPA chromatograms for patient with FAP (patient 15).

**Figure 9.** MLPA chromatograms for the patient 31.

According to the microsatellites alteration assay we performed the MLPA analysis of *APC*

and *BRCA1* genes and two other genes (*EGFR* and *CDH1*) were included.

**Figure 8.** MLPA chromatograms for patient with FAP (patient 15).

**Figure 10.** Mutational profile of *APC* by MLPA

The interpretation of the results was made by the help of a specific soft that assesses the reaction products in accordance with their molecular weight and quantitative expression. The GeneMapper results were exported in Coffalyzer software for normalization and the relative probe signals were calculated by dividing each measured peak area by the sum of all peak areas of the sample. A value of 1.0 indicated the presence of two alleles, and values of 0.5 and 1.5 represented a heterozygous deletion or duplication at that locus, respectively.

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 107

**Figure 11.** Mutational profile of *CDH1* by MLPA

**Figure 12.** Mutational profile of *BRCA1* by MLPA

**Figure 13.** Mutational profile of *EGFR* by MLPA

The mutational analyses at *APC* gene indicate that patient 15 diagnosed with FAP (Familial Adenomatous Polyposis) had deletion at the promoter region and also constitutional mutation 1309 (Figure 8) and no positive cases were found in the blood DNA samples.

This patient showed two deletions, in blood and in the tumour, in the promoter 2 and mutation 1309 region, although the individual did not show microsatellite loci alteration. Another example is patient 31 who presents a large deletion in between exon 12 – exon 15 (Figure 9) and by immunohistochemistry we found *APC* loss of expression in epithelial cells. In all studied cases we observed that 12% (9/ 75) of patients had a mutational profile. Deletions appeared frequently at the E12 - E15 level (11.9%) and in 3/ 9 cases in the promotor region 2 (33%); in E15 in 44% (4/ 9) of cases. Insertions were observed in 13% of cases (10/ 75) of cases in the promoter region and 13% (10/ 75) of patients have shown presence of wild type mutation 1309 (Figure 10).

Regarding the *CDH1* mutational status we observed that mutational profile appear in 30% (20/ 75) of patients. Insertion was observed at exon 4 in 30% (6/ 20) of patients and in 20% (4/ 20) of patients at exon 10. Loss of heterozygosity was observed at exons 08 and 13 in 20% (4/ 20) of patients for each exon (Figure 11). Without making microsatellite instability analyze, at the *CDH1* gene locus, loss of heterozygosity that was found by MLPA analysis was not necessary overlapped with results of E-cadherin protein expression studied by IHF in the tumors samples.

Mutational analyses at *BRCA1* gene indicate that 20% (15/ 75) of patients have mutations like duplication or loss of heterozygosity. Duplication at exon E13B was observed in 40% (6/ 15) of patients and at exon 20 was observed in 20% (3/ 15) of patients. As well as duplication, loss of heterozygosity was observed in principal to exon 13B in 40% (6/ 15) patients (Figure 12).

*EGFR* mutational status analyzes indicate that mutational profile appears like insertion, in 18.66% (20/ 75) of patients. Out of these, in 50% (10/ 20) of patients we observed insertion at the exon 3, in 20% (4/ 20) of patients at the exon 08, in 40% (8/20) of patients at the exon 17, in 40% (8/ 20) of patients at exon 25 and in 30% (6/ 20) of patients at exon 28 (Figure 13). For each of the following exons 02, 09 – 16, 18 – 24, 26 and 27 we have found insertions in 10% (2/ 20) of patients.

**Figure 11.** Mutational profile of *CDH1* by MLPA

respectively.

DNA samples.

(Figure 10).

tumors samples.

(2/ 20) of patients.

12).

The interpretation of the results was made by the help of a specific soft that assesses the reaction products in accordance with their molecular weight and quantitative expression. The GeneMapper results were exported in Coffalyzer software for normalization and the relative probe signals were calculated by dividing each measured peak area by the sum of all peak areas of the sample. A value of 1.0 indicated the presence of two alleles, and values of 0.5 and 1.5 represented a heterozygous deletion or duplication at that locus,

The mutational analyses at *APC* gene indicate that patient 15 diagnosed with FAP (Familial Adenomatous Polyposis) had deletion at the promoter region and also constitutional mutation 1309 (Figure 8) and no positive cases were found in the blood

This patient showed two deletions, in blood and in the tumour, in the promoter 2 and mutation 1309 region, although the individual did not show microsatellite loci alteration. Another example is patient 31 who presents a large deletion in between exon 12 – exon 15 (Figure 9) and by immunohistochemistry we found *APC* loss of expression in epithelial cells. In all studied cases we observed that 12% (9/ 75) of patients had a mutational profile. Deletions appeared frequently at the E12 - E15 level (11.9%) and in 3/ 9 cases in the promotor region 2 (33%); in E15 in 44% (4/ 9) of cases. Insertions were observed in 13% of cases (10/ 75) of cases in the promoter region and 13% (10/ 75) of patients have shown presence of wild type mutation 1309

Regarding the *CDH1* mutational status we observed that mutational profile appear in 30% (20/ 75) of patients. Insertion was observed at exon 4 in 30% (6/ 20) of patients and in 20% (4/ 20) of patients at exon 10. Loss of heterozygosity was observed at exons 08 and 13 in 20% (4/ 20) of patients for each exon (Figure 11). Without making microsatellite instability analyze, at the *CDH1* gene locus, loss of heterozygosity that was found by MLPA analysis was not necessary overlapped with results of E-cadherin protein expression studied by IHF in the

Mutational analyses at *BRCA1* gene indicate that 20% (15/ 75) of patients have mutations like duplication or loss of heterozygosity. Duplication at exon E13B was observed in 40% (6/ 15) of patients and at exon 20 was observed in 20% (3/ 15) of patients. As well as duplication, loss of heterozygosity was observed in principal to exon 13B in 40% (6/ 15) patients (Figure

*EGFR* mutational status analyzes indicate that mutational profile appears like insertion, in 18.66% (20/ 75) of patients. Out of these, in 50% (10/ 20) of patients we observed insertion at the exon 3, in 20% (4/ 20) of patients at the exon 08, in 40% (8/20) of patients at the exon 17, in 40% (8/ 20) of patients at exon 25 and in 30% (6/ 20) of patients at exon 28 (Figure 13). For each of the following exons 02, 09 – 16, 18 – 24, 26 and 27 we have found insertions in 10%

**Figure 12.** Mutational profile of *BRCA1* by MLPA

**Figure 13.** Mutational profile of *EGFR* by MLPA

#### **2.5. Microsatellite instability correlation on** *APC***,** *BRCA1* **and** *PLA2G2A*

During tumorigenesis, loss of wild-type alleles (inherited from the non-mutation-carrying parents) is frequently observed. Loss of heterozygosity (LOH) on tumor suppressor genes play a key role in colorectal cancer transformation, and LOH analysis of sporadic colorectal cancers could help discover unknown tumor suppressor genes (Ahmed B, 2011). For those patients who presented deletion/ duplication at the interested genes, in order to have a more accurate mutational analysis we decided to analyze the microsatellite instability. A panel of microsatellite markers, labeled with FAM, HEX, TET, were used to amplify DNA from normal and tumour tissues for LOH and MSI analyses of chromosomal loci specifics for *APC*, *PLA2G2A*, and *BRCA1*.

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 109

gene of neuroblastomas. In 1993, Tanaka *et al.* believed that a normal chromosome 1p36 might contain a tumor suppressor gene of colon carcinogenesis. Due to many genes located in the region of 1p36.33-36.31, additional analyses are necessary in order to confirm our hypothesis.

**Figure 14. Microsatellite alteration for** *PLA2G2A* **gene in patient 1.** D1S234, D1S 264 and D1S2843

**Figure 15. Microsatellite alteration for** *PLA2G2A* **genes in patient 14.** D1S2843 - S14 – Blood

(considered as normal); D1S2843 – M14 –MSI with low amplitude signal;

In order to analyze the polymorphic microsatellite markers, a PCR reaction was carried out for 10 ng DNA from normal and tumour tissue. The fluorescent specific-marker amplification PCR products were separated on ABI PRISM™ 310 Genetic Analyzer (Applied Biosystems). Resulted electrophoregrams were analyzed with GeneMapper ID v3.1 software for molecular size and peak heights. Data analysis was done with Sequencing DNA Analysis Software. The allelic imbalance can appear as loss of heterozygosity (LOH) or as microsatellite instability (MSI). LOH was determined using the following ratio: (T1:T2)/ (N1:N2), where 1 and 2 are the first and the second peaks of alleles identified in the tumour/ blood DNA samples from patients with colorectal cancer. When the ratio is lower than 0.67 or higher than 1.5, this is revealing the loss of one of the alleles (LOH). The presence of a novel allele in the tumour sample was interpreted as microsatellite instability (MSI).

In case of homozygosity, the two alleles are identical as dimension, and the corresponding picks are overlapped. Thus we cannot make distinction between the two alleles and their height.

Highly polymorphic markers were designed for AI analysis. The designed microsatellite markers for *PLA2G2A* located on chromosome 1, were D1S199, D1S2843, D1S2644 which are located around the gene and D1S234 from the coding region of the gene. For *APC* gene we selected D5S82, D5S489 microsatellite markers which are surrounding the gene, D5S656 which partial overlaps the gene and D5S421 which are localized on the coding region. Another panel of microsatellites loci was used for *BRCA1* gene: D17S855, D17S1322, D17S1323 which are localized on the introns 20, 12 and 19 of the gene and, D17S250, D17S800, D17S856, D17S1327 on chromosome 17q, surrounding the gene.

At the microsatellite loci designed on chromosome 1, LOH/ MSI was observed in 28% (21/ 75) of patients and 68% (17/ 21) of these have had allelic imbalance at the D1S234 locus which covers the *PLA2G2A* locus (Figure 14, Figure 15). MSI was observed in only 6.66% (5/ 75) of patients (Figure 16) and that, make us to suggest that MSI is very rare in sporadic adenocarcinomas and routine screening such lesions for MSI may not be a high priority. Previous studies showed that the 1p36 region frequently present allelic loss in various cancers, such as colon cancer, neuroblastoma, hepatocellular carcinomas, lung cancer, and breast cancer. However, only NB (neuroblastoma) gene was confirmed to be the tumor suppressor gene of neuroblastomas. In 1993, Tanaka *et al.* believed that a normal chromosome 1p36 might contain a tumor suppressor gene of colon carcinogenesis. Due to many genes located in the region of 1p36.33-36.31, additional analyses are necessary in order to confirm our hypothesis.

108 Mutations in Human Genetic Disease

*APC*, *PLA2G2A*, and *BRCA1*.

height.

sample was interpreted as microsatellite instability (MSI).

on chromosome 17q, surrounding the gene.

**2.5. Microsatellite instability correlation on** *APC***,** *BRCA1* **and** *PLA2G2A*

During tumorigenesis, loss of wild-type alleles (inherited from the non-mutation-carrying parents) is frequently observed. Loss of heterozygosity (LOH) on tumor suppressor genes play a key role in colorectal cancer transformation, and LOH analysis of sporadic colorectal cancers could help discover unknown tumor suppressor genes (Ahmed B, 2011). For those patients who presented deletion/ duplication at the interested genes, in order to have a more accurate mutational analysis we decided to analyze the microsatellite instability. A panel of microsatellite markers, labeled with FAM, HEX, TET, were used to amplify DNA from normal and tumour tissues for LOH and MSI analyses of chromosomal loci specifics for

In order to analyze the polymorphic microsatellite markers, a PCR reaction was carried out for 10 ng DNA from normal and tumour tissue. The fluorescent specific-marker amplification PCR products were separated on ABI PRISM™ 310 Genetic Analyzer (Applied Biosystems). Resulted electrophoregrams were analyzed with GeneMapper ID v3.1 software for molecular size and peak heights. Data analysis was done with Sequencing DNA Analysis Software. The allelic imbalance can appear as loss of heterozygosity (LOH) or as microsatellite instability (MSI). LOH was determined using the following ratio: (T1:T2)/ (N1:N2), where 1 and 2 are the first and the second peaks of alleles identified in the tumour/ blood DNA samples from patients with colorectal cancer. When the ratio is lower than 0.67 or higher than 1.5, this is revealing the loss of one of the alleles (LOH). The presence of a novel allele in the tumour

In case of homozygosity, the two alleles are identical as dimension, and the corresponding picks are overlapped. Thus we cannot make distinction between the two alleles and their

Highly polymorphic markers were designed for AI analysis. The designed microsatellite markers for *PLA2G2A* located on chromosome 1, were D1S199, D1S2843, D1S2644 which are located around the gene and D1S234 from the coding region of the gene. For *APC* gene we selected D5S82, D5S489 microsatellite markers which are surrounding the gene, D5S656 which partial overlaps the gene and D5S421 which are localized on the coding region. Another panel of microsatellites loci was used for *BRCA1* gene: D17S855, D17S1322, D17S1323 which are localized on the introns 20, 12 and 19 of the gene and, D17S250, D17S800, D17S856, D17S1327

At the microsatellite loci designed on chromosome 1, LOH/ MSI was observed in 28% (21/ 75) of patients and 68% (17/ 21) of these have had allelic imbalance at the D1S234 locus which covers the *PLA2G2A* locus (Figure 14, Figure 15). MSI was observed in only 6.66% (5/ 75) of patients (Figure 16) and that, make us to suggest that MSI is very rare in sporadic adenocarcinomas and routine screening such lesions for MSI may not be a high priority. Previous studies showed that the 1p36 region frequently present allelic loss in various cancers, such as colon cancer, neuroblastoma, hepatocellular carcinomas, lung cancer, and breast cancer. However, only NB (neuroblastoma) gene was confirmed to be the tumor suppressor

**Figure 14. Microsatellite alteration for** *PLA2G2A* **gene in patient 1.** D1S234, D1S 264 and D1S2843

**Figure 15. Microsatellite alteration for** *PLA2G2A* **genes in patient 14.** D1S2843 - S14 – Blood (considered as normal); D1S2843 – M14 –MSI with low amplitude signal;

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 111

regions (Figure 18). Another observation is that for microsatellite marker D17S1327, all

**Figure 17. Microsatellite alteration for** *APC* **genes in patient 23.** D5S656 - S23 – Blood (considered as normal); the report between D5S656 – T23\_Mj is (1202:207)/ (1299:1094) = 5 which is interpreted as LOH.

**Figure 18.** Microsatellite alteration for *APC* and *BRCA1* genes at patient 1.

individuals have a homozygote profile.

**Figure 16. Microsatellite alteration for** *PLA2G2A* **genes in patient 14.** D1S234 - S14 – Blood (considered as normal); D1S234 – Vf14 – with MSI; D1S234 – Mj14 –MSI with low amplitude signal; D1S234 – B14 – the signal could not be detected and was considered not measurable.

On chromosome 5 LOH/ MSI was observed in 38.66% (29/ 75) of patients (Figure 17) and 51.72% (15/ 29) of these have had allelic imbalance at the D5S421 locus which overlap the *APC* locus. MSI was observed only in 6.66% (5/ 75) of patients (Figure 17), similar with the results obtained for *PLA2G2A*. Allelic imbalance/ loss of heterozygosity appear to be a more frequent alteration than microsatellite instability in adenocarcinomas.

Microsatellites loci alterations corresponding to *BRCA1* gene have been found in 29.33% (22/ 75) of patients where D17S855 was the most affected (11 AI). Allelic imbalance analyses at the microsatellite loci D17S1323, D17S1322, and D17S855, which localize to introns 12, 19, and 20, respectively, indicates that 86.36% (19/ 22) of patients have LOH/ MSI in these regions (Figure 18). Another observation is that for microsatellite marker D17S1327, all individuals have a homozygote profile.

110 Mutations in Human Genetic Disease

**Figure 16. Microsatellite alteration for** *PLA2G2A* **genes in patient 14.** D1S234 - S14 – Blood (considered as normal); D1S234 – Vf14 – with MSI; D1S234 – Mj14 –MSI with low amplitude signal;

On chromosome 5 LOH/ MSI was observed in 38.66% (29/ 75) of patients (Figure 17) and 51.72% (15/ 29) of these have had allelic imbalance at the D5S421 locus which overlap the *APC* locus. MSI was observed only in 6.66% (5/ 75) of patients (Figure 17), similar with the results obtained for *PLA2G2A*. Allelic imbalance/ loss of heterozygosity appear to be a more

Microsatellites loci alterations corresponding to *BRCA1* gene have been found in 29.33% (22/ 75) of patients where D17S855 was the most affected (11 AI). Allelic imbalance analyses at the microsatellite loci D17S1323, D17S1322, and D17S855, which localize to introns 12, 19, and 20, respectively, indicates that 86.36% (19/ 22) of patients have LOH/ MSI in these

D1S234 – B14 – the signal could not be detected and was considered not measurable.

frequent alteration than microsatellite instability in adenocarcinomas.

**Figure 17. Microsatellite alteration for** *APC* **genes in patient 23.** D5S656 - S23 – Blood (considered as normal); the report between D5S656 – T23\_Mj is (1202:207)/ (1299:1094) = 5 which is interpreted as LOH.

**Figure 18.** Microsatellite alteration for *APC* and *BRCA1* genes at patient 1.

By examining the allelic imbalance analyses for the three genes included in this study and for all the patients, we can conclude that instability variation was: a) 29.63% on the short arm of chromosome 1; b) 55.56% on the long arm of chromosome 5; c) 37.10% on the long arm of chromosome 17 (Figure 19, Table 2). Because MSI was observed only in 13 patients (14.81%) we suppose that this type of instability is no specific for sporadic colorectal cancer and appears to be a relatively specific pointer for HNPCC. As MSI is very rare in sporadic adenomas, routine screening of such lesions for MSI is not a high priority (Xue-Rong C, 2006). However, MSI analysis in adenomas is likely to be useful in the cases where clinical features or family history suggest hereditary predisposition (Jesus V, 2011). Consequently, these results can be associated with sporadic colon cancer and not with hereditary cancer, like in HNPCC.

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 113

**Table 3.** Comparative analyses of protein and genetic expression of PLA2 type IIA, APC and BRCA1

**Figure 19.** Comparative analyses of the fifteen microsatellites markers

By comparative analysis of all 15 microsatellite markers, we found that: a) 7/ 93 patients have instability on all three genes (7.52%); b) 20/ 93 patients on both *PLA2G2A* and *APC* genes (21.50%); c) 23/ 93 patients on both APC and *BRCA1* genes (24.73%); d) 7/ 93 patients on both *PLA2G2A* and *BRCA1* genes (7.52%) (Table 3).


**Table 2.** The instability variation at the fifteen microsatellite loci

The frequencies of instability observed at *PLA2G2A* (89.33%) locus makes us not to exclude the possibility that *PLA2G2A* gene plays a key role in colorectal tumorigenesis. Similar to other studies we observed that the region where *PLA2G2A* gene is located is frequently modified in colorectal cancer, and encourages us not to exclude the possibility that it may represent a tumour suppressor gene.

On chromosome 5q, in the region where *APC* gene is located, the informative percent was 72.00%. Despite the construction of D5S421 microsatellite marker, in our analyses we


like in HNPCC.

By examining the allelic imbalance analyses for the three genes included in this study and for all the patients, we can conclude that instability variation was: a) 29.63% on the short arm of chromosome 1; b) 55.56% on the long arm of chromosome 5; c) 37.10% on the long arm of chromosome 17 (Figure 19, Table 2). Because MSI was observed only in 13 patients (14.81%) we suppose that this type of instability is no specific for sporadic colorectal cancer and appears to be a relatively specific pointer for HNPCC. As MSI is very rare in sporadic adenomas, routine screening of such lesions for MSI is not a high priority (Xue-Rong C, 2006). However, MSI analysis in adenomas is likely to be useful in the cases where clinical features or family history suggest hereditary predisposition (Jesus V, 2011). Consequently, these results can be associated with sporadic colon cancer and not with hereditary cancer,

By comparative analysis of all 15 microsatellite markers, we found that: a) 7/ 93 patients have instability on all three genes (7.52%); b) 20/ 93 patients on both *PLA2G2A* and *APC* genes (21.50%); c) 23/ 93 patients on both APC and *BRCA1* genes (24.73%); d) 7/ 93 patients

The frequencies of instability observed at *PLA2G2A* (89.33%) locus makes us not to exclude the possibility that *PLA2G2A* gene plays a key role in colorectal tumorigenesis. Similar to other studies we observed that the region where *PLA2G2A* gene is located is frequently modified in colorectal cancer, and encourages us not to exclude the possibility that it may

On chromosome 5q, in the region where *APC* gene is located, the informative percent was 72.00%. Despite the construction of D5S421 microsatellite marker, in our analyses we

**Figure 19.** Comparative analyses of the fifteen microsatellites markers

on both *PLA2G2A* and *BRCA1* genes (7.52%) (Table 3).

**Table 2.** The instability variation at the fifteen microsatellite loci

represent a tumour suppressor gene.

#### Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 113

**Table 3.** Comparative analyses of protein and genetic expression of PLA2 type IIA, APC and BRCA1

observed that the informative percent of the larger D5S82 (5q15 – 5q21) marker is at higher level (80.00%), and makes us to suppose that, probably, other genes around *APC* can be also mutated in colorectal cancer. According to our expectation, the other two markers located under D5S82 marker, have also a good informative percent: 58.67% for D5S489 (5q21) and 57.33% for D5S656 (5q21.3). On the other hand, the higher percentage of modifications encountered at the level of the microsatellites in the *PLA2G2A* gene region, demonstrates that the alterations at its level are much more frequent than those of the *APC* gene.

Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 115

gene affected both by familial and sporadic tumours. Regarding the PLA2 type IIA expression our results suggest that the malignant cells lose their ability to express PLA2 type IIA when invasive carcinoma develops in the adenoma. Our results are in line with the findings of Avoranta et al., who reported elevated gene and protein expression of PLA2 type IIA in colorectal adenomas from FAP patients. The lack of PLA2 type IIA expression is very common among colorectal cancer patients and, accordingly to the other studies, it seems that during tumor progression, malignant cells lose their ability to express PLA2 type IIA. These patients have a better prognosis than the patients with positive tumours (Buhmeida A., 2009) in contrast to normal mucosa. Most of the cell types that over-express PLA2 type IIA are apoptotic and necrotic, and this expression can be associated with the role of PLA2 type IIA in promoting death of cancer cells. Regarding BRCA1 expression, previous studies indicate a higher rates of CRC in families linked to the *BRCA1* gene than in other families (Porter D.E., 1994) and mutations on this gene in stomach and colon cancers are associated with the microsatellite mutator phenotype. After several studies in which controversial importance of BRCA1 expression and mutator phenotype is still in debate, in 13.33% (10/ 75) of patients we observed a correlation between IHF and AI analyses. Considering that 3/7 microsatellites are intragenic to *BRCA1*, hypermethylation of *BRCA1* can be an event that has been described in breast and ovarian tumours. Because LOH was not observed in the microsatellites surrounding the *BRCA1* locus, the loss of the large part of chromosome 17q is not necessary to be considered. Somatic mutation can be taken in account because by MLPA analyses in 13.33% (10/ 75) of patients we observed deletion at different exons, especially on exon 13B. Our results suggest that BRCA1 can be an independent prognostic factor in patients with CRC, and it may be used to identify patient subgroups at high risk that might benefit from adjuvant chemotherapy. In conclusion, the comparative analyses between immunohistochemical expression and mutational status of *APC*, *PLA2G2A* and *BRCA1* genes suggest that at the *APC* level, 10% (7/ 75) samples have loss of heterozygosity without any presence of a deletion on MLPA. A complete loss is correlated with reduction of APC protein expression. The mutational status of the studied genes correlated with the protein and MLPA expression provides us useful data about the most common type of modification that can appear in individuals with colorectal cancer and how they can be group in order to

Without making microsatellite instability analyze, at the *CDH1* gene locus, loss of heterozygosity that was found by MLPA analysis was not necessary overlapped with results of E-cadherin protein expression studied by IHF in the tumors samples. We can suppose that abnormal E-cadherin protein expression could be a result of some type of mutation at *CDH1* level or to others genes that are involved by association in its regulatory functions (some members of ECCU complex such α-cadherin or β-catenin), probably, as a consequence of tumor progression status. At the locus of *EGFR* gene, the mutational profile indicates only the presence of insertions, which can be interpreted as frame-shift mutations. The insertions founded at the exons E18, E19 and E21 are in relation with the catalytic domain of the *EGFR* gene. Future analyzes have to be done in order to reveal some specific

somatic mutations that are generally associated with the target therapy in CRCs.

receive a proper therapy.

By comparing *APC* and *PLA2G2A* genes with the allelic imbalance observed at the *BRCA1* locus, the informative percent was 69.33%. Among all 7 microsatellites designed for the *BRCA1* gene, only one marker – D17S1327 is non-informative because it constantly appears as homozygote meaning that it has no variable number repeat. The most altered microsatellite marker was D17S855 (17q21), designed for intron 20 of *BRCA1* gene, for which the informative percent was 88.00%. For the other two markers designed into the *BRCA1* gene, namely D17S1322 for intron 12 and D17S1323 for intron 19, the informative percent was 56.00% and 64.00% respectively.

## **3. Conclusions**

In order to improve the "personalized" therapeutic strategy in CRC, by our study we have comparatively evaluated the protein and gene expression for several key point biomarkers (APC, PLA2G2A, CDH1, BRCA1, and EGFR). Our *in vivo* experiment involved diagnosis testing of CRC patients and molecular biology testing on biological samples in order to clarify the cross-talk of interested genes and to better understand the CRC typology among Romanian patients.

We observed a close relationship in between different proteins and genes, which depends on the tumor type, cell grade and staging. For LOH/ MSI evaluation, our investigations were undertaken at the chromosomal regions where *APC*, *PLA2G2A* and *BRCA1* genes are located. We used microsatellite markers, in a series of sporadic CRCs with unknown status with respect to mutations in germline *PLA2G2A*, *APC* and *BRCA1*. Mutational status of 1p35-36.1, 5q and 17q21 chromosomal regions was evaluated and correlated with immunohistochemical and MLPA expression.

Regarding the *APC* MLPA analyses, our results are in accordance with those obtained by Sieber and Lamlum (2000), according to which, occasionally, in certain tumors in patients with germline mutations at the level of codon 1309, either the MCR (mutational cluster region) locus or the 3' and 5' region of *APC* gene, do not associate with the allelic loss at the level of adenomas. This same fact is observed in the case of patient 19 whose deletion, detected through MLPA at the E12 - E15 level, a region also including the MCR situ, is not supported by an allelic loss in any of the other microsatellite markers assayed. Although in this case no germline mutations were identified, we could extrapolate the same argument as Lamlum, starting from the premise that *APC* is often cited as the first tumor suppressor gene affected both by familial and sporadic tumours. Regarding the PLA2 type IIA expression our results suggest that the malignant cells lose their ability to express PLA2 type IIA when invasive carcinoma develops in the adenoma. Our results are in line with the findings of Avoranta et al., who reported elevated gene and protein expression of PLA2 type IIA in colorectal adenomas from FAP patients. The lack of PLA2 type IIA expression is very common among colorectal cancer patients and, accordingly to the other studies, it seems that during tumor progression, malignant cells lose their ability to express PLA2 type IIA. These patients have a better prognosis than the patients with positive tumours (Buhmeida A., 2009) in contrast to normal mucosa. Most of the cell types that over-express PLA2 type IIA are apoptotic and necrotic, and this expression can be associated with the role of PLA2 type IIA in promoting death of cancer cells. Regarding BRCA1 expression, previous studies indicate a higher rates of CRC in families linked to the *BRCA1* gene than in other families (Porter D.E., 1994) and mutations on this gene in stomach and colon cancers are associated with the microsatellite mutator phenotype. After several studies in which controversial importance of BRCA1 expression and mutator phenotype is still in debate, in 13.33% (10/ 75) of patients we observed a correlation between IHF and AI analyses. Considering that 3/7 microsatellites are intragenic to *BRCA1*, hypermethylation of *BRCA1* can be an event that has been described in breast and ovarian tumours. Because LOH was not observed in the microsatellites surrounding the *BRCA1* locus, the loss of the large part of chromosome 17q is not necessary to be considered. Somatic mutation can be taken in account because by MLPA analyses in 13.33% (10/ 75) of patients we observed deletion at different exons, especially on exon 13B. Our results suggest that BRCA1 can be an independent prognostic factor in patients with CRC, and it may be used to identify patient subgroups at high risk that might benefit from adjuvant chemotherapy. In conclusion, the comparative analyses between immunohistochemical expression and mutational status of *APC*, *PLA2G2A* and *BRCA1* genes suggest that at the *APC* level, 10% (7/ 75) samples have loss of heterozygosity without any presence of a deletion on MLPA. A complete loss is correlated with reduction of APC protein expression. The mutational status of the studied genes correlated with the protein and MLPA expression provides us useful data about the most common type of modification that can appear in individuals with colorectal cancer and how they can be group in order to receive a proper therapy.

114 Mutations in Human Genetic Disease

56.00% and 64.00% respectively.

immunohistochemical and MLPA expression.

**3. Conclusions** 

Romanian patients.

observed that the informative percent of the larger D5S82 (5q15 – 5q21) marker is at higher level (80.00%), and makes us to suppose that, probably, other genes around *APC* can be also mutated in colorectal cancer. According to our expectation, the other two markers located under D5S82 marker, have also a good informative percent: 58.67% for D5S489 (5q21) and 57.33% for D5S656 (5q21.3). On the other hand, the higher percentage of modifications encountered at the level of the microsatellites in the *PLA2G2A* gene region, demonstrates

By comparing *APC* and *PLA2G2A* genes with the allelic imbalance observed at the *BRCA1* locus, the informative percent was 69.33%. Among all 7 microsatellites designed for the *BRCA1* gene, only one marker – D17S1327 is non-informative because it constantly appears as homozygote meaning that it has no variable number repeat. The most altered microsatellite marker was D17S855 (17q21), designed for intron 20 of *BRCA1* gene, for which the informative percent was 88.00%. For the other two markers designed into the *BRCA1* gene, namely D17S1322 for intron 12 and D17S1323 for intron 19, the informative percent was

In order to improve the "personalized" therapeutic strategy in CRC, by our study we have comparatively evaluated the protein and gene expression for several key point biomarkers (APC, PLA2G2A, CDH1, BRCA1, and EGFR). Our *in vivo* experiment involved diagnosis testing of CRC patients and molecular biology testing on biological samples in order to clarify the cross-talk of interested genes and to better understand the CRC typology among

We observed a close relationship in between different proteins and genes, which depends on the tumor type, cell grade and staging. For LOH/ MSI evaluation, our investigations were undertaken at the chromosomal regions where *APC*, *PLA2G2A* and *BRCA1* genes are located. We used microsatellite markers, in a series of sporadic CRCs with unknown status with respect to mutations in germline *PLA2G2A*, *APC* and *BRCA1*. Mutational status of 1p35-36.1, 5q and 17q21 chromosomal regions was evaluated and correlated with

Regarding the *APC* MLPA analyses, our results are in accordance with those obtained by Sieber and Lamlum (2000), according to which, occasionally, in certain tumors in patients with germline mutations at the level of codon 1309, either the MCR (mutational cluster region) locus or the 3' and 5' region of *APC* gene, do not associate with the allelic loss at the level of adenomas. This same fact is observed in the case of patient 19 whose deletion, detected through MLPA at the E12 - E15 level, a region also including the MCR situ, is not supported by an allelic loss in any of the other microsatellite markers assayed. Although in this case no germline mutations were identified, we could extrapolate the same argument as Lamlum, starting from the premise that *APC* is often cited as the first tumor suppressor

that the alterations at its level are much more frequent than those of the *APC* gene.

Without making microsatellite instability analyze, at the *CDH1* gene locus, loss of heterozygosity that was found by MLPA analysis was not necessary overlapped with results of E-cadherin protein expression studied by IHF in the tumors samples. We can suppose that abnormal E-cadherin protein expression could be a result of some type of mutation at *CDH1* level or to others genes that are involved by association in its regulatory functions (some members of ECCU complex such α-cadherin or β-catenin), probably, as a consequence of tumor progression status. At the locus of *EGFR* gene, the mutational profile indicates only the presence of insertions, which can be interpreted as frame-shift mutations. The insertions founded at the exons E18, E19 and E21 are in relation with the catalytic domain of the *EGFR* gene. Future analyzes have to be done in order to reveal some specific somatic mutations that are generally associated with the target therapy in CRCs.

## **Author details**

Mihaela Tica *University of Medicine and Pharmacie "Carol Davila", Bucharest, Romania* 

Valeria Tica, Mihaela Uta, Ovidiu Vlaicu and Elena Ionica *University of Bucharest, Department of Biochemistry and Molecular Biology, Bucharest, Romania*  Genotype-Phenotype Disturbances of Some Biomarkers in Colorectal Cancer 117

Hardy R.G., Meltzer S.J. and Jankowski J.A. (2000). ABC of colorectal cancer: Molecular

Ishiguro Y., Ochiai M. and Sugimura T (1999). Strain differences of rats in the susceptibility to aberrant crypt foci formation by 2-amino-1-methyl-6phenylimidazo-[4,5-b]pyridine: no implication of Apc and Pla2g2a genetic polymorphisms in differential susceptibility.

Jhawer M, Goel S, Wilson AJ, Montagna C, Ling YH, Byun DS, Nasser S, Arango D, Shin J, Klampfer L, Augenlicht L.H, Soler R.P, Mariadason J.M. (2008). PIK3CA mutation/PTEN expression status predicts response of colon cancer cells to the epidermal growth factor receptor inhibitor cetuximab. *Cancer Res*., Vol. 68, Issue 6, pp. 1953-1961, ISSN 0008-5472 Knudson A.G., (2001).Two genetic hits (more or less) to cancer, *Nat Rev Cancer*., Vol. 1, Issue

Sieber O.M., Tomlinson I.P. and Lamlum H. (2000). The adenomatous polyposis coli (APC) tumour suppressor – genetics, function and disease. *Molecular Medicine Today*, Vol. 6,

Lee J.J. and Chu E. (2007). First-line use of anti-EGFR monoclonal antibodies in the treatment of metastatic colorectal cancer. *Clin Colorectal Cancer,* Vol. 6 (suppl 2), pp.

Mendelsohn J.and Baselga J. (2003) Status of epidermal growth factor receptor antagonist in the biology and treatment of cancer. *J Clinical Oncology*, Vol. 21, pp. 2787-2799, ISSN

Mihalcea A., Tica V., Georgescu S.E., Tesio C., Dinischiotu A, Condac E., Costache M. and Ionica E. (2005). Allelic imbalance on chromosomes 1 and 5 in colorectal carcinoma.

Mihalcea (Chitu) A., Stefan (Berlin) I., Tica V., Costache M and Ionica E. (2009). The detection of mutations in the APC gene of Romanian patients with colorectal cancer through two independent techniques. *Rom. Biotechnol. Lett*., Vol. 14, Issue. 5, pp. 4747-

Mounier C.M., Wendum D., Greenspan E. Flejou J., Rosenberg D.W. and Lambeau, G. (2008). Distinct expression pattern of the full set of secreted phospholipases A2 in human colorectal adenocarcinomas: sPLA2-III as a biomarker candidate. *Br J Cancer*.,

Muhammad W.S., and Edward C. (2010). Biology of Colorectal Cancer, *Cancer J*., Vol. 16,

Narayan S. and Roy D. (2003). Role of APC and DNA mismatch repair genes in the development of colorectal cancers. *Molecular cancer*, Vol. 2, Issue 1, pp. 1 – 15, ISSN 1476-

Nevalainen T.J., Gronroos J.M. and Kallajoki M. (1995). Expression of group II phospholipase A2 in the human gastrointestinal tract. *Lab Invest*., Vol. 72, Issue 2, pp.

Nevalainen T.J. and Haapanen T.J. (1993). Distribution of pancreatic (group I) and synovial type (group II) phospholipases A2 in human tissues. *Inammation*, Vol. 17, Issue 4, pp.

basis for risk factors. *BMJ*, Vol. 321, Issue 7265, pp. 886-889, ISSN 0959-8146

*Carcinogenesis,* Vol. 20, pp. 1063–1068.

Issue 12, pp. 462-469, ISSN: 1357-4310

*Plovdiv University Press*, pp. 568-575.

Vol. 98, Issue 3, pp. 587–595, ISSN 0007-0920

Issue 3, pp. 196 –201, ISSN 1528-9117

S42–S46, ISSN: 1533-0028

4755, ISSN 1224-5984

201–208, ISSN 0023-6837

453–464, ISSN 0360-3997

2, pp. 157 – 162

0732-183X

4598

Alexandru Naumescu *Emergency University Hospital, Bucharest, Romania* 

## **Acknowledgement**

This work has been supported by the Government of Romania, through National Plan of Research II, grant no. 137/ and 42-158/ 2008. We are grateful to all our partners from Bucharest Emergency Clinical Hospital Bucharest, Romania and Department of Biochemistry and Molecular Biology from the University of Bucharest, for their excellent technical support.

## **4. References**


Hardy R.G., Meltzer S.J. and Jankowski J.A. (2000). ABC of colorectal cancer: Molecular basis for risk factors. *BMJ*, Vol. 321, Issue 7265, pp. 886-889, ISSN 0959-8146

116 Mutations in Human Genetic Disease

*University of Medicine and Pharmacie "Carol Davila", Bucharest, Romania* 

*University of Bucharest, Department of Biochemistry and Molecular Biology, Bucharest, Romania* 

This work has been supported by the Government of Romania, through National Plan of Research II, grant no. 137/ and 42-158/ 2008. We are grateful to all our partners from Bucharest Emergency Clinical Hospital Bucharest, Romania and Department of Biochemistry and Molecular Biology from the University of Bucharest, for their excellent technical support.

Aleksandra S, Rafal S., Lubomir B., Wojciech O., Marzena C., Wojciech K., Cezary S., Tadeusz P., Maciej W. & Monika LP. (2011). Prognostic Significance of Wnt-1, β-catenin and E-cadherin Expression in Advanced Colorectal Carcinoma, *Pathol. Oncol. Res*., Vol.

Arteaga CL. (2001). The epidermal growth factor receptor: from mutant oncogene in nonhuman cancers to therapeutic target in human neoplasia. *J Clin Oncol*., Vol. 19 (18

Avoranta T., Sundström J., Korkeila E., Syrjänen K., Pyrhönen S. and Laine J. (2010). The expression and distribution of group IIA phospholipase A2 in human colorectal

Bos JL. (1989). Ras oncogene in human cancer: a review. *Cancer Res.,* Vol. 49, Issue 17, pp.

Bryant D.M., Stow J.L. (2004). The ins and outs of E-cadherin trafficking. *Trends Cell Biol*.

Buhmeida A, Bendardaf R, Hilska M, Laine J, Collan Y, Laato M, Syrjänen K. and Pyrhönen S. (2009). PLA2 (group IIA phospholipase A2) as a prognostic determinant in stage II

Goodwin M, Yap A.S. (2004) Classical cadherin adhesion molecules: coordinating cell adhesion, signaling and the cytoskeleton. *J Mol Histol*. Vol. 35, Issue 8, pp. 839–844,

Haapamaki M.M., Gronroos, J M; Nurmi, H; Alanen, K; Kallajoki, M; Nevalainen, T J. (1997). Gene expression of group II phospholipase A2 in intestine in ulcerative colitis. *Gut,* Vol.

colorectal carcinoma. *Ann Oncol*., Vol. 20, Issue 7, pp. 1230–1235, ISSN 0923-7534 Friedenson Bernard, (2004). BRCA1 and BRCA2 Founder Mutations and the Risk of Colorectal Cancer. *Journal of the National Cancer Institute*, Vol. 96, Issue 15, pp. 1185 –

tumours, *Virchows Arch*., Vol. 457, Issue 6, pp. 659–667, ISSN 0945-6317

Valeria Tica, Mihaela Uta, Ovidiu Vlaicu and Elena Ionica

*Emergency University Hospital, Bucharest, Romania* 

17, Issue 4, pp. 955–963, ISSN 1219-4956

4682– 4689, ISSN 0008-5472

1186, ISSN 0027-8874

ISSN 1567-2379

suppl), Issue 18, pp. 32S– 40S, ISSN 0732-183X

Vol. 14, Issue 8, pp. 427–434, ISSN 0962-8924

40, Issue 1, pp. 95 – 101, ISSN 0017-5749

**Author details** 

Alexandru Naumescu

**Acknowledgement** 

**4. References** 

Mihaela Tica


Ogawa M., Yamashita S., Sakamoto K. and Ikei S. (1991). Elevation of serum group II phospholipase A2 in patients with cancers of digestive organs. *Res Commun Chem*. *Pathol Pharmacol*., Vol. 74, Issue 2, pp. 241–244, ISSN 0034-5164

**Chapter 6** 

© 2012 Fahed and Nemer, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**Genetic Causes of Syndromic and** 

Akl C. Fahed and Georges M. Nemer

http://dx.doi.org/10.5772/48477

**1. Introduction** 

Additional information is available at the end of the chapter

to survive to adulthood and have their own children.

**Non-Syndromic Congenital Heart Disease** 

Congenital heart disease (CHD) is the most common human congenital defect, and a leading cause of death in infants. With an incidence that varies between 0.8 to 2% in neonates, congenital heart disease contributes to a much larger fraction of stillbirths.(Goldmuntz 2001; Loffredo 2000) Additionally, undiagnosed mild malformations of the heart often appear later in adulthood or remain undiagnosed for life. If these are included, some expect a prevalence of CHD that is up to 4% among all newborns.(Loffredo 2000) An additional contributor to the rising prevalence of CHD among adults is the advance in diagnostics and medical and surgical treatments of children with CHD, which is allowing them, in the majority of cases, to get their heart defect, fixed and sustain a normal life into adulthood.(van der Bom and others 2011) Management of the increasing number of adult patients living with CHD is becoming more and more complicated due to the fact that many patients with mild cardiac lesions are missed during childhood and later appear with complications due to these defects such as heart failure, but even more due to the improvements in diagnosis and surgical care of pediatric patients which are allowing them

The majority of CHD is thought to result from gene mutations. This was suggested by early observations of Mendelian inheritance of CHD in families. Another evidence came from congenital syndromes due to micro and macro deletions of chromosomal regions that would result in CHD together with several other manifestations. Over the past few decades, and with the advent of gene sequencing and other techniques it became possible to identify the genetic causes of CHD.(Goldmuntz 2001) In syndromic cases, although it was possible to identify the chromosomal deletions causing the disease, in many cases the gene responsible for the heart phenotype remains undefined. Other syndromes were found to be due to single gene defects; however, for the majority, the downstream pathophysiology linking the

and reproduction in any medium, provided the original work is properly cited.


#### **Chapter 6**

## **Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease**

Akl C. Fahed and Georges M. Nemer

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48477

## **1. Introduction**

118 Mutations in Human Genetic Disease

Issue 1, pp. 33–48, ISSN 1473-7159

56, Issue 5, pp. 955–958, ISSN 0008-5472

*Oncol Res*., Vol.17, Issue 4, pp. 955-63, ISSN 1219-4956

131, ISSN 0022-7722

1522-8002

pp. 341–344, ISSN 0167-6806

8-20, ISSN: 1040-8428

Ogawa M., Yamashita S., Sakamoto K. and Ikei S. (1991). Elevation of serum group II phospholipase A2 in patients with cancers of digestive organs. *Res Commun Chem*.

Porter D.E., Cohen B.B., Wallace M.R., Smyth E., Chetty U., M. Dixon J., Steel C.M., Carter D.C. (1994). Breast cancer incidence, penetrance and survival in probable carriers of BRCA1 gene mutations in families linked to BRCA1 on chromosome 17q12–21. *Br J* 

Rizzo P., Osipo C., Foreman K., Golde T., Osborne B. and Miele, L., (2008). Rational targeting of Notch signaling in cancer. *Oncogene*, Vol. 27, Issue 38, pp. 5124-5131, ISSN 0950-9232 Roukos D., (2010). Novel clinico–genome network modeling for 27 revolutionizing genotype–phenotype-based personalized cancer care. *Expert Rev. Mol. Diagn*., Vol. 10,

Samowitz W.S., Powers M.D., Spirio L.N., Nollet F., Frans van Roy, Slattery M.L. (1999). β-Catenin mutations are more frequent in small colorectal adenomas than in larger adenomas and invasive carcinomas. *Cancer Res*., Vol. 59, pp. 1442 - 1444, ISSN 0008-5472 Senda T., Shimomura A., and Iizuka-Kogo A. (2005). Adenomatous polyposis coli (APC) tumor suppressor gene as a multifunctional gene. *Anat Sci Int.,* Vol. 80, Issue 3, pp. 121-

Spano J.P., Lagorce C., Atlan D., Milano G., Domont J., Benamouzig R., Attar A., Benichou J., Martin A., Morere J.F., Raphael M., Penault-Llorca F., Breau, J.L., Fagard R., Khayat D., and Wind P. (2005). Impact of EGFR expression on colorectal cancer patient prognosis

Stanczak A., Stec R., Bodnar L., Olszewski W., Cichowicz M., Kozlowski W., Szczylik C., Pietrucha T., Wieczorek M. and Lamparska-Przybysz M. (2011). Prognostic Significance of Wnt-1, β-catenin and E-cadherin Expression in advanced colorectal carcinoma. *Pathol* 

Thorstensen L., Ovist H., Heim S., Jan-Liefers G., Nesland J.M., Giercksky K.E. and Löthe, R. (2000). Evaluation of 1p losses in primary carcinomas, local recurrences and peripheral metastases from colorectal cancer patients. *Neoplasia*, Vol. 2, Issue 6, pp. 514-522, ISSN

Valle J., Menendez M., Izquierdo A., Campos O., Velasco A., Feliubadalo L., Brunet J., Tornero E., Capella G., Darder E., Blanco I.and Lazaro C., (2011). Identication of a new complex rearrangement affecting exon 20 of BRCA1, *Breast Cancer Res Treat*., Vol. 130,

Ng K. and Zhu A.X. (2006). Anti Targeting the epidermal growth factor receptor in metastatic colorectal cancer. *Critical Reviews in Oncology/Hematology,* Vol. 65, Issue 1, pp.

Xue-Rong C., Wei-Zhong Z., Xing-Qiu L. and Jin-Wei W., (2006). Genetic instability of BRCA1 gene at locus D17S855 is related to clinicopathological behaviors of gastric cancer from Chinese population, *World J Gastroenterol*., Vol. 12, Issue 26, ISSN 1007-9327

and survival. *Annals of Oncology*, Vol. 16, Issue 1, pp. 102-108, ISSN 0923-7534 Spirio L.N., Kutchera W., Winstead M.V., Pearson B., Kaplan C., Robertson M., Lawrence E., Burt R.W., Tischfield J.A., Leppert M.F., Prescott S.M. and White R. (1996). Three secretory phospholipase A(2) genes that map to human chromosome 1P35-36 are not mutated in individuals with attenuated adenomatous polyposis coli. *Cancer Res*., Vol.

*Pathol Pharmacol*., Vol. 74, Issue 2, pp. 241–244, ISSN 0034-5164

*Surg*, Vol. 81, Issue 10, pp 1512–1515, Online ISSN: 1365-2168

Congenital heart disease (CHD) is the most common human congenital defect, and a leading cause of death in infants. With an incidence that varies between 0.8 to 2% in neonates, congenital heart disease contributes to a much larger fraction of stillbirths.(Goldmuntz 2001; Loffredo 2000) Additionally, undiagnosed mild malformations of the heart often appear later in adulthood or remain undiagnosed for life. If these are included, some expect a prevalence of CHD that is up to 4% among all newborns.(Loffredo 2000) An additional contributor to the rising prevalence of CHD among adults is the advance in diagnostics and medical and surgical treatments of children with CHD, which is allowing them, in the majority of cases, to get their heart defect, fixed and sustain a normal life into adulthood.(van der Bom and others 2011) Management of the increasing number of adult patients living with CHD is becoming more and more complicated due to the fact that many patients with mild cardiac lesions are missed during childhood and later appear with complications due to these defects such as heart failure, but even more due to the improvements in diagnosis and surgical care of pediatric patients which are allowing them to survive to adulthood and have their own children.

The majority of CHD is thought to result from gene mutations. This was suggested by early observations of Mendelian inheritance of CHD in families. Another evidence came from congenital syndromes due to micro and macro deletions of chromosomal regions that would result in CHD together with several other manifestations. Over the past few decades, and with the advent of gene sequencing and other techniques it became possible to identify the genetic causes of CHD.(Goldmuntz 2001) In syndromic cases, although it was possible to identify the chromosomal deletions causing the disease, in many cases the gene responsible for the heart phenotype remains undefined. Other syndromes were found to be due to single gene defects; however, for the majority, the downstream pathophysiology linking the

© 2012 Fahed and Nemer, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

gene defect to the development of disease remains obscure. In parallel, extensive *in-vitro* and *in-vivo* studies widened our understanding of the molecular basis of heart development. It is thought that perturbations during embryonic heart development are at the origin of CHD. These studies resulted in large sets of candidate genes and molecular pathways involved in heart development. It is hypothesized that mutations in these genes cause CHD. This was confirmed by sequencing of genes encoding cardiac-enriched transcription factors such as *GATA4*, *NKX2-5*, and *TBX5* in non-syndromic cases of CHD, and finding mutations that segregate with the disease. This prompted excitement in the field; however, screening of large cohorts of isolated CHD cases brought some disappointment as these genes explained only a minority of the cases.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 121

Connection (TAPVC). Congenital heart defects can also be simple or complex. A complex malformation includes several simple malformations occurring together. The most typical example is Tetralogy of Fallot, which -as its name implies- includes four malformations: Pulmonary Stenosis (PS), an overriding aorta, Ventricular Septal Defect (VSD), and right ventricular hypertrophy. Because of the wide diversity in the anatomy of the cardiac malformations, several detailed morphological classifications were also developed. The most widely recognized one is the International Pediatric and Congenital Cardiac Code (IPCCC), which was developed by the International Society for Nomenclature of Paediatric and Congenital Heart Disease (ISNPCHD). Table 1 shows the categories of CHD classifications of the IPCCC with the most common diagnoses within each category. The detailed version could be downloaded from the IPCCC website (www.ipccc.net). Other classification systems are radiologic based on echocardiography or magnetic resonance imaging, hemodynamic based on shunts and circulations in the heart, or embryological based on the presumed origin during heart development. CHD can occur as part of a syndrome and as such is labeled as syndromic or nonsyndromic, both of which are discussed in this chapter. In syndromic and non-syndromic cases, CHD can be isolated, that is occurring in a single patient, or familial afflicting many members within the same family.

The recurrence rate of CHD after an isolated case is 2.7%. (Gill and others 2003)

heterogeneity.

This anatomical heterogeneity of CHD has been one major reason why we know little about its genetics. Beyond the anatomical classification described in the IPCCC, different combinations of malformations and variations to described malformations can occur. Pediatric cardiologists often end up using different terminologies to describe similar defects because of their complexity. Extremely rare complex malformations are also sometimes described and run in families while their cause remains unknown.(Herrera and others 2008; Jaeggi and others 2008) Genotype-phenotype correlations are hard to establish due to this heterogeneity. In the majority of familial cases of CHD, there are different types of structural malformations within the same family. The same single gene mutation has been shown to cause a variety of cardiac defects, even within the same family.(Goldmuntz 2001) Whenever mouse knockout models were developed to recapitulate a human CHD phenotype, the mouse phenotype was not always similar to that seen in humans.(Bruneau 2008) All these issues raised the hypothesis of a multifactorial and perhaps polygenetic origin of CHD. The genetic background of the individual, in-utero environment, epigenetic changes, and embryological hemodynamics and physiology are all possible causes of this phenotypic

Being a leading cause of deaths in the first year of life, CHD has prompted a large wave of development in surgical and interventional procedures to treat CHD. As such, CHD is mostly corrected with surgical and interventional procedures when the malformation causes symptoms or can cause heart failure such as a large septal defect or a cyanotic heart disease. Small malformations such as tiny septal defects that are expected to correct on their own or to not cause any complication are simply observed. With the recent advances in treatment, the mortality from CHD has decreased tremendously and most CHD patients survive a normal life throughout adulthood.(van der Bom and others 2011) This prompted a whole

The understanding of how defects in these genes cause CHD turned out to be more complicated that initially expected. It became evident that not all CHD manifests true Mendelian inheritance. It is possible that combinations of mutations in different genes result in a particular phenotype, or combination of a gene mutation with a particular environmental exposure results in a CHD phenotype. Mutations might have low penetrance and only serve to increase the risk of CHD. Other mutations might yield totally defective proteins, yet be compensated for by other proteins in interlinked pathways. Copy Number Variations (CNVs), altered transcription, somatic mutations, and microRNA (miRNA) are also additional mechanisms through which the molecular basis of CHD can be explained. Current research explores all of these mechanisms with a wide array of technologies that are better than ever, and hence the future decade promises a near complete understanding of heart development and the genetic basis of Congenital Heart Disease.

This chapter covers the genetics of syndromic and non-syndromic congenital heart disease. It discusses all genes that have been associated with congenital heart disease in humans with depiction of the spectrum of mutations and the genotype-phenotype correlations for each. The chapter also covers the roles of CNVs, epigenetics, somatic mutations, and miRNA in CHD. Current technologies and strategies used to understand the genetics of congenital heart disease are also discussed. The chapter ends with an explanation of how these technologies can unravel the genetics of CHD and allow the application of research findings for the benefit of patients.

#### **2. Classifications, anatomy, and clinical significance**

Congenital heart disease encompasses a broad category of anatomic malformations, which can range from a small septal defect or leaky valve to a severe malformation requiring extensive surgical repair or leading to death such as a single ventricle. Several classification systems exist for describing congenital heart disease. The most common classification used to describe CHD is purely clinical whereby CHD is cyanotic if the malformation results in deoxygenated blood bypassing the lung and causes cyanosis (blue patient), or non-cyanotic if the malformation does not result in cyanosis. The most common cyanotic heart defects are Tetralogy of Fallot (TOF), Hypoplastic Left Heart Syndrome (HLHS), Transposition of the Great Arteries (TGA), Truncus Arteriosus (TA), and Total Anomalous Pulmonary Venous Connection (TAPVC). Congenital heart defects can also be simple or complex. A complex malformation includes several simple malformations occurring together. The most typical example is Tetralogy of Fallot, which -as its name implies- includes four malformations: Pulmonary Stenosis (PS), an overriding aorta, Ventricular Septal Defect (VSD), and right ventricular hypertrophy. Because of the wide diversity in the anatomy of the cardiac malformations, several detailed morphological classifications were also developed. The most widely recognized one is the International Pediatric and Congenital Cardiac Code (IPCCC), which was developed by the International Society for Nomenclature of Paediatric and Congenital Heart Disease (ISNPCHD). Table 1 shows the categories of CHD classifications of the IPCCC with the most common diagnoses within each category. The detailed version could be downloaded from the IPCCC website (www.ipccc.net). Other classification systems are radiologic based on echocardiography or magnetic resonance imaging, hemodynamic based on shunts and circulations in the heart, or embryological based on the presumed origin during heart development. CHD can occur as part of a syndrome and as such is labeled as syndromic or nonsyndromic, both of which are discussed in this chapter. In syndromic and non-syndromic cases, CHD can be isolated, that is occurring in a single patient, or familial afflicting many members within the same family. The recurrence rate of CHD after an isolated case is 2.7%. (Gill and others 2003)

120 Mutations in Human Genetic Disease

only a minority of the cases.

for the benefit of patients.

gene defect to the development of disease remains obscure. In parallel, extensive *in-vitro* and *in-vivo* studies widened our understanding of the molecular basis of heart development. It is thought that perturbations during embryonic heart development are at the origin of CHD. These studies resulted in large sets of candidate genes and molecular pathways involved in heart development. It is hypothesized that mutations in these genes cause CHD. This was confirmed by sequencing of genes encoding cardiac-enriched transcription factors such as *GATA4*, *NKX2-5*, and *TBX5* in non-syndromic cases of CHD, and finding mutations that segregate with the disease. This prompted excitement in the field; however, screening of large cohorts of isolated CHD cases brought some disappointment as these genes explained

The understanding of how defects in these genes cause CHD turned out to be more complicated that initially expected. It became evident that not all CHD manifests true Mendelian inheritance. It is possible that combinations of mutations in different genes result in a particular phenotype, or combination of a gene mutation with a particular environmental exposure results in a CHD phenotype. Mutations might have low penetrance and only serve to increase the risk of CHD. Other mutations might yield totally defective proteins, yet be compensated for by other proteins in interlinked pathways. Copy Number Variations (CNVs), altered transcription, somatic mutations, and microRNA (miRNA) are also additional mechanisms through which the molecular basis of CHD can be explained. Current research explores all of these mechanisms with a wide array of technologies that are better than ever, and hence the future decade promises a near complete understanding of

This chapter covers the genetics of syndromic and non-syndromic congenital heart disease. It discusses all genes that have been associated with congenital heart disease in humans with depiction of the spectrum of mutations and the genotype-phenotype correlations for each. The chapter also covers the roles of CNVs, epigenetics, somatic mutations, and miRNA in CHD. Current technologies and strategies used to understand the genetics of congenital heart disease are also discussed. The chapter ends with an explanation of how these technologies can unravel the genetics of CHD and allow the application of research findings

Congenital heart disease encompasses a broad category of anatomic malformations, which can range from a small septal defect or leaky valve to a severe malformation requiring extensive surgical repair or leading to death such as a single ventricle. Several classification systems exist for describing congenital heart disease. The most common classification used to describe CHD is purely clinical whereby CHD is cyanotic if the malformation results in deoxygenated blood bypassing the lung and causes cyanosis (blue patient), or non-cyanotic if the malformation does not result in cyanosis. The most common cyanotic heart defects are Tetralogy of Fallot (TOF), Hypoplastic Left Heart Syndrome (HLHS), Transposition of the Great Arteries (TGA), Truncus Arteriosus (TA), and Total Anomalous Pulmonary Venous

heart development and the genetic basis of Congenital Heart Disease.

**2. Classifications, anatomy, and clinical significance** 

This anatomical heterogeneity of CHD has been one major reason why we know little about its genetics. Beyond the anatomical classification described in the IPCCC, different combinations of malformations and variations to described malformations can occur. Pediatric cardiologists often end up using different terminologies to describe similar defects because of their complexity. Extremely rare complex malformations are also sometimes described and run in families while their cause remains unknown.(Herrera and others 2008; Jaeggi and others 2008) Genotype-phenotype correlations are hard to establish due to this heterogeneity. In the majority of familial cases of CHD, there are different types of structural malformations within the same family. The same single gene mutation has been shown to cause a variety of cardiac defects, even within the same family.(Goldmuntz 2001) Whenever mouse knockout models were developed to recapitulate a human CHD phenotype, the mouse phenotype was not always similar to that seen in humans.(Bruneau 2008) All these issues raised the hypothesis of a multifactorial and perhaps polygenetic origin of CHD. The genetic background of the individual, in-utero environment, epigenetic changes, and embryological hemodynamics and physiology are all possible causes of this phenotypic heterogeneity.

Being a leading cause of deaths in the first year of life, CHD has prompted a large wave of development in surgical and interventional procedures to treat CHD. As such, CHD is mostly corrected with surgical and interventional procedures when the malformation causes symptoms or can cause heart failure such as a large septal defect or a cyanotic heart disease. Small malformations such as tiny septal defects that are expected to correct on their own or to not cause any complication are simply observed. With the recent advances in treatment, the mortality from CHD has decreased tremendously and most CHD patients survive a normal life throughout adulthood.(van der Bom and others 2011) This prompted a whole

new subspecialty in adult cardiology to take care of adult patients with CHD.(Moodie 1994) As these adults with CHD are planning to have children of their own, the recurrence risk became a problem, and this was yet another force to identify the genetic causes behind the disease, given that genetic counseling and pre-implantation genetic diagnosis (PGD) can be useful tools for these parents.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 123

**3. Developmental genetics of congenital heart disease** 

knowledge is far from being complete.

heart disease.

Heart development is crucial to understand because its molecular basis is evolutionary conserved as depicted by studies in several model organisms. Heart development is a complex process regulated by combinatorial interactions of transcription factors and their regulators, ligands and receptors, signaling pathways, and contractile protein genes among others. The differential expression of each of these genes at unique stages of development and in different areas of the heart is responsible for the normal development of the heart. Any disruption in these genes will result in congenital malformations of the heart. This molecular program for heart development has been a heavy field of research, yet our

The heart is the first organ to develop in the embryo at the second week of gestation when pre-cardiac lateral plate mesoderm cells migrate towards the midline of the embryo and form two crescent-shaped primordia, which fuse to form a beating heart tube at week 3. Within only few days the heart tube folds on itself in a process known as looping. This is the first event in the organogenesis of the embryo that manifests left-right asymmetry and is believed to be at the origin of the laterality program of the embryo. Subsequently, the four chambers of the heart are formed. This requires the differentiation of myocytes into two different subtypes, atrial and ventricular. Finally, valves and septa form through divisions within the heart to form the mature four-chambered heart. Valvulogenesis and septogenesis both require interaction between endocardial and myocardial cells, and valvoseptal malformations are the most common CHDs. In addition, development of the conduction system occurs into pacemakers and purkinjie cells, as well as vascularization from neural crest cells, and coronary arteries from epicardial precursor cells. As such, heart development requires a complex interplay of cell-commitment, migration, proliferation, differentiation, and apoptosis. Any perturbation in this program can result in congenital

Transcription factors regulate this tight program of gene expression, which is chamber-, and stage-specific. Protein interactions and formation of complexes that regulate downstream targets cardiac targets with convergent and divergent pathways have made the understanding of the molecular basis of CHD complicated. In-vitro and invivo studies have been crucial in widening our understanding of the molecular program for heart development. Major transcription factor families involved in heart development include the GATA, T-box, homeobox, and basic Helix-Loop-Helix (bHLH) among others. Screening of human CHD patients for gene mutations within these transcription factor families as well as other cardiac-enriched genes implicated in heart development has not been as rewarding. Mutations in *TBX5*, *GATA4*, *NKX2-5* have been implicated in many CHD families and genetic tests became clinically available. Several other genes have been clearly established to cause syndromic cases of CHD such as *JAG1* and *ELN*. Deletions of chromosomal regions have also been established to cause several CHD syndromes, the most famous of which is DiGeorge Syndrome, which is caused by the 22q11.2 deletion. Despite all this progress, the majority of gene mutations


**Table 1.** IPCCC Classification of Congenital Heart Disease and Most Common Diagnoses

#### **3. Developmental genetics of congenital heart disease**

122 Mutations in Human Genetic Disease

useful tools for these parents.

Abnormalities of position and connection of the heart

Tetralogy of Fallot and variants

Abnormalities of great veins

Abnormalities of atriums and atrial septum

Abnormalities of AV valves and AV septal defect

Abnormalities of ventricles and ventricular septum

Abnormalities of VA valves and great arteries

Abnormalities of coronary arteries, arterial duct and pericardium; AV fistulae

new subspecialty in adult cardiology to take care of adult patients with CHD.(Moodie 1994) As these adults with CHD are planning to have children of their own, the recurrence risk became a problem, and this was yet another force to identify the genetic causes behind the disease, given that genetic counseling and pre-implantation genetic diagnosis (PGD) can be

> Dextrocardia Atrial Situs Inversus Double Inlet Left Ventricle (DILV); Double Inlet Right Ventricle (DIRV) Transposition of the Great Arteries (TGA) Double Outlet Left Ventricle (DORV); Double Outlet Right Ventricle (DORV) Common Arterial Trunk (CAT), aka Truncus Arteriosus (TA)

Tetralogy of Fallot (TOF) Pulmonary Atresia (PA) and Venticular Septal Defect (VSD)

Supervior Vena Cava (SVC) Abnormality Inferior Vena Cava (SVC) Abnormality Coronary Sinus Abnormality Total Anomalous Pulmonary Venous Connection (TAPVC) Partially Anomalous Pulmonary Venous Connection (PAPVC)

> Atrial Septal Defect (ASD) Patent Foramen Ovale (PFO)

Tricuspid Regurgitation (TR) Tricuspid Stenosis (TS) Ebstein's Anomaly Mitral Regurgitation (MR) Mitral Stenosis (MS) Mitral Valve Proplapse (MVP) Atrioventricular Septal Defect (AVSD)

Single Ventricle Ventricular imbalance: dominant LV +hypoplastic RV, or dominant RV+hypoplastic RV Aneurysm (RV, LV, or septal) Hypoplastic Left Heart Syndrome (HLHS) Double Chambered Right Ventricle (DCRV) Ventricular Septal Defect (VSD)

> Aortopulmonary Window (AP Window) Pulmonary Stenosis (PS), valvar or subalvar Pulmonary Artery Stenosis (PAS) Aortic Stenosis (AS), valvar or suvalvar Aortic Insufficiency (AI) Bicuspid Aortic Valve (BAV) Supravalvar Aortic Stenosis (SVS) Coarctation of the Aorta (COA) Interrupted Aortic Arach (IAA)

Anomalous Origin of Coronary Artery from Pulmonary Artery (ALCAPA) Patent Ductus Arteriosus (PDA)

**Table 1.** IPCCC Classification of Congenital Heart Disease and Most Common Diagnoses

**Classification Category Most Common Diagnoses**

Heart development is crucial to understand because its molecular basis is evolutionary conserved as depicted by studies in several model organisms. Heart development is a complex process regulated by combinatorial interactions of transcription factors and their regulators, ligands and receptors, signaling pathways, and contractile protein genes among others. The differential expression of each of these genes at unique stages of development and in different areas of the heart is responsible for the normal development of the heart. Any disruption in these genes will result in congenital malformations of the heart. This molecular program for heart development has been a heavy field of research, yet our knowledge is far from being complete.

The heart is the first organ to develop in the embryo at the second week of gestation when pre-cardiac lateral plate mesoderm cells migrate towards the midline of the embryo and form two crescent-shaped primordia, which fuse to form a beating heart tube at week 3. Within only few days the heart tube folds on itself in a process known as looping. This is the first event in the organogenesis of the embryo that manifests left-right asymmetry and is believed to be at the origin of the laterality program of the embryo. Subsequently, the four chambers of the heart are formed. This requires the differentiation of myocytes into two different subtypes, atrial and ventricular. Finally, valves and septa form through divisions within the heart to form the mature four-chambered heart. Valvulogenesis and septogenesis both require interaction between endocardial and myocardial cells, and valvoseptal malformations are the most common CHDs. In addition, development of the conduction system occurs into pacemakers and purkinjie cells, as well as vascularization from neural crest cells, and coronary arteries from epicardial precursor cells. As such, heart development requires a complex interplay of cell-commitment, migration, proliferation, differentiation, and apoptosis. Any perturbation in this program can result in congenital heart disease.

Transcription factors regulate this tight program of gene expression, which is chamber-, and stage-specific. Protein interactions and formation of complexes that regulate downstream targets cardiac targets with convergent and divergent pathways have made the understanding of the molecular basis of CHD complicated. In-vitro and invivo studies have been crucial in widening our understanding of the molecular program for heart development. Major transcription factor families involved in heart development include the GATA, T-box, homeobox, and basic Helix-Loop-Helix (bHLH) among others. Screening of human CHD patients for gene mutations within these transcription factor families as well as other cardiac-enriched genes implicated in heart development has not been as rewarding. Mutations in *TBX5*, *GATA4*, *NKX2-5* have been implicated in many CHD families and genetic tests became clinically available. Several other genes have been clearly established to cause syndromic cases of CHD such as *JAG1* and *ELN*. Deletions of chromosomal regions have also been established to cause several CHD syndromes, the most famous of which is DiGeorge Syndrome, which is caused by the 22q11.2 deletion. Despite all this progress, the majority of gene mutations

discovered in a family with CHD have not been confirmed in other families, or in only a few. Also screening of large cohorts of isolated CHD cases for mutations in a large set of cardiac-enriched candidate genes consistently results in a low yield of genetic causality.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 125

*TBX1* gene

*ELN* gene

*JAG1* or *Notch1* mutations; Microdeletion or rearrangement at 20p12 resulting in absent *JAG1* gene

*KRAS, BRAF, MEK1, MEK2*, and *HRAS*

Microdeletion at 22q11.2

*MEK2*; Microdeletion at 12q21.2-q22

Mutations in *HRAS* (overlap with Noonan and Cardiofaciocutaneous Syndrome)

**Syndrome with CHD Genetic Cause for CHD**  *Disorders of Chromosome Dosage* 

> Turner Unknown *Chromosomal Microdeletions*

Di Georges Syndrome 22q11.2 deletion resulting in absent

Williams-Beuren Syndrome Microdeletion of *ELN* gene; Mutations in

Noonan Syndrome Mutations in *PTPN11, SOS1, RAF1,* 

CHARGE Association Mutations in *CHD7* and *SEMA3E*;

Char Syndrome Mutations in *TFAP2B* Ellis-can Creveld Syndrome Mutations in *EVC* or *EVC2* Cardiofaciocutaneous Syndrome Mutations in *KRAS, BRAK, MEK1*, or

Marfan Syndrome Mutations in *Fibrillin-1*

Congenital Heart Disease occurs in 40 to 50% of Down Syndrome patients. The most common abnormality is Atrioventricular Septal Defect (AVSD).(Marino 1993) Other malformations include VSD and TOF among others. Some CHD phenotypes are not seen in Down Syndrome patients such as Transposition of the Great Arteries (TGA) and Situs Inversus.(Marino 1993) Adult patients with Down Syndrome are also predisposed to Mitral Valve Prolapse (MVP) and fenestrations in the cusps of the aortic and pulmonary valves.

Given the complexity of the phenotype in Down Syndrome, there has been tremendous effort to build a phenotype map and identify the genetic cause behind each phenotype.(Delabar and others 1993; Korenberg and others 1994) Although successful for other features of Down Syndrome, the cause of the cardiac malformations in Down Syndrome are still unclear. Knowing that *CRELD1* gene mutations have been associated with AVSD, one screening of 39 Down Syndrome patients identified two missense *CRELD1*

**Table 2.** Syndromes Manifesting Congenital Heart Disease and their Genetic Cause

*Single Gene Defects* Holt-Oram Syndrome *TBX5* mutations

Alagille Syndome

Costello Syndrome

(Hamada and others 1998)

Trisomy 21 (Down Syndrome) Unknown

This gap has prompted novel directions in understanding the genetics of CHD. One of the hypotheses is the multifactorial and polygenetic nature of CHD, with gene mutations acting on a certain genetic background or acting within a particular susceptible environment within a developmental window. There have been efforts towards a new systems biology approach to understanding CHD. In addition to germline DNA sequencing which comprises the majority of the literature, somatic DNA sequencing, RNA sequencing, study of microRNAs (miRNAs), and Copy Number Variations (CNVs) analysis are becoming more popular tools to study CHD. Also with the advent of next-generation sequencing and the decreased cost of both sequencing and array comparative genomic hybridization (array-CGH), more data are becoming available, and the molecular biology approach of the past few decades is shifting into a bioinformatics approach to help decipher the genetics of this complex disease. The subsequent sections of the chapter will dwell into the genetics of CHD from the oldest and most known to the most recent and least known. The below section discusses syndromic CHD, which comprises entities where the genetic causes is the most well established. Then the genes implicated in non-syndromic CHD in humans will be discussed with the degree of evidence for each. The most recent but least developed technologies to understand CHD mentioned above will be discussed at the end of the chapter.

#### **4. Syndromic congenital heart disease**

Cardiac malformations are among the most prevalent malformations in congenital syndromes. A large list of syndromes with congenital heart disease as a common manifestation has known genetic defects. CHD syndromes can be either due chromosome dosage disorders, large chromosomal deletions, small micro-deletions, or single gene defects. Table 2 shows a list of CHD syndromes within each of these categories with the corresponding genetic defect. This section will discuss the most common syndromes that include congenital heart disease as a primary manifestation. Within each syndrome, the phenotypic diversity as well as the spectrum of mutations and chromosomal defects that have been reported will be discussed.

#### **4.1. Down Syndrome (trisomy 21)**

Down Syndrome is the most common disorder of chromosome dosage with an incidence of 1 in 700 to 1 in 800 live births. The incidence is known to increase tremendously with increased maternal age, particularly above the age of 35. The main clinical manifestations of Down Syndrome are characteristic dysmorphic facies, mental retardation, premature ageing, congenital heart disease, hearing loss, and increased risk of hematologic malignancies.(Pueschel 1990)


causality.

chapter.

**4. Syndromic congenital heart disease** 

have been reported will be discussed.

**4.1. Down Syndrome (trisomy 21)** 

malignancies.(Pueschel 1990)

discovered in a family with CHD have not been confirmed in other families, or in only a few. Also screening of large cohorts of isolated CHD cases for mutations in a large set of cardiac-enriched candidate genes consistently results in a low yield of genetic

This gap has prompted novel directions in understanding the genetics of CHD. One of the hypotheses is the multifactorial and polygenetic nature of CHD, with gene mutations acting on a certain genetic background or acting within a particular susceptible environment within a developmental window. There have been efforts towards a new systems biology approach to understanding CHD. In addition to germline DNA sequencing which comprises the majority of the literature, somatic DNA sequencing, RNA sequencing, study of microRNAs (miRNAs), and Copy Number Variations (CNVs) analysis are becoming more popular tools to study CHD. Also with the advent of next-generation sequencing and the decreased cost of both sequencing and array comparative genomic hybridization (array-CGH), more data are becoming available, and the molecular biology approach of the past few decades is shifting into a bioinformatics approach to help decipher the genetics of this complex disease. The subsequent sections of the chapter will dwell into the genetics of CHD from the oldest and most known to the most recent and least known. The below section discusses syndromic CHD, which comprises entities where the genetic causes is the most well established. Then the genes implicated in non-syndromic CHD in humans will be discussed with the degree of evidence for each. The most recent but least developed technologies to understand CHD mentioned above will be discussed at the end of the

Cardiac malformations are among the most prevalent malformations in congenital syndromes. A large list of syndromes with congenital heart disease as a common manifestation has known genetic defects. CHD syndromes can be either due chromosome dosage disorders, large chromosomal deletions, small micro-deletions, or single gene defects. Table 2 shows a list of CHD syndromes within each of these categories with the corresponding genetic defect. This section will discuss the most common syndromes that include congenital heart disease as a primary manifestation. Within each syndrome, the phenotypic diversity as well as the spectrum of mutations and chromosomal defects that

Down Syndrome is the most common disorder of chromosome dosage with an incidence of 1 in 700 to 1 in 800 live births. The incidence is known to increase tremendously with increased maternal age, particularly above the age of 35. The main clinical manifestations of Down Syndrome are characteristic dysmorphic facies, mental retardation, premature ageing, congenital heart disease, hearing loss, and increased risk of hematologic **Table 2.** Syndromes Manifesting Congenital Heart Disease and their Genetic Cause

Congenital Heart Disease occurs in 40 to 50% of Down Syndrome patients. The most common abnormality is Atrioventricular Septal Defect (AVSD).(Marino 1993) Other malformations include VSD and TOF among others. Some CHD phenotypes are not seen in Down Syndrome patients such as Transposition of the Great Arteries (TGA) and Situs Inversus.(Marino 1993) Adult patients with Down Syndrome are also predisposed to Mitral Valve Prolapse (MVP) and fenestrations in the cusps of the aortic and pulmonary valves. (Hamada and others 1998)

Given the complexity of the phenotype in Down Syndrome, there has been tremendous effort to build a phenotype map and identify the genetic cause behind each phenotype.(Delabar and others 1993; Korenberg and others 1994) Although successful for other features of Down Syndrome, the cause of the cardiac malformations in Down Syndrome are still unclear. Knowing that *CRELD1* gene mutations have been associated with AVSD, one screening of 39 Down Syndrome patients identified two missense *CRELD1*

mutations and suggested that CRELD1 mutations might cause AVSD in Down Syndrome.(Maslen and others 2006) However other complex hypotheses have been suggested such as epigenetic mechanisms. Despite considerable process for molecular genetic analysis of Down Syndrome has been achieved using mouse models, to date no clear cause for CHD is known.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 127

in 7500.(Stromme and others 2002) Clinically, patients have Supravalvular Aortic Stenosis (SVAS), mental retardation, characteristic facial features, distinctive dental anomalies, infantile hypercalcemia, and peripheral pulmonary artery stenosis.(Beuren and others 1962; Grimm and Wesselhoeft 1980; Williams and others 1961) The cardiac phenotype of vascular stenosis is caused by haploinsufficiency of the Elastin (*ELN*) gene and is found in at least 70% of the patients.(Pober 2010) Mutations of the *ELN* gene also result in familial cases of SVAS without the syndromic features of Williams-Beuren.(Curran and others 1993; Metcalfe and others 2000) Although SVAS is the most common lesion in WBS patients, vascular stenoses can occur in any medium or large artery due to the thick media layer. Lesions have been described in aortic arch, descending aorta, pulmonary, coronary, renal artery, mesenteric arteries, and intracranial arteries.(Pober 2010) Half of Williams-Beuren patients also suffer form hypertension, and cardiovascular disease is the most common cause of

Holt-Oram Syndrome (HOS) is also known as Heart-Hand Syndrome, and it manifests as congenital heart disease and upper limb dysplasia. The heart manifestations are mostly septal malformations and include secundum ASD, VSD, patent ductus arteriosus, and conduction system abnormalities. The upper limb malformations are widely variable but are typically bilateral and asymmetric in severity. They can range from a small abnormality such as a distally-placed thumb to phocomelia or hypoplasia of the shoulders and clavicles. Sometimes the upper limb dysplasia can go unnoticed and will be seen only after radiological imaging. Congenital heart malformations occur in 85% of HOS patients.(Basson

Genetically, HOS is an autosomal dominant disease caused by mutations in the *TBX5* gene, a member of the T-box family of transcription factors. (Basson and others 1997; Li and others 1997b) Haploinsufficiency of *TBX5* was shown to be at the origin of the HOS. *TBX5* interacts with other cardiac-specific transcription factors *GATA4* and *NKX2-5* to regulate the expression of downstream genes such as *ID2,* which are essential in septation of the cardiac chambers as well as development of the conduction system. The functional mechanisms through which the three transcription factors *TBX5*, *GATA-4*, and *Nkx2-5* interact to mediate processes in heart development have been heavily studied, and there is a very complex network of interactions among these and other transcription factors and downstream genes

Genotype-phenotype correlations were also performed in HOS, and it has been shown that *TBX5* mutations that create null alleles result in more severe abnormalities in both upper limbs and the heart as compared to missense mutations.(Basson and others 1999) Some mutations caused very severe cardiac malformations but only subtle upper limb deformities. From a clinical perspective, it is important to look for subtle upper limb malformations in patients with septal deformities, because a diagnosis of HOS can increase the recurrence risk in a sibling from 3% to 50% given that this is an autosomal dominant disease. Clinical genetic testing for *TBX5* has also become available in some laboratories across the world.

death in these patients.(Pober 2010; Pober and others 2008)

**4.5. Holt-Oram Syndrome** 

and others 1994; Boehme and Shotar 1989)

that exists but that is still partially understood (Figure 1).

#### **4.2. Turner Syndrome**

Turner syndrome is a condition in females where all or part of one sex chromosome is absent. It is estimated to occur in 1 of 2500 females.(Bondy 2009) It manifests most commonly with characteristic physical features such as short stature, webbed necks, broad chest, low hairline, and low set ears, gonadal dysfunction, and cognitive deficits.(Bondy 2009) Clinical features are highly variable and can sometimes be very mild. Congenital heart disease is found in 20% to 50% of Turner Syndrome patients. The most common malformation is a Coarctation of the Aorta (COA) of the postductal type, which comprises 50% to 70% of CHD in Turner Syndrome.(Doswell and others 2006) Other cardiac malformations seen in Turner Syndrome include Bicuspid Aortic Valve (BAV), Partial Anomalous Pulmonary Venous Connection (PAPVC), and Hypoplastic Left Heart (HLH). In addition, a higher frequency of cardiac conduction abnormalities, hypertension, and aortic dilation has been reported in Turner Syndrome patients.(Doswell and others 2006; Lopez and others 2008) The molecular mechanisms leading to the cardiac malformations in Turner Syndrome are not clear.

#### **4.3. Di George Syndrome**

Di George Syndrome (DGS) is also known as Velocardiofacial Syndrome (VCFS) or Chromosome 22q11.2 Deletion Syndrome. It is caused by a 1.5 to 3.0-Mb hemizygous deletion on chromosome 22 q11, which can be inherited in an autosomal dominant fashion, but most commonly arises *de novo*.(Emanuel 2008) The clinical manifestations are highly variable owing to incomplete penetrance. When the disease is fully penetrant, clinical manifestations include cardiac outflow tract defects, parathyroid gland hypoplasia resulting in hypocalcaemia, thymus gland aplasia resulting in immunodeficiency, and neurologic and facial abnormalities.(Emanuel 2008) Cardiac outflow tact defects in DGS include TOF, type B Interrupted Aortic Arch (IAA), Truncus Arteriosus, Right Aortic Arch, and aberrant right subclavian artery.(Momma 2010) (Yagi and others 2003) The molecular mechanisms leading to the phenotype in DGS are more known than for Down and Turner Syndromes. The microdeletion results in haploinsufficiency of the *TBX1* gene, which is responsible for neural crest migration into the derivatives of the pharyngeal arches and pouches in the developing embryo.(Emanuel 2008) Target genes downstream of *TBX1* are not yet elucidated, however they are most likely to explain the different phenotypes in DGS.

#### **4.4. Williams-Beuren Syndrome**

Williams-Beuren Syndrome (WBS) results from a hemizygous deletion of 1.5 to 1.8 Mb on chromosome 7q11.23, an area that encompasses 28 genes. Its prevalence is estimated to be 1 in 7500.(Stromme and others 2002) Clinically, patients have Supravalvular Aortic Stenosis (SVAS), mental retardation, characteristic facial features, distinctive dental anomalies, infantile hypercalcemia, and peripheral pulmonary artery stenosis.(Beuren and others 1962; Grimm and Wesselhoeft 1980; Williams and others 1961) The cardiac phenotype of vascular stenosis is caused by haploinsufficiency of the Elastin (*ELN*) gene and is found in at least 70% of the patients.(Pober 2010) Mutations of the *ELN* gene also result in familial cases of SVAS without the syndromic features of Williams-Beuren.(Curran and others 1993; Metcalfe and others 2000) Although SVAS is the most common lesion in WBS patients, vascular stenoses can occur in any medium or large artery due to the thick media layer. Lesions have been described in aortic arch, descending aorta, pulmonary, coronary, renal artery, mesenteric arteries, and intracranial arteries.(Pober 2010) Half of Williams-Beuren patients also suffer form hypertension, and cardiovascular disease is the most common cause of death in these patients.(Pober 2010; Pober and others 2008)

#### **4.5. Holt-Oram Syndrome**

126 Mutations in Human Genetic Disease

cause for CHD is known.

**4.2. Turner Syndrome** 

**4.3. Di George Syndrome** 

**4.4. Williams-Beuren Syndrome** 

mutations and suggested that CRELD1 mutations might cause AVSD in Down Syndrome.(Maslen and others 2006) However other complex hypotheses have been suggested such as epigenetic mechanisms. Despite considerable process for molecular genetic analysis of Down Syndrome has been achieved using mouse models, to date no clear

Turner syndrome is a condition in females where all or part of one sex chromosome is absent. It is estimated to occur in 1 of 2500 females.(Bondy 2009) It manifests most commonly with characteristic physical features such as short stature, webbed necks, broad chest, low hairline, and low set ears, gonadal dysfunction, and cognitive deficits.(Bondy 2009) Clinical features are highly variable and can sometimes be very mild. Congenital heart disease is found in 20% to 50% of Turner Syndrome patients. The most common malformation is a Coarctation of the Aorta (COA) of the postductal type, which comprises 50% to 70% of CHD in Turner Syndrome.(Doswell and others 2006) Other cardiac malformations seen in Turner Syndrome include Bicuspid Aortic Valve (BAV), Partial Anomalous Pulmonary Venous Connection (PAPVC), and Hypoplastic Left Heart (HLH). In addition, a higher frequency of cardiac conduction abnormalities, hypertension, and aortic dilation has been reported in Turner Syndrome patients.(Doswell and others 2006; Lopez and others 2008) The molecular

mechanisms leading to the cardiac malformations in Turner Syndrome are not clear.

they are most likely to explain the different phenotypes in DGS.

Di George Syndrome (DGS) is also known as Velocardiofacial Syndrome (VCFS) or Chromosome 22q11.2 Deletion Syndrome. It is caused by a 1.5 to 3.0-Mb hemizygous deletion on chromosome 22 q11, which can be inherited in an autosomal dominant fashion, but most commonly arises *de novo*.(Emanuel 2008) The clinical manifestations are highly variable owing to incomplete penetrance. When the disease is fully penetrant, clinical manifestations include cardiac outflow tract defects, parathyroid gland hypoplasia resulting in hypocalcaemia, thymus gland aplasia resulting in immunodeficiency, and neurologic and facial abnormalities.(Emanuel 2008) Cardiac outflow tact defects in DGS include TOF, type B Interrupted Aortic Arch (IAA), Truncus Arteriosus, Right Aortic Arch, and aberrant right subclavian artery.(Momma 2010) (Yagi and others 2003) The molecular mechanisms leading to the phenotype in DGS are more known than for Down and Turner Syndromes. The microdeletion results in haploinsufficiency of the *TBX1* gene, which is responsible for neural crest migration into the derivatives of the pharyngeal arches and pouches in the developing embryo.(Emanuel 2008) Target genes downstream of *TBX1* are not yet elucidated, however

Williams-Beuren Syndrome (WBS) results from a hemizygous deletion of 1.5 to 1.8 Mb on chromosome 7q11.23, an area that encompasses 28 genes. Its prevalence is estimated to be 1 Holt-Oram Syndrome (HOS) is also known as Heart-Hand Syndrome, and it manifests as congenital heart disease and upper limb dysplasia. The heart manifestations are mostly septal malformations and include secundum ASD, VSD, patent ductus arteriosus, and conduction system abnormalities. The upper limb malformations are widely variable but are typically bilateral and asymmetric in severity. They can range from a small abnormality such as a distally-placed thumb to phocomelia or hypoplasia of the shoulders and clavicles. Sometimes the upper limb dysplasia can go unnoticed and will be seen only after radiological imaging. Congenital heart malformations occur in 85% of HOS patients.(Basson and others 1994; Boehme and Shotar 1989)

Genetically, HOS is an autosomal dominant disease caused by mutations in the *TBX5* gene, a member of the T-box family of transcription factors. (Basson and others 1997; Li and others 1997b) Haploinsufficiency of *TBX5* was shown to be at the origin of the HOS. *TBX5* interacts with other cardiac-specific transcription factors *GATA4* and *NKX2-5* to regulate the expression of downstream genes such as *ID2,* which are essential in septation of the cardiac chambers as well as development of the conduction system. The functional mechanisms through which the three transcription factors *TBX5*, *GATA-4*, and *Nkx2-5* interact to mediate processes in heart development have been heavily studied, and there is a very complex network of interactions among these and other transcription factors and downstream genes that exists but that is still partially understood (Figure 1).

Genotype-phenotype correlations were also performed in HOS, and it has been shown that *TBX5* mutations that create null alleles result in more severe abnormalities in both upper limbs and the heart as compared to missense mutations.(Basson and others 1999) Some mutations caused very severe cardiac malformations but only subtle upper limb deformities. From a clinical perspective, it is important to look for subtle upper limb malformations in patients with septal deformities, because a diagnosis of HOS can increase the recurrence risk in a sibling from 3% to 50% given that this is an autosomal dominant disease. Clinical genetic testing for *TBX5* has also become available in some laboratories across the world.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 129

1998) Mutations have also been identified in patients with a predominantly cardiac phenotype.(Li and others 1997a) Some families do have variable penetrance of the mutation as well as variant expressivity of the disease within the same family, such as facial dysmorphism only, or subtle liver disease only within members of the family carrying the same mutation.(El-Rassy and others 2008) JAG-1 mutations are present in 94% of patients that are clinically diagnosed with Alagille Syndrome. A small number of cases are also explained by mutations in the *Notch1* gene, the JAG-1 receptor.(McDaniell and others

Clinical testing for *JAG-1* mutations is available. If patients are clinically diagnosed, a *JAG-1* mutation could confirm the diagnosis, and indicate the need for multisystem assessment to look for other subclinical abnormalities and possibly prevent them. It would also allow for similar assessment of family members. Due to the high variability of the disease, patients with suspicious right-sided heart lesions such as PAS, TOF, and PS who do not necessarily fulfill the criteria for Alagille Syndrome could also be tested for *JAG-1*

Noonan Syndrome (NS) is a dysmorphic cardiofacial syndrome inherited mostly in an autosomal dominant fashion, with some cases occurring sporadically. Its incidence ranges between 1 in 1000 to 1 in 2500 live births.(Tartaglia and others 2010) The characteristic physical features are downward eyeslanting of the eyes, hypertelorism, low-set ears, short stature, short and webbed neck, and epicanthic folds.(Tartaglia and others 2010) Congenital Heart Disease is found in 80 to 90% of patients with Noonan Syndrome and valvar pulmonary stenosis (PS) and Hypertrophic Cardiomyopathy (HCM) are the two most common cardiac manifestations. A large set of cardiac malformations can also occur including secundum ASD, AVSD, TOF, COA, VSD, PDA, and mitral valve disease.(Marino and others 1999; Noonan 1994) Patients might also have deafness, cryptorchidism, motor

NS is a genetically heterogeneous syndrome with at least 8 genes that have been associated with the disease so far: *PTPN11, SOS1, RAF1, KRAS, BRAF, MEK1, MEK2*, and *HRAS*.(Tidyman and Rauen 2009) Mutations in *PTPN11* are most common and explain 50% of the Noonan Syndrome cases, the other 7 genes explain roughly 25% of the cases, and in about 25% of the cases no mutation is found.(Tartaglia and others 2010) All the genes implicated in NS encode proteins that are part of the Ras/Raf/MEK/ERK signaling pathway, an important regulator of cell proliferation, differentiation, and survival. *PTPN11* encodes SHP-2, a protein tyrosine phosphatase that plays an important role in the signal

Disease penetrance is almost complete with *PTPN11* mutations, but there is a wide variability in the phenotype. Clinical testing for some of the genes involved in NS such as *PTPN11*, *SOS1*, and *KRAS* is available. Clinical diagnosis might be helpful might be helpful

2006).

mutations.

**4.7. Noonan Syndrome** 

delay, and bleeding diathesis.(Tartaglia and others 2010)

transduction to medial the biological processes described above.

in borderline cases given the variability in the phenotype.

**Figure 1.** Complex Genetic Interactions of *TBX5*, *GATA4*, and *Nkx2-5* (Network created using www.genemani.org )

#### **4.6. Alagille Syndrome**

Alagille Syndrome is inherited in an autosomal dominant fashion and is defined in the presence of intrahepatic bile duct paucity that usually manifests as cholestasis, congenital heart disease, distinctive facies, skeletal, ocular, renal, and neurological abnormalities. (Kamath and others 2011; Li and others 1997a) CHD is found in more than 90% of patients with Alagille Syndrome and the most common lesion is Pulmonary Atery Stenosis (PAS) or hypoplasia. Other common lesions include TOF, pulmonary valve stenosis (PS), and ASD.(McElhinney and others 2002) The prevalence of the disease is estimated at around one in 700,000 neonates when presence of jaundice is used to ascertain cases (Danks and others 1977), but in fact the disease has a tremendous variability in the phenotype and variable penetrance in families so that the actual prevalence is expected to be much higher.

Alagille Syndrome is caused by mutations in the *JAG1* gene.(Li and others 1997a; Oda and others 1997) The gene encodes a ligand to the Notch1 receptor. Jagged-Notch cell-cell interactions are crucial in determining cell fates during early developmental processes. The mutations spectrum of *JAG1* in Alagille Syndrome encompasses frameshift mutations, nonsense mutations, splice site mutations, or deletion of the whole gene.(Yuan and others 1998) Mutations have also been identified in patients with a predominantly cardiac phenotype.(Li and others 1997a) Some families do have variable penetrance of the mutation as well as variant expressivity of the disease within the same family, such as facial dysmorphism only, or subtle liver disease only within members of the family carrying the same mutation.(El-Rassy and others 2008) JAG-1 mutations are present in 94% of patients that are clinically diagnosed with Alagille Syndrome. A small number of cases are also explained by mutations in the *Notch1* gene, the JAG-1 receptor.(McDaniell and others 2006).

Clinical testing for *JAG-1* mutations is available. If patients are clinically diagnosed, a *JAG-1* mutation could confirm the diagnosis, and indicate the need for multisystem assessment to look for other subclinical abnormalities and possibly prevent them. It would also allow for similar assessment of family members. Due to the high variability of the disease, patients with suspicious right-sided heart lesions such as PAS, TOF, and PS who do not necessarily fulfill the criteria for Alagille Syndrome could also be tested for *JAG-1* mutations.

#### **4.7. Noonan Syndrome**

128 Mutations in Human Genetic Disease

www.genemani.org )

**4.6. Alagille Syndrome** 

**Figure 1.** Complex Genetic Interactions of *TBX5*, *GATA4*, and *Nkx2-5* (Network created using

penetrance in families so that the actual prevalence is expected to be much higher.

Alagille Syndrome is caused by mutations in the *JAG1* gene.(Li and others 1997a; Oda and others 1997) The gene encodes a ligand to the Notch1 receptor. Jagged-Notch cell-cell interactions are crucial in determining cell fates during early developmental processes. The mutations spectrum of *JAG1* in Alagille Syndrome encompasses frameshift mutations, nonsense mutations, splice site mutations, or deletion of the whole gene.(Yuan and others

Alagille Syndrome is inherited in an autosomal dominant fashion and is defined in the presence of intrahepatic bile duct paucity that usually manifests as cholestasis, congenital heart disease, distinctive facies, skeletal, ocular, renal, and neurological abnormalities. (Kamath and others 2011; Li and others 1997a) CHD is found in more than 90% of patients with Alagille Syndrome and the most common lesion is Pulmonary Atery Stenosis (PAS) or hypoplasia. Other common lesions include TOF, pulmonary valve stenosis (PS), and ASD.(McElhinney and others 2002) The prevalence of the disease is estimated at around one in 700,000 neonates when presence of jaundice is used to ascertain cases (Danks and others 1977), but in fact the disease has a tremendous variability in the phenotype and variable Noonan Syndrome (NS) is a dysmorphic cardiofacial syndrome inherited mostly in an autosomal dominant fashion, with some cases occurring sporadically. Its incidence ranges between 1 in 1000 to 1 in 2500 live births.(Tartaglia and others 2010) The characteristic physical features are downward eyeslanting of the eyes, hypertelorism, low-set ears, short stature, short and webbed neck, and epicanthic folds.(Tartaglia and others 2010) Congenital Heart Disease is found in 80 to 90% of patients with Noonan Syndrome and valvar pulmonary stenosis (PS) and Hypertrophic Cardiomyopathy (HCM) are the two most common cardiac manifestations. A large set of cardiac malformations can also occur including secundum ASD, AVSD, TOF, COA, VSD, PDA, and mitral valve disease.(Marino and others 1999; Noonan 1994) Patients might also have deafness, cryptorchidism, motor delay, and bleeding diathesis.(Tartaglia and others 2010)

NS is a genetically heterogeneous syndrome with at least 8 genes that have been associated with the disease so far: *PTPN11, SOS1, RAF1, KRAS, BRAF, MEK1, MEK2*, and *HRAS*.(Tidyman and Rauen 2009) Mutations in *PTPN11* are most common and explain 50% of the Noonan Syndrome cases, the other 7 genes explain roughly 25% of the cases, and in about 25% of the cases no mutation is found.(Tartaglia and others 2010) All the genes implicated in NS encode proteins that are part of the Ras/Raf/MEK/ERK signaling pathway, an important regulator of cell proliferation, differentiation, and survival. *PTPN11* encodes SHP-2, a protein tyrosine phosphatase that plays an important role in the signal transduction to medial the biological processes described above.

Disease penetrance is almost complete with *PTPN11* mutations, but there is a wide variability in the phenotype. Clinical testing for some of the genes involved in NS such as *PTPN11*, *SOS1*, and *KRAS* is available. Clinical diagnosis might be helpful might be helpful in borderline cases given the variability in the phenotype.

#### **5. Nonsyndromic congenital heart disease**

Isolated congenital heart disease is the most prevalent form of CHD. Evidence for the genetic basis of isolated CHD comes from familial clustering of cases as well as higher recurrence rate of CHD. Mutations in many genes have been associated with several CHD phenotypes, yet the evidence is variable for each gene. Gene mutations can best be classified as highly penetrant mutations in disease-causing genes, low-penetrance mutations in susceptibility genes, and common variants in CHD risk-genes. Transcription factor genes are the most common group of genes implicated in CHD. Other genes are part of signaling transduction pathways and structural components of the heart. Evidence for each gene comes from family studies and segregation analyses using direct sequencing. As mentioned earlier, one of the biggest challenges in the genetics of nonsyndromic CHD is that sequencing for all genes implicated in CHD explains the genetic cause of only a small percentage of patients. Most gene mutations have been described in one or few cases, while only a small number of genes have been duplicated in many cohorts and families.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 131

20437614, 19886994, 17924340, 16470721, 18538293, 20581743, 19553149, 18538293, 18538293, 20819618, 14517948, 14607454

20071345, 18273862, 19064609, 14681828

18159245, 1480002, 15689439, 12845333, 17072672, 19853937, 19666519, 16287139, 17668378, 9651244,

15689439,14607454

12845333, 20670841, 19064609, 19506109, 12632326, 15857420, 18538293, 10053005

18593716, 14607454, 20456451, 11470490,

21544582, 12845333, 17253934, 18055909, 19853937, 14681828, 19853938, 16287139, 17668378, 12074273, 9651244, 10587520

21080980, 9916847, 12845333, 19666519,

10942104, 20437614

15810002,

14681828

14681828

20656787

11175284

*ELN, JAG1* 16944981, 11175284,

*ELN* 9215670, 16944981,

Tetralogy of Fallot (TOF) *Nkx2-5, NODAL, CFC1,* 

ASD *NKX2-5, GATA4, GATA6,* 

Ebstein 's Anomaly *MYH7* 21127202

*ZIC3* 

VSD *NKX2-5, GATA4, CFC1,* 

Total Anomalous Pulmonary Venous Connection (TAPVC)

Partial Anomalous Pulmonary Venous Connection (PAPVC)

Atrioventricular Septal Defect

Hypoplastic Left Heart Syndrome (HLH)

Pulmonary Valve Stenosis

Pulmonary Artery Stenosis

Supravalvar Aortic Stenosis

(AVSD)

(PS)

(PAS)

(SVAS)

*FOXH1, GATA4, FOG2, GDF1, HAND2, ALDH1A2, GATA6, TDGF1, JAG1* 

*NODAL, PDGFRA, ANKRD1, ZIC3* 

*TBX20, CFC1, CITED2* 

*NODAL, GATA4, ACVR1, CRELD1, CFC1, LEFTY2* 

*NOTCH1, NKX2-5, GJA1,* 

*IRX4, ZIC3, TDGF1, CITED2, TBX20* 

*ELN, GATA4, ACVR2B,* 

*ZIC3, GATA6* 

Aortic Valve Stenosis (AS) *NOTCH1, ELN, MYH6* 21080980, 16025100,

Bicuspid Aortic Valve (BAV) *NOTCH1* 16729972, 160251100

Coarcation of the Aorta *VEGF, NOTCH1, NKX2-5,* 20420808, 10053005,

*GATA4* 18076106

Table 3 lists all genes in which mutations have been found in different nonsyndromic CHD phenotypes. Most of these are based on only few cases and hence remain to be ascertained; however some have been duplicated in several families such as the phenotypes associated with *NKX2-5* or *GATA4* mutations. The table lists all the genes in which mutations have ever been described for each phenotype. The corresponding PubMed IDs are provided for the published studies where these gene mutations are reported so that readers can make their own assessment regarding the strength of the association.



**5. Nonsyndromic congenital heart disease** 

own assessment regarding the strength of the association.

Transposition of the Great

Double Outlet Right Ventricle

Common Arterial Trunk

Arteries (TGA)

(DORV)

(CAT)

Isolated congenital heart disease is the most prevalent form of CHD. Evidence for the genetic basis of isolated CHD comes from familial clustering of cases as well as higher recurrence rate of CHD. Mutations in many genes have been associated with several CHD phenotypes, yet the evidence is variable for each gene. Gene mutations can best be classified as highly penetrant mutations in disease-causing genes, low-penetrance mutations in susceptibility genes, and common variants in CHD risk-genes. Transcription factor genes are the most common group of genes implicated in CHD. Other genes are part of signaling transduction pathways and structural components of the heart. Evidence for each gene comes from family studies and segregation analyses using direct sequencing. As mentioned earlier, one of the biggest challenges in the genetics of nonsyndromic CHD is that sequencing for all genes implicated in CHD explains the genetic cause of only a small percentage of patients. Most gene mutations have been described in one or few cases, while

only a small number of genes have been duplicated in many cohorts and families.

**Phenotype Implicated Genes PubMed ID**

Mitral Atresia *FLNA* 20730588

Dextrocardia *ACVR2B, NODAL, ZIC3* 9916846, 19064609,

Tricuspid Atresia *MYH6* 15643620, 15389319

*NODAL, FOXH1, CFC1, THRAP2, GDF1, ACVR2B, ZIC3, NKX2-5, MYH6* 

*NODAL, FOG2, GDF1, CFC1, ACVR2B, NKX2-5*  14682828

9916847, 14638541, 17924340, 11799476, 18538293, 19553149, 19933292, 19064609, 17295247, 19933292, 14681828, 18538293, 1460745420656787

9916847, 17924340, 11799476, 19553149, 14681828, 20807224,

14607454

15649947

*GATA6, NKX2-5, Nkx2-6* 19666519, 14607454,

Table 3 lists all genes in which mutations have been found in different nonsyndromic CHD phenotypes. Most of these are based on only few cases and hence remain to be ascertained; however some have been duplicated in several families such as the phenotypes associated with *NKX2-5* or *GATA4* mutations. The table lists all the genes in which mutations have ever been described for each phenotype. The corresponding PubMed IDs are provided for the published studies where these gene mutations are reported so that readers can make their


Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 133

in familial studies. The most common phenotypes were causative *GATA4* mutations are found are ASD, VSD, TOF, and AVSD.(Garg and others 2003; Nemer and others 2006) Findings of *GATA4* mutations have been duplicated in many familial studies.(Chen and others 2010; Garg and others 2003) Multiple phenotypes are often seen within the same family segregating the same mutation. In isolated studies of CHD cohorts with phenotypes within the spectrum of phenotypes obtained from *GATA4* knockout mice, the frequency of *GATA4* mutations ranges between 0.8% and 3.7%.(Peng and others 2010; Rajagopal and others 2007; Tomita-Mitchell and others 2007; Zhang and others 2006) The spectrum of mutations in *GATA4* includes missense mutations as well as mutations that truncate the protein such as nonsense, frameshift, or splice site variants. Disease-causing missense mutations often disturb the cooperative binding of *GATA4* to other transcription factors such as *Nkx2-5* and *TBX5* (Figure 1), a process which is essential for modulating

Animal studies have shown that while *Gata4*+/- and *Gata6*+/- mice survive normally, compound heterozygous *Gata4+/- Gata6+/-* mice die at embryonic day 13.5 due to severe cardiac malformations.(Xin and others 2006) Also when both genes are knocked out completely, mice fail to develop any heart.(Zhao and others 2008) These studies have shown that both *Gata4* and *Gata6* are essential for cardiac development and that they interact to regulate downstream targets during heart development. Inactivating *Gata6* in specific vascular cells using transgenic mice has also shown that *Gata6* is involved in the migration of neural crest cells and differentiation of terminal smooth muscle cells, late processes in cardiac development.(Lepore and others 2006) Sequencing of patients with CHD corroborated animal findings by identifying heterozygous *GATA6* mutations in outflow tract defects, mainly Common Arterial Trunk (CAT).(Kodo and others 2009) Subsequent studies showed that *GATA6* mutations also cause ASD and TOF.(Lin and others 2010) Like for *GATA4*, the mutational spectrum of *GATA6* includes missense as well as truncating variants, and genotype-phenotype correlations are not established as the same mutation can cause different phenotypes. In many laboratories around the world, clinical genetic testing is

Homeobox-containing genes are transcription factors that play crucial roles in cardiac development through regulating essential processes such as the spatio-temporal specificity of gene expression required for normal cardiac tissue differentiation. This transcription factor is evolutionary conserved and essential for cardiac development. The "*Tinman*" gene in drosophila is a homeobox-containing gene that is essential for development of the dorsal vessel, a structure analogous to the human heart. *NKX2-5* is the "*Tinman*" homologue in mouse and is highly expressed in the mouse embryologic heart and essential for its development.(Reamon-Buettner and Borlak 2010) The *NKX2-5* gene was cloned in 1996 (Turbay and others 1996), and since then it was shown to be one of the most common

downstream gene expression during cardiac development.

commonly available for *GATA4*, but not for *GATA6*.

known genetic causes of human CHD.

**5.2. Homeobox transcription factors (***NKX2-5, NKX2-6***)** 

**Table 3.** Implicated Genes in Different Nonsyndromic CHD Phenotypes

In the remaining part this section, the most common genes implicated in nonsyndromic CHD are discussed in details. For each gene, the mutational spectrum, function, associated CHD phenotypes, and mechanism of disease (if known) are provided. The three large groups of cardiac specific transcription factors, the GATA (*GATA4, GATA5*, and *GATA6*), Homebox (*Nkx2-5* and *Nkx2-6*), and T-box (*TBX1, TBX5*, and *TBX20*) are first discussed in detail each in a separate subsection. These three categories of genes comprise the majority of the known genetic causes of CHD. Genes from all three categories interact to regulate downstream gene expression in the developing heart. Other transcription factor genes are discussed in a separate section. Different signaling pathway genes such as the NODAL signaling genes and the Notch signaling pathway are discussed separately. Contractile protein genes, in addition to their well-established role in cadiomyopathy, have been associated with CHD and are mentioned under one section. All remaining genes with minimal evidence for causing CHD comprise are clustered under the final subtitle of this section of the chapter.

#### **5.1. GATA transcription factors (***GATA4, GATA5, GATA6***)**

GATA-binding proteins are a family of transcription factors that regulate gene expression and are involved in cell differentiation, survival, and proliferation in many tissues. GATA proteins are evolutionary conserved proteins containing two zinc-finger motifs. They recognize and bind to a "GATA" consensus sequence, which is an important *cis*-element of the promoters of many genes.

*GATA4, GATA5*, and *GATA6* are involved in the developing heart, and knockout studies in mice have shown that all three are essential for normal cardiac development. Silencing of GATA genes can result in cardiac malformations ranging from valvoseptal defects to acardia. However, mutations in humans with CHD have been described only in *GATA4* and *GATA6* but not *GATA5*.

*GATA* genes are also among the earliest transcription factors to be expressed in the developing heart. They are expressed in different but overlapping time and tissue patterns in the embryonic heart and manifest complex combinatorial interactions. These characteristics seem to be essential for proper embryonic and postnatal cardiac development.

*GATA4* mutations are a well-established cause of CHD in humans. They are inherited in an autosomal dominant fashion in familial cases and are also seen in isolated cases. Haploinsufficiency of the *GATA4* gene causes CHD, which is highly penetrant as observed in familial studies. The most common phenotypes were causative *GATA4* mutations are found are ASD, VSD, TOF, and AVSD.(Garg and others 2003; Nemer and others 2006) Findings of *GATA4* mutations have been duplicated in many familial studies.(Chen and others 2010; Garg and others 2003) Multiple phenotypes are often seen within the same family segregating the same mutation. In isolated studies of CHD cohorts with phenotypes within the spectrum of phenotypes obtained from *GATA4* knockout mice, the frequency of *GATA4* mutations ranges between 0.8% and 3.7%.(Peng and others 2010; Rajagopal and others 2007; Tomita-Mitchell and others 2007; Zhang and others 2006) The spectrum of mutations in *GATA4* includes missense mutations as well as mutations that truncate the protein such as nonsense, frameshift, or splice site variants. Disease-causing missense mutations often disturb the cooperative binding of *GATA4* to other transcription factors such as *Nkx2-5* and *TBX5* (Figure 1), a process which is essential for modulating downstream gene expression during cardiac development.

132 Mutations in Human Genetic Disease

Patent Ductus Arteriosus

section of the chapter.

the promoters of many genes.

*GATA6* but not *GATA5*.

development.

(PDA)

(COA) *LEFTY2* 18593716, 14607454 Interrupted Aortic Arch (IAA) *CFC1, LEFTY2, NKX2-5* 18538293, 10053005,

In the remaining part this section, the most common genes implicated in nonsyndromic CHD are discussed in details. For each gene, the mutational spectrum, function, associated CHD phenotypes, and mechanism of disease (if known) are provided. The three large groups of cardiac specific transcription factors, the GATA (*GATA4, GATA5*, and *GATA6*), Homebox (*Nkx2-5* and *Nkx2-6*), and T-box (*TBX1, TBX5*, and *TBX20*) are first discussed in detail each in a separate subsection. These three categories of genes comprise the majority of the known genetic causes of CHD. Genes from all three categories interact to regulate downstream gene expression in the developing heart. Other transcription factor genes are discussed in a separate section. Different signaling pathway genes such as the NODAL signaling genes and the Notch signaling pathway are discussed separately. Contractile protein genes, in addition to their well-established role in cadiomyopathy, have been associated with CHD and are mentioned under one section. All remaining genes with minimal evidence for causing CHD comprise are clustered under the final subtitle of this

GATA-binding proteins are a family of transcription factors that regulate gene expression and are involved in cell differentiation, survival, and proliferation in many tissues. GATA proteins are evolutionary conserved proteins containing two zinc-finger motifs. They recognize and bind to a "GATA" consensus sequence, which is an important *cis*-element of

*GATA4, GATA5*, and *GATA6* are involved in the developing heart, and knockout studies in mice have shown that all three are essential for normal cardiac development. Silencing of GATA genes can result in cardiac malformations ranging from valvoseptal defects to acardia. However, mutations in humans with CHD have been described only in *GATA4* and

*GATA* genes are also among the earliest transcription factors to be expressed in the developing heart. They are expressed in different but overlapping time and tissue patterns in the embryonic heart and manifest complex combinatorial interactions. These characteristics seem to be essential for proper embryonic and postnatal cardiac

*GATA4* mutations are a well-established cause of CHD in humans. They are inherited in an autosomal dominant fashion in familial cases and are also seen in isolated cases. Haploinsufficiency of the *GATA4* gene causes CHD, which is highly penetrant as observed

**Table 3.** Implicated Genes in Different Nonsyndromic CHD Phenotypes

**5.1. GATA transcription factors (***GATA4, GATA5, GATA6***)** 

14607454

18752453

*MYH11, TFAP2B* 16444274, 17956658,

Animal studies have shown that while *Gata4*+/- and *Gata6*+/- mice survive normally, compound heterozygous *Gata4+/- Gata6+/-* mice die at embryonic day 13.5 due to severe cardiac malformations.(Xin and others 2006) Also when both genes are knocked out completely, mice fail to develop any heart.(Zhao and others 2008) These studies have shown that both *Gata4* and *Gata6* are essential for cardiac development and that they interact to regulate downstream targets during heart development. Inactivating *Gata6* in specific vascular cells using transgenic mice has also shown that *Gata6* is involved in the migration of neural crest cells and differentiation of terminal smooth muscle cells, late processes in cardiac development.(Lepore and others 2006) Sequencing of patients with CHD corroborated animal findings by identifying heterozygous *GATA6* mutations in outflow tract defects, mainly Common Arterial Trunk (CAT).(Kodo and others 2009) Subsequent studies showed that *GATA6* mutations also cause ASD and TOF.(Lin and others 2010) Like for *GATA4*, the mutational spectrum of *GATA6* includes missense as well as truncating variants, and genotype-phenotype correlations are not established as the same mutation can cause different phenotypes. In many laboratories around the world, clinical genetic testing is commonly available for *GATA4*, but not for *GATA6*.

#### **5.2. Homeobox transcription factors (***NKX2-5, NKX2-6***)**

Homeobox-containing genes are transcription factors that play crucial roles in cardiac development through regulating essential processes such as the spatio-temporal specificity of gene expression required for normal cardiac tissue differentiation. This transcription factor is evolutionary conserved and essential for cardiac development. The "*Tinman*" gene in drosophila is a homeobox-containing gene that is essential for development of the dorsal vessel, a structure analogous to the human heart. *NKX2-5* is the "*Tinman*" homologue in mouse and is highly expressed in the mouse embryologic heart and essential for its development.(Reamon-Buettner and Borlak 2010) The *NKX2-5* gene was cloned in 1996 (Turbay and others 1996), and since then it was shown to be one of the most common known genetic causes of human CHD.

*NKX2-5* plays critical roles in later stages of cardiac development, namely septation and development of the conduction system. It physically interacts with *TBX5* to form a complex that cooperatively regulates downstream gene expression that is essential for proper septation and formation of the conduction system.(Habets and others 2002; Moskowitz and others 2007) Mutations in *NKX2-5* gene cause congenital heart disease in an autosomal dominant fashion and with high penetrance.(Kasahara and others 2000) Many families have been described. The most common phenotype is ASD with Atrioventricular (AV) Block. However *NKX2-5* mutations have also been associated with many other CHD phenotypes such as VSD, TOF, subvalvar AS, Ebstein's Anomaly, cardiomyopathy, ventricular hypertrophy or non-compaction, and arrythmias other than the common AV block.(Reamon-Buettner and Borlak 2010) Also in families, different CHD phenotypes can be observed with the same *NKX2-5* mutations making genotype-phenotype correlations difficult. In cohorts of isolated CHD, *NKX2-5* mutations are found in around 2%.(Reamon-Buettner and Borlak 2010) The mutational spectrum is wide with missense and truncating mutations being heavily described. Sequencing for *NKX2-5* is clinically available for genetic testing. Identifying family members through cascade screening might allow the diagnosis of fatal arrythmias or silent ASD's that can otherwise lead to heart failure.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 135

tract abnormalities, and HLH syndrome.(Kirk and others 2007; Posch and others 2010) Both missense and nonsense heterozygous mutations are described. Functional studies suggest that both loss of function and gain of function mutations in the *TBX20* gene can cause

The above three families of transcription factors are the most heavily studied in heart development, however a large set of other transcription factors have also been implicated in CHD, yet with lower degrees of evidence, or for some lower penetrance. This section will

*CITED2* codes for CBP/p300-Interacting Transactivator with E/D-rich c-terminal Domain Type 2, a transcriptional co-activator several transcriptional responses such as *TFAP2*, the known cause of Char Syndrome. *CITED2* null mouse embryos die embryologically and manifest septal, outflow tract, and aortic arch defects.(Bamforth and others 2004) CITED2 mutations were detected in about 1% of sporadic cases of CHD. Phenotypes include ASD,

Ankyrin Repeat Domain 1 (*ANKRD1*) is a transcription factor that interacts with cardiac sarcomere proteins. One balanced translocation and one missense mutation in *ANKRD1*

Friend of GATA 2 (*FOG2*) is, as its name implies, a cofactor of *GATA4*. *FOG2* knockout mice have TOF-like phenotype,(Tevosian and others 2000) and *FOG2* mutations have been described in TOF patients however with reduced penetrance.(Pizzuti and others 2003)

*ZIC3* encodes for a zinc finger transcription factor that is implicated in left-right axis development. It is a known gene in human *situs* abnormalities and is inherited in an Xlinked fashion. Mutations in *ZIC3* have been identified in families and cohorts of heterotaxy.(Gebbia and others 1997) Additionally, there has been one reported family with TGA carrying a transversion in the *ZIC3* gene, yet with incomplete penetrance.(Megarbane

The NODAL family of proteins is member of the TGF-beta superfamily of secreted signaling molecules. NODAL signaling is responsible for dorso-ventral patterning in vertebrate development as well as mesoderm and endoderm generation. Mutations in different genes in the NODAL signaling cascade are believed to occur and cumulatively decrease NODAL signaling leading to CHD phenotypes.(Roessler and others 2009) NODAL mutations have been reported in patients with heterotaxy, TGA, and conotruncal defects,(Gebbia and others 1997; Mohapatra and others 2009) but as mentioned earlier simple heterozygosity is not

gene were detected in two separate cases of TAPVC.(Cinquetti and others 2008)

**5.5. NODAL signaling genes (NODAL, GDF1, FOXH1, CFC1, ACVR2B,** 

**5.4. Other transcription factors (***CITED2, ANKRD1, FOG2, ZIC3***)** 

CHD.(Posch and others 2010)

briefly discuss each of these transcription factors.

VSD, and TAPVC.(Sperling and others 2005)

and others 2000)

**LEFTY2)** 

*NKX2-6* is another homeobox transcription factor that shares great homology with *NKX2-6* but whose downstream targets are unknown. Mouse in which *NKX2-6* was knocked out did not have any cardiac phenotype, but one mutation has been associated with CAT in one family.(Heathcote and others 2005) More mutations in *NKX2-6* remain to be detected in CHD patients with high throughput screening before its causality to CHD could be established.

#### **5.3. T-Box transcription factors (***TBX1, TBX5, TBX20***)**

The T-box family of binding proteins also consists of important transcription factors in cardiac development. T-box genes are evolutionary conserved and share a T-binding domain. All family members are involved in regulating developmental processes such as the initiation and potentiation of cardiac development.(Hariri and others 2011)

The crucial role of *TBX5* in heart development and its interactions with *GATA4* and *NKX2-5* has been discussed earlier in this chapter. Apart from Holt-Oram Syndrome, *TBX5* has not been implicated in nonsyndromic CHD, although some *TBX5* mutations can cause a heartpredominant phenotype with very subtle upper limbs disease. *TBX1* was also discussed earlier as the cause of cardiac malformations in Di George Syndrome. A large deletion of 57bp in the *TBX1* gene was found in one non-syndromic patient with TOF.(Griffin and others 2010) Apart from this single report, findings of *TBX1* mutations have not been duplicated in non-syndromic CHD patients.

Another member of the family that has been implicated in non-syndromic CHD is *TBX20*. *Tbx20*+/- mice have dilated cardiomyopathy and *TBX20*-/- mice die at midgestation due to grossly abnormal heart.(Stennard and others 2005) Mutations in *TBX20* are found in less than 1% of patients with CHD phenotypes such as septal defects, left ventricular outflow tract abnormalities, and HLH syndrome.(Kirk and others 2007; Posch and others 2010) Both missense and nonsense heterozygous mutations are described. Functional studies suggest that both loss of function and gain of function mutations in the *TBX20* gene can cause CHD.(Posch and others 2010)

#### **5.4. Other transcription factors (***CITED2, ANKRD1, FOG2, ZIC3***)**

134 Mutations in Human Genetic Disease

established.

*NKX2-5* plays critical roles in later stages of cardiac development, namely septation and development of the conduction system. It physically interacts with *TBX5* to form a complex that cooperatively regulates downstream gene expression that is essential for proper septation and formation of the conduction system.(Habets and others 2002; Moskowitz and others 2007) Mutations in *NKX2-5* gene cause congenital heart disease in an autosomal dominant fashion and with high penetrance.(Kasahara and others 2000) Many families have been described. The most common phenotype is ASD with Atrioventricular (AV) Block. However *NKX2-5* mutations have also been associated with many other CHD phenotypes such as VSD, TOF, subvalvar AS, Ebstein's Anomaly, cardiomyopathy, ventricular hypertrophy or non-compaction, and arrythmias other than the common AV block.(Reamon-Buettner and Borlak 2010) Also in families, different CHD phenotypes can be observed with the same *NKX2-5* mutations making genotype-phenotype correlations difficult. In cohorts of isolated CHD, *NKX2-5* mutations are found in around 2%.(Reamon-Buettner and Borlak 2010) The mutational spectrum is wide with missense and truncating mutations being heavily described. Sequencing for *NKX2-5* is clinically available for genetic testing. Identifying family members through cascade screening might allow the diagnosis of

*NKX2-6* is another homeobox transcription factor that shares great homology with *NKX2-6* but whose downstream targets are unknown. Mouse in which *NKX2-6* was knocked out did not have any cardiac phenotype, but one mutation has been associated with CAT in one family.(Heathcote and others 2005) More mutations in *NKX2-6* remain to be detected in CHD patients with high throughput screening before its causality to CHD could be

The T-box family of binding proteins also consists of important transcription factors in cardiac development. T-box genes are evolutionary conserved and share a T-binding domain. All family members are involved in regulating developmental processes such as the

The crucial role of *TBX5* in heart development and its interactions with *GATA4* and *NKX2-5* has been discussed earlier in this chapter. Apart from Holt-Oram Syndrome, *TBX5* has not been implicated in nonsyndromic CHD, although some *TBX5* mutations can cause a heartpredominant phenotype with very subtle upper limbs disease. *TBX1* was also discussed earlier as the cause of cardiac malformations in Di George Syndrome. A large deletion of 57bp in the *TBX1* gene was found in one non-syndromic patient with TOF.(Griffin and others 2010) Apart from this single report, findings of *TBX1* mutations have not been

Another member of the family that has been implicated in non-syndromic CHD is *TBX20*. *Tbx20*+/- mice have dilated cardiomyopathy and *TBX20*-/- mice die at midgestation due to grossly abnormal heart.(Stennard and others 2005) Mutations in *TBX20* are found in less than 1% of patients with CHD phenotypes such as septal defects, left ventricular outflow

fatal arrythmias or silent ASD's that can otherwise lead to heart failure.

initiation and potentiation of cardiac development.(Hariri and others 2011)

**5.3. T-Box transcription factors (***TBX1, TBX5, TBX20***)** 

duplicated in non-syndromic CHD patients.

The above three families of transcription factors are the most heavily studied in heart development, however a large set of other transcription factors have also been implicated in CHD, yet with lower degrees of evidence, or for some lower penetrance. This section will briefly discuss each of these transcription factors.

*CITED2* codes for CBP/p300-Interacting Transactivator with E/D-rich c-terminal Domain Type 2, a transcriptional co-activator several transcriptional responses such as *TFAP2*, the known cause of Char Syndrome. *CITED2* null mouse embryos die embryologically and manifest septal, outflow tract, and aortic arch defects.(Bamforth and others 2004) CITED2 mutations were detected in about 1% of sporadic cases of CHD. Phenotypes include ASD, VSD, and TAPVC.(Sperling and others 2005)

Ankyrin Repeat Domain 1 (*ANKRD1*) is a transcription factor that interacts with cardiac sarcomere proteins. One balanced translocation and one missense mutation in *ANKRD1* gene were detected in two separate cases of TAPVC.(Cinquetti and others 2008)

Friend of GATA 2 (*FOG2*) is, as its name implies, a cofactor of *GATA4*. *FOG2* knockout mice have TOF-like phenotype,(Tevosian and others 2000) and *FOG2* mutations have been described in TOF patients however with reduced penetrance.(Pizzuti and others 2003)

*ZIC3* encodes for a zinc finger transcription factor that is implicated in left-right axis development. It is a known gene in human *situs* abnormalities and is inherited in an Xlinked fashion. Mutations in *ZIC3* have been identified in families and cohorts of heterotaxy.(Gebbia and others 1997) Additionally, there has been one reported family with TGA carrying a transversion in the *ZIC3* gene, yet with incomplete penetrance.(Megarbane and others 2000)

## **5.5. NODAL signaling genes (NODAL, GDF1, FOXH1, CFC1, ACVR2B, LEFTY2)**

The NODAL family of proteins is member of the TGF-beta superfamily of secreted signaling molecules. NODAL signaling is responsible for dorso-ventral patterning in vertebrate development as well as mesoderm and endoderm generation. Mutations in different genes in the NODAL signaling cascade are believed to occur and cumulatively decrease NODAL signaling leading to CHD phenotypes.(Roessler and others 2009) NODAL mutations have been reported in patients with heterotaxy, TGA, and conotruncal defects,(Gebbia and others 1997; Mohapatra and others 2009) but as mentioned earlier simple heterozygosity is not

enough to cause the phenotype in the majority of cases. Mutations in other pathway genes such as *GDF1, FOXH1, CFC1*, and *LEFTY2* are often necessary to cause disease.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 137

Metcalfe and others 2000) *GJA1* encodes Connexin-43, a gap junction protein that maintains cell-cell adhesion and communication. Mutations in *GJA1* were reported in a case of HLH and another report of heterotaxia patients. (Britz-Cunningham and others 1995; Dasgupta and others 2001) Filamin A (FLNA) cross-links actin filaments in the cytoplasm and anchors them to the rest of the cytoskeleton. *FLNA* is an X-linked gene in which mutations are associated with valvular dystrophy. (Kyndt and others 2007) Finally, mutations in the *THRAP2* gene, which encodes a TRAP-complex protein, have been associated with TGA in

Despite the large number of genes implicated in non-syndromic CHD, the genetic cause of the majority of isolated cases of CHD is still poorly understood. This has led researchers to investigate genetic mechanisms other than gene mutations that can contribute to inherited or isolated CHD. Copy Number Variations (CNVs), micro RNA (miRNA), somatic

Copy Number Variations (CNVs) are structural alterations to the genomic DNA that result in the cell having abnormal copies of large sections of its DNA. They can be inherited or occur *de novo*. Over the past decade, the role of CNVs in disease has been heavily studied, mostly in different types of cancers. In the heart, CNV analysis has explained an additional small fraction of the genetics of syndromic CHD (3.6%), but more of the non-syndromic CHD (19%).(Breckpot and others 2011) Submicroscopic deletions have been discovered using array-CGH in large CHD cohorts. CNVs occured in regions harboring known CHD candidate genes but were also capable of identifying new CHD loci in TOF, HLH, heterotaxy, and other CHD phenotypes.(Fakhro and others 2011; Greenway and others 2009; Payne and others 2012) One of the most commonly used strategies in CNV analysis is trio analysis, which allows the determination of de novo CNVs in CHD patients. Comparison with control groups is also helpful in assessing the likelihood of causality of CNVs using statistical methods. Despite several successful examples, the use of CNVs in understanding CHD remains challenging, particularly in proving the causality of the CNVs and assessing the magnitude that these CNVs have on

Micro RNAs (miRNAs) are small (around 22 nucleotides long) single stranded noncoding RNAs and are encoded by miRNA genes. miRNAs serve as regulators of gene expression. Since cardiac development involves tremendous spatio-temportal specificity of gene expression, it is believed that miRNAs are involved in cardiac development and they can potentially cause CHD. miRNAs are important players in cellular proliferation, differentiation, and migration all of which are essential processes for proper cardiac

mutations, and epigenetics are all active areas of research into the genetics of CHD.

one study.(Muncke and others 2003)

**6.1. Copy Number Variations** 

the phenotype.

**6.2. Micro RNA** 

**6. Other genetic mechanisms of CHD** 

*CFC1* (Cryptic) is a cofactor of NODAL signaling and its acts through activin receptors. *CFC1* mutations have been initially reported in laterality defects.(Bamford and others 2000) However, outflow tract defects such as TGA and DORV have also been associated with CFC1 mutations.(Goldmuntz and others 2002) Similar associations with CHD phenotypes apart from *situs* abnormalities have been observed for *GDF1*, another member of the TGFbeta superfamily involved in NODAL signaling.(Karkera and others 2007) *FOXH1* mutations have been associated with CHD however only within the context of reduced NODAL signaling due to mutations in more than one gene in the cascade.(Roessler and others 2008) Therefore, sequencing of all NODAL signaling genes together would give a better picture of the genetic cause of a particular CHD phenotype rather than identifying a variant in one of the genes.

#### **5.6. Notch signaling genes (***NOTCH1, JAG1, NOTCH2***)**

The Notch-Jagged signaling pathway is an important regulatory mechanism of cell differentiation processes during embryonic and adult life. In the heart, it is particularly important in cardiac valve development. *JAG1* and *NOTCH2* mutations are known causes of Alagille Syndrome. However mutations in both can cause non-syndromic CHD.(Bauer and others 2010; McDaniell and others 2006) *NOTCH1* has been also implicated in nonsyndromic CHD. Mutations can cause BAV, AS, COA, and HLH.(Garg and others 2005; McBride and others 2008; Mohamed and others 2006)

#### **5.7. Contractile protein genes (***MYH6, MYH7, MYH11, MYBPC3, ACTC1***)**

Mutations in contractile protein genes are common causes of Hypertrophic Cardiomyopathy (HCM) and other cardiomyopathies. However, some of these genes have also been implicated in a minority of CHD cases. One *MYH6* (Alpha Myosin Heavy Chain) mutation has been described in a family with ASD. (Ching and others 2005) Mutations in *MYH7* (Beta Myosin Heavy Chain) can cause Ebstein's Anomaly and septal defects.(Budde and others 2007) Heterozygous *MYBPC3* mutations are a very frequent cause of HCM, however there have been reports of ASD and PDA in addition to severe HCM in patients with homozygous truncating mutations in the Myosin Binding Protein C gene *MYBPC3*.(Xin and others 2007; Zahka and others 2008) Similarly, mutations in Alpha-Cardiac Actin *ACTC1*, another sarcomere protein gene, cause ASD together with HCM.(Monserrat and others 2007) Finally, Myosin Heavy Chain 11 (MYH11) has a role in smooth muscles, and mutations in *MYH11* have been implicated in familial thoracic aortic aneurysm with PDA due to decreased elasticity of the aortic wall and the ductus arteriosus.(Zhu and others 2006)

#### **5.8. Miscellaneous genes (***ELN, GJA1, FLNA, THRAP2***)**

Elastin (*ELN*) deletion or mutations are implicated in Williams-Beuren syndrome, however have also been reported in many cases of isolated SVAS and PS. (Arrington and others 2006; Metcalfe and others 2000) *GJA1* encodes Connexin-43, a gap junction protein that maintains cell-cell adhesion and communication. Mutations in *GJA1* were reported in a case of HLH and another report of heterotaxia patients. (Britz-Cunningham and others 1995; Dasgupta and others 2001) Filamin A (FLNA) cross-links actin filaments in the cytoplasm and anchors them to the rest of the cytoskeleton. *FLNA* is an X-linked gene in which mutations are associated with valvular dystrophy. (Kyndt and others 2007) Finally, mutations in the *THRAP2* gene, which encodes a TRAP-complex protein, have been associated with TGA in one study.(Muncke and others 2003)

### **6. Other genetic mechanisms of CHD**

Despite the large number of genes implicated in non-syndromic CHD, the genetic cause of the majority of isolated cases of CHD is still poorly understood. This has led researchers to investigate genetic mechanisms other than gene mutations that can contribute to inherited or isolated CHD. Copy Number Variations (CNVs), micro RNA (miRNA), somatic mutations, and epigenetics are all active areas of research into the genetics of CHD.

#### **6.1. Copy Number Variations**

136 Mutations in Human Genetic Disease

variant in one of the genes.

enough to cause the phenotype in the majority of cases. Mutations in other pathway genes

*CFC1* (Cryptic) is a cofactor of NODAL signaling and its acts through activin receptors. *CFC1* mutations have been initially reported in laterality defects.(Bamford and others 2000) However, outflow tract defects such as TGA and DORV have also been associated with CFC1 mutations.(Goldmuntz and others 2002) Similar associations with CHD phenotypes apart from *situs* abnormalities have been observed for *GDF1*, another member of the TGFbeta superfamily involved in NODAL signaling.(Karkera and others 2007) *FOXH1* mutations have been associated with CHD however only within the context of reduced NODAL signaling due to mutations in more than one gene in the cascade.(Roessler and others 2008) Therefore, sequencing of all NODAL signaling genes together would give a better picture of the genetic cause of a particular CHD phenotype rather than identifying a

The Notch-Jagged signaling pathway is an important regulatory mechanism of cell differentiation processes during embryonic and adult life. In the heart, it is particularly important in cardiac valve development. *JAG1* and *NOTCH2* mutations are known causes of Alagille Syndrome. However mutations in both can cause non-syndromic CHD.(Bauer and others 2010; McDaniell and others 2006) *NOTCH1* has been also implicated in nonsyndromic CHD. Mutations can cause BAV, AS, COA, and HLH.(Garg and others 2005;

Mutations in contractile protein genes are common causes of Hypertrophic Cardiomyopathy (HCM) and other cardiomyopathies. However, some of these genes have also been implicated in a minority of CHD cases. One *MYH6* (Alpha Myosin Heavy Chain) mutation has been described in a family with ASD. (Ching and others 2005) Mutations in *MYH7* (Beta Myosin Heavy Chain) can cause Ebstein's Anomaly and septal defects.(Budde and others 2007) Heterozygous *MYBPC3* mutations are a very frequent cause of HCM, however there have been reports of ASD and PDA in addition to severe HCM in patients with homozygous truncating mutations in the Myosin Binding Protein C gene *MYBPC3*.(Xin and others 2007; Zahka and others 2008) Similarly, mutations in Alpha-Cardiac Actin *ACTC1*, another sarcomere protein gene, cause ASD together with HCM.(Monserrat and others 2007) Finally, Myosin Heavy Chain 11 (MYH11) has a role in smooth muscles, and mutations in *MYH11* have been implicated in familial thoracic aortic aneurysm with PDA due to decreased

Elastin (*ELN*) deletion or mutations are implicated in Williams-Beuren syndrome, however have also been reported in many cases of isolated SVAS and PS. (Arrington and others 2006;

**5.7. Contractile protein genes (***MYH6, MYH7, MYH11, MYBPC3, ACTC1***)** 

elasticity of the aortic wall and the ductus arteriosus.(Zhu and others 2006)

**5.8. Miscellaneous genes (***ELN, GJA1, FLNA, THRAP2***)** 

such as *GDF1, FOXH1, CFC1*, and *LEFTY2* are often necessary to cause disease.

**5.6. Notch signaling genes (***NOTCH1, JAG1, NOTCH2***)** 

McBride and others 2008; Mohamed and others 2006)

Copy Number Variations (CNVs) are structural alterations to the genomic DNA that result in the cell having abnormal copies of large sections of its DNA. They can be inherited or occur *de novo*. Over the past decade, the role of CNVs in disease has been heavily studied, mostly in different types of cancers. In the heart, CNV analysis has explained an additional small fraction of the genetics of syndromic CHD (3.6%), but more of the non-syndromic CHD (19%).(Breckpot and others 2011) Submicroscopic deletions have been discovered using array-CGH in large CHD cohorts. CNVs occured in regions harboring known CHD candidate genes but were also capable of identifying new CHD loci in TOF, HLH, heterotaxy, and other CHD phenotypes.(Fakhro and others 2011; Greenway and others 2009; Payne and others 2012) One of the most commonly used strategies in CNV analysis is trio analysis, which allows the determination of de novo CNVs in CHD patients. Comparison with control groups is also helpful in assessing the likelihood of causality of CNVs using statistical methods. Despite several successful examples, the use of CNVs in understanding CHD remains challenging, particularly in proving the causality of the CNVs and assessing the magnitude that these CNVs have on the phenotype.

#### **6.2. Micro RNA**

Micro RNAs (miRNAs) are small (around 22 nucleotides long) single stranded noncoding RNAs and are encoded by miRNA genes. miRNAs serve as regulators of gene expression. Since cardiac development involves tremendous spatio-temportal specificity of gene expression, it is believed that miRNAs are involved in cardiac development and they can potentially cause CHD. miRNAs are important players in cellular proliferation, differentiation, and migration all of which are essential processes for proper cardiac development. In fact, cardiac specific miRNAs were discovered such as miR-133 and miR-1- 2, both of which when knocked out in mice cause cardiac defects, specifically VSD and dilated cardiomyopathy.(Ikeda and others 2007) miR-208a and miR-208b are also cardiacenriched, and they are encoded within the introns of *MYH6* and *MYH7*.(Callis and others 2009; van Rooij and others 2007) Current research focuses on sequencing miRNA to identify potential mutations that can cause CHD. Definite evidence in humans is still unavailable but might be underway.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 139

importantly was the introduction of next-generation sequencing in 2005 and the tremendous decrease in the cost of sequencing over the past several years, which is allowing the massive sequencing of the exome and even genome of huge numbers of patients. Next-generation RNA sequencing is also beginning to be used to sequence cardiac transcripts from CHD patients who have underwent surgery. The rapid pooling of high throughput data is expected to massively increase our understanding of CHD within the coming two years. To deal with these large amounts of data, bioinformatics and modeling of genetic variants determine function is becoming the standard and many molecular biology labs are forced to become genetics and bioinformatics labs to make use of current technology. A systems biology approach is needed nowadays to integrate high throughput

With the advances in sequencing and bioinformatics, gene discovery in CHD is escalating. This advance in research is directly translated to clinical testing to provide genetic counseling for adult patients with CHD who plan to have children. From a technical aspect, our capability to identify genetic variants in CHD genes has magnified. Nonetheless, making functional significance and even clinical sense of the large number of gene mutations remains a big challenge. Given the complexity of CHD, definite gene mutations remain uncommon. At this time when the genetic inflow of information is very fast, physician-scientists must be very careful in communicating genetic information that is not validated to patients, in order to avoid psychological and emotional harm. On a different angle, with sequencing of the exome or genome, the chances of detecting incidental findings that indicate disease risk or prognosis becomes very high. All such unintentionally detected serious genetic findings are termed the incidentalome.(Kohane and others 2006) Since CHD is mostly surgically treated and people who undergo genetic testing are often already cured, caregivers need to be cautious before rushing next-generation sequencing into the CHD

Current trends in CHD genetics research are making use of the rapidly developing technology, particularly high throughput sequencing. This trend will continue over the coming few years. The challenge is in integrating the increasing amounts of data to answer the questions that need to be answered. Systems biology and innovative bioinformatics tools are crucial to integrate data from different sources and build a pipeline that can unravel the

Eventually, more validated genetic information will be available in the clinic to allow accurate genetic counseling and prenatal screening. Understanding heart development will also allow for possible therapeutic applications given the many-shared molecular pathways between embryologic heart development and adult heart disease, particularly tissue death

mysteries that molecular biologists have been trying to answer for many years.

and regeneration in the setting of ischemic heart disease.

data from the many possible sources.

clinic.

**9. Future prospects** 

**8. From the bench to the bedside** 

#### **6.3. Somatic mutations**

Another direction of research to assess CHD is the study of somatic mutations using surgically discarded tissues from CHD patients who undergo surgical repair. Both DNA and RNA can be extracted and sequenced. Previous studies have focused on sequencing *GATA4* and *Nkx2-5* in somatic DNA of patients with septal defects, and yielded controversial findings as to whether somatic mutations contribute significantly to these genes.(Draus and others 2009; Esposito and others 2011; Reamon-Buettner and Borlak 2004) In the current era of high throughput DNA sequencing, and development of new analytical frameworks for RNA sequencing, the contribution of somatic mutations to CHD will become clearer soon, however no significant data in this field is published yet.

#### **6.4. Epigenetics**

The multifactorial causality of CHD has long been hypothesized to explain the complexity of the genetics of cardiac malformations. Epigenetics is one model where gene-environment interaction can affect gene expression and disturb developmental processes in the embryonic heart. Histone modifications and chromatin remodeling both play important roles in cardiac development and physiology(Han and others 2011; Lange and others 2008; Ohtani and Dimmeler 2011), and recent studies shoed that they can directly interact with some classes of transcription factors like the T-box family.(Miller and Weinmann 2009) It is possible that epigenetic mechanisms contribute to the etiology of CHD, however more evidence remains to be established.

## **7. Current tools for the genetic evaluation of CHD**

Different techniques are currently available to interrogate the genetic causes of CHD. Karyotyping and Fluorescent In-Situ Hybridization (FISH) analysis remain the best tools to assess chromosomal deletions or rearrangements. They are often the starting point for the genetic assessment of a CHD patient. Whenever candidate genes are suspected, for instance in the setting of a clinically diagnosed syndrome, Sanger sequencing is performed on the candidate gene to look for disease-causing mutations. For many years, together with positional mapping through linkage analysis, these were the only tools that drove genetic discovery in CHD in humans. Current technology makes use of array-comparative genomic hybridization (array-CGH) for linkage analysis, Genome Wide Association Studies (GWAS), CNV analysis, homozygosity mapping, and transcriptome analysis. More importantly was the introduction of next-generation sequencing in 2005 and the tremendous decrease in the cost of sequencing over the past several years, which is allowing the massive sequencing of the exome and even genome of huge numbers of patients. Next-generation RNA sequencing is also beginning to be used to sequence cardiac transcripts from CHD patients who have underwent surgery. The rapid pooling of high throughput data is expected to massively increase our understanding of CHD within the coming two years. To deal with these large amounts of data, bioinformatics and modeling of genetic variants determine function is becoming the standard and many molecular biology labs are forced to become genetics and bioinformatics labs to make use of current technology. A systems biology approach is needed nowadays to integrate high throughput data from the many possible sources.

## **8. From the bench to the bedside**

138 Mutations in Human Genetic Disease

might be underway.

**6.4. Epigenetics** 

evidence remains to be established.

**7. Current tools for the genetic evaluation of CHD** 

**6.3. Somatic mutations** 

development. In fact, cardiac specific miRNAs were discovered such as miR-133 and miR-1- 2, both of which when knocked out in mice cause cardiac defects, specifically VSD and dilated cardiomyopathy.(Ikeda and others 2007) miR-208a and miR-208b are also cardiacenriched, and they are encoded within the introns of *MYH6* and *MYH7*.(Callis and others 2009; van Rooij and others 2007) Current research focuses on sequencing miRNA to identify potential mutations that can cause CHD. Definite evidence in humans is still unavailable but

Another direction of research to assess CHD is the study of somatic mutations using surgically discarded tissues from CHD patients who undergo surgical repair. Both DNA and RNA can be extracted and sequenced. Previous studies have focused on sequencing *GATA4* and *Nkx2-5* in somatic DNA of patients with septal defects, and yielded controversial findings as to whether somatic mutations contribute significantly to these genes.(Draus and others 2009; Esposito and others 2011; Reamon-Buettner and Borlak 2004) In the current era of high throughput DNA sequencing, and development of new analytical frameworks for RNA sequencing, the contribution of somatic mutations to CHD will

The multifactorial causality of CHD has long been hypothesized to explain the complexity of the genetics of cardiac malformations. Epigenetics is one model where gene-environment interaction can affect gene expression and disturb developmental processes in the embryonic heart. Histone modifications and chromatin remodeling both play important roles in cardiac development and physiology(Han and others 2011; Lange and others 2008; Ohtani and Dimmeler 2011), and recent studies shoed that they can directly interact with some classes of transcription factors like the T-box family.(Miller and Weinmann 2009) It is possible that epigenetic mechanisms contribute to the etiology of CHD, however more

Different techniques are currently available to interrogate the genetic causes of CHD. Karyotyping and Fluorescent In-Situ Hybridization (FISH) analysis remain the best tools to assess chromosomal deletions or rearrangements. They are often the starting point for the genetic assessment of a CHD patient. Whenever candidate genes are suspected, for instance in the setting of a clinically diagnosed syndrome, Sanger sequencing is performed on the candidate gene to look for disease-causing mutations. For many years, together with positional mapping through linkage analysis, these were the only tools that drove genetic discovery in CHD in humans. Current technology makes use of array-comparative genomic hybridization (array-CGH) for linkage analysis, Genome Wide Association Studies (GWAS), CNV analysis, homozygosity mapping, and transcriptome analysis. More

become clearer soon, however no significant data in this field is published yet.

With the advances in sequencing and bioinformatics, gene discovery in CHD is escalating. This advance in research is directly translated to clinical testing to provide genetic counseling for adult patients with CHD who plan to have children. From a technical aspect, our capability to identify genetic variants in CHD genes has magnified. Nonetheless, making functional significance and even clinical sense of the large number of gene mutations remains a big challenge. Given the complexity of CHD, definite gene mutations remain uncommon. At this time when the genetic inflow of information is very fast, physician-scientists must be very careful in communicating genetic information that is not validated to patients, in order to avoid psychological and emotional harm. On a different angle, with sequencing of the exome or genome, the chances of detecting incidental findings that indicate disease risk or prognosis becomes very high. All such unintentionally detected serious genetic findings are termed the incidentalome.(Kohane and others 2006) Since CHD is mostly surgically treated and people who undergo genetic testing are often already cured, caregivers need to be cautious before rushing next-generation sequencing into the CHD clinic.

#### **9. Future prospects**

Current trends in CHD genetics research are making use of the rapidly developing technology, particularly high throughput sequencing. This trend will continue over the coming few years. The challenge is in integrating the increasing amounts of data to answer the questions that need to be answered. Systems biology and innovative bioinformatics tools are crucial to integrate data from different sources and build a pipeline that can unravel the mysteries that molecular biologists have been trying to answer for many years.

Eventually, more validated genetic information will be available in the clinic to allow accurate genetic counseling and prenatal screening. Understanding heart development will also allow for possible therapeutic applications given the many-shared molecular pathways between embryologic heart development and adult heart disease, particularly tissue death and regeneration in the setting of ischemic heart disease.

## **Author details**

Akl C. Fahed *Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA* 

Georges M. Nemer

*Departent of Biochemistry and Molecular Genetics, American University of Beirut, Beirut, Lebanon* 

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 141

syndromic and non-syndromic congenital heart defects. Cytogenet Genome Res 135,

Britz-Cunningham SH, Shah MM, Zuppan CW, Fletcher WH. (1995). Mutations of the Connexin43 gap-junction gene in patients with heart malformations and defects of

Bruneau BG. (2008). The developmental genetics of congenital heart disease. Nature 451,943-

Budde BS, Binner P, Waldmuller S, Hohne W, Blankenfeldt W, Hassfeld S, Bromsen J, Dermintzoglou A, Wieczorek M, May E and others. (2007). Noncompaction of the ventricular myocardium is associated with a de novo mutation in the beta-myosin

Callis TE, Pandya K, Seok HY, Tang RH, Tatsuguchi M, Huang ZP, Chen JF, Deng Z, Gunn B, Shumate J and others. (2009). MicroRNA-208a is a regulator of cardiac hypertrophy

Chen Y, Han ZQ, Yan WD, Tang CZ, Xie JY, Chen H, Hu DY. (2010). A novel mutation in GATA4 gene associated with dominant inherited familial atrial septal defect. J Thorac

Ching YH, Ghosh TK, Cross SJ, Packham EA, Honeyman L, Loughna S, Robinson TE, Dearlove AM, Ribas G, Bonser AJ and others. (2005). Mutation in myosin heavy chain 6

Cinquetti R, Badi I, Campione M, Bortoletto E, Chiesa G, Parolini C, Camesasca C, Russo A, Taramelli R, Acquati F. (2008). Transcriptional deregulation and a missense mutation define ANKRD1 as a candidate gene for total anomalous pulmonary venous return.

Curran ME, Atkinson DL, Ewart AK, Morris CA, Leppert MF, Keating MT. (1993). The elastin gene is disrupted by a translocation associated with supravalvular aortic

Danks DM, Campbell PE, Jack I, Rogers J, Smith AL. (1977). Studies of the aetiology of

Dasgupta C, Martinez AM, Zuppan CW, Shah MM, Bailey LL, Fletcher WH. (2001). Identification of connexin43 (alpha1) gap junction gene mutations in patients with hypoplastic left heart syndrome by denaturing gradient gel electrophoresis (DGGE).

Delabar JM, Theophile D, Rahmani Z, Chettouh Z, Blouin JL, Prieur M, Noel B, Sinet PM. (1993). Molecular mapping of twenty-four features of Down syndrome on chromosome

Doswell BH, Visootsak J, Brady AN, Graham JM, Jr. (2006). Turner syndrome: an update

Draus JM, Jr., Hauck MA, Goetsch M, Austin EH, 3rd, Tomita-Mitchell A, Mitchell ME. (2009). Investigation of somatic NKX2-5 mutations in congenital heart disease. J Med

and review for the primary pediatrician. Clin Pediatr (Phila) 45,301-13.

neonatal hepatitis and biliary atresia. Arch Dis Child 52,360-7.

251-9.

8.

laterality. N Engl J Med 332,1323-9.

heavy chain gene. PLoS One 2,e1362.

Cardiovasc Surg 140,684-7.

Hum Mutat 29,468-74.

stenosis. Cell 73,159-68.

Mutat Res 479,173-86.

Genet 46,115-22.

21. Eur J Hum Genet 1,114-24.

and conduction in mice. J Clin Invest 119,2772-86.

causes atrial septal defect. Nat Genet 37,423-8.

## **10. References**


syndromic and non-syndromic congenital heart defects. Cytogenet Genome Res 135, 251-9.

Britz-Cunningham SH, Shah MM, Zuppan CW, Fletcher WH. (1995). Mutations of the Connexin43 gap-junction gene in patients with heart malformations and defects of laterality. N Engl J Med 332,1323-9.

140 Mutations in Human Genetic Disease

*Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA* 

and pulmonic stenosis. Pediatr Dev Pathol 9,297-306.

(heart-hand syndrome). N Engl J Med 330,885-91.

*Departent of Biochemistry and Molecular Genetics, American University of Beirut, Beirut, Lebanon* 

Arrington CB, Nightengale D, Lowichik A, Rosenthal ET, Christian-Ritter K, Viskochil DH. (2006). Pathologic and molecular analysis in a family with rare mixed supravalvar aortic

Bamford RN, Roessler E, Burdine RD, Saplakoglu U, dela Cruz J, Splitt M, Goodship JA, Towbin J, Bowers P, Ferrero GB and others. (2000). Loss-of-function mutations in the EGF-CFC gene CFC1 are associated with human left-right laterality defects. Nat Genet

Bamforth SD, Braganca J, Farthing CR, Schneider JE, Broadbent C, Michell AC, Clarke K, Neubauer S, Norris D, Brown NA and others. (2004). Cited2 controls left-right patterning and heart development through a Nodal-Pitx2c pathway. Nat Genet 36,1189-

Basson CT, Bachinsky DR, Lin RC, Levi T, Elkins JA, Soults J, Grayzel D, Kroumpouzou E, Traill TA, Leblanc-Straceski J and others. (1997). Mutations in human TBX5 [corrected] cause limb and cardiac malformation in Holt-Oram syndrome. Nat Genet

Basson CT, Cowley GS, Solomon SD, Weissman B, Poznanski AK, Traill TA, Seidman JG, Seidman CE. (1994). The clinical and genetic spectrum of the Holt-Oram syndrome

Basson CT, Huang T, Lin RC, Bachinsky DR, Weremowicz S, Vaglio A, Bruzzone R, Quadrelli R, Lerone M, Romeo G and others. (1999). Different TBX5 interactions in heart and limb defined by Holt-Oram syndrome mutations. Proc Natl Acad Sci U S A

Bauer RC, Laney AO, Smith R, Gerfen J, Morrissette JJ, Woyciechowski S, Garbarini J, Loomes KM, Krantz ID, Urban Z and others. (2010). Jagged1 (JAG1) mutations in

Boehme DH, Shotar AO. (1989). A complex deformity of appendicular skeleton and shoulder with congenital heart disease in three generations of a Jordanian family. Clin

Breckpot J, Thienpont B, Arens Y, Tranchevent LC, Vermeesch JR, Moreau Y, Gewillig M, Devriendt K. (2011). Challenges of interpreting copy number variation in

patients with tetralogy of Fallot or pulmonic stenosis. Hum Mutat 31,594-601. Beuren AJ, Apitz J, Harmjanz D. (1962). Supravalvular aortic stenosis in association with

mental retardation and a certain facial appearance. Circulation 26,1235-40.

Bondy CA. (2009). Turner syndrome 2008. Horm Res 71 Suppl 1,52-6.

**Author details** 

Georges M. Nemer

**10. References** 

26,365-9.

96.

15,30-5.

96,2919-24.

Genet 36,442-50.

Akl C. Fahed


El-Rassy I, Bou-Abdallah J, Al-Ghadban S, Bitar F, Nemer G. (2008). Absence of NOTCH2 and Hey2 mutations in a familial Alagille syndrome case with a novel frameshift mutation in JAG1. Am J Med Genet A 146,937-9.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 143

ANF expression in the atrioventricular canal: implications for cardiac chamber

Hamada T, Gejyo F, Koshino Y, Murata T, Omori M, Nishio M, Misawa T, Isaki K. (1998). Echocardiographic evaluation of cardiac valvular abnormalities in adults with Down's

Han P, Hang CT, Yang J, Chang CP. (2011). Chromatin remodeling in cardiovascular

Hariri F, Nemer M, Nemer G. (2011). T-box factors: Insights into the evolutionary emergence

Heathcote K, Braybrook C, Abushaban L, Guy M, Khetyar ME, Patton MA, Carter ND, Scambler PJ, Syrris P. (2005). Common arterial trunk associated with a homeodomain

Herrera P, Caldarone CA, Forte V, Holtby H, Cox P, Chiu P, Kim PC. (2008). Topsy-turvy heart with associated congenital tracheobronchial stenosis and airway compression

Ikeda S, Kong SW, Lu J, Bisping E, Zhang H, Allen PD, Golub TR, Pieske B, Pu WT. (2007). Altered microRNA expression in human heart disease. Physiol Genomics 31,367-

Jaeggi E, Chitayat D, Golding F, Kim P, Yoo SJ. (2008). Prenatal diagnosis of topsy-turvy

Kamath BM, Podkameni G, Hutchinson AL, Leonard LD, Gerfen J, Krantz ID, Piccoli DA, Spinner NB, Loomes KM, Meyers K. (2011). Renal anomalies in Alagille syndrome: A

Karkera JD, Lee JS, Roessler E, Banerjee-Basu S, Ouspenskaia MV, Mez J, Goldmuntz E, Bowers P, Towbin J, Belmont JW and others. (2007). Loss-of-function mutations in growth differentiation factor-1 (GDF1) are associated with congenital heart defects in

Kasahara H, Lee B, Schott JJ, Benson DW, Seidman JG, Seidman CE, Izumo S. (2000). Loss of function and inhibitory effects of human CSX/NKX2.5 homeoprotein mutations

Kirk EP, Sunde M, Costa MW, Rankin SA, Wolstein O, Castro ML, Butler TL, Hyun C, Guo G, Otway R and others. (2007). Mutations in cardiac T-box factor gene TBX20 are associated with diverse cardiac pathologies, including defects of septation and

Kodo K, Nishizawa T, Furutani M, Arai S, Yamamura E, Joo K, Takahashi T, Matsuoka R, Yamagishi H. (2009). GATA6 mutations cause human cardiac outflow tract defects by disrupting semaphorin-plexin signaling. Proc Natl Acad Sci U S A 106,13933-

Kohane IS, Masys DR, Altman RB. (2006). The incidentalome: a threat to genomic medicine.

Korenberg JR, Chen XN, Schipper R, Sun Z, Gonsky R, Gerwehr S, Carpenter N, Daumer C, Dignan P, Disteche C and others. (1994). Down syndrome phenotypes:

associated with congenital heart disease. J Clin Invest 106,299-308.

valvulogenesis and cardiomyopathy. Am J Hum Genet 81,280-91.

formation. Genes Dev 16,1234-46.

of the complex heart. Ann Med.

heart. Cardiol Young 18,337-42.

73.

8.

JAMA 296,212-5.

syndrome. Tohoku J Exp Med 185,31-5.

development and physiology. Circ Res 108,378-96.

mutation of NKX2.6. Hum Mol Genet 14,585-93.

disease-defining feature. Am J Med Genet A.

humans. Am J Hum Genet 81,987-94.

requiring surgical reconstruction. Ann Thorac Surg 86,282-3.


ANF expression in the atrioventricular canal: implications for cardiac chamber formation. Genes Dev 16,1234-46.

Hamada T, Gejyo F, Koshino Y, Murata T, Omori M, Nishio M, Misawa T, Isaki K. (1998). Echocardiographic evaluation of cardiac valvular abnormalities in adults with Down's syndrome. Tohoku J Exp Med 185,31-5.

142 Mutations in Human Genetic Disease

155A,2416-21.

443-7.

Perinatol 28,1-10.

1651-5.

4.

mutation in JAG1. Am J Med Genet A 146,937-9.

rearrangements. Dev Disabil Res Rev 14,11-8.

mutations in ZIC3. Nat Genet 17,305-8.

ventricle. Am J Hum Genet 70,776-80.

transl)]. Z Kardiol 69,168-72.

echocardiography. J Am Coll Cardiol 42,923-9.

El-Rassy I, Bou-Abdallah J, Al-Ghadban S, Bitar F, Nemer G. (2008). Absence of NOTCH2 and Hey2 mutations in a familial Alagille syndrome case with a novel frameshift

Emanuel BS. (2008). Molecular mechanisms and diagnosis of chromosome 22q11.2

Esposito G, Butler TL, Blue GM, Cole AD, Sholler GF, Kirk EP, Grossfeld P, Perryman BM, Harvey RP, Winlaw DS. (2011). Somatic mutations in NKX2-5, GATA4, and HAND1 are not a common cause of tetralogy of Fallot or hypoplastic left heart. Am J Med Genet A

Fakhro KA, Choi M, Ware SM, Belmont JW, Towbin JA, Lifton RP, Khokha MK, Brueckner M. (2011). Rare copy number variations in congenital heart disease patients identify

Garg V, Kathiriya IS, Barnes R, Schluterman MK, King IN, Butler CA, Rothrock CR, Eapen RS, Hirayama-Yamada K, Joo K and others. (2003). GATA4 mutations cause human congenital heart defects and reveal an interaction with TBX5. Nature 424,

Garg V, Muth AN, Ransom JF, Schluterman MK, Barnes R, King IN, Grossfeld PD, Srivastava D. (2005). Mutations in NOTCH1 cause aortic valve disease. Nature 437,270-

Gebbia M, Ferrero GB, Pilia G, Bassi MT, Aylsworth A, Penman-Splitt M, Bird LM, Bamforth JS, Burn J, Schlessinger D and others. (1997). X-linked situs abnormalities result from

Gill HK, Splitt M, Sharland GK, Simpson JM. (2003). Patterns of recurrence of congenital heart disease: an analysis of 6,640 consecutive pregnancies evaluated by detailed fetal

Goldmuntz E. (2001). The epidemiology and genetics of congenital heart disease. Clin

Goldmuntz E, Bamford R, Karkera JD, dela Cruz J, Roessler E, Muenke M. (2002). CFC1 mutations in patients with transposition of the great arteries and double-outlet right

Greenway SC, Pereira AC, Lin JC, DePalma SR, Israel SJ, Mesquita SM, Ergul E, Conta JH, Korn JM, McCarroll SA and others. (2009). De novo copy number variants identify new

Griffin HR, Topf A, Glen E, Zweier C, Stuart AG, Parsons J, Peart I, Deanfield J, O'Sullivan J, Rauch A and others. (2010). Systematic survey of variants in TBX1 in non-syndromic tetralogy of Fallot identifies a novel 57 base pair deletion that reduces transcriptional activity but finds no evidence for association with common variants. Heart 96,

Grimm T, Wesselhoeft H. (1980). [The genetic aspects of Williams-Beuren syndrome and the isolated form of the supravalvular aortic stenosis. Investigation of 128 families (author's

Habets PE, Moorman AF, Clout DE, van Roon MA, Lingbeek M, van Lohuizen M, Campione M, Christoffels VM. (2002). Cooperative action of Tbx2 and Nkx2.5 inhibits

genes and loci in isolated sporadic tetralogy of Fallot. Nat Genet 41,931-5.

unique genes in left-right patterning. Proc Natl Acad Sci U S A 108,2915-20.


the consequences of chromosomal imbalance. Proc Natl Acad Sci U S A 91,4997- 5001.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 145

individuals with a JAG1 mutation and/or Alagille syndrome. Circulation 106,2567-

Megarbane A, Salem N, Stephan E, Ashoush R, Lenoir D, Delague V, Kassab R, Loiselet J, Bouvagnet P. (2000). X-linked transposition of the great arteries and incomplete penetrance among males with a nonsense mutation in ZIC3. Eur J Hum Genet 8,

Metcalfe K, Rucka AK, Smoot L, Hofstadler G, Tuzler G, McKeown P, Siu V, Rauch A, Dean J, Dennis N and others. (2000). Elastin: mutational spectrum in supravalvular aortic

Miller SA, Weinmann AS. (2009). An essential interaction between T-box proteins and

Mohamed SA, Aherrahrou Z, Liptau H, Erasmi AW, Hagemann C, Wrobel S, Borzym K, Schunkert H, Sievers HH, Erdmann J. (2006). Novel missense mutations (p.T596M and p.P1797H) in NOTCH1 in patients with bicuspid aortic valve. Biochem Biophys Res

Mohapatra B, Casey B, Li H, Ho-Dawson T, Smith L, Fernbach SD, Molinari L, Niesh SR, Jefferies JL, Craigen WJ and others. (2009). Identification and functional characterization of NODAL rare variants in heterotaxy and isolated cardiovascular malformations. Hum

Momma K. (2010). Cardiovascular anomalies associated with chromosome 22q11.2 deletion

Monserrat L, Hermida-Prieto M, Fernandez X, Rodriguez I, Dumont C, Cazon L, Cuesta MG, Gonzalez-Juanatey C, Peteiro J, Alvarez N and others. (2007). Mutation in the alpha-cardiac actin gene associated with apical hypertrophic cardiomyopathy, left

Moskowitz IP, Kim JB, Moore ML, Wolf CM, Peterson MA, Shendure J, Nobrega MA, Yokota Y, Berul C, Izumo S and others. (2007). A molecular pathway including Id2, Tbx5, and Nkx2-5 required for cardiac conduction system development. Cell 129,1365-

Muncke N, Jung C, Rudiger H, Ulmer H, Roeth R, Hubert A, Goldmuntz E, Driscoll D, Goodship J, Schon K and others. (2003). Missense mutations and gene interruption in PROSIT240, a novel TRAP240-like gene, in patients with congenital heart defect

Nemer G, Fadlalah F, Usta J, Nemer M, Dbaibo G, Obeid M, Bitar F. (2006). A novel mutation in the GATA4 gene in patients with Tetralogy of Fallot. Hum Mutat 27,293-4. Noonan JA. (1994). Noonan syndrome. An update and review for the primary pediatrician.

Oda T, Elkahloun AG, Pike BL, Okajima K, Krantz ID, Genin A, Piccoli DA, Meltzer PS, Spinner NB, Collins FS and others. (1997). Mutations in the human Jagged1 gene are

Ohtani K, Dimmeler S. (2011). Epigenetic regulation of cardiovascular differentiation.

ventricular non-compaction, and septal defects. Eur Heart J 28,1953-61. Moodie DS. (1994). Adult congenital heart disease. Curr Opin Cardiol 9,137-42.

(transposition of the great arteries). Circulation 108,2843-50.

responsible for Alagille syndrome. Nat Genet 16,235-42.

74.

704-8.

stenosis. Eur J Hum Genet 8,955-63.

syndrome. Am J Cardiol 105,1617-24.

Clin Pediatr (Phila) 33,548-55.

Cardiovasc Res 90,404-12.

Commun 345,1460-5.

Mol Genet 18,861-71.

76.

histone-modifying enzymes. Epigenetics 4,85-8.


individuals with a JAG1 mutation and/or Alagille syndrome. Circulation 106,2567- 74.

Megarbane A, Salem N, Stephan E, Ashoush R, Lenoir D, Delague V, Kassab R, Loiselet J, Bouvagnet P. (2000). X-linked transposition of the great arteries and incomplete penetrance among males with a nonsense mutation in ZIC3. Eur J Hum Genet 8, 704-8.

144 Mutations in Human Genetic Disease

Genes Dev 22,2370-84.

morphogenesis. J Clin Invest 116,929-39.

factors. Am J Med Genet 97,319-25.

in the young. Pediatrics 121,e1622-7.

genetic aspects. Biomed Pharmacother 47,197-200.

prevalence of atrioventricular canal. J Pediatr 135,703-6.

defects in Down syndrome. Am J Med Genet A 140,2501-5.

notch signaling pathway. Am J Hum Genet 79,169-73.

which encodes a ligand for Notch1. Nat Genet 16,243-51.

5001.

21-9.

55,662-7.

the consequences of chromosomal imbalance. Proc Natl Acad Sci U S A 91,4997-

Kyndt F, Gueffet JP, Probst V, Jaafar P, Legendre A, Le Bouffant F, Toquet C, Roy E, McGregor L, Lynch SA and others. (2007). Mutations in the gene encoding filamin A as

Lange M, Kaynak B, Forster UB, Tonjes M, Fischer JJ, Grimm C, Schlesinger J, Just S, Dunkel I, Krueger T and others. (2008). Regulation of muscle development by DPF3, a novel histone acetylation and methylation reader of the BAF chromatin remodeling complex.

Lepore JJ, Mericko PA, Cheng L, Lu MM, Morrisey EE, Parmacek MS. (2006). GATA-6 regulates semaphorin 3C and is required in cardiac neural crest for cardiovascular

Li L, Krantz ID, Deng Y, Genin A, Banta AB, Collins CC, Qi M, Trask BJ, Kuo WL, Cochran J and others. (1997a). Alagille syndrome is caused by mutations in human Jagged1,

Li QY, Newbury-Ecob RA, Terrett JA, Wilson DI, Curtis AR, Yi CH, Gebuhr T, Bullen PJ, Robson SC, Strachan T and others. (1997b). Holt-Oram syndrome is caused by mutations in TBX5, a member of the Brachyury (T) gene family. Nat Genet 15,

Lin X, Huo Z, Liu X, Zhang Y, Li L, Zhao H, Yan B, Liu Y, Yang Y, Chen YH. (2010). A novel GATA6 mutation in patients with tetralogy of Fallot or atrial septal defect. J Hum Genet

Loffredo CA. (2000). Epidemiology of cardiovascular malformations: prevalence and risk

Lopez L, Arheart KL, Colan SD, Stein NS, Lopez-Mitnik G, Lin AE, Reller MD, Ventura R, Silberbach M. (2008). Turner syndrome is an independent risk factor for aortic dilation

Marino B. (1993). Congenital heart disease in patients with Down's syndrome: anatomic and

Marino B, Digilio MC, Toscano A, Giannotti A, Dallapiccola B. (1999). Congenital heart diseases in children with Noonan syndrome: An expanded cardiac spectrum with high

Maslen CL, Babcock D, Robinson SW, Bean LJ, Dooley KJ, Willour VL, Sherman SL. (2006). CRELD1 mutations contribute to the occurrence of cardiac atrioventricular septal

McBride KL, Riley MF, Zender GA, Fitzgerald-Butt SM, Towbin JA, Belmont JW, Cole SE. (2008). NOTCH1 mutations in individuals with left ventricular outflow tract

McElhinney DB, Krantz ID, Bason L, Piccoli DA, Emerick KM, Spinner NB, Goldmuntz E. (2002). Analysis of cardiovascular phenotype and genotype-phenotype correlation in

malformations reduce ligand-induced signaling. Hum Mol Genet 17,2886-93. McDaniell R, Warthen DM, Sanchez-Lara PA, Pai A, Krantz ID, Piccoli DA, Spinner NB. (2006). NOTCH2 mutations cause Alagille syndrome, a heterogeneous disorder of the

a cause for familial cardiac valvular dystrophy. Circulation 115,40-9.


Payne AR, Chang SW, Koenig SN, Zinn AR, Garg V. (2012). Submicroscopic Chromosomal Copy Number Variations Identified in Children With Hypoplastic Left Heart Syndrome. Pediatr Cardiol.

Genetic Causes of Syndromic and Non-Syndromic Congenital Heart Disease 147

Tartaglia M, Zampino G, Gelb BD. (2010). Noonan syndrome: clinical aspects and molecular

Tevosian SG, Deconinck AE, Tanaka M, Schinke M, Litovsky SH, Izumo S, Fujiwara Y, Orkin SH. (2000). FOG-2, a cofactor for GATA transcription factors, is essential for heart morphogenesis and development of coronary vessels from epicardium. Cell 101,

Tidyman WE, Rauen KA. (2009). The RASopathies: developmental syndromes of Ras/MAPK

Tomita-Mitchell A, Maslen CL, Morris CD, Garg V, Goldmuntz E. (2007). GATA4 sequence

Turbay D, Wechsler SB, Blanchard KM, Izumo S. (1996). Molecular cloning, chromosomal mapping, and characterization of the human cardiac-specific homeobox gene hCsx. Mol

van der Bom T, Zomer AC, Zwinderman AH, Meijboom FJ, Bouma BJ, Mulder BJ. (2011). The changing epidemiology of congenital heart disease. Nat Rev Cardiol 8,

van Rooij E, Sutherland LB, Qi X, Richardson JA, Hill J, Olson EN. (2007). Control of stressdependent cardiac growth and gene expression by a microRNA. Science 316,575-9. Williams JC, Barratt-Boyes BG, Lowe JB. (1961). Supravalvular aortic stenosis. Circulation

Xin B, Puffenberger E, Tumbush J, Bockoven JR, Wang H. (2007). Homozygosity for a novel splice site mutation in the cardiac myosin-binding protein C gene causes severe

Xin M, Davis CA, Molkentin JD, Lien CL, Duncan SA, Richardson JA, Olson EN. (2006). A threshold of GATA4 and GATA6 expression is required for cardiovascular

Yagi H, Furutani Y, Hamada H, Sasaki T, Asakawa S, Minoshima S, Ichida F, Joo K, Kimura M, Imamura S and others. (2003). Role of TBX1 in human del22q11.2 syndrome. Lancet

Yuan ZR, Kohsaka T, Ikegaya T, Suzuki T, Okano S, Abe J, Kobayashi N, Yamada M. (1998). Mutational analysis of the Jagged 1 gene in Alagille syndrome families. Hum Mol Genet

Zahka K, Kalidas K, Simpson MA, Cross H, Keller BB, Galambos C, Gurtz K, Patton MA, Crosby AH. (2008). Homozygous mutation of MYBPC3 associated with severe infantile hypertrophic cardiomyopathy at high frequency among the Amish. Heart 94,1326-

Zhang L, Tumer Z, Jacobsen JR, Andersen PS, Tommerup N, Larsen LA. (2006). Screening of 99 Danish patients with congenital heart disease for GATA4 mutations. Genet Test

Zhao R, Watt AJ, Battle MA, Li J, Bondow BJ, Duncan SA. (2008). Loss of both GATA4 and GATA6 blocks cardiac myocyte differentiation and results in acardia in mice. Dev Biol

neonatal hypertrophic cardiomyopathy. Am J Med Genet A 143A,2662-7.

development. Proc Natl Acad Sci U S A 103,11189-94.

variants in patients with congenital heart disease. J Med Genet 44,779-83.

pathogenesis. Mol Syndromol 1,2-26.

pathway dysregulation. Curr Opin Genet Dev 19,230-6.

729-39.

Med 2,86-96.

50-60.

24,1311-8.

362,1366-73.

7,1363-9.

10,277-80.

317,614-9.

30.


Tartaglia M, Zampino G, Gelb BD. (2010). Noonan syndrome: clinical aspects and molecular pathogenesis. Mol Syndromol 1,2-26.

146 Mutations in Human Genetic Disease

138,1231-40.

Genet 47,230-5.

Mutat 31,1185-94.

Genet Metab 98,225-34.

Child Neurol 17,269-71.

Med Genet Suppl 7,52-6.

Syndrome. Pediatr Cardiol.

tetralogy of Fallot. Hum Mutat 22,372-7.

Pober BR. (2010). Williams-Beuren syndrome. N Engl J Med 362,239-52.

in Williams-Beuren syndrome. J Clin Invest 118,1606-15.

Payne AR, Chang SW, Koenig SN, Zinn AR, Garg V. (2012). Submicroscopic Chromosomal Copy Number Variations Identified in Children With Hypoplastic Left Heart

Peng T, Wang L, Zhou SF, Li X. (2010). Mutations of the GATA4 and NKX2.5 genes in Chinese pediatric patients with non-familial congenital heart disease. Genetica

Pizzuti A, Sarkozy A, Newton AL, Conti E, Flex E, Digilio MC, Amati F, Gianni D, Tandoi C, Marino B and others. (2003). Mutations of ZFPM2/FOG2 gene in sporadic cases of

Pober BR, Johnson M, Urban Z. (2008). Mechanisms and treatment of cardiovascular disease

Posch MG, Gramlich M, Sunde M, Schmitt KR, Lee SH, Richter S, Kersten A, Perrot A, Panek AN, Al Khatib IH and others. (2010). A gain-of-function TBX20 mutation causes congenital atrial septal defects, patent foramen ovale and cardiac valve defects. J Med

Pueschel SM. (1990). Clinical aspects of Down syndrome from infancy to adulthood. Am J

Rajagopal SK, Ma Q, Obler D, Shen J, Manichaikul A, Tomita-Mitchell A, Boardman K, Briggs C, Garg V, Srivastava D and others. (2007). Spectrum of heart disease associated

Reamon-Buettner SM, Borlak J. (2004). Somatic NKX2-5 mutations as a novel mechanism of

Reamon-Buettner SM, Borlak J. (2010). NKX2-5: an update on this hypermutable homeodomain protein and its role in human congenital heart disease (CHD). Hum

Roessler E, Ouspenskaia MV, Karkera JD, Velez JI, Kantipong A, Lacbawan F, Bowers P, Belmont JW, Towbin JA, Goldmuntz E and others. (2008). Reduced NODAL signaling strength via mutation of several pathway members including FOXH1 is linked to

Roessler E, Pei W, Ouspenskaia MV, Karkera JD, Velez JI, Banerjee-Basu S, Gibney G, Lupo PJ, Mitchell LE, Towbin JA and others. (2009). Cumulative ligand activity of NODAL mutations and modifiers are linked to human heart defects and holoprosencephaly. Mol

Sperling S, Grimm CH, Dunkel I, Mebus S, Sperling HP, Ebner A, Galli R, Lehrach H, Fusch C, Berger F and others. (2005). Identification and functional analysis of CITED2

Stennard FA, Costa MW, Lai D, Biben C, Furtado MB, Solloway MJ, McCulley DJ, Leimena C, Preis JI, Dunwoodie SL and others. (2005). Murine T-box transcription factor Tbx20 acts as a repressor during heart development, and is essential for adult heart integrity,

Stromme P, Bjornstad PG, Ramstad K. (2002). Prevalence estimation of Williams syndrome. J

with murine and human GATA4 mutation. J Mol Cell Cardiol 43,677-85.

human heart defects and holoprosencephaly. Am J Hum Genet 83,18-29.

mutations in patients with congenital heart defects. Hum Mutat 26,575-82.

function and adaptation. Development 132,2451-62.

disease in complex congenital heart disease. J Med Genet 41,684-90.


Zhu L, Vranckx R, Khau Van Kien P, Lalande A, Boisset N, Mathieu F, Wegman M, Glancy L, Gasc JM, Brunotte F and others. (2006). Mutations in myosin heavy chain 11 cause a syndrome associating thoracic aortic aneurysm/aortic dissection and patent ductus arteriosus. Nat Genet 38,343-9.

**Chapter 7** 

© 2012 Berdeli and Nalbantoglu, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**The Prototype of Hereditary Periodic Fevers:** 

Autoinflammatory disorders are multisystem periodic fever syndromes, and characterized with recurrent unprovoked inflammation of the serosal membranes. Unlike autoimmune disorders, autoinflammatory disorders lack the production of high-titer autoantibodies or antigen-specific T cells. These diseases primarily include hereditary syndromes (**Table 1**); Familial Mediterrenean fever (FMF), TNF receptor-associated periodic fever syndrome (TRAPS), hyperimmunoglobulinaemia D and periodic fever syndrome (HIDS), and the cryopyrin-associated periodic syndrome (CAPS) which involves familial cold autoinflammatory syndrome (FCAS), Muckle–Wells syndrome (MWS) and neonatal onset multi-system inflammatory disease (NOMID)/chronic infantile neurological cutaneous and articular syndrome (CINCA). Familial mediterrenean fever has been considered as the most prevalent of innate immune system disorders involving systemic autoinflammatory reaction effecting joints, skin, bones and the kidney. Systemic amyloidosis is the most severe manifestation of the disease, commonly effecting the kidneys (11% of cases), and sometimes the adrenals, intestine, spleen, lung, and testis (1). As an innate immune system disorder, FMF is characterized by recurrent episodes of unseemingly unprovoked inflammation and fever with lasting 1- to 3-day attacks accompanied by sterile peritonitis, pleurisy, rash, arthritis, and in some cases amyloidosis leading to renal failure. this (Sohar et al., 1967). Apart from the typical implications of the disease, there is increasing evidence about the expanding clinical spectrum of FMF that embraces unusual clinical characters (2-4). These are the rare presentations of the disease and therefore undescores the role of molecular

FMF is classically transmitted with autosomal recessive inheritance, and has been common among Mediterranean populations; however, previous reports have confirmed its presence worldwide. It has been described in Mediterranean populations, including

and reproduction in any medium, provided the original work is properly cited.

**Familial Mediterranean Fever** 

Afig Berdeli and Sinem Nalbantoglu

http://dx.doi.org/10.5772/51378

**1. Introduction** 

Additional information is available at the end of the chapter

analysis in particular for the suspicious and probable cases.

## **The Prototype of Hereditary Periodic Fevers: Familial Mediterranean Fever**

Afig Berdeli and Sinem Nalbantoglu

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/51378

## **1. Introduction**

148 Mutations in Human Genetic Disease

arteriosus. Nat Genet 38,343-9.

Zhu L, Vranckx R, Khau Van Kien P, Lalande A, Boisset N, Mathieu F, Wegman M, Glancy L, Gasc JM, Brunotte F and others. (2006). Mutations in myosin heavy chain 11 cause a syndrome associating thoracic aortic aneurysm/aortic dissection and patent ductus

> Autoinflammatory disorders are multisystem periodic fever syndromes, and characterized with recurrent unprovoked inflammation of the serosal membranes. Unlike autoimmune disorders, autoinflammatory disorders lack the production of high-titer autoantibodies or antigen-specific T cells. These diseases primarily include hereditary syndromes (**Table 1**); Familial Mediterrenean fever (FMF), TNF receptor-associated periodic fever syndrome (TRAPS), hyperimmunoglobulinaemia D and periodic fever syndrome (HIDS), and the cryopyrin-associated periodic syndrome (CAPS) which involves familial cold autoinflammatory syndrome (FCAS), Muckle–Wells syndrome (MWS) and neonatal onset multi-system inflammatory disease (NOMID)/chronic infantile neurological cutaneous and articular syndrome (CINCA). Familial mediterrenean fever has been considered as the most prevalent of innate immune system disorders involving systemic autoinflammatory reaction effecting joints, skin, bones and the kidney. Systemic amyloidosis is the most severe manifestation of the disease, commonly effecting the kidneys (11% of cases), and sometimes the adrenals, intestine, spleen, lung, and testis (1). As an innate immune system disorder, FMF is characterized by recurrent episodes of unseemingly unprovoked inflammation and fever with lasting 1- to 3-day attacks accompanied by sterile peritonitis, pleurisy, rash, arthritis, and in some cases amyloidosis leading to renal failure. this (Sohar et al., 1967). Apart from the typical implications of the disease, there is increasing evidence about the expanding clinical spectrum of FMF that embraces unusual clinical characters (2-4). These are the rare presentations of the disease and therefore undescores the role of molecular analysis in particular for the suspicious and probable cases.

> FMF is classically transmitted with autosomal recessive inheritance, and has been common among Mediterranean populations; however, previous reports have confirmed its presence worldwide. It has been described in Mediterranean populations, including

© 2012 Berdeli and Nalbantoglu, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Italian, Spanish (5), Portuguese, French, and Greek, as well as in patients from Northern Europe and Japan. Nevertheless, only rare occurrences have been reported throughout the general population because of the low frequency of the causative alleles (6). Among susceptible ethnic groups, FMF prevalence is between 1/500-1/2000, and the carrier rate is between 16-22%. Contrary to the traditionally known monogenic inheritance of the disease, it has been previously evidenced that there have been a number of patients who have the typical FMF phenotype or FMF related symptoms with only one MEFV heterozygous mutation and/or even without any MEFV mutations (5-7), indicating the presence of clinical phenotype not only in homozygous patients, but also similarly in the heterozygous patients with mild disease.

The Prototype of Hereditary Periodic Fevers: Familial Mediterranean Fever 151

16p13.3 chromosomally located MEFV (Mediterranean Fever) gene has been found responsible for FMF disease, and the protein product, Pyrin, is a 781-amino-acid protein (8- 11). Evolutionary conserved domains of pyrin protein involves N-terminal pyrin domain, a B-box zinc-finger, a coiled coil and a C-terminal B30.2 PrySpry domains. Pyrin protein has been reported as a component of the inflammasome complex with both pro-inflammatory and anti-inflammatory role in the cytokine regulation (10-13). Thus, a proapoptotic or antiapoptotic role have been still not precise for the pyrin protein in NF-kB activation and apoptosis (11-16). By means of its PYD and B30.2 interacting domains, pyrin has been shown to bind different proteins of autoinflammatory disease genes. Each interacting protein that binds through the pyrin domains (PYD) consists of PSTPIP1 (17), 14-3-3 (18), Caspase-1 (19),

In 1997, The International FMF Consortium and The French FMF Consortium reported four missense disease associated mutations in the MEFV gene involving M694V, M680I, V726A, and M694I. Major and minor mutations of MEFV gene are well documented in INFEVERS, the database of hereditary autoinflammatory disorder mutations, and exons 2 and 10 comprises the hot-spots (22). To date, mutations have been mostly identified in exons 2, 3, 5, and 10 of the MEFV gene. According to previous reports by Touitou I. (2001), and by the Turkish FMF study group (2005), the most common MEFV mutation in Turkey is M694V (57.0 and 51.4%, respectively), followed by M680I (16.5 and 14.4%, respectively), and V726A (13.9 and 8.6%, respectively). Moreover, no correlation has been reported between various MEFV gene mutations and the severity of the phenotype in various populations supporting

According to INFEVERS (22), the database of hereditary autoinflammatory disorder mutations, To date, approximately 222 sequence variants including both missense mutations (only one nonsense mutation; Y688X) and polymorphisms have been defined in the FMF gene (MEFV), INFEVERS, 100 of them was clinically associated with the phenotype, 33 of them was not associated with the disease and the remaining was of uncertain pathogenicity. The remarkably wide clinical variability of the disease, as indicated by previous reports, has been linked to the MEFV allelic heterogeneity that underlies genotypic and phenotypic heterogeneity (23, 26, 27), and this has made detailed mutation screening critically important. In particular, Turkish FMF patients are characterized by an increased genetic heterogeneity due to various mutation frequencies from different regions, explained by the

With respect to our mutation screenings, a previous comprehensive study was performed with 3430 Turkish individuals from all regions of Turkey (ages range from 2 months to 67 years; 2101 females and 1329 males) including first and second-degree relatives of individuals with FMF clinical diagnoses (including suspicious, possible, and definitive cases) who referred to the Molecular Medicine Laboratory for genetic diagnosis between years May, 2005 and December, 2010. The Tel-Hashomer and Livneh criteria were used for the clinical diagnosis of FMF based on the model of major, minor, and supportive criteria, which stipulates the presence of either 1 major or 2 minor criteria or 1 minor and 5 supportive criteria for a diagnosis. A simple set of criteria for the diagnosis of FMF required

the genotypic and phenotypic heterogeneity present for FMF (5, 23-25).

ASC (20), and Siva (21).

intrapopulation differentiation.


\*CINCA, chronic infantile neurological, cutaneous, and articular syndrome; FCAS, familial cold autoinflammatory syndrome; FMF, familial Mediterranean fever; MWS, Muckle-Wells syndrome; NOMID, neonatal onset multisystem inflammatory disease; PAPA, pyogenic sterile arthritis, pyoderma gangrenosum, and acne; CAPS, cryopyrin-associated periodic syndrome; TRAPS, tumour necrosis factor receptor-associated periodic syndrome.

**Table 1.** Hereditary autoinflammatory syndromes with identified gene loci (adapted from Lachmann and Hawkins, 2009: 36).

16p13.3 chromosomally located MEFV (Mediterranean Fever) gene has been found responsible for FMF disease, and the protein product, Pyrin, is a 781-amino-acid protein (8- 11). Evolutionary conserved domains of pyrin protein involves N-terminal pyrin domain, a B-box zinc-finger, a coiled coil and a C-terminal B30.2 PrySpry domains. Pyrin protein has been reported as a component of the inflammasome complex with both pro-inflammatory and anti-inflammatory role in the cytokine regulation (10-13). Thus, a proapoptotic or antiapoptotic role have been still not precise for the pyrin protein in NF-kB activation and apoptosis (11-16). By means of its PYD and B30.2 interacting domains, pyrin has been shown to bind different proteins of autoinflammatory disease genes. Each interacting protein that binds through the pyrin domains (PYD) consists of PSTPIP1 (17), 14-3-3 (18), Caspase-1 (19), ASC (20), and Siva (21).

150 Mutations in Human Genetic Disease

heterozygous patients with mild disease.

**Inheritance** 

Recessive (dominant forms are rarely presented)

Autosomal Recessive

Autosomal Dominant

Autosomal Dominant

Dominant

Dominant

**Syndrome (MIM) State of** 

FMF (249100) Autosomal

HIDS

TRAPS


syndrome.

and Hawkins, 2009: 36).

(260920; 251170)

(142680; 191190)

CAPS (606416)


PAPA Syndrome Autosomal

Blau Syndrome Autosomal

Italian, Spanish (5), Portuguese, French, and Greek, as well as in patients from Northern Europe and Japan. Nevertheless, only rare occurrences have been reported throughout the general population because of the low frequency of the causative alleles (6). Among susceptible ethnic groups, FMF prevalence is between 1/500-1/2000, and the carrier rate is between 16-22%. Contrary to the traditionally known monogenic inheritance of the disease, it has been previously evidenced that there have been a number of patients who have the typical FMF phenotype or FMF related symptoms with only one MEFV heterozygous mutation and/or even without any MEFV mutations (5-7), indicating the presence of clinical phenotype not only in homozygous patients, but also similarly in the

**Gene**

MEFV

Ch-16

Ch-12

Ch-12

 NLRP3 Ch-1

PSTPIP1 Ch-15

Ch-16

syndrome; FMF, familial Mediterranean fever; MWS, Muckle-Wells syndrome; NOMID, neonatal onset multisystem inflammatory disease; PAPA, pyogenic sterile arthritis, pyoderma gangrenosum, and acne; CAPS, cryopyrin-associated periodic syndrome; TRAPS, tumour necrosis factor receptor-associated periodic

\*CINCA, chronic infantile neurological, cutaneous, and articular syndrome; FCAS, familial cold autoinflammatory

**Table 1.** Hereditary autoinflammatory syndromes with identified gene loci (adapted from Lachmann

NOD2/CARD15

**(GenBank no)** 

Pyrin

(marenostrin)

Mevalonate kinase

TNF-receptor type I (p55)

Cryopyrin -Childhood

PSTPIP1 Childhood

NOD2/CARD15 Childhood

(NM\_000243)

MVK (M88468)

TNFRSF1A (NM\_001065) **Protein Age at disease** 

**onset** 

Childhood

Infancy

Childhood


According to INFEVERS (22), the database of hereditary autoinflammatory disorder mutations, To date, approximately 222 sequence variants including both missense mutations (only one nonsense mutation; Y688X) and polymorphisms have been defined in the FMF gene (MEFV), INFEVERS, 100 of them was clinically associated with the phenotype, 33 of them was not associated with the disease and the remaining was of uncertain pathogenicity. The remarkably wide clinical variability of the disease, as indicated by previous reports, has been linked to the MEFV allelic heterogeneity that underlies genotypic and phenotypic heterogeneity (23, 26, 27), and this has made detailed mutation screening critically important. In particular, Turkish FMF patients are characterized by an increased genetic heterogeneity due to various mutation frequencies from different regions, explained by the intrapopulation differentiation.

With respect to our mutation screenings, a previous comprehensive study was performed with 3430 Turkish individuals from all regions of Turkey (ages range from 2 months to 67 years; 2101 females and 1329 males) including first and second-degree relatives of individuals with FMF clinical diagnoses (including suspicious, possible, and definitive cases) who referred to the Molecular Medicine Laboratory for genetic diagnosis between years May, 2005 and December, 2010. The Tel-Hashomer and Livneh criteria were used for the clinical diagnosis of FMF based on the model of major, minor, and supportive criteria, which stipulates the presence of either 1 major or 2 minor criteria or 1 minor and 5 supportive criteria for a diagnosis. A simple set of criteria for the diagnosis of FMF required 1 or more major and/or 2 or more minor criteria (28). None of the patients with FMF had an immunological disorder or another rheumatic disease. Active clinical presentations (fever, abdominal pain, arthritis, and myalgia) and laboratory parameters (high levels of serum amyloid A [SAA], C-reactive protein [CRP], fibrinogen, white blood cell [WBC] counts and erythrocyte sedimentation rates [ESR]) were determined for each patient. For the detection of all coding and non-coding sequence variations along the MEFV gene, we performed bidirectional DNA sequencing analysis in all 10 coding exons and exon-intron boundaries of the respective gene, and reported frequencies of common and rare nucleotide substitutions and synonymous and non-synonimous single nucleotide polymorphisms obtained in the Turkish population (7).

The Prototype of Hereditary Periodic Fevers: Familial Mediterranean Fever 153

oligonucleotide sequences are available upon request). Prior to sequencing, PCR products were purified using an ExoSAP-IT PCR Product Clean-Up kit. BigDye Terminatorv3.1 Cycle Sequencing Kit (Applied Biosystems, San Diego, CA, USA) was used in cycle sequencing reactions. Cycle sequencing PCR products followed purification with the BigDyeXT kit(Applied Biosystems,) and the data were analyzed using an ABI3130xl Genetic Analyzer (Applied Biosystems). DNA sequencing was performed in both directions, initiated from the forward and reverse primers that were used in the initial PCR reaction. SeqScape 2.0 sequence analysis software (Applied Biosystems, San Diego, CA, USA) was employed for

A RFLP was identified in the mutation site and was utilized for mutation detection. Amplicons encompassing exon 5 were digested with the restriction enzyme Tsp509I, and

We found that M694V accounted for the majority of FMF chromosomes (44%), followed by E148Q (19%), V726A (10%), M680I (10%), P369S (4%), R408Q (3%), K695R (2%), M694I and R761H (1.6%), A744S (1.4%), and F479L (0.09%) (**Tables 2, 3**). Missense disease-causing mutations and synonymous polymorphisms accounted for 38% and 54% of MEFV chromosomes, respectively. Among the Turkish general population, the most frequent healthy heterozygous carrier mutation was found E148Q (6.9%), and the carrier rate was found 16%, with a mutation frequency of 8% (Berdeli et al., 2011). Except for the known major FMF mutations, by DNA sequencing, we frequently detect additional rare and novel mutations and critical SNPs about which we have only limited information in Turkish FMF patients. Remarkable consequences of sequencing analysis have been found relative to mutation-SNP combination underlying the combined existence of nucleotide variations in

For patients whose MEFV gene does not contain mutations of exons 2, 3, 5, and 10, we performed bidirectional DNA sequencing also in exons 1, 4, 6, 7, 8, and 9. However, we could not find any disease related mutation except for an exon 9 homozygous SNP, P588P, which is thought to be symptomatic with disease relation. This SNP was always in homozygous state and was not seen in combination with any of the major and minor mutations or any of the SNPs in the entire coding and non-coding regions of the gene. Relative to our experiences, this SNP has a disease relation to a minor degree, however possible validation of other autoinflammatory disease gene mutations should need to be considered. Single P588P SNP was associated with continuously high SAA levels and musculoskeletal complications which has a good response to colchicine in a three-member family who did not have any sequence variations along other coding and non-coding

**2.3. Restriction fragment length polymorphism analysis (RFLP)** 

sequence evaluation.

**3. Results** 

the same haplotype.

regions of the MEFV gene.

electrophoresed on a 1% agarose gel.

## **2. Methods**

2 ml peripheral blood was collected into ethylenediaminetetraacetic acid (EDTA) anticoagulated tubes by the standard venipuncture method and DNA was extracted using the QIAamp DNA Blood Isolation kit (Qiagen GmbH, Hilden, Germany) following the manufacturer's instructions. The extracted DNA concentration was determined using a Thermo Scientific NanoDrop spectrophotometer (Wilmington, USA). The quality assessment of the extracted DNA was determined by 2% agarose gel electrophoresis.

#### **2.1. FMF strip asay - Reverse hybridization multiplex PCR**

Reverse hybridization assay (FMF StripAssay, Viennalab Labordiagnostika GmbH) was used to investigate the mutations. According to the manufacturer's instructions, in a first step multiplex PCR was performed using biotinylated primers for exons 2, 3, 5, 10 amplification. PCR products were selectively hybridized to a test strip presenting a paralel array of allele-specific oligonucleotide probes which includes 12 MEFV mutations [E148Q, P369S, F479L, M680I (G/C), M680I (G/A), I692del, M694V, M694I, K695R, V726A, A744S, R761H]. Hybridizations were illuminated by the reaction of streptavidin-alkaline phosphatase and color substrate.

#### **2.2. DNA sequencing strategy**

Hot-spots, exons 10, and 2; with 3 and 5, and when necessary exons 1, 4, 6, 7, 8 and 9 of the MEFV gene were analyzed for MEFV mutations by PCR amplification followed by automated DNA sequence analysis. One microliter (100 ng) of genomic DNA was added to Polymerase Chain Reaction (PCR) amplification buffer containing 20 mM Tris (pH 8.3); 50 mM KCl; 1.5 mM MgCl2; 0.2 mM each of dATP, 2'-deoxycytidine 5'-triphosphate, dGTP, and 2'-deoxythymidine 5'-triphosphate; 10 pmol each of reverse and forward primers provided by Invitrogen; and 1.0 U of PlatiniumTaq DNA Polymerase (Invitrogen, Carlsbad, CA) in a total volume of 25 µl. The cycling conditions included a hot-start denaturation step at 95° C for 10 min, followed by 35 amplification cycles of denaturation at 95° C for 30 s, annealing at 61° C for exon 10, 58° C for exons 2 and 3, or 57° C for exon 5 for 40 s, and elongation at 72° C for 45 s; a final extension was performed at 72° C for 7 min (the oligonucleotide sequences are available upon request). Prior to sequencing, PCR products were purified using an ExoSAP-IT PCR Product Clean-Up kit. BigDye Terminatorv3.1 Cycle Sequencing Kit (Applied Biosystems, San Diego, CA, USA) was used in cycle sequencing reactions. Cycle sequencing PCR products followed purification with the BigDyeXT kit(Applied Biosystems,) and the data were analyzed using an ABI3130xl Genetic Analyzer (Applied Biosystems). DNA sequencing was performed in both directions, initiated from the forward and reverse primers that were used in the initial PCR reaction. SeqScape 2.0 sequence analysis software (Applied Biosystems, San Diego, CA, USA) was employed for sequence evaluation.

#### **2.3. Restriction fragment length polymorphism analysis (RFLP)**

A RFLP was identified in the mutation site and was utilized for mutation detection. Amplicons encompassing exon 5 were digested with the restriction enzyme Tsp509I, and electrophoresed on a 1% agarose gel.

## **3. Results**

152 Mutations in Human Genetic Disease

Turkish population (7).

phosphatase and color substrate.

**2.2. DNA sequencing strategy** 

C for exon 10, 58°

at 95°

annealing at 61°

elongation at 72°

**2. Methods** 

1 or more major and/or 2 or more minor criteria (28). None of the patients with FMF had an immunological disorder or another rheumatic disease. Active clinical presentations (fever, abdominal pain, arthritis, and myalgia) and laboratory parameters (high levels of serum amyloid A [SAA], C-reactive protein [CRP], fibrinogen, white blood cell [WBC] counts and erythrocyte sedimentation rates [ESR]) were determined for each patient. For the detection of all coding and non-coding sequence variations along the MEFV gene, we performed bidirectional DNA sequencing analysis in all 10 coding exons and exon-intron boundaries of the respective gene, and reported frequencies of common and rare nucleotide substitutions and synonymous and non-synonimous single nucleotide polymorphisms obtained in the

2 ml peripheral blood was collected into ethylenediaminetetraacetic acid (EDTA) anticoagulated tubes by the standard venipuncture method and DNA was extracted using the QIAamp DNA Blood Isolation kit (Qiagen GmbH, Hilden, Germany) following the manufacturer's instructions. The extracted DNA concentration was determined using a Thermo Scientific NanoDrop spectrophotometer (Wilmington, USA). The quality

Reverse hybridization assay (FMF StripAssay, Viennalab Labordiagnostika GmbH) was used to investigate the mutations. According to the manufacturer's instructions, in a first step multiplex PCR was performed using biotinylated primers for exons 2, 3, 5, 10 amplification. PCR products were selectively hybridized to a test strip presenting a paralel array of allele-specific oligonucleotide probes which includes 12 MEFV mutations [E148Q, P369S, F479L, M680I (G/C), M680I (G/A), I692del, M694V, M694I, K695R, V726A, A744S, R761H]. Hybridizations were illuminated by the reaction of streptavidin-alkaline

Hot-spots, exons 10, and 2; with 3 and 5, and when necessary exons 1, 4, 6, 7, 8 and 9 of the MEFV gene were analyzed for MEFV mutations by PCR amplification followed by automated DNA sequence analysis. One microliter (100 ng) of genomic DNA was added to Polymerase Chain Reaction (PCR) amplification buffer containing 20 mM Tris (pH 8.3); 50 mM KCl; 1.5 mM MgCl2; 0.2 mM each of dATP, 2'-deoxycytidine 5'-triphosphate, dGTP, and 2'-deoxythymidine 5'-triphosphate; 10 pmol each of reverse and forward primers provided by Invitrogen; and 1.0 U of PlatiniumTaq DNA Polymerase (Invitrogen, Carlsbad, CA) in a total volume of 25 µl. The cycling conditions included a hot-start denaturation step

C for 10 min, followed by 35 amplification cycles of denaturation at 95°

C for 45 s; a final extension was performed at 72°

C for exons 2 and 3, or 57°

C for 30 s,

C for 7 min (the

C for exon 5 for 40 s, and

assessment of the extracted DNA was determined by 2% agarose gel electrophoresis.

**2.1. FMF strip asay - Reverse hybridization multiplex PCR** 

We found that M694V accounted for the majority of FMF chromosomes (44%), followed by E148Q (19%), V726A (10%), M680I (10%), P369S (4%), R408Q (3%), K695R (2%), M694I and R761H (1.6%), A744S (1.4%), and F479L (0.09%) (**Tables 2, 3**). Missense disease-causing mutations and synonymous polymorphisms accounted for 38% and 54% of MEFV chromosomes, respectively. Among the Turkish general population, the most frequent healthy heterozygous carrier mutation was found E148Q (6.9%), and the carrier rate was found 16%, with a mutation frequency of 8% (Berdeli et al., 2011). Except for the known major FMF mutations, by DNA sequencing, we frequently detect additional rare and novel mutations and critical SNPs about which we have only limited information in Turkish FMF patients. Remarkable consequences of sequencing analysis have been found relative to mutation-SNP combination underlying the combined existence of nucleotide variations in the same haplotype.

For patients whose MEFV gene does not contain mutations of exons 2, 3, 5, and 10, we performed bidirectional DNA sequencing also in exons 1, 4, 6, 7, 8, and 9. However, we could not find any disease related mutation except for an exon 9 homozygous SNP, P588P, which is thought to be symptomatic with disease relation. This SNP was always in homozygous state and was not seen in combination with any of the major and minor mutations or any of the SNPs in the entire coding and non-coding regions of the gene. Relative to our experiences, this SNP has a disease relation to a minor degree, however possible validation of other autoinflammatory disease gene mutations should need to be considered. Single P588P SNP was associated with continuously high SAA levels and musculoskeletal complications which has a good response to colchicine in a three-member family who did not have any sequence variations along other coding and non-coding regions of the MEFV gene.


The Prototype of Hereditary Periodic Fevers: Familial Mediterranean Fever 155

V726A/Wt 111 8.43 V726A/V726A 4 0.3

V726A/R761H 3 0.22 V726A/R761H/ M680IG-C 1 0.12 V726A/K695R 1 0.12

A744S/Wt 19 1.44

R761H/Wt 31 2.35 R761H/ A744S 1 0.12 R653H/Wt 1 0.12 E685K/E685K 1 0.12

**Exon 2 Exon 3 Exon 5 Exon 10 No (%)**  E148Q K695R 2 0.15 E148Q/T267M 1 0.12 E148Q/E230K 4 0.3 E148Q/T267I 1 0.12 E148Q/L110P M694I 1 0.12 E148Q P369S/R408Q 18 1.36 E148Q P369S/R408Q M680I 1 0.12 M694I/A744S 1 0.12

E167D V726A 3 0.22 V726A/M694I 2 0.15

 F479L V726A 3 0.22 K695R/Wt 38 2.88

M694I/Wt 10 0.75

L110P/L1010P 1 0.12 E230K/E230K 1 0.12 E230K/ Wt 1 0.12 T267M/Wt 3 0.22 R241K/R241K 1 0.12 E148V/Wt 5 0.37 E148L/Wt 2 0.15 E167D/Wt 2 0.15

P369S/Wt 7 0.53 P369S/R408Q 40 3.03 P369S/R408Q M694V 4 0.3

P350R/Wt 1 0.12 P350R A744S 2 0.15

**Number of Patients** 

**Genotype Frequency** 

**Genotype** 

**MEFV Mutation** 


**Number of Patients** 

M680IG-A/Wt 5 0.37 M680IG-C/M680I 12 0.91 M680IG-C/V726A 23 1.74 M680IG-C/ M694V 42 3.19

M680IG-C/ R761H 2 0.15

M694V/M694V 91 6.91 M694V/V722M\* 1 0.12 M694V/V726A 42 3.19 M694V/R761H 5 0.37 M694V/K695R 2 0.15 M694V/A744S 1 0.12

**Exon 2 Exon 3 Exon 5 Exon 10 No (%)**  M680IG-C/Wt 69 5.24

E230K M680IG-C/ M694V 2 0.15 M680IG-C/A744S 1 0.12

E148Q M680IG-C 4 0.3 E167D F479L M680IG-C 1 0.07 E167D F479L 2 0.15 F479L/Wt 1 0.12 M694V/Wt 322 24.4

R241K M694V/ 1 0.12 E230K M694V/ 3 0.22 E148Q P369S M694V 1 0.12 E148Q/S179N\* M694V 1 0.12 E148Q/Wt 237 18 E148Q/E148Q 19 1.44 E148Q M694V 42 3.19 E148Q L709R 1 0.12 E148Q P369S 4 0.3 E148Q V726A 7 0.53 E148Q A744S 1 0.12 E148Q M694I 7 0.53 E148Q K695N 1 0.12 E148Q R761H 4 0.3 E148Q I72OM 2 0.15 E148Q/L110P 6 0.45 E148Q/R151S 1 0.12

**Genotype Frequency** 

**Genotype** 

**MEFV Mutation** 


The Prototype of Hereditary Periodic Fevers: Familial Mediterranean Fever 157

**Mutation Number of Alleles (No) Allelic Frequency (%)** 

**Total 2033 100** 

for the studied mutations; complex alleles excluded).

**Table 3.** Allelic frequencies of totally 40 MEFV mutations involving major, rare and, novel sequence changes among the detected mutations in 1316 mutation positive patients group (mutation frequency

0.04 0.24 0.09 0.04 0.04 0.04

M694V 908 44.7 E148Q 386 19 V726A 204 10 M680IG-C 170 8.3 P369S 75 3.69 R408Q 63 3.1 K695R 43 2.11 M694I 21 1 R761H 47 2.31 A744S 26 1.28 E148V 5 0.24 E167D 8 0,39 T267M 4 0.19 L110P 8 0.39 R241K 3 0.14 I720M 2 0.09 E230K 12 0.59 M680IG-A 5 0.24 E148L 2 0.09 F479L 7 0.34 E685K 2 0.09 R653H 1 0.04 T267I 1 0.04 V722M 1 0.04 S141I 3 0.14 S339F 4 0.19 R151S 1 0.04 I506V 1 0.04 S503C 2 0.09 L709R 1 0.04 K695N 1 0.04 P350R 3 0.14 G340R 1 0.04 G456A 1 0.04

Y471X R329H S166L S179N A511V R354W

\*, novel mutations

**Table 2.** DNA sequencing results of MEFV genotyping among 3430 Turkish patients.


**Number of Patients** 

**Exon 2 Exon 3 Exon 5 Exon 10 No (%)**  G456A/Wt 1 0.12 S503C/Wt 2 0.15 I506V/Wt 1 0.12 Y471X/Wt 1 0.12

S141I/Wt 3 0.22 S166L/Wt 2 0.15 A511V/Wt 1 0.12

E148Q R329H/ 1 0.12 Heterozygotes 885 67.2

heterozygotes 271 20.5 Homozygotes 130 9.87

genotypes 30 2.27

of patients 3430 100

**Table 2.** DNA sequencing results of MEFV genotyping among 3430 Turkish patients.

G340R/Wt 1 0.12

R354W/Wt 1 0.12 S339F/Wt 4 0.3 R329H/Wt 3 0.22 R329H/ M694V 1 0.12

1316 38.36

231 6.7

1883 54.8

**Genotype Frequency** 

**Genotype** 

Compound

Complex

Total number of patients with mutations

No mutation or SNPs identified

Total number of patients with only SNPs (**+R202Q**)

Total number

\*, novel mutations

**MEFV Mutation** 

The Prototype of Hereditary Periodic Fevers: Familial Mediterranean Fever 157

**Table 3.** Allelic frequencies of totally 40 MEFV mutations involving major, rare and, novel sequence changes among the detected mutations in 1316 mutation positive patients group (mutation frequency for the studied mutations; complex alleles excluded).

Additionally, sequence analysis revealed that there was a single FMF-associated mutation in the MEFV coding region of 76% of the Turkish individuals studied, and 80% of these individuals initiated colchicine treatment following molecular diagnosis. The prevalence of a single mutation in patients experiencing a pathogenic effect in Turkey (76%) is contrary to the expected pattern of autosomal recessive inheritance and does not support the ''heterozygous advantage'' selection theory. However, the expression of the FMF phenotype may be influenced by other candidate modifier gene loci, autoinflammatory pathway genes or FMF-like diseases (29-31). For this reason, genome-wide association studies involving more patients should be performed and the data included in future investigations covering critical coding and noncoding gene SNPs for Turkish FMF patients.

The Prototype of Hereditary Periodic Fevers: Familial Mediterranean Fever 159

the last 3 years. So, other autoinflammatory genes, MVK, TNFRSF1A, CIAS1, were not considered to evaluate as the suspicious genes in this case and were not evaluated as

**Figure 1.** Electropherogram of the p.Y471X nonsense mutation in the MEFV gene revealed by DNA

The case presented here was one of the patients who had misdiagnosis in particularly during the childhood losing time by unnecessary processes and treatments. Therefore, certain diagnosis determined by detailed DNA sequence analysis is essential for suspicious and undefined cases, and for cases disestablished by other limited screening methods. In the molecular analysis of Mediterranean fever gene, c.1413C>A nucleotide change in exon 5 resulting in p.Tyr471X nonsense mutation was determined (Figure I). We also exploited the fact that the p.Y471X creates a novel recognition site for the Tsp509I restriction enzyme to develop a PCR-RFLP assay in order to screen the affected families and healthy controls for

Y471X nonsense mutation in MEFV gene is the first noted in Turkish FMF patients (7), and the second nonsense mutation of FMF mutation database worldwide. Inherited missens mutations reported in the 5th exon of MEFV gene in FMF patients are very rare. Though the fifth exon of the gene could not called as a critical region carrying the mutational hotspots, the result could demonstrate there is still way to walk on the road through the hidden side of FMF. Novel Y471X mutation in exon 5 of the MEFV gene located in the coiled coil domain

molecular diagnostics.

Sequencing analysis in the Turkish patient.

the mutation.

As an ancestral population of FMF, Turkey was one of the regions which involves most of the rare and novel mutations. As referenced in INVEFERS, most of the rare mutations in view of the ethnic origins were found to be symptomatic. Novel Y471X mutation found in the present study was the second nonsense mutation in FMF era. Among the newly identified mutations, involving R151S, S166N, S179N, and G340R; P350R, G456A, Y471X, S503C, I506V, L709R and K695N; Y471X, R151S, L709R, and K695N were observed as pathogenic reflecting the typical FMF character. The main clinical characteristics of the patients were as follows: abdominal pain (92.1%), fever (93.9%), thoracic pain (59%), myalgia (67.8%), arthritis (55.1%), erysipelas like erythema (ELE) (21.8%). None of the patients developed amyloidosis. This finding verifies the importance of molecular diagnosis and detailed sequencing which is recommended to perform in particular for the ancestral populations of FMF.

In this report, from a large scaled heterogeneous group of patients, we describe a 44-yearold Turkish patient from Western Turkey with clinical diagnosis of periodic fever. The case presented here is a 44-year-old Turkish woman, from western Turkey. The course of the patient includes short and rare episodes of fever, ongoing abdominal pain, temporary myalgia and arthralgia since her childhood. Physical examination revealed no pathology except for arthritis on the right knee. Her weight, height, and blood pressure were normal. Primarily, she had diagnosed as having conditions secondary to FMF. Although family and relatives screening are of great importance, her family (parents are dead in an accident) and past history were noncontributory and unhappy. She had undergone antibiotherapy, steroid treatment and appendectomy. Laboratory tests revealed the acute phase reactants as follows; ESR 81 mm/h, SAA 76 mg/dl, CRP 3.46 mg/dl, and fibrinogen 526 mg/dl. Renal function tests and other biochemical parameters were normal. No molecular genetic diagnosis was done except for Strip Assay in other centers. The clinical figure associated with her was not much contributed to the start of colchicine not fulfilling most of the clinical criteria, so in our laboratory, FMF strip assay was used as the first stage of mutation detection method involving 12 common mutations. However, no particular mutation was identified. Thereafter, DNA sequence analysis revealed the responsible nonsense mutation, p.Y471X, in MEFV gene (**Figure 1**). By means of the molecular diagnosis, colchicine therapy (1.5 mg/day) was started properly. She had no symptoms after the colchicine therapy and had a good response to 1,5 mg/d, and the acute phase reactants were completely normal in the last 3 years. So, other autoinflammatory genes, MVK, TNFRSF1A, CIAS1, were not considered to evaluate as the suspicious genes in this case and were not evaluated as molecular diagnostics.

158 Mutations in Human Genetic Disease

populations of FMF.

Additionally, sequence analysis revealed that there was a single FMF-associated mutation in the MEFV coding region of 76% of the Turkish individuals studied, and 80% of these individuals initiated colchicine treatment following molecular diagnosis. The prevalence of a single mutation in patients experiencing a pathogenic effect in Turkey (76%) is contrary to the expected pattern of autosomal recessive inheritance and does not support the ''heterozygous advantage'' selection theory. However, the expression of the FMF phenotype may be influenced by other candidate modifier gene loci, autoinflammatory pathway genes or FMF-like diseases (29-31). For this reason, genome-wide association studies involving more patients should be performed and the data included in future investigations covering

As an ancestral population of FMF, Turkey was one of the regions which involves most of the rare and novel mutations. As referenced in INVEFERS, most of the rare mutations in view of the ethnic origins were found to be symptomatic. Novel Y471X mutation found in the present study was the second nonsense mutation in FMF era. Among the newly identified mutations, involving R151S, S166N, S179N, and G340R; P350R, G456A, Y471X, S503C, I506V, L709R and K695N; Y471X, R151S, L709R, and K695N were observed as pathogenic reflecting the typical FMF character. The main clinical characteristics of the patients were as follows: abdominal pain (92.1%), fever (93.9%), thoracic pain (59%), myalgia (67.8%), arthritis (55.1%), erysipelas like erythema (ELE) (21.8%). None of the patients developed amyloidosis. This finding verifies the importance of molecular diagnosis and detailed sequencing which is recommended to perform in particular for the ancestral

In this report, from a large scaled heterogeneous group of patients, we describe a 44-yearold Turkish patient from Western Turkey with clinical diagnosis of periodic fever. The case presented here is a 44-year-old Turkish woman, from western Turkey. The course of the patient includes short and rare episodes of fever, ongoing abdominal pain, temporary myalgia and arthralgia since her childhood. Physical examination revealed no pathology except for arthritis on the right knee. Her weight, height, and blood pressure were normal. Primarily, she had diagnosed as having conditions secondary to FMF. Although family and relatives screening are of great importance, her family (parents are dead in an accident) and past history were noncontributory and unhappy. She had undergone antibiotherapy, steroid treatment and appendectomy. Laboratory tests revealed the acute phase reactants as follows; ESR 81 mm/h, SAA 76 mg/dl, CRP 3.46 mg/dl, and fibrinogen 526 mg/dl. Renal function tests and other biochemical parameters were normal. No molecular genetic diagnosis was done except for Strip Assay in other centers. The clinical figure associated with her was not much contributed to the start of colchicine not fulfilling most of the clinical criteria, so in our laboratory, FMF strip assay was used as the first stage of mutation detection method involving 12 common mutations. However, no particular mutation was identified. Thereafter, DNA sequence analysis revealed the responsible nonsense mutation, p.Y471X, in MEFV gene (**Figure 1**). By means of the molecular diagnosis, colchicine therapy (1.5 mg/day) was started properly. She had no symptoms after the colchicine therapy and had a good response to 1,5 mg/d, and the acute phase reactants were completely normal in

critical coding and noncoding gene SNPs for Turkish FMF patients.

**Figure 1.** Electropherogram of the p.Y471X nonsense mutation in the MEFV gene revealed by DNA Sequencing analysis in the Turkish patient.

The case presented here was one of the patients who had misdiagnosis in particularly during the childhood losing time by unnecessary processes and treatments. Therefore, certain diagnosis determined by detailed DNA sequence analysis is essential for suspicious and undefined cases, and for cases disestablished by other limited screening methods. In the molecular analysis of Mediterranean fever gene, c.1413C>A nucleotide change in exon 5 resulting in p.Tyr471X nonsense mutation was determined (Figure I). We also exploited the fact that the p.Y471X creates a novel recognition site for the Tsp509I restriction enzyme to develop a PCR-RFLP assay in order to screen the affected families and healthy controls for the mutation.

Y471X nonsense mutation in MEFV gene is the first noted in Turkish FMF patients (7), and the second nonsense mutation of FMF mutation database worldwide. Inherited missens mutations reported in the 5th exon of MEFV gene in FMF patients are very rare. Though the fifth exon of the gene could not called as a critical region carrying the mutational hotspots, the result could demonstrate there is still way to walk on the road through the hidden side of FMF. Novel Y471X mutation in exon 5 of the MEFV gene located in the coiled coil domain

of pyrin protein is implicated in association with actin binding interacting selectively with monomeric or multimeric forms of actin. Since effects of nonsense mutations in the amino acids are known damaging and pathogenic, we did not use the PolyPhen software (32) in order to evaluate the potential pathogenicity of this newly found amino acid substitution which we carry out regularly in our laboratory. Nevertheless, expression studies will be required.

The Prototype of Hereditary Periodic Fevers: Familial Mediterranean Fever 161

TNFRSF1A, V377I in MVK, and V198M in CIAS1 (29, 34, 35). For the purpose of screening mutations in other known autoinflammatory genes for typical FMF patients carrying 1 single heterozygous MEFV mutation, Booty et al screened 6 candidate genes that encode proteins known to interact with pyrin or genes functioning in IL-1B pathway involving ASC/PYCARD, SIVA, CASP1, PSTPIP1, POP1, and POP2 (6). A novel PSTPIP1 nucleotide mutation, two novel substitutions in ASC/PYCARD and SIVA genes were identified while Casp1, POP1, and POP2 were mutation negative. In a Jewish patient with FMF, novel W171X (513G>A) mutation was identified which is presumed as a stop codon, to remove the last 2 of the 6 helices in the CARD domain of ASC/PYCARD. In FMF patients with only 1 MEFV mutation, including milder FMF-associated mutations, 1 Turkish patient was identified as a carrier of W171X (6). To date, SNPs in ASC/PYCARD gene were identified in 5'/3' region, exon 1, intron 1, exon 3 coding region involving rs79351176, rs8056505, rs11648861, rs79464842, rs73532217, rs75471387, rs11867108, rs61086377, rs76878620, and rs75216100. In the ASC/PYCARD protein, the conserved PyD domain is 91 aa in lenght (1- 91) and CARD domain is 89 aa in lenght (107-195). The previously reported W171X (513G>A) mutation (31) corresponds to the exon 3 coding region of the ASC/PYCARD gene and results with a stop codon. Thus, in our sequencing analysis, we also searched the presence of mutations in the ASC/PYCARD gene in our entire patients group. However, this sequence was not mutated, and we have neither identified the above substitutions along the entire coding regions and flanking segments of ASC/PYCARD gene (unpublished data).

For investigating of mutations in other periodic fever disease genes, in a study of our group, a total of 75 Turkish patients and 25 ethnically matched healthy control individuals diagnosed with periodic fever was molecularly diagnosed for having mutations in causative disease genes (apart from the present patients group; unpublished data). Mutation screening of coding and noncoding regions of MVK, TNFRSF1A, and NLRP3/CIAS1 genes were

MVK gene transcript variant 1 (12q24; NM\_000431.2→NP\_000422.1) was fully sequenced in 25 periodic fever patients. Molecular diagnosis revealed the following results: p.Ser52Asn missense mutation was identified in 6 patients. In addition, p.Asp170Asp and p.Ser135Ser synonimous aminoacid mutations and IVS6-18 A>G, homozygous IVS9+24 G>A, and IVS 4+8 C/T intronic nucleotide substitutions were observed in the remaining patients group.

NLRP3 gene (CIAS1; 1q44; NM\_004895.4→NP\_004886.3) NACHT, LRR and PYD domainscontaining protein 3 isoform a was fully sequenced in 25 periodic fever patients. Molecular diagnosis revealed the following nucleotide substitutions in the screened gene region: K608fsX611 frameshift mutation, p.Ser726Gly and p.Gln703Lys missense mutations, together with Ser34Ser, Ala242Ala, Arg260Arg, Thr219Thr ve Leu411Leu synonimous

TNFRSF1A gene (12p13.2; NM\_001065.3→NP\_001056.1) tumor necrosis factor receptor superfamily member 1A precursor form was fully sequenced in 25 periodic fever patients. Molecular diagnosis revealed the following nucleotide substitutions in the screened gene

carried out for different group of patients according to their clinical implications.

aminoacid mutations.

Due to the abundance of mutations in exon 10 and clinical heterogeneity of the disease, different screening methods have been developed. As long been known the majority of FMF patients in classically affected populations were screened by routine methods for only common mutations, which primarily targets only the most prevalent MEFV mutations in a specific population; thus, rare or novel mutations can be overlooked. The first nonsense mutation in FMF era, Y688X, was evaluated by Touitou I. (5), and was suggested to have a location between two well-known hotspots for FMF mutations (codons 680 and 694) in exon 10. This finding contributed to the critical role of exon 10 for the MEFV function as an hotspot. Here, it is discussed that, the newly found Y471X nonsense mutation has a great significance in screening asymptomatic individuals since it was not found in one of the hotspots of MEFV gene.

Autoinflammatory diseases are heterogeneous group of disorders, thus FMF like phenotypes and related genes most likely exists (33-36). In some cases, the causal genes may not only be the unique causes of the diseases. It is well known that Mendelian disorders caused by the dysfunction of a single gene have a wide heterogeneity of disease phenotypes (37). FMF has both genetic and phenotypic heterogeneity and mutations within a single gene are known to cause different clinical phenotypes in Turkey. Thus, all MEFV gene sequence variations found in symptomatic cases should not be considered as causative pathogenic disease mutations. In particular, FMF related Turkish patients with no MEFV mutation or with only single MEFV mutations may not actually reflect the phenotype seen in FMF.

Another point is subclinical inflammation concerning asymptomatic heterozygous patients without a second mutation mostly continues with the typical disease characteristics possibly due to the presence of other modifier genes and/or environmental factors. Therefore, factors other than casual MEFV gene and other pyrin-dependent effects should be contributing to the sustainable systemic inflammation that is sufficient for the occurrence of the symptomatic FMF related phenotype. Previously, MICA, TLR2 and SAA loci were shown as modifying alleles in FMF (5, 38). Synonimous or non-synonimous sequence variations of MEFV relevant genes involving SAA and TLR2 were previously considered as critical factors for the course of the disease. Both SAA1 locus and Arg753Gln TLR2 polymorphism were implied as genetic susceptible loci for a risk factor of developing secondary amyloidosis in different ethnic populations of FMF patients (26, 27, 30). Against the traditionally considered monogenic inheritance pattern, compound heterozygotes of 2 autoinflammatory disease genes were also reported describing patients who were found to have 2 or more reduced penetrance mutations, involving E148Q in MEFV, R92Q or P46L in TNFRSF1A, V377I in MVK, and V198M in CIAS1 (29, 34, 35). For the purpose of screening mutations in other known autoinflammatory genes for typical FMF patients carrying 1 single heterozygous MEFV mutation, Booty et al screened 6 candidate genes that encode proteins known to interact with pyrin or genes functioning in IL-1B pathway involving ASC/PYCARD, SIVA, CASP1, PSTPIP1, POP1, and POP2 (6). A novel PSTPIP1 nucleotide mutation, two novel substitutions in ASC/PYCARD and SIVA genes were identified while Casp1, POP1, and POP2 were mutation negative. In a Jewish patient with FMF, novel W171X (513G>A) mutation was identified which is presumed as a stop codon, to remove the last 2 of the 6 helices in the CARD domain of ASC/PYCARD. In FMF patients with only 1 MEFV mutation, including milder FMF-associated mutations, 1 Turkish patient was identified as a carrier of W171X (6). To date, SNPs in ASC/PYCARD gene were identified in 5'/3' region, exon 1, intron 1, exon 3 coding region involving rs79351176, rs8056505, rs11648861, rs79464842, rs73532217, rs75471387, rs11867108, rs61086377, rs76878620, and rs75216100. In the ASC/PYCARD protein, the conserved PyD domain is 91 aa in lenght (1- 91) and CARD domain is 89 aa in lenght (107-195). The previously reported W171X (513G>A) mutation (31) corresponds to the exon 3 coding region of the ASC/PYCARD gene and results with a stop codon. Thus, in our sequencing analysis, we also searched the presence of mutations in the ASC/PYCARD gene in our entire patients group. However, this sequence was not mutated, and we have neither identified the above substitutions along the entire coding regions and flanking segments of ASC/PYCARD gene (unpublished data).

160 Mutations in Human Genetic Disease

required.

spots of MEFV gene.

in FMF.

of pyrin protein is implicated in association with actin binding interacting selectively with monomeric or multimeric forms of actin. Since effects of nonsense mutations in the amino acids are known damaging and pathogenic, we did not use the PolyPhen software (32) in order to evaluate the potential pathogenicity of this newly found amino acid substitution which we carry out regularly in our laboratory. Nevertheless, expression studies will be

Due to the abundance of mutations in exon 10 and clinical heterogeneity of the disease, different screening methods have been developed. As long been known the majority of FMF patients in classically affected populations were screened by routine methods for only common mutations, which primarily targets only the most prevalent MEFV mutations in a specific population; thus, rare or novel mutations can be overlooked. The first nonsense mutation in FMF era, Y688X, was evaluated by Touitou I. (5), and was suggested to have a location between two well-known hotspots for FMF mutations (codons 680 and 694) in exon 10. This finding contributed to the critical role of exon 10 for the MEFV function as an hotspot. Here, it is discussed that, the newly found Y471X nonsense mutation has a great significance in screening asymptomatic individuals since it was not found in one of the hot-

Autoinflammatory diseases are heterogeneous group of disorders, thus FMF like phenotypes and related genes most likely exists (33-36). In some cases, the causal genes may not only be the unique causes of the diseases. It is well known that Mendelian disorders caused by the dysfunction of a single gene have a wide heterogeneity of disease phenotypes (37). FMF has both genetic and phenotypic heterogeneity and mutations within a single gene are known to cause different clinical phenotypes in Turkey. Thus, all MEFV gene sequence variations found in symptomatic cases should not be considered as causative pathogenic disease mutations. In particular, FMF related Turkish patients with no MEFV mutation or with only single MEFV mutations may not actually reflect the phenotype seen

Another point is subclinical inflammation concerning asymptomatic heterozygous patients without a second mutation mostly continues with the typical disease characteristics possibly due to the presence of other modifier genes and/or environmental factors. Therefore, factors other than casual MEFV gene and other pyrin-dependent effects should be contributing to the sustainable systemic inflammation that is sufficient for the occurrence of the symptomatic FMF related phenotype. Previously, MICA, TLR2 and SAA loci were shown as modifying alleles in FMF (5, 38). Synonimous or non-synonimous sequence variations of MEFV relevant genes involving SAA and TLR2 were previously considered as critical factors for the course of the disease. Both SAA1 locus and Arg753Gln TLR2 polymorphism were implied as genetic susceptible loci for a risk factor of developing secondary amyloidosis in different ethnic populations of FMF patients (26, 27, 30). Against the traditionally considered monogenic inheritance pattern, compound heterozygotes of 2 autoinflammatory disease genes were also reported describing patients who were found to have 2 or more reduced penetrance mutations, involving E148Q in MEFV, R92Q or P46L in For investigating of mutations in other periodic fever disease genes, in a study of our group, a total of 75 Turkish patients and 25 ethnically matched healthy control individuals diagnosed with periodic fever was molecularly diagnosed for having mutations in causative disease genes (apart from the present patients group; unpublished data). Mutation screening of coding and noncoding regions of MVK, TNFRSF1A, and NLRP3/CIAS1 genes were carried out for different group of patients according to their clinical implications.

MVK gene transcript variant 1 (12q24; NM\_000431.2→NP\_000422.1) was fully sequenced in 25 periodic fever patients. Molecular diagnosis revealed the following results: p.Ser52Asn missense mutation was identified in 6 patients. In addition, p.Asp170Asp and p.Ser135Ser synonimous aminoacid mutations and IVS6-18 A>G, homozygous IVS9+24 G>A, and IVS 4+8 C/T intronic nucleotide substitutions were observed in the remaining patients group.

NLRP3 gene (CIAS1; 1q44; NM\_004895.4→NP\_004886.3) NACHT, LRR and PYD domainscontaining protein 3 isoform a was fully sequenced in 25 periodic fever patients. Molecular diagnosis revealed the following nucleotide substitutions in the screened gene region: K608fsX611 frameshift mutation, p.Ser726Gly and p.Gln703Lys missense mutations, together with Ser34Ser, Ala242Ala, Arg260Arg, Thr219Thr ve Leu411Leu synonimous aminoacid mutations.

TNFRSF1A gene (12p13.2; NM\_001065.3→NP\_001056.1) tumor necrosis factor receptor superfamily member 1A precursor form was fully sequenced in 25 periodic fever patients. Molecular diagnosis revealed the following nucleotide substitutions in the screened gene

region: p. Arg92Gln and p. Ala301Thr missense mutations with IVS6+10 A>G and IVS8-23 T>C intronic nucleotide substitutions.

The Prototype of Hereditary Periodic Fevers: Familial Mediterranean Fever 163

Sequencing analysis not only the common major mutations but also the detection of rare mutations can be carried out which have great importance in particular for at-risk populations. By means of sequencing analysis, we could prevent the missing of less common rare variants that might be restricted to the populations by routine techniques. The majority of FMF patients in classically affected populations are screened by routine methods that are limited to the detection of common mutations. These tests primarily target the most prevalent MEFV mutations to rule out asymptomatic cases in at-risk populations. Therefore, while searching for the common mutations that underlie typical FMF symptoms, we should primarily consider the entire coding sequence of the MEFV gene before analyzing other recurrent fever genes. In conclusion, by using sequencing analysis, we can prevent less common, population-restricted, novel sequence variants from being overlooked. This has implications for the characterization of typical and atypical FMF; screening for the most common mutations by routine methods is sufficient for the initial laboratory diagnosis of FMF in Turkish patients; however, the results should be confirmed by specific DNA sequencing of all coding exons and exon-intron flanking regions. We should consider gene mutation screening in early diagnosis and the follow-up of the clinical course in particular for the asymptomatic cases. Early determination of the disease causing mutation will be

favorable in order to prevent abundant treatments in newly diagnosed patients.

*Ege University, School of Medicine, Children's Hospital, Molecular Medicine Laboratory, Bornova,* 

We would like to thank patients and clinicians for their participation and contribution in our

[1] Touitou I. The spectrum of Familial Mediterranean Fever (FMF) mutations. Eur J Hum

[2] Schwabe AD, Peters RS. Familial Mediterranean fever in Armenians: analysis of 100

[3] Tufan A, Babaoglu MO, Akdogan A, Yasar U, Calguneri M, Kalyoncu U, Karadag O, Hayran M, Ertenli AI, Bozkurt A, Kiraz S. Association of drug transporter gene ABCB1 (MDR1) 3435C to T polymorphism with colchicine response in familial Mediterranean

[4] Özçakar B., Yalçnkaya F., Yüksel S., Ekim M. The expanded clinical spectrum of

familial Mediterranean fever. Clin Rheumatol, 2007; 26:1557–1560.

**Author details** 

**Acknowledgement** 

*Izmir, Turkey* 

**5. References** 

study.

Afig Berdeli and Sinem Nalbantoglu

Genet, 2001; 9(7):473-83.

cases. Medicine (Baltimore), 1974; 53:453–62.

fever. J Rheumatol, 2007; 34(7):1540-4.

Intronic nucleotide substitutions and synonimous aminoacid mutations of all the screened gene regions were also observed in the 25 ethnically matched healthy control individuals. Mutation frequency was 4% (1/25), 32% (8/25), and 40% (n:10/25) in TRAPS, HIDS, and CAPS patients.

Nonetheless, finding of symptomatic rare MEFV mutations in particular for at-risk populations and the individuals who have been asymptomatic and negative for common mutations makes detailed mutation screening critically important in FMF. It has been previously evidenced that there have been a number of patients who have typical FMF phenotype or FMF related symptoms with only one MEFV heterozygous mutation and/or even without any MEFV mutations (6, 7).

The majority of FMF patients in classically affected populations are screened by routine methods that are limited to the detection of common mutations. These tests primarily target the most prevalent MEFV mutations to rule out asymptomatic cases in at-risk populations. Therefore, while searching for the common mutations that underlie typical FMF symptoms, we should primarily consider the entire coding sequence of the MEFV gene before analyzing other recurrent fever genes. Patients with no mutation or with only single pyrin mutations may not actually reflect the phenotype seen in FMF. Compound heterozygotes of 2 autoinflammatory disease genes involving MEFV, TNFRSF1A, CIAS1, and MVK were reported (29, 34, 35). Thus, screening of other autoinflammatory disease genes, e.g. CIAS, were considered for the MEFV gene mutation/SNP negative FMF patients. In conclusion, by using sequencing analysis, we can prevent less common, population-restricted, novel sequence variants from being overlooked. This has implications for the characterization of typical and atypical FMF; screening for the most common mutations by routine methods is sufficient for the initial laboratory diagnosis of FMF in Turkish patients; however, the results should be confirmed by specific DNA sequencing of all coding exons and exon-intron flanking regions.

### **4. Conclusions**

Among the newly identified mutations in this comprehensive study, Y471X, R151S, L709R, and K695N were observed as pathogenic reflecting the typical FMF character involving abdominal pain, fever, thoracic pain, myalgia, arthritis, and erysipelas like erythema. Rare mutations and SNPs have great importance for FMF pathogenesis. For this periodic fever disorder, heterogeneity is present in phases of allelic, frequency and critical locations of mutant alleles, and clinical appearance. Therefore, in particular for the suspicious cases; possible presence of other autoinflammatory disease gene mutations as we outlined above and rare mutations and SNP variations in the MEFV gene, molecular techniques, sample sizes, ethnic origins, and regions in the ancestral countries should be regarded as critical and determinative keys in FMF clinical and molecular diagnosis.

Sequencing analysis not only the common major mutations but also the detection of rare mutations can be carried out which have great importance in particular for at-risk populations. By means of sequencing analysis, we could prevent the missing of less common rare variants that might be restricted to the populations by routine techniques. The majority of FMF patients in classically affected populations are screened by routine methods that are limited to the detection of common mutations. These tests primarily target the most prevalent MEFV mutations to rule out asymptomatic cases in at-risk populations. Therefore, while searching for the common mutations that underlie typical FMF symptoms, we should primarily consider the entire coding sequence of the MEFV gene before analyzing other recurrent fever genes. In conclusion, by using sequencing analysis, we can prevent less common, population-restricted, novel sequence variants from being overlooked. This has implications for the characterization of typical and atypical FMF; screening for the most common mutations by routine methods is sufficient for the initial laboratory diagnosis of FMF in Turkish patients; however, the results should be confirmed by specific DNA sequencing of all coding exons and exon-intron flanking regions. We should consider gene mutation screening in early diagnosis and the follow-up of the clinical course in particular for the asymptomatic cases. Early determination of the disease causing mutation will be favorable in order to prevent abundant treatments in newly diagnosed patients.

#### **Author details**

162 Mutations in Human Genetic Disease

CAPS patients.

flanking regions.

**4. Conclusions** 

T>C intronic nucleotide substitutions.

even without any MEFV mutations (6, 7).

region: p. Arg92Gln and p. Ala301Thr missense mutations with IVS6+10 A>G and IVS8-23

Intronic nucleotide substitutions and synonimous aminoacid mutations of all the screened gene regions were also observed in the 25 ethnically matched healthy control individuals. Mutation frequency was 4% (1/25), 32% (8/25), and 40% (n:10/25) in TRAPS, HIDS, and

Nonetheless, finding of symptomatic rare MEFV mutations in particular for at-risk populations and the individuals who have been asymptomatic and negative for common mutations makes detailed mutation screening critically important in FMF. It has been previously evidenced that there have been a number of patients who have typical FMF phenotype or FMF related symptoms with only one MEFV heterozygous mutation and/or

The majority of FMF patients in classically affected populations are screened by routine methods that are limited to the detection of common mutations. These tests primarily target the most prevalent MEFV mutations to rule out asymptomatic cases in at-risk populations. Therefore, while searching for the common mutations that underlie typical FMF symptoms, we should primarily consider the entire coding sequence of the MEFV gene before analyzing other recurrent fever genes. Patients with no mutation or with only single pyrin mutations may not actually reflect the phenotype seen in FMF. Compound heterozygotes of 2 autoinflammatory disease genes involving MEFV, TNFRSF1A, CIAS1, and MVK were reported (29, 34, 35). Thus, screening of other autoinflammatory disease genes, e.g. CIAS, were considered for the MEFV gene mutation/SNP negative FMF patients. In conclusion, by using sequencing analysis, we can prevent less common, population-restricted, novel sequence variants from being overlooked. This has implications for the characterization of typical and atypical FMF; screening for the most common mutations by routine methods is sufficient for the initial laboratory diagnosis of FMF in Turkish patients; however, the results should be confirmed by specific DNA sequencing of all coding exons and exon-intron

Among the newly identified mutations in this comprehensive study, Y471X, R151S, L709R, and K695N were observed as pathogenic reflecting the typical FMF character involving abdominal pain, fever, thoracic pain, myalgia, arthritis, and erysipelas like erythema. Rare mutations and SNPs have great importance for FMF pathogenesis. For this periodic fever disorder, heterogeneity is present in phases of allelic, frequency and critical locations of mutant alleles, and clinical appearance. Therefore, in particular for the suspicious cases; possible presence of other autoinflammatory disease gene mutations as we outlined above and rare mutations and SNP variations in the MEFV gene, molecular techniques, sample sizes, ethnic origins, and regions in the ancestral countries should be regarded as critical and

determinative keys in FMF clinical and molecular diagnosis.

Afig Berdeli and Sinem Nalbantoglu *Ege University, School of Medicine, Children's Hospital, Molecular Medicine Laboratory, Bornova, Izmir, Turkey* 

### **Acknowledgement**

We would like to thank patients and clinicians for their participation and contribution in our study.

## **5. References**


[5] Touitou I., The spectrum of Familial Mediterranean Fever (FMF) mutations, Eur J Hum Genet. 9(7) (2001) 473-83.

The Prototype of Hereditary Periodic Fevers: Familial Mediterranean Fever 165

[21] Balci-Peynircioglu B, Waite AL, Hu C, et al. Pyrin, product of the MEFV locus, interacts

[23] Shohat M, Magal N, Shohat T, Chen X, Dagan T, Mimouni A, Danon Y, Lotan R, Ogur G, Sirin A, Schlezinger M, Halpern GJ, Schwabe A, Kastner D, Rotter JI, Fischel-Ghodsian N. Phenotype-genotype correlation in familial Mediterranean fever: evidence for an association between Met694Val and amyloidosis. Eur J Hum Genet. 1999

[24] Yalçinkaya F, Tekin M, Cakar N, Akar E, Akar N, Tümer N. Familial Mediterranean fever and systemic amyloidosis in untreated Turkish patients. QJM. 2000 Oct;93(10):681-

[25] Papadopoulos V, Mitroulis I, Giaglis S. MEFV heterogeneity in Turkish Familial

[26] Cazeneuve C, Papin S, Jéru I, et al. Subcellular localisation of marenostrin/pyrin isoforms carrying the most common mutations involved in familial Mediterranean fever in the presence or absence of its binding partner ASC. J Med Genet, 2004,

[27] Gershoni-Baruch R, Brik R, Shinawi M, et al. The differential contribution of MEFV mutant alleles to the clinical profile of familial Mediterranean fever. Eur J Hum Genet,

[28] Livneh A, Langevitz P, Zemer D, et al. Criteria for the diagnosis of FMF. Arthritis

[29] Singh-Grewal D, Chaitow J, Aksentijevich I, et al. Coexistent MEFV and CIAS1 mutations manifesting as familial Mediterranean fever plus deafness [letter]. Ann

[30] Ozen S, Berdeli A, Türel B, et al. Arg753Gln TLR-2 polymorphism in familial mediterranean fever: linking the environment to the phenotype in a monogenic

[31] Samuels J, Ozen S. Familial Mediterranean fever and the other autoinflammatory syndromes: evaluation of the patient with recurrent fever. Curr Opin Rheumatol

[33] Kastner DL, Aksentijevich I. Intermittent and periodic arthritis syndromes. In: Koopman WJ, Moreland LW, editors. Arthritis and allied conditions: a textbook of rheumatology. 15th ed. Philadelphia: Lippincott Williams & Wilkins; 2005. p. 1411–61. [34] Stojanov S, Kastner DL. Familial autoinflammatory diseases: genetics, pathogenesis and

[35] Touitou I, Perez C, Dumont B, et al. Refractory auto-inflammatory syndrome associated with digenic transmission of low-penetrance tumour necrosis factor receptor-associated periodic syndrome and cryopyrin-associated periodic syndrome mutations. Ann

[36] Helen J Lachmann and Philip N Hawkins. Developments in the scientific and clinical

inflammatory disease. J Rheumatol, 2006, 33(12):2498-500. Epub 2006 Oct 1.

Mediterranean Fever patients. Mol Biol Rep. 2010 Jan;37(1):355-8.

with the proapoptotic protein, Siva. J Cell Physiol, 2008.

[22] http://fmf.igh.cnrs.fr/ISSAID/infevers/

Apr;7(3):287-92.

4.

41(3):e24.

2002,10:145–9.

2006;18:108–17.

Rheum, 1997, 40(10):1879-85.

Rheum Dis, 2007, 66:1541.

[32] http://coot.embl.de/PolyPhen.

Rheum Dis, 2006, 65:1530–1.

treatment. Curr Opin Rheumatol, 2005, 17:586–99.

understanding of autoinflammatory disorders. 2009.


Genet, 17(1):25-31.

807.

53.

Genet. 9(7) (2001) 473-83.

Aug;15(7-8):475-82. Epub 2011 Mar 17.

oligomerization, Cell Death Differ. 2006; 13(2):236-49.

macrophage apoptosis. Mol. Cell 2003; 11, 591–604.

inflammatory mediators. Blood, 2000, 95:3223–31.

Arthritis Rheum. 2009 Jun;60(6):1862-6.

Natl. Acad. Sci, 2003, 100:13501–13506.

Arthritis Rheum, 2005, 52:1848–1857.

2006, 103: 9982–9987.

[5] Touitou I., The spectrum of Familial Mediterranean Fever (FMF) mutations, Eur J Hum

[6] Booty M., Chae J., Masters S., Remmers E., Barham B., Le JM., Barron KS., Holland SM., Kastner DL., Aksentijevich A. F amilial Mediterranean Fever With a Single MEFV Mutation. Where Is the Second Hit? Arthritis & Rheumatism, 2009; 60 (6): 1851–1861. [7] Berdeli A, Mir S, Nalbantoglu S, Kutukculer N, Sozeri B, Kabasakal Y, Cam S, Solak M. Comprehensive analysis of a large-scale screen for MEFV gene mutations: do they truly provide a "heterozygote advantage" in Turkey? Genet Test Mol Biomarkers. 2011 Jul-

[8] French FMF Consortium. (1997) A candidate gene for familial Mediterranean fever. Nat

[9] The International FMF consortium. (1997) Ancient missensemutations in a new member of the RoRet gene family are likely to cause familial Mediterranean fever. Cell, 90: 797-

[10] Yu JW, Wu J, Zhang Z, Datta P, Ibrahimi I, Taniguchi S, Sagara J, Fernandes-Alnemri T, Alnemri ES., Cryopyrin and pyrin activate caspase-1, but not NF-kappaB, via ASC

[11] Chae, J.J. Komarow HD, Cheng J, Wood G, Raben N, Liu PP, et al. Targeted disruption of pyrin, the FMF protein, causes heightened sensitivity to endotoxin and a defect in

[13] Centola M, Wood G, Frucht DM, et al. The gene for familial Mediterranean fever, MEFV, is expressed in early leukocyte development and is regulated in response to

[14] Gumucio DL, Diaz A, Schaner P, et al. Fire and ICE: the role of pyrin domaincontaining proteins in inflammation and apoptosis. Clin Exp Rheumatol, 2002, 26:S45-

[15] Masumoto J, Dowds TA, Schaner P, et al. ASC is an activating adaptor for NF-kappa B and caspase-8-dependent apoptosis. Biochem Biophys Res Commun, 2003,303(1):69-73. [16] Marek-Yagel D, Berkun Y, Padeh S, Abu A, Reznik-Wolf H, Livneh A, Pras M, Pras E. Clinical disease among patients heterozygous for familial Mediterranean fever.

[17] Shoham, N.G. et al. Pyrin binds the PSTPIP1/CD2BP1 protein, defining familial Mediterranean fever and PAPA syndrome as disorders in the same pathway. Proc.

[18] Jeru I, Papin S, L'Hoste S, et al. Interaction of pyrin with 14.3.3 in an isoform-specific and phosphorylation-dependent manner regulates its translocation to the nucleus.

[19] Chae, J.J. et al. The B30.2 domain of pyrin, the familial Mediterranean fever protein, interacts directly with caspase-1 to modulate IL-1b production. Proc. Natl. Acad. Sci,

[20] Richards N, Schaner P, Diaz A, et al. Interaction between pyrin and the apoptotic speck protein (ASC) modulates ASC-induced apoptosis. J Biol Chem, 2001,276(42):39320–9.

[12] Diaz, A., Hu, C., Kastner, D. L., et al. Arthritis Rheum, 2004, 50: 3679–3689.

	- [37] Bell J. Predicting disease using genomics. Nature, 2004, 429:453–6.
	- [38] Shaw PJ, Lukens JR, Burns S, et al. Cutting Edge: critical role for PYCARD/ASC in the development of experimental autoimmune encephalomyelitis. J Immunol, 2010, 184:4610–4.

**Chapter 8** 

© 2012 Seki et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**Pathophysiological Roles of Mutations in the** 

**-**

family, plays essential roles in the regulation of extracellular and intracellular pH [1,2]. Consistent with an essential role of NBCe1 in bicarbonate absorption from renal proximal tubules, homozygous mutations in NBCe1 cause proximal renal tubular acidosis (pRTA) [3- 11]. These pRTA patients with NBCe1 mutations invariably present with ocular abnormalities such as band keratopathy, cataract, and glaucoma, indicating that NBCe1 also plays important roles in the maintenance of ocular homeostasis [12,13]. Some pRTA patients also have migraine, suggesting that NBCe1 may also contribute to the pH regulation in the brain [10]. In addition, mice models for NBCe1 deficiency have been developed [11,14].

In this review, we try to summarize the recent data about the pathophysiological roles of

There are at least five mammalian NBCe1 variants, NBCe1A through NBCe1E as shown in Figure 1 [15,16]. NBCe1B differs from NBCe1A at the N-terminus, where the first 85 amino acids of NBCe1B replace the first 41 amino acids of NBCe1A [17]. NBCe1C differs from NBCe1B at the C-terminus, where the last 61 amino acids of NBCe1C replace the last 46 amino acids of NBCe1B [18]. NBCe1D and NBCe1E, identified from mouse reproductive tract tissues, contain a deletion of 9 amino acids in exon 6 of NBCe1A and NBCe1B,

Among these variants, NBCe1C is predominantly expressed in brain, but its physiological roles remain speculative [18]. NBCe1B is widely expressed in several tissues including

and reproduction in any medium, provided the original work is properly cited.

**2. Physiological roles of NBCe1 in kidney and pancreas** 

 **Cotransporter NBCe1** 

cotransporter NBCe1, belonging to the solute carrier 4 (SLC4)

**-HCO3**

George Seki, Shoko Horita, Masashi Suzuki, Osamu Yamazaki and Hideomi Yamada

Additional information is available at the end of the chapter

**Electrogenic Na+**

http://dx.doi.org/10.5772/39225

The electrogenic Na+-HCO3-

**1. Introduction** 

NBCe1 mutations.

respectively [16].

#### **Pathophysiological Roles of Mutations in the Electrogenic Na+ -HCO3 - Cotransporter NBCe1**

George Seki, Shoko Horita, Masashi Suzuki, Osamu Yamazaki and Hideomi Yamada

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/39225

## **1. Introduction**

166 Mutations in Human Genetic Disease

184:4610–4.

[37] Bell J. Predicting disease using genomics. Nature, 2004, 429:453–6.

[38] Shaw PJ, Lukens JR, Burns S, et al. Cutting Edge: critical role for PYCARD/ASC in the development of experimental autoimmune encephalomyelitis. J Immunol, 2010,

> The electrogenic Na+-HCO3 cotransporter NBCe1, belonging to the solute carrier 4 (SLC4) family, plays essential roles in the regulation of extracellular and intracellular pH [1,2]. Consistent with an essential role of NBCe1 in bicarbonate absorption from renal proximal tubules, homozygous mutations in NBCe1 cause proximal renal tubular acidosis (pRTA) [3- 11]. These pRTA patients with NBCe1 mutations invariably present with ocular abnormalities such as band keratopathy, cataract, and glaucoma, indicating that NBCe1 also plays important roles in the maintenance of ocular homeostasis [12,13]. Some pRTA patients also have migraine, suggesting that NBCe1 may also contribute to the pH regulation in the brain [10]. In addition, mice models for NBCe1 deficiency have been developed [11,14].

> In this review, we try to summarize the recent data about the pathophysiological roles of NBCe1 mutations.

## **2. Physiological roles of NBCe1 in kidney and pancreas**

There are at least five mammalian NBCe1 variants, NBCe1A through NBCe1E as shown in Figure 1 [15,16]. NBCe1B differs from NBCe1A at the N-terminus, where the first 85 amino acids of NBCe1B replace the first 41 amino acids of NBCe1A [17]. NBCe1C differs from NBCe1B at the C-terminus, where the last 61 amino acids of NBCe1C replace the last 46 amino acids of NBCe1B [18]. NBCe1D and NBCe1E, identified from mouse reproductive tract tissues, contain a deletion of 9 amino acids in exon 6 of NBCe1A and NBCe1B, respectively [16].

Among these variants, NBCe1C is predominantly expressed in brain, but its physiological roles remain speculative [18]. NBCe1B is widely expressed in several tissues including

pancreatic ducts, intestinal tracts, ocular tissues, and brain [2,12,13,19-22]. In the basolateral membranes of pancreatic ducts NBCe1B is thought to mediate bicarbonate uptake into cells, which may be essential for the bicarbonate secretion from pancreas [23-25]. Consistent with this view, some pRTA patients with NBCe1 mutations presented with an elevated serum amylase level [3,7]. However, none of these patients presented with a distinct form of pancreatitis. Probably, other acid/base transporters such as Na+/H+ exchanger 1 (NHE1) or H+-ATPase in the basolateral membranes of pancreatic duct cells could at least partially compensate for the NBCe1 inactivation [26].

Pathophysiological Roles of Mutations in the Electrogenic Na+-HCO3

Although the basolateral membranes of renal proximal tubules are known to contain several

[35,36], NBCe1A seems to play an essential role in bicarbonate absorption in this nephron segment. Consistent with this view, the homozygous inactivating mutations in NBCe1A cause severe pRTA with the blood bicarbonate concentration often less than 10 mM [3-11]. Functional deletion of NBCe1 in mice produces even more severe acidemia with the blood bicarbonate concentration around 5 mM [11,14]. By contrast, functional deletion of Cl-

from α-intercalated duct cells, produces only moderate acidemia in mice with the blood bicarbonate concentration around 17 mM [37]. This may probably reflect much higher bicarbonate absorbing capacity of renal proximal tubules than that of renal distal tubules.

Until now, 12 homozygous mutations in NBCe1 have been identified in pRTA patients

**Figure 2.** NBCe1 topology and pRTA-related mutations. Numbers in circles correspond to Q29X, R298S, S427L, T485S, G486R, R510H, W516X, L522P, N721TfsX29, A799V, R881C, and S982NfsX4.

They include eight missense mutations R298S, S427L, T485S, G486R, R510H, L522P, A799V, and R881C, two nonsense mutations Q29X and W516X, and two frame shift mutations N721TfsX29 and S982NfsX4. Except the NBCe1A-specific mutation Q29X, which is expected to yield non-functional NBCe1A but leave both NBCe1B and NBCe1C intact [4], all the other mutations lie in the common regions of NBCe1 variants. The C-terminal mutant S982NfsX4 is expected to introduce a frameshift in exon 23 and a premature stop codon for both

White numbers in black circles indicate mutations associated with migraine.

exchanger AE1, which is responsible for a majority of basolateral bicarbonate exit

bicarbonate transporters such as Na+-dependent and Na+-independent Cl-

associated with ocular abnormalities as shown in Figure 2 [3-11].

corresponding residues of electroneutral Na+-HCO3-

**3. NBCe1 mutations and pRTA** 

electroneutral NBC.

/HCO3-


cotransporter NBCn1-A creates an

/HCO3-

Cotransporter NBCe1 169

exchangers

**Figure 1.** Structures of NBCe1 variants. Numbers of boxes indicate numbers of amino acids in N- or Cterminus. Note that NBCe1D and NBCe1E lack 9 amino acids (9-aa) in exon 6 of NBCe1A and NBCe1B, respectively. TMD: transmembrane domain.

NBCe1A is predominantly expressed in the basolateral membranes of renal proximal tubules, where it mediates bicarbonate exit from cells [2,27]. The opposite transport directions between NBCe1A in kidney and NBCe1B in pancreas may be related to the different stoichiometric ratios. Thus, NBCe1A in *in vivo* renal proximal tubules functions with 1Na+ to 3HCO3 stoichiometry, whereas NBCe1B in pancreatic ducts may function with 1Na+ to 2HCO3 stoichiometry [23,28]. However, these differences in transport stoichiometry may not be due to the intrinsic properties of NBCe1 variants, but rather reflect the environmental factors such as incubation conditions or cell types. Indeed, NBCe1A in isolated renal proximal tubules can function with either 1Na+ to 2HCO3 or 1Na+ to 3HCO3 stoichiometry depending on the incubation conditions [29-31]. Such changes in transport stoichiometry of NBCe1A can be also induced in *Xenopus* oocytes [32]. Moreover, NBCe1B may function with 1Na+ to 2HCO3 stoichiometry in cultured pancreatic duct cells, but may function with 1Na+ to 3HCO3 stoichiometry when expressed in cultured renal proximal tubular cells [33]. Regarding the electrogenicity of NBCe1A, recent work by Chen and Boron suggests that the predicted fourth extracellular loop corresponding to amino acids 704 to 735 may have an important role [34]. They found that replacing these residues with the corresponding residues of electroneutral Na+-HCO3 cotransporter NBCn1-A creates an electroneutral NBC.

Although the basolateral membranes of renal proximal tubules are known to contain several bicarbonate transporters such as Na+-dependent and Na+-independent Cl- /HCO3 exchangers [35,36], NBCe1A seems to play an essential role in bicarbonate absorption in this nephron segment. Consistent with this view, the homozygous inactivating mutations in NBCe1A cause severe pRTA with the blood bicarbonate concentration often less than 10 mM [3-11]. Functional deletion of NBCe1 in mice produces even more severe acidemia with the blood bicarbonate concentration around 5 mM [11,14]. By contrast, functional deletion of Cl- /HCO3 exchanger AE1, which is responsible for a majority of basolateral bicarbonate exit from α-intercalated duct cells, produces only moderate acidemia in mice with the blood bicarbonate concentration around 17 mM [37]. This may probably reflect much higher bicarbonate absorbing capacity of renal proximal tubules than that of renal distal tubules.

#### **3. NBCe1 mutations and pRTA**

168 Mutations in Human Genetic Disease

compensate for the NBCe1 inactivation [26].

respectively. TMD: transmembrane domain.

may function with 1Na+ to 2HCO3-

function with 1Na+ to 3HCO3-

with 1Na+ to 3HCO3-

1Na+ to 2HCO3-

pancreatic ducts, intestinal tracts, ocular tissues, and brain [2,12,13,19-22]. In the basolateral membranes of pancreatic ducts NBCe1B is thought to mediate bicarbonate uptake into cells, which may be essential for the bicarbonate secretion from pancreas [23-25]. Consistent with this view, some pRTA patients with NBCe1 mutations presented with an elevated serum amylase level [3,7]. However, none of these patients presented with a distinct form of pancreatitis. Probably, other acid/base transporters such as Na+/H+ exchanger 1 (NHE1) or H+-ATPase in the basolateral membranes of pancreatic duct cells could at least partially

**Figure 1.** Structures of NBCe1 variants. Numbers of boxes indicate numbers of amino acids in N- or Cterminus. Note that NBCe1D and NBCe1E lack 9 amino acids (9-aa) in exon 6 of NBCe1A and NBCe1B,

NBCe1A is predominantly expressed in the basolateral membranes of renal proximal tubules, where it mediates bicarbonate exit from cells [2,27]. The opposite transport directions between NBCe1A in kidney and NBCe1B in pancreas may be related to the different stoichiometric ratios. Thus, NBCe1A in *in vivo* renal proximal tubules functions

may not be due to the intrinsic properties of NBCe1 variants, but rather reflect the environmental factors such as incubation conditions or cell types. Indeed, NBCe1A in

stoichiometry depending on the incubation conditions [29-31]. Such changes in transport stoichiometry of NBCe1A can be also induced in *Xenopus* oocytes [32]. Moreover, NBCe1B

tubular cells [33]. Regarding the electrogenicity of NBCe1A, recent work by Chen and Boron suggests that the predicted fourth extracellular loop corresponding to amino acids 704 to 735 may have an important role [34]. They found that replacing these residues with the

isolated renal proximal tubules can function with either 1Na+ to 2HCO3-

stoichiometry, whereas NBCe1B in pancreatic ducts may function with

stoichiometry in cultured pancreatic duct cells, but may

stoichiometry when expressed in cultured renal proximal

or 1Na+ to 3HCO3-

stoichiometry [23,28]. However, these differences in transport stoichiometry

Until now, 12 homozygous mutations in NBCe1 have been identified in pRTA patients associated with ocular abnormalities as shown in Figure 2 [3-11].

**Figure 2.** NBCe1 topology and pRTA-related mutations. Numbers in circles correspond to Q29X, R298S, S427L, T485S, G486R, R510H, W516X, L522P, N721TfsX29, A799V, R881C, and S982NfsX4. White numbers in black circles indicate mutations associated with migraine.

They include eight missense mutations R298S, S427L, T485S, G486R, R510H, L522P, A799V, and R881C, two nonsense mutations Q29X and W516X, and two frame shift mutations N721TfsX29 and S982NfsX4. Except the NBCe1A-specific mutation Q29X, which is expected to yield non-functional NBCe1A but leave both NBCe1B and NBCe1C intact [4], all the other mutations lie in the common regions of NBCe1 variants. The C-terminal mutant S982NfsX4 is expected to introduce a frameshift in exon 23 and a premature stop codon for both

NBCe1A (S982NfsX4) and NBCe1B (S1026NfsX4), yielding the mutant proteins with 51 fewer amino acids than the wild-type proteins. On the other hand, this mutation abolishes the translation of NBCe1C, the C-terminal variant skipping exon 24 [10,18].

Pathophysiological Roles of Mutations in the Electrogenic Na+-HCO3

activity was largely suppressed by adenovirus-mediated transfer of a specific hammerhead ribozyme against NBCe1, consistent with a major role of NBCe1 in overall bicarbonate transport by the lens epithelium [13]. The lens is an avasuclar tissue, and the transport by lens epithelium may be essential for the maintenance of lens homeostasis and integrity [48]. A study in lens epithelial cell layers indeed detected an active fluid transport from their anterior to posterior sides against a hydrostatic pressure [49]. Probably, the transport activity of NBCe1 in lens epithelium may be essential for the lens homeostasis and transparency. Indeed, the pRTA patients with NBCe1 mutations often presented with

Most of the pRTA patients with NBCe1 mutations also presented with glaucoma. Immunohistological analysis detected the expression of NBCe1 in human trabecular meshwork cells [13]. The electrogenic transport activity compatible with NBCe1 was also reported in human trabecular meshwork cells [50]. Because trabecular meshwork is the main site for aqueous outflow in the human eye [51], the inactivation of NBCe1 in trabecular meshwork cells may be responsible for the occurrence of high-tension glaucoma usually observed in the pRTA patients with homozygous NBCe1 mutations [10]. On the other hand, the NBCe1 expression was also detected in retina [12,52]. Interestingly, some of the family members carrying the heterozygous NBCe1 S982NfsX4 mutation, which has a dominant negative effect as will be discussed later, presented with normal-tension glaucoma without pRTA [10]. This type of glaucoma may be caused by dysregulation of extracellular pH in retina, because NBCe1 in retinal Müller cells may protect the excessive synaptic activities by

NBCe1 was also found in human and rat pigmented and nonpigmented ciliary epithelial cells [12,13]. In addition to Na+/H+ and anion exchangers [54], NBCe1 may be also involved in influx and efflux of bicarbonate into/from these tissues, thereby contributing to the initial

Regarding the NBCe1 variants expressed in ocular tissues, several studies suggest that NBCe1B is the predominant variant [12,47]. However, both NBCe1A and NBCe1B are indeed expressed in several ocular tissues [13,46]. Consistent with the latter view, the pRTA patient carrying the homozygous Q29X mutation, which inactivates NBCe1A but leaves NBCe1B and NBCe1C intact, presented with bilateral high-tension glaucoma [4]. She did not

It has been known that pH in the brain shows rapid changes in response to electrical activity. These changes in local pH may have an important influence on neurobiological responses by modifying numerous enzymes, ion channels, transporters, and receptors [19].

Among several acid/base transporters expressed in the brain, NBCe1 is intensively expressed in olfactory bulb, hippocampal dentate gyrus, and cerebellum, localizing in both glial cells and neurons [56]. Although a large number of transporters may be involved in the pH homeostasis

counteracting the light-induced extracellular alkalosis [12,52,53].

step of aqueous humor formation [55].

have band keratopathy or cataract.

**5. NBCe1 mutations and migraine** 

cataracts.


Cotransporter NBCe1 171

Topological analysis using the substituted cysteine accessibility method suggests that most of these mutations are buried in the protein complex/lipid bilayer where they perform important structural roles [38]. In particular, the amino acid substitution analysis revealed that Thr485 might reside in a special position, which seems to require the OH group side chain to maintain a normal conformation of NBCe1A. Based on homology modeling to the crystallized cytoplasmic domain structure of AE1, Arg298 in the C-terminal cytoplasmic domain of NBCe1A was also predicted to reside in a solvent-inaccessible subsurface pocket and to associate with Glu91 or Glu295 via H-bonding and charge-charge interactions [39]. This unusual continuous chain of interconnected polar residues may be essential for HCO3 transporting ability of SLC4 proteins. Parker *et al*. recently found that in addition to a per-molecule transport defect as previous reported [7], the NBCe1 A799V mutant has an unusual HCO3- -independent conductance that, if associated with mutant NBCe1 in muscle cells, could contribute to the occurrence of hypokalemic paralysis in the affected individual [40,41].

Functional analyses using different expression systems indicate that at least 50% reduction in NBCe1A activity would be required to induce severe pRTA [3,7,9]. However, no tight relationship between the degree of NBCe1A inactivation and the severity of acidemia exits, suggesting the involvement of other factors in the etiology of pRTA. Indeed, several mutants are found to display abnormal trafficking in mammalian cells [10,42,43]. As will be discussed later, defective membrane expression of NBCe1B in astrocytes may be responsible for the occurrence of migraine [10].

## **4. Physiological roles of NBCe1 in ocular homeostasis**

The presence of NBCe1-like activity has been reported in several ocular tissues. Among these tissues, the physiological role of NBCe1 is established in the corneal endothelium. Thus, the corneal endothelium is known to mediate the electrogenic transport of sodium and bicarbonate into the aqueous humor, and this process is considered to be essential for corneal hydration and transparency [44]. Several lines of evidence suggest that NBCe1 is responsible for a majority of this transport. For example, Jentsch *et al*. found an electrogenic sodiumcoupled bicarbonate cotransport activity compatible with NBCe1 in cultured bovine corneal endothelial cells [45]. Usui *et al*. later found the functional and molecular evidence for NBCe1 in cultured human corneal endothelial cells [46]. Immunohistological analysis confirmed the expression of NBCe1 in rat, human, and bovine corneal endothelium [12,13,47]. Furthermore, most of the pRTA patients with NBCe1 mutations presented with band keratopathy. The reduction of bicarbonate efflux by NBCe1 mutations may increase the local pH within the corneal stroma, which may facilitate local Ca2+ deposition resulting in band keratopathy [13].

Immunohistological analysis also detected the expression of NBCe1 in rat and human lens epithelium [12,13]. Functional analysis in cultured human lens epithelial cells revealed the presence of Cl- -independent, electrogenic Na+-HCO3 cotransporter activity. This transport activity was largely suppressed by adenovirus-mediated transfer of a specific hammerhead ribozyme against NBCe1, consistent with a major role of NBCe1 in overall bicarbonate transport by the lens epithelium [13]. The lens is an avasuclar tissue, and the transport by lens epithelium may be essential for the maintenance of lens homeostasis and integrity [48]. A study in lens epithelial cell layers indeed detected an active fluid transport from their anterior to posterior sides against a hydrostatic pressure [49]. Probably, the transport activity of NBCe1 in lens epithelium may be essential for the lens homeostasis and transparency. Indeed, the pRTA patients with NBCe1 mutations often presented with cataracts.

Most of the pRTA patients with NBCe1 mutations also presented with glaucoma. Immunohistological analysis detected the expression of NBCe1 in human trabecular meshwork cells [13]. The electrogenic transport activity compatible with NBCe1 was also reported in human trabecular meshwork cells [50]. Because trabecular meshwork is the main site for aqueous outflow in the human eye [51], the inactivation of NBCe1 in trabecular meshwork cells may be responsible for the occurrence of high-tension glaucoma usually observed in the pRTA patients with homozygous NBCe1 mutations [10]. On the other hand, the NBCe1 expression was also detected in retina [12,52]. Interestingly, some of the family members carrying the heterozygous NBCe1 S982NfsX4 mutation, which has a dominant negative effect as will be discussed later, presented with normal-tension glaucoma without pRTA [10]. This type of glaucoma may be caused by dysregulation of extracellular pH in retina, because NBCe1 in retinal Müller cells may protect the excessive synaptic activities by counteracting the light-induced extracellular alkalosis [12,52,53].

NBCe1 was also found in human and rat pigmented and nonpigmented ciliary epithelial cells [12,13]. In addition to Na+/H+ and anion exchangers [54], NBCe1 may be also involved in influx and efflux of bicarbonate into/from these tissues, thereby contributing to the initial step of aqueous humor formation [55].

Regarding the NBCe1 variants expressed in ocular tissues, several studies suggest that NBCe1B is the predominant variant [12,47]. However, both NBCe1A and NBCe1B are indeed expressed in several ocular tissues [13,46]. Consistent with the latter view, the pRTA patient carrying the homozygous Q29X mutation, which inactivates NBCe1A but leaves NBCe1B and NBCe1C intact, presented with bilateral high-tension glaucoma [4]. She did not have band keratopathy or cataract.

## **5. NBCe1 mutations and migraine**

170 Mutations in Human Genetic Disease

for the occurrence of migraine [10].

presence of Cl-

NBCe1A (S982NfsX4) and NBCe1B (S1026NfsX4), yielding the mutant proteins with 51 fewer amino acids than the wild-type proteins. On the other hand, this mutation abolishes

Topological analysis using the substituted cysteine accessibility method suggests that most of these mutations are buried in the protein complex/lipid bilayer where they perform important structural roles [38]. In particular, the amino acid substitution analysis revealed that Thr485 might reside in a special position, which seems to require the OH group side chain to maintain a normal conformation of NBCe1A. Based on homology modeling to the crystallized cytoplasmic domain structure of AE1, Arg298 in the C-terminal cytoplasmic domain of NBCe1A was also predicted to reside in a solvent-inaccessible subsurface pocket and to associate with Glu91 or Glu295 via H-bonding and charge-charge interactions [39]. This unusual continuous

proteins. Parker *et al*. recently found that in addition to a per-molecule transport defect as

conductance that, if associated with mutant NBCe1 in muscle cells, could contribute to the

Functional analyses using different expression systems indicate that at least 50% reduction in NBCe1A activity would be required to induce severe pRTA [3,7,9]. However, no tight relationship between the degree of NBCe1A inactivation and the severity of acidemia exits, suggesting the involvement of other factors in the etiology of pRTA. Indeed, several mutants are found to display abnormal trafficking in mammalian cells [10,42,43]. As will be discussed later, defective membrane expression of NBCe1B in astrocytes may be responsible

The presence of NBCe1-like activity has been reported in several ocular tissues. Among these tissues, the physiological role of NBCe1 is established in the corneal endothelium. Thus, the corneal endothelium is known to mediate the electrogenic transport of sodium and bicarbonate into the aqueous humor, and this process is considered to be essential for corneal hydration and transparency [44]. Several lines of evidence suggest that NBCe1 is responsible for a majority of this transport. For example, Jentsch *et al*. found an electrogenic sodiumcoupled bicarbonate cotransport activity compatible with NBCe1 in cultured bovine corneal endothelial cells [45]. Usui *et al*. later found the functional and molecular evidence for NBCe1 in cultured human corneal endothelial cells [46]. Immunohistological analysis confirmed the expression of NBCe1 in rat, human, and bovine corneal endothelium [12,13,47]. Furthermore, most of the pRTA patients with NBCe1 mutations presented with band keratopathy. The reduction of bicarbonate efflux by NBCe1 mutations may increase the local pH within the corneal stroma, which may facilitate local Ca2+ deposition resulting in band keratopathy [13].

Immunohistological analysis also detected the expression of NBCe1 in rat and human lens epithelium [12,13]. Functional analysis in cultured human lens epithelial cells revealed the

previous reported [7], the NBCe1 A799V mutant has an unusual HCO3-

transporting ability of SLC4

cotransporter activity. This transport


the translation of NBCe1C, the C-terminal variant skipping exon 24 [10,18].

chain of interconnected polar residues may be essential for HCO3-

occurrence of hypokalemic paralysis in the affected individual [40,41].

**4. Physiological roles of NBCe1 in ocular homeostasis** 


It has been known that pH in the brain shows rapid changes in response to electrical activity. These changes in local pH may have an important influence on neurobiological responses by modifying numerous enzymes, ion channels, transporters, and receptors [19].

Among several acid/base transporters expressed in the brain, NBCe1 is intensively expressed in olfactory bulb, hippocampal dentate gyrus, and cerebellum, localizing in both glial cells and neurons [56]. Although a large number of transporters may be involved in the pH homeostasis

of the brain interstitial space, acid secretion by glial cells via inward electrogenic Na+-HCO3 cotransporter NBCe1B may have a significant role in the prevention of excessive neural activities. In fact, alkalosis in extracellular spaces is generally associated with enhanced neuronal excitability, while acidosis is known to suppress neural activity [19]. A recent study using NBCe1 knockout (KO) mice confirmed that NBCe1 mediates a depolarization-induced alkalinization (DIA) response in astrocytes [57]. This study revealed that NBCe1 also contributes partially to a DIA response in hippocampal neurons [57]. Bevensee *et al*. initially reported that the expression of NBCe1B is more abundant in astrocytes than in neuron, while NBCe1C show the reverse pattern of expression [18]. However, the expression of NBCe1C was also found in rat astrocytes [22]. Despite the intensive expression of NBCe1 in brain and the potential contribution of NBCe1 to the extracellular pH regulation in brain, the physiological significance of NBCe1 in brain had still remained speculative. However, recent work revealed an unrecognized association of migraine with NBCe1 mutations [10].

Pathophysiological Roles of Mutations in the Electrogenic Na+-HCO3

NBCe1B activity in astrocytes can cause migraine potentially through dysregulation of synaptic pH [10]. We cannot exclude a possibility that the inactivation of NBCe1C is also

Cerebral cortical hyperexcitability causing cortical spreading depression (CSD) seems to be the underlying pathophysiological mechanism of migraine aura [63]. In general, neuronal firing may lead to a rise in extracellular K+ concentration and further depolarization, but uptake of K+ into astrocytes can counteract this process. Therefore, enhanced neurotransmitter release by *CACNA1*A mutations, excessive neuronal firing by *SCN1A* mutations, or impaired clearance of K+ and/or glutamate by *ATP1A2* mutations can all induce CSD [63].Neuronal excitation may also elicit an initial extracellular alkalosis, probably mediated by Ca2+/H+ exchange [19]. Upon

NBCe1, i.e. DIA, overwhelming the initial extracellular alkalosis. Under normal condition, the net extracellular acidosis due to DIA makes surrounding neuronal cells less excitable, because protons suppress excitatory NMDA receptors, with a steep sensitivity in the physiological range of extracellular [19]. Absence of DIA due to defective membrane expression of NBCe1 in astrocytes may cause a positive feedback loop of increased neuronal activity leading to further NMDA-mediated neuronal hyperactivity, causing complete depolarization of a sizable population of brain cells, i.e. CSD. We therefore think that migraine associated with NBCe1 mutations represents a primary headache most likely caused by dysfunctional local pH

**Figure 3.** Migraine-associated transporters. While *SCN1A* and *CACNA1A* may directly regulate neuron excitation, *ATP1A2* may regulate neuron excitation indirectly via uptake of K+ and/or glutamate into

into astrocytes may also regulate

astrocytes. On the other hand, NBCe1-mediated uptake of HCO3-

neuron excitation by affecting pH-sensitive NMDA receptors.

depolarization, however, glial cells secret acid via inward electrogenic Na+-HCO3-

involved in the pathogenesis of migraine.

regulation in the brain as shown in Figure 3.


Cotransporter NBCe1 173

cotransport

Migraine is a common, disabling, multifactorial disorder, affecting more than 10% of the population with women more affected than men [58]. Although genetic factor plays a substantial role in ordinary migraine, the genetic basis has been established only in familial hemiplegic migraine (FHM), a rare autosomal dominant subtype of migraine with aura. In addition to a similar headache phase as found in ordinarily migraine, FHM patients experience prolonged hemiparesis [59]. Thus far, three genes have been identified as the genetic basis for FHM: *CACNA1A* encoding the α1 subunit of voltage-gated neuronal Cav2.1 calcium channels [60], *ATP1A2* encoding the α2 subunit of Na+/K+ ATPase [61], and *SCN1A* encoding the neuronal voltage-gated sodium channel Nav1.1 [62]. These mutations are thought to cause migraine by enhancing neuronal excitability [63].

We recently identified two sisters with pRTA, ocular abnormalities and hemiplegic migraine. Genetic analysis excluded pathological mutation in *CACNA1A*, *ATP1A2*, and *SCN1A,* but identified the homozygous S982NfsX4 mutation in the C-terminus of NBCe1 [10]. Several heterozygous members of the family also presented with glaucoma and migraine with or without aura. This mutant showed a normal electrogenic activity in *Xenopus* oocytes. When expressed in mammalian cells, however, the S982NfsX4 mutant showed almost no transport activity due to a predominant retention in the endoplasmic reticulum (ER). Several mutant proteins that are retained in the ER are known to exert a dominant negative effect by forming hetero-oligomer complexes with wild-type proteins [64], and NBCe1 can also form the oligomer complexes [65]. Indeed, co-expression analysis uncovered a dominant negative effect of the mutant through hetero-oligomer formation with wild-type NBCe1, which may be responsible for the occurrence of migraine and glaucoma in the heterozygous family members. To further substantiate NBCe1 mutations as a cause of migraine, we re-investigated the other pRTA pedigrees with distinct NBCe1 mutations, and found 4 additional homozygous patients with migraine: hemiplegic migraine with episodic ataxia in L522P [8], migraine with aura in N721TfsX29 [6], and migraine without aura in R510H and R881C [3,7]. Transient expression of GFP-tagged NBCe1B constructs carrying these mutations in C6 glioma cells revealed a remarkable coincidence between the apparent lack of membrane expression and the occurrence of migraine. From these and other results, we concluded that the near total loss of NBCe1B activity in astrocytes can cause migraine potentially through dysregulation of synaptic pH [10]. We cannot exclude a possibility that the inactivation of NBCe1C is also involved in the pathogenesis of migraine.

172 Mutations in Human Genetic Disease

of the brain interstitial space, acid secretion by glial cells via inward electrogenic Na+-HCO3 cotransporter NBCe1B may have a significant role in the prevention of excessive neural activities. In fact, alkalosis in extracellular spaces is generally associated with enhanced neuronal excitability, while acidosis is known to suppress neural activity [19]. A recent study using NBCe1 knockout (KO) mice confirmed that NBCe1 mediates a depolarization-induced alkalinization (DIA) response in astrocytes [57]. This study revealed that NBCe1 also contributes partially to a DIA response in hippocampal neurons [57]. Bevensee *et al*. initially reported that the expression of NBCe1B is more abundant in astrocytes than in neuron, while NBCe1C show the reverse pattern of expression [18]. However, the expression of NBCe1C was also found in rat astrocytes [22]. Despite the intensive expression of NBCe1 in brain and the potential contribution of NBCe1 to the extracellular pH regulation in brain, the physiological significance of NBCe1 in brain had still remained speculative. However, recent work revealed

Migraine is a common, disabling, multifactorial disorder, affecting more than 10% of the population with women more affected than men [58]. Although genetic factor plays a substantial role in ordinary migraine, the genetic basis has been established only in familial hemiplegic migraine (FHM), a rare autosomal dominant subtype of migraine with aura. In addition to a similar headache phase as found in ordinarily migraine, FHM patients experience prolonged hemiparesis [59]. Thus far, three genes have been identified as the genetic basis for FHM: *CACNA1A* encoding the α1 subunit of voltage-gated neuronal Cav2.1 calcium channels [60], *ATP1A2* encoding the α2 subunit of Na+/K+ ATPase [61], and *SCN1A* encoding the neuronal voltage-gated sodium channel Nav1.1 [62]. These mutations are

We recently identified two sisters with pRTA, ocular abnormalities and hemiplegic migraine. Genetic analysis excluded pathological mutation in *CACNA1A*, *ATP1A2*, and *SCN1A,* but identified the homozygous S982NfsX4 mutation in the C-terminus of NBCe1 [10]. Several heterozygous members of the family also presented with glaucoma and migraine with or without aura. This mutant showed a normal electrogenic activity in *Xenopus* oocytes. When expressed in mammalian cells, however, the S982NfsX4 mutant showed almost no transport activity due to a predominant retention in the endoplasmic reticulum (ER). Several mutant proteins that are retained in the ER are known to exert a dominant negative effect by forming hetero-oligomer complexes with wild-type proteins [64], and NBCe1 can also form the oligomer complexes [65]. Indeed, co-expression analysis uncovered a dominant negative effect of the mutant through hetero-oligomer formation with wild-type NBCe1, which may be responsible for the occurrence of migraine and glaucoma in the heterozygous family members. To further substantiate NBCe1 mutations as a cause of migraine, we re-investigated the other pRTA pedigrees with distinct NBCe1 mutations, and found 4 additional homozygous patients with migraine: hemiplegic migraine with episodic ataxia in L522P [8], migraine with aura in N721TfsX29 [6], and migraine without aura in R510H and R881C [3,7]. Transient expression of GFP-tagged NBCe1B constructs carrying these mutations in C6 glioma cells revealed a remarkable coincidence between the apparent lack of membrane expression and the occurrence of migraine. From these and other results, we concluded that the near total loss of

an unrecognized association of migraine with NBCe1 mutations [10].

thought to cause migraine by enhancing neuronal excitability [63].

Cerebral cortical hyperexcitability causing cortical spreading depression (CSD) seems to be the underlying pathophysiological mechanism of migraine aura [63]. In general, neuronal firing may lead to a rise in extracellular K+ concentration and further depolarization, but uptake of K+ into astrocytes can counteract this process. Therefore, enhanced neurotransmitter release by *CACNA1*A mutations, excessive neuronal firing by *SCN1A* mutations, or impaired clearance of K+ and/or glutamate by *ATP1A2* mutations can all induce CSD [63].Neuronal excitation may also elicit an initial extracellular alkalosis, probably mediated by Ca2+/H+ exchange [19]. Upon depolarization, however, glial cells secret acid via inward electrogenic Na+-HCO3 cotransport NBCe1, i.e. DIA, overwhelming the initial extracellular alkalosis. Under normal condition, the net extracellular acidosis due to DIA makes surrounding neuronal cells less excitable, because protons suppress excitatory NMDA receptors, with a steep sensitivity in the physiological range of extracellular [19]. Absence of DIA due to defective membrane expression of NBCe1 in astrocytes may cause a positive feedback loop of increased neuronal activity leading to further NMDA-mediated neuronal hyperactivity, causing complete depolarization of a sizable population of brain cells, i.e. CSD. We therefore think that migraine associated with NBCe1 mutations represents a primary headache most likely caused by dysfunctional local pH regulation in the brain as shown in Figure 3.

**Figure 3.** Migraine-associated transporters. While *SCN1A* and *CACNA1A* may directly regulate neuron excitation, *ATP1A2* may regulate neuron excitation indirectly via uptake of K+ and/or glutamate into astrocytes. On the other hand, NBCe1-mediated uptake of HCO3 into astrocytes may also regulate neuron excitation by affecting pH-sensitive NMDA receptors.

## **6. Roles of N-terminal sequences in NBCe1 functions**

When expressed in *Xenopus* oocytes, NBCe1B and NBCe1C showed much lower activities than that of NBCe1A [66-68]. The deletion from of the cytoplasmic N-terminus of an 87 amino acid sequence markedly enhanced the activities of both NBCe1B and NBCe1C by more than 3-fold, indicating that this sequence contains an autoinhibitory domain [66,68]. On the other hand, this sequence also contains a binding domain for inositol 1,4,5 triphosphate receptors (IP3R) binding protein released with IP3 (IRBIT). IRBIT is dissociated from IP3R in the presence of physiological concentrations of IP3, the process of which has an important role in the regulation of IP3R functions [69,70].

Pathophysiological Roles of Mutations in the Electrogenic Na+-HCO3

retardation, hyperaldosteronism, anemia and splenomegaly, abnormal enamel mineralization, intestinal obstruction, and early death before weaning. Splenomegaly might be due to hemolytic anemia due to severe acidemia. The white pulp and the red pulp were severely disrupted in spleen of KO mice. A significant reduction in the cAMP-stimulated short circuit current was detected in colon of KO mice in the presence of a carbonic

A homozygous NBCe1 W516X mutation was identified in a girl with severe pRTA (blood

growth retardation, hyperaldosteronism, anemia and splenomegaly, and early death before weaning [11]. Due to the process of nonsense-mediated decay, the expression of NBCe1 mRNA was halved in the heterozygous and virtually absent in the homozygous W516X KI mice. The NBCe1 activity in isolated renal proximal tubules from the homozygous KI mice was severely reduced to less than 20% of the activity in tubules from wild-type mice. The rate of bicarbonate absorption in the homozygous KI mice was also markedly reduced to less than 20% of that in wild-type mice, confirming the indispensable role of NBCe1 in bicarbonate absorption from renal proximal tubules. Alkali therapy was effective in prolonging the survival, and partially improving growth retardation and bone abnormalities of the homozygous KI mice. The prolonged survival time by alkali therapy uncovered the development of corneal opacities due to corneal edema in the homozygous KI mice. These results confirmed that the normal NBCe1 activity in corneal endothelium is essential for the maintenance of corneal transparency not only in humans but also in

Unlike NBCe1 KO and W516X KI mice, NHE3 KO mice showed only a mild acidemia with

Na+/H+ exchanger type 3 (NHE3) has been considered to mediate a majority of proton secretion into lumen [77]. However, functional analysis using isolated renal proximal tubules from NHE3 KO mice revealed the residual amiloride-sensitive NHE activity, which corresponded to approximately 50% of the wild-type activity [78]. This residual NHE activity, which could represent NHE8 [79], might be able to at least partially compensate for the loss of NHE3 activity. In contrast to such an effective compensation mechanism in the

basolateral membranes of renal proximal tubules [35,36] may be unable to compensate for

George Seki, Shoko Horita, Masashi Suzuki, Osamu Yamazaki and Hideomi Yamada

*Department of Internal Medicine, Faculty of Medicine, University of Tokyo, Japan* 

level of around 21 mM [76]. In the apical membranes of renal proximal tubules,

/HCO3-

exchangers in the

 concentration of 10 mM), growth retardation, and the typical ocular abnormalities including band keratopathy, cataracts, and glaucoma [11]. Homozygous W516X KI mice

anhydrase inhibitor acetazolamide, which might reduce the availability of HCO3-

KO mice exhibited severe metabolic acidosis (blood HCO3-

also presented with severe metabolic acidosis (blood HCO3-

apical membranes, Na+-dependent and Na+-independent Cl-

HCO3-

mice [11].

blood HCO3-

the loss of NBCe1A activity.

**Author details** 


concentration of 5.3 mM), growth

.

concentration of 3.9 mM),

Cotransporter NBCe1 175

We and others found that IRBIT binds to and activates NBCe1B and NBCe1C expressed in *Xenopus* oocytes [67,71]. Because this binding requires the cytoplasmic sequence of a 62-amino acid sequence in the N-terminus of NBCe1B and NBCe1C, IRBIT does not bind to NBCe1A that lacks this sequence [67]. Co-expression of IRBIT markedly activates the NBCe1B activity by several-fold. Because this stimulation is not associated with the significant changes in the amount of NBCe1B expressed in the plasma membranes of *Xenopus* oocytes, IRBIT may induce the stimulation of per-molecule activity of NBCe1B [67,68]. Interestingly, Lee et al. found that a mutant IRBIT lacking a protein phophatase-1 (PP-1) binding site stimulates NBCe1B to a 50% greater than can be achieved by the removal of autoinhibitory domain [68]. These results suggest that the stimulatory mechanism of IRBIT may involve not only the neutralization of autoinhibitory domain but also other factors.

The stimulation of NBCe1B by IRBIT has been also confirmed in pancreatic ducts *in vivo* [25]. Thus in secretory epithelia such as pancreatic ducts, IRBIT has a central role in fluid and bicarbonate secretion by activating both NBCe1B and the cystic fibrosis transmembrane conductance regulator CFTR [25]. The subsequent study revealed that the with-no-lysine (WNK) kinases act as scaffolds to recruit Ste20-related proline/alanine-rich kinase (SPAK), which phosphorylates CFTR and NBCe1B, reducing their surface expression. In addition to the direct activation of NBCe1B and CFTR, IRBIT opposed the effects of WNKs and SPAK by recruiting PP-1 to dephosphorylate CFTR and NBCe1B, restoring their surface expression [72]. In contrast to these complex modes of IRBIT-mediated transport stimulation in secretory epithelia, the dephosphorylation of IRBIT by PP-1 may rather partially suppress the stimulatory effect of IRBIT on NBCe1B in *Xenopus* oocytes, which do not express WNKs or SPAK [68,73].

The injection of inositol 4,5-bisphoshate (PIP2) into *Xenopus* oocytes stimulated the whole currents of NBCe1B and NBCe1C [74]. IRBIT reduced the PIP2-induced stimulation of NBCe1B and NBCe1C, suggesting that IRBIT and PIP2 may compete with one another in stimulating NBCe1B and NBCe1C [71]. In addition to the regulation by the binding of IRBIT or PIP2, the N-terminus of NBCe1B and NBCe1C may also play a role in the inhibition by intracellular Mg2+ [75].

#### **7. Phenotypes of NBCe1-deficient mice**

Two types of NBCe1-deficient mice, NBCe1 KO and W516X knockin (KI) mice, have been produced [11,14]. Both types of mice show severe acidosis and early lethality. Thus, NBCe1 KO mice exhibited severe metabolic acidosis (blood HCO3 concentration of 5.3 mM), growth retardation, hyperaldosteronism, anemia and splenomegaly, abnormal enamel mineralization, intestinal obstruction, and early death before weaning. Splenomegaly might be due to hemolytic anemia due to severe acidemia. The white pulp and the red pulp were severely disrupted in spleen of KO mice. A significant reduction in the cAMP-stimulated short circuit current was detected in colon of KO mice in the presence of a carbonic anhydrase inhibitor acetazolamide, which might reduce the availability of HCO3- .

A homozygous NBCe1 W516X mutation was identified in a girl with severe pRTA (blood HCO3 concentration of 10 mM), growth retardation, and the typical ocular abnormalities including band keratopathy, cataracts, and glaucoma [11]. Homozygous W516X KI mice also presented with severe metabolic acidosis (blood HCO3 concentration of 3.9 mM), growth retardation, hyperaldosteronism, anemia and splenomegaly, and early death before weaning [11]. Due to the process of nonsense-mediated decay, the expression of NBCe1 mRNA was halved in the heterozygous and virtually absent in the homozygous W516X KI mice. The NBCe1 activity in isolated renal proximal tubules from the homozygous KI mice was severely reduced to less than 20% of the activity in tubules from wild-type mice. The rate of bicarbonate absorption in the homozygous KI mice was also markedly reduced to less than 20% of that in wild-type mice, confirming the indispensable role of NBCe1 in bicarbonate absorption from renal proximal tubules. Alkali therapy was effective in prolonging the survival, and partially improving growth retardation and bone abnormalities of the homozygous KI mice. The prolonged survival time by alkali therapy uncovered the development of corneal opacities due to corneal edema in the homozygous KI mice. These results confirmed that the normal NBCe1 activity in corneal endothelium is essential for the maintenance of corneal transparency not only in humans but also in mice [11].

Unlike NBCe1 KO and W516X KI mice, NHE3 KO mice showed only a mild acidemia with blood HCO3 level of around 21 mM [76]. In the apical membranes of renal proximal tubules, Na+/H+ exchanger type 3 (NHE3) has been considered to mediate a majority of proton secretion into lumen [77]. However, functional analysis using isolated renal proximal tubules from NHE3 KO mice revealed the residual amiloride-sensitive NHE activity, which corresponded to approximately 50% of the wild-type activity [78]. This residual NHE activity, which could represent NHE8 [79], might be able to at least partially compensate for the loss of NHE3 activity. In contrast to such an effective compensation mechanism in the apical membranes, Na+-dependent and Na+-independent Cl- /HCO3 exchangers in the basolateral membranes of renal proximal tubules [35,36] may be unable to compensate for the loss of NBCe1A activity.

### **Author details**

174 Mutations in Human Genetic Disease

**6. Roles of N-terminal sequences in NBCe1 functions** 

important role in the regulation of IP3R functions [69,70].

autoinhibitory domain but also other factors.

intracellular Mg2+ [75].

**7. Phenotypes of NBCe1-deficient mice** 

When expressed in *Xenopus* oocytes, NBCe1B and NBCe1C showed much lower activities than that of NBCe1A [66-68]. The deletion from of the cytoplasmic N-terminus of an 87 amino acid sequence markedly enhanced the activities of both NBCe1B and NBCe1C by more than 3-fold, indicating that this sequence contains an autoinhibitory domain [66,68]. On the other hand, this sequence also contains a binding domain for inositol 1,4,5 triphosphate receptors (IP3R) binding protein released with IP3 (IRBIT). IRBIT is dissociated from IP3R in the presence of physiological concentrations of IP3, the process of which has an

We and others found that IRBIT binds to and activates NBCe1B and NBCe1C expressed in *Xenopus* oocytes [67,71]. Because this binding requires the cytoplasmic sequence of a 62-amino acid sequence in the N-terminus of NBCe1B and NBCe1C, IRBIT does not bind to NBCe1A that lacks this sequence [67]. Co-expression of IRBIT markedly activates the NBCe1B activity by several-fold. Because this stimulation is not associated with the significant changes in the amount of NBCe1B expressed in the plasma membranes of *Xenopus* oocytes, IRBIT may induce the stimulation of per-molecule activity of NBCe1B [67,68]. Interestingly, Lee et al. found that a mutant IRBIT lacking a protein phophatase-1 (PP-1) binding site stimulates NBCe1B to a 50% greater than can be achieved by the removal of autoinhibitory domain [68]. These results suggest that the stimulatory mechanism of IRBIT may involve not only the neutralization of

The stimulation of NBCe1B by IRBIT has been also confirmed in pancreatic ducts *in vivo* [25]. Thus in secretory epithelia such as pancreatic ducts, IRBIT has a central role in fluid and bicarbonate secretion by activating both NBCe1B and the cystic fibrosis transmembrane conductance regulator CFTR [25]. The subsequent study revealed that the with-no-lysine (WNK) kinases act as scaffolds to recruit Ste20-related proline/alanine-rich kinase (SPAK), which phosphorylates CFTR and NBCe1B, reducing their surface expression. In addition to the direct activation of NBCe1B and CFTR, IRBIT opposed the effects of WNKs and SPAK by recruiting PP-1 to dephosphorylate CFTR and NBCe1B, restoring their surface expression [72]. In contrast to these complex modes of IRBIT-mediated transport stimulation in secretory epithelia, the dephosphorylation of IRBIT by PP-1 may rather partially suppress the stimulatory effect of IRBIT on NBCe1B in *Xenopus* oocytes, which do not express WNKs or SPAK [68,73].

The injection of inositol 4,5-bisphoshate (PIP2) into *Xenopus* oocytes stimulated the whole currents of NBCe1B and NBCe1C [74]. IRBIT reduced the PIP2-induced stimulation of NBCe1B and NBCe1C, suggesting that IRBIT and PIP2 may compete with one another in stimulating NBCe1B and NBCe1C [71]. In addition to the regulation by the binding of IRBIT or PIP2, the N-terminus of NBCe1B and NBCe1C may also play a role in the inhibition by

Two types of NBCe1-deficient mice, NBCe1 KO and W516X knockin (KI) mice, have been produced [11,14]. Both types of mice show severe acidosis and early lethality. Thus, NBCe1 George Seki, Shoko Horita, Masashi Suzuki, Osamu Yamazaki and Hideomi Yamada *Department of Internal Medicine, Faculty of Medicine, University of Tokyo, Japan* 

#### **8. References**

[1] Romero MF, Hediger MA, Boulpaep EL, Boron WF. Expression cloning and characterization of a renal electrogenic Na+/HCO3 cotransporter. Nature 1997; 387: 409- 413.

Pathophysiological Roles of Mutations in the Electrogenic Na+-HCO3

[16] Liu Y, Xu JY, Wang DK, Wang L, Chen LM. Cloning and identification of two novel NBCe1 splice variants from mouse reproductive tract tissues: a comparative study of

[17] Abuladze N, Song M, Pushkin A*, et al.* Structural organization of the human NBC1 gene: kNBC1 is transcribed from an alternative promoter in intron 3. Gene 2000; 251:

[18] Bevensee MO, Schmitt BM, Choi I, Romero MF, Boron WF. An electrogenic Na+-HCO3 cotransporter (NBC) with a novel COOH-terminus, cloned from rat brain. Am J Physiol

[19] Chesler M. Regulation and modulation of pH in the brain. Physiol Rev 2003; 83: 1183-

[20] Marino CR, Jeanes V, Boron WF, Schmitt BM. Expression and distribution of the Na+-

cotransporter in human pancreas. Am J Physiol 1999; 277: G487-G494.

variants in rat and human pancreas. Am J Physiol Cell Physiol 2003; 284: C729-C737. [22] Majumdar D, Maunsbach AB, Shacka JJ*, et al.* Localization of electrogenic Na/bicarbonate cotransporter NBCe1 variants in rat brain. Neuroscience 2008; 155: 818-

[23] Ishiguro H, Steward MC, Lindsay AR, Case RM. Accumulation of intracellular HCO3-

[24] Ishiguro H, Steward MC, Wilson RW, Case RM. Bicarbonate secretion in interlobular

[25] Yang D, Shcheynikov N, Zeng W*, et al.* IRBIT coordinates epithelial fluid and HCO3 secretion by stimulating the transporters pNBC1 and CFTR in the murine pancreatic

[26] Steward MC, Ishiguro H, Case RM. Mechanisms of bicarbonate secretion in the

[27] Boron WF. Acid-base transport by the renal proximal tubule. J Am Soc Nephrol 2006;

[28] Yoshitomi K, Burckhardt BC, Fromter E. Rheogenic sodium-bicarbonate cotransport in the peritubular cell membrane of rat renal proximal tubule. Pflugers Arch 1985; 405:

[30] Seki G, Coppola S, Yoshitomi K*, et al.* On the mechanism of bicarbonate exit from renal

[31] Muller-Berger S, Nesterov VV, Fromter E. Partial recovery of in vivo function by improved incubation conditions of isolated renal proximal tubule. II. Change of Na-HCO3 cotransport stoichiometry and of response to acetazolamide. Pflugers Arch 1997;

ducts from guinea-pig pancreas. J Physiol 1996; 495 ( Pt 1): 179-191.

cotransport in interlobular ducts from guinea-pig pancreas. J Physiol

to 1 Na+ in isolated rabbit renal proximal tubule. Pflugers Arch 1993;

[21] Satoh H, Moriyama N, Hara C*, et al.* Localization of Na+-HCO3-

NCBT genes. Genomics 2011; 98: 112-119.

Cell Physiol 2000; 278: C1200-C1211.

109-122.

1221.

HCO3-

832.

by Na+-HCO3-

17: 2368-2382.

ratio of 2 HCO3-

425: 409-416.

434: 383-391.

360-366.

1996; 495 ( Pt 1): 169-178.

duct. J Clin Invest 2009; 119: 193-202.

[29] Seki G, Coppola S, Fromter E. The Na+-HCO3-

pancreatic duct. Annu Rev Physiol 2005; 67: 377-409.

proximal tubular cells. Kidney Int 1996; 49: 1671-1677.


Cotransporter NBCe1 177

cotransporter (NBC-1)

cotransporter operates with a coupling


[16] Liu Y, Xu JY, Wang DK, Wang L, Chen LM. Cloning and identification of two novel NBCe1 splice variants from mouse reproductive tract tissues: a comparative study of NCBT genes. Genomics 2011; 98: 112-119.

176 Mutations in Human Genetic Disease

[1] Romero MF, Hediger MA, Boulpaep EL, Boron WF. Expression cloning and

[3] Igarashi T, Inatomi J, Sekine T*, et al.* Mutations in SLC4A4 cause permanent isolated proximal renal tubular acidosis with ocular abnormalities. Nat Genet 1999; 23: 264-266. [4] Igarashi T, Inatomi J, Sekine T*, et al.* Novel nonsense mutation in the Na+/HCO3 cotransporter gene (SLC4A4) in a patient with permanent isolated proximal renal

[5] Dinour D, Chang MH, Satoh J*, et al.* A novel missense mutation in the sodium bicarbonate cotransporter (NBCe1/SLC4A4) causes proximal tubular acidosis and

[6] Inatomi J, Horita S, Braverman N*, et al.* Mutational and functional analysis of SLC4A4 in a patient with proximal renal tubular acidosis. Pflugers Arch 2004; 448: 438-444. [7] Horita S, Yamada H, Inatomi J*, et al.* Functional analysis of NBC1 mutants associated with proximal renal tubular acidosis and ocular abnormalities. J Am Soc Nephrol 2005;

[8] Demirci FY, Chang MH, Mah TS, Romero MF, Gorin MB. Proximal renal tubular acidosis and ocular pathology: a novel missense mutation in the gene (SLC4A4) for

[10] Suzuki M, Van Paesschen W, Stalmans I*, et al.* Defective membrane expression of the

[11] Lo YF, Yang SS, Seki G*, et al.* Severe metabolic acidosis causes early lethality in NBC1 W516X knock-in mice as a model of human isolated proximal renal tubular acidosis.

[12] Bok D, Schibler MJ, Pushkin A*, et al.* Immunolocalization of electrogenic sodiumbicarbonate cotransporters pNBC1 and kNBC1 in the rat eye. Am J Physiol Renal

[13] Usui T, Hara M, Satoh H*, et al.* Molecular basis of ocular abnormalities associated with

[14] Gawenis LR, Bradford EM, Prasad V*, et al.* Colonic anion secretory defects and

[15] Boron WF, Chen L, Parker MD. Modular structure of sodium-coupled bicarbonate

proximal renal tubular acidosis. J Clin Invest 2001; 108: 107-115.

metabolic acidosis in mice lacking the NBC1 Na+/HCO3-

transporters. J Exp Biol 2009; 212: 1697-1706.

cotransporter NBCe1 is associated with familial migraine. Proc Natl Acad

sodium bicarbonate cotransporter protein (NBCe1). Mol Vis 2006; 12: 324-330. [9] Suzuki M, Vaisbich MH, Yamada H*, et al.* Functional analysis of a novel missense NBC1 mutation and of other mutations causing proximal renal tubular acidosis. Pflugers Arch

tubular acidosis and bilateral glaucoma. J Am Soc Nephrol 2001; 12: 713-718.

glaucoma through ion transport defects. J Biol Chem 2004; 279: 52238-52246.

cotransporter. Nature 1997; 387: 409-

cotransporters: cloning and

cotransporter. J Biol Chem

characterization of a renal electrogenic Na+/HCO3-

[2] Romero MF, Boron WF. Electrogenic Na+/HCO3-

physiology. Annu Rev Physiol 1999; 61: 699-723.

**8. References** 

413.

16: 2270-2278.

2008; 455: 583-593.

Sci U S A 2010; 107: 15963-15968.

Kidney Int 2011; 79: 730-741.

Physiol 2001; 281: F920-F935.

2007; 282: 9042-9052.

Na+-HCO3-


[32] Muller-Berger S, Ducoudret O, Diakov A, Fromter E. The renal Na-HCO3-cotransporter expressed in Xenopus laevis oocytes: change in stoichiometry in response to elevation of cytosolic Ca2+ concentration. Pflugers Arch 2001; 442: 718-728.

Pathophysiological Roles of Mutations in the Electrogenic Na+-HCO3

[48] Mathias RT, Rae JL, Baldo GJ. Physiological properties of the normal lens. Physiol Rev

[49] Fischbarg J, Diecke FP, Kuang K*, et al.* Transport of fluid by lens epithelium. Am J

[50] Lepple-Wienhues A, Rauch R, Clark AF, Grassmann A, Berweck S, Wiederholt M. Electrophysiological properties of cultured human trabecular meshwork cells. Exp Eye

[51] Bill A. Blood circulation and fluid dynamics in the eye. Physiol Rev 1975; 55: 383-417. [52] Newman EA. Sodium-bicarbonate cotransport in retinal astrocytes and Muller cells of

[53] Borgula GA, Karwoski CJ, Steinberg RH. Light-evoked changes in extracellular pH in

[55] Shahidullah M, To CH, Pelis RM, Delamere NA. Studies on bicarbonate transporters and carbonic anhydrase in porcine nonpigmented ciliary epithelium. Invest

[56] Schmitt BM, Berger UV, Douglas RM*, et al.* Na/HCO3 cotransporters in rat brain: expression in glia, neurons, and choroid plexus. J Neurosci 2000; 20: 6839-6848. [57] Svichar N, Esquenazi S, Chen HY, Chesler M. Preemptive regulation of intracellular pH in hippocampal neurons by a dual mechanism of depolarization-induced alkalinization.

[58] Lipton RB, Scher AI, Kolodner K, Liberman J, Steiner TJ, Stewart WF. Migraine in the United States: epidemiology and patterns of health care use. Neurology 2002; 58: 885-

[59] The International Classification of Headache Disorders: 2nd edition. Cephalalgia 2004;

[60] Ophoff RA, Terwindt GM, Vergouwe MN*, et al.* Familial hemiplegic migraine and episodic ataxia type-2 are caused by mutations in the Ca2+ channel gene CACNL1A4.

[61] De Fusco M, Marconi R, Silvestri L*, et al.* Haploinsufficiency of ATP1A2 encoding the Na+/K+ pump alpha2 subunit associated with familial hemiplegic migraine type 2. Nat

[62] Dichgans M, Freilinger T, Eckstein G*, et al.* Mutation in the neuronal voltage-gated sodium channel SCN1A in familial hemiplegic migraine. Lancet 2005; 366: 371-377. [63] Goadsby PJ. Recent advances in understanding migraine mechanisms, molecules and

[64] Alper SL. Genetic diseases of acid-base transporters. Annu Rev Physiol 2002; 64: 899-

[65] Kao L, Sassani P, Azimov R*, et al.* Oligomeric structure and minimal functional unit of the electrogenic sodium bicarbonate cotransporter NBCe1-A. J Biol Chem 2008; 283:

1997; 77: 21-50.

Physiol 1999; 276: C548-C557.

the rat. Glia 1999; 26: 302-308.

frog retina. Vision Res 1989; 29: 1069-1077.

Ophthalmol Vis Sci 2009; 50: 1791-1800.

J Neurosci 2011; 31: 6997-7004.

894.

923.

26782-26794.

24 Suppl 1: 9-160.

Cell 1996; 87: 543-552.

Genet 2003; 33: 192-196.

therapeutics. Trends Mol Med 2007; 13: 39-44.

[54] Counillon L, Touret N, Bidet M*, et al.* Na+/H+ and CI-

pigmented ciliary epithelial cells. Pflugers Arch 2000; 440: 667-678.

Res 1994; 59: 305-311.


/HCO3-

Cotransporter NBCe1 179

antiporters of bovine


[48] Mathias RT, Rae JL, Baldo GJ. Physiological properties of the normal lens. Physiol Rev 1997; 77: 21-50.

178 Mutations in Human Genetic Disease

13416-13426.

[32] Muller-Berger S, Ducoudret O, Diakov A, Fromter E. The renal Na-HCO3-cotransporter expressed in Xenopus laevis oocytes: change in stoichiometry in response to elevation

[33] Gross E, Hawkins K, Abuladze N*, et al.* The stoichiometry of the electrogenic sodium bicarbonate cotransporter NBC1 is cell-type dependent. J Physiol 2001; 531: 597-603. [34] Chen LM, Liu Y, Boron WF. Role of an extracellular loop in determining the

[35] Preisig PA, Alpern RJ. Basolateral membrane H-OH-HCO3 transport in the proximal

[36] Seki G, Fromter E. Acetazolamide inhibition of basolateral base exit in rabbit renal

[37] Stehberger PA, Shmukler BE, Stuart-Tilley AK, Peters LL, Alper SL, Wagner CA. Distal

[38] Zhu Q, Kao L, Azimov R*, et al.* Topological location and structural importance of the NBCe1-A residues mutated in proximal renal tubular acidosis. J Biol Chem 2010; 285:

[39] Chang MH, DiPiero J, Sonnichsen FD, Romero MF. Entry to "formula tunnel" revealed by SLC4A4 human mutation and structural model. J Biol Chem 2008; 283: 18402-18410. [40] Deda G, Ekim M, Guven A, Karagol U, Tumer N. Hypopotassemic paralysis: a rare presentation of proximal renal tubular acidosis. J Child Neurol 2001; 16: 770-771.

conductance with a mutant Na/HCO3 cotransporter (SLC4A4) in a case of proximal renal tubular acidosis with hypokalemic paralysis. J Physiol 2012: (in press) doi:

kidney cells: a basis of proximal renal tubular acidosis. Am J Physiol Renal Physiol

cotransporter NBC1 show abnormal trafficking in polarized

[42] Li HC, Szigligeti P, Worrell RT, Matthews JB, Conforti L, Soleimani M. Missense

[43] Toye AM, Parker MD, Daly CM*, et al.* The human NBCe1-A mutant R881C, associated with proximal renal tubular acidosis, retains function but is mistargeted in polarized

[44] Hodson S, Miller F. The bicarbonate ion pump in the endothelium which regulates the

[45] Jentsch TJ, Keller SK, Koch M, Wiederholt M. Evidence for coupled transport of bicarbonate and sodium in cultured bovine corneal endothelial cells. J Membr Biol 1984;

[46] Usui T, Seki G, Amano S*, et al.* Functional and molecular evidence for Na+-HCO3 cotransporter in human corneal endothelial cells. Pflugers Arch 1999; 438: 458-462. [47] Sun XC, Bonanno JA, Jelamskii S, Xie Q. Expression and localization of Na+-HCO3 cotransporter in bovine corneal endothelium. Am J Physiol Cell Physiol 2000; 279:

[41] Parker MD, Qin X, Williamson RC, Toye AM, Boron WF. HCO3-

renal epithelia. Am J Physiol Cell Physiol 2006; 291: C788-C801.

hydration of rabbit cornea. J Physiol 1976; 263: 563-577.

/HCO3-

exchanger (slc4a1). J


of cytosolic Ca2+ concentration. Pflugers Arch 2001; 442: 718-728.

proximal tubule S2 segment. Pflugers Arch 1992; 422: 60-65.

renal tubular acidosis in mice lacking the AE1 (band3) Cl-

tubule. Am J Physiol 1989; 256: F751-F765.

Am Soc Nephrol 2007; 18: 1408-1418.

10.1113/jphysiol.2011.224733

mutations in Na+:HCO3-

2005; 289: F61-F71.

81: 189-204.

C1648-C1655.

stoichiometry of Na+-HCO3 cotransporters. J Physiol 2011; 589: 877-890.


[66] McAlear SD, Liu X, Williams JB, McNicholas-Bevensee CM, Bevensee MO. Electrogenic Na/HCO3 cotransporter (NBCe1) variants expressed in Xenopus oocytes: functional comparison and roles of the amino and carboxy termini. J Gen Physiol 2006; 127: 639- 658.

**Chapter 9** 

© 2012 Frías-Lasserre, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**The Mutations and Their Relationships** 

**Editing and Evolution in Eukaryotes** 

Additional information is available at the end of the chapter

*very little about inheritance and the existence of genes."* 

Daniel Frías-Lasserre

http://dx.doi.org/10.5772/49968

Gustavo Hoecker Salas

**1. Introduction** 

(December 5, 1915- March 19, 2008 ) National Prize of Science of Chile in 1989

genes, Lamarck's ideas have resurfaced.

**with the Genome and Epigenome, RNAs** 

*"Mutations have been crucial for geneticists, as day and night for astronomers. Whithout the successions of days and night we would not know about stars. Whithout mutations we would know* 

The idea of variation in nature is very old, in Heraclitus of Ephesus (504-500 BC) we find the first ideas of changes when he stated: "we never bathed in the same river". However in the field of biology, the Greeks considered that the species were immutables. This concept changes with the first scientific ideas of organic evolutions and heredity. Lamarck proposed the first evolutionary theory where the organisms evolved from simple forms. Also he proposed an hereditary model in which the environmental influences are very important as an agents of evolutionary change and proposed the Theory of acquired characters. With the Mendelism advent, Lamarcks's Theory was left behind and all the mutations in the living organisms were attributed to Mendelian "factors". However in recent years with the development of epigenesis, genomic imprinting and the horizontal transferences of the

The concept of mutation was coined by Hugo De Vries in 1901, whom worked with plants species of the genus Oenothera where he discovered some phenotypic hereditary characteristics that he coined as "mutations" and "mutants" to those individuals that have these phenotypic alterations. In opinion of De Vries, these mutations give origin to a new

and reproduction in any medium, provided the original work is properly cited.


## **The Mutations and Their Relationships with the Genome and Epigenome, RNAs Editing and Evolution in Eukaryotes**

Daniel Frías-Lasserre

180 Mutations in Human Genetic Disease

Physiol 2012; 302: C518-C526.

2010; 24: 815.816.

2007; 407: 303-311.

114.

658.

[66] McAlear SD, Liu X, Williams JB, McNicholas-Bevensee CM, Bevensee MO. Electrogenic Na/HCO3 cotransporter (NBCe1) variants expressed in Xenopus oocytes: functional comparison and roles of the amino and carboxy termini. J Gen Physiol 2006; 127: 639-

[67] Shirakabe K, Priori G, Yamada H*, et al.* IRBIT, an inositol 1,4,5-trisphosphate receptorbinding protein, specifically binds to and activates pancreas-type Na+/HCO3-

[68] Lee SK, Boron WF, Parker MD. Relief of autoinhibition of the electrogenic Na-HCO3 cotransporter NBCe1-B: role of IRBIT vs. amino-terminal truncation. Am J Physiol Cell

[69] Ando H, Mizutani A, Matsu-ura T, Mikoshiba K. IRBIT, a novel inositol 1,4,5 trisphosphate (IP3) receptor-binding protein, is released from the IP3 receptor upon IP3

[70] Ando H, Mizutani A, Kiefer H, Tsuzurugi D, Michikawa T, Mikoshiba K. IRBIT suppresses IP3 receptor activity by competing with IP3 for the common binding site on

[71] Thornell IM, Wu J, Bevensee MO. The IP3 receptor-binding protein IRBIT reduces phosphatidylinositol 4,5-bisphosphate (PIP2) stimulationon of Na/bicarbonate cotransporter NBCe1 variants expressed in Xenopus laevis oocytes (Abstract). FASEB J

[72] Yang D, Li Q, So I*, et al.* IRBIT governs epithelial secretion in mice by antagonizing the

[73] Devogelaere B, Beullens M, Sammels E*, et al.* Protein phosphatase-1 is a novel regulator of the interaction between IRBIT and the inositol 1,4,5-trisphosphate receptor. Biochem J

[74] Wu J, McNicholas CM, Bevensee MO. Phosphatidylinositol 4,5-bisphosphate (PIP2) stimulates the electrogenic Na/HCO3 cotransporter NBCe1-A expressed in Xenopus

regulated by intracellular Mg2+. Biochem Biophys Res Commun 2008; 376: 100-104. [76] Schultheis PJ, Clarke LL, Meneton P*, et al.* Renal and intestinal absorptive defects in

[77] Alpern RJ. Cell mechanisms of proximal tubule acidification. Physiol Rev 1990; 70: 79-

[78] Choi JY, Shah M, Lee MG*, et al.* Novel amiloride-sensitive sodium-dependent proton secretion in the mouse proximal convoluted tubule. J Clin Invest 2000; 105: 1141-1146. [79] Goyal S, Vanden Heuvel G, Aronson PS. Renal expression of novel Na+/H+ exchanger

mice lacking the NHE3 Na+/H+ exchanger. Nat Genet 1998; 19: 282-285.

isoform NHE8. Am J Physiol Renal Physiol 2003; 284: F467-F473.

cotransporter NBCe1-B is

cotransporter 1 (pNBC1). Proc Natl Acad Sci U S A 2006; 103: 9542-9547.

binding to the receptor. J Biol Chem 2003; 278: 10602-10612.

WNK/SPAK kinase pathway. J Clin Invest 2011; 121: 956-965.

oocytes. Proc Natl Acad Sci U S A 2009; 106: 14150-14155. [75] Yamaguchi S, Ishikawa T. The electrogenic Na+-HCO3-

the IP3 receptor. Mol Cell 2006; 22: 795-806.

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/49968

*"Mutations have been crucial for geneticists, as day and night for astronomers. Whithout the successions of days and night we would not know about stars. Whithout mutations we would know very little about inheritance and the existence of genes."* 

Gustavo Hoecker Salas

(December 5, 1915- March 19, 2008 )

National Prize of Science of Chile in 1989

## **1. Introduction**

The idea of variation in nature is very old, in Heraclitus of Ephesus (504-500 BC) we find the first ideas of changes when he stated: "we never bathed in the same river". However in the field of biology, the Greeks considered that the species were immutables. This concept changes with the first scientific ideas of organic evolutions and heredity. Lamarck proposed the first evolutionary theory where the organisms evolved from simple forms. Also he proposed an hereditary model in which the environmental influences are very important as an agents of evolutionary change and proposed the Theory of acquired characters. With the Mendelism advent, Lamarcks's Theory was left behind and all the mutations in the living organisms were attributed to Mendelian "factors". However in recent years with the development of epigenesis, genomic imprinting and the horizontal transferences of the genes, Lamarck's ideas have resurfaced.

The concept of mutation was coined by Hugo De Vries in 1901, whom worked with plants species of the genus Oenothera where he discovered some phenotypic hereditary characteristics that he coined as "mutations" and "mutants" to those individuals that have these phenotypic alterations. In opinion of De Vries, these mutations give origin to a new

© 2012 Frías-Lasserre, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

species that he named "elementary species" [1], [2]. Thus, this gave birth to the saltacionist Theory of Evolution that he described in his book entitled " Mutations". The harmony between Mutation Theory and Mendel model of heredity, the simplicity of the experimental method and the vast accumulation of supporting data, explain the big impact in the biological world [3]. Also, De Vries ventured with a hipothesis: " With the knowledge of the principles of the mutations will be possible in the future to induce mutations artificially" [4]. Wilhelm Johannsen argued that evolution consisted of discontinuous changes between "pure lines" and carried out their classic experiments in the beans *Phaseolus vulgaris,* through which coined the concepts of phenotype, genotype and gene [5] Other important step in the advances of the genetics as an experimental discipline, was the stablishment of relationships between mutations and genes discovered by Thomas Hunt Morgan in 1939 using *Drosophila* as biological material. Later Timoféeff- Ressovsky distinguished mutations at gene level and chromosomal aberrations. Morgan named mutations to these changes in individuals genes with variable effects [6] . Year later Morgan perfected the gene concepts as " the hereditary unit indivisible by recombination, located in the loci in a homologous chromosomal pair that can spontaneously mutate and belong to the linkage unit" [7]. In the framework of this concept the genes are located in a fixed position, specifically in a locus, concept coined by Morgan in 1915, and could change of position only by structural chromosomal reorganization [6]. This concept was accepted by the great majority of the scientific community of this time, prevailing until the discovery of transposable genetic elements in the second half of last century. However it is necessary to refer to some exceptions to the classic concept of the gene. Richard Goldschmidt in his book Theoretical Genetics denied the existence of an corpuscular gene; according to his opinion, in the chromosome only there is a definite pattern of changes that corresponds with the mutation and: "the mutation create the gene" [8].

The Mutations and Their Relationships

with the Genome and Epigenome, RNAs Editing and Evolution in Eukaryotes 183

gene was simultaneously a unit of mutation and function and were indivisible by recombination [7]. Archivald Garrod in 1909 was interested in to explain the origins and inheritance of human diseases. Also he was the first proposing the concept that a gene is in direct relationship with the production of a specific protein and that establishes the genetic control of some inborn error of the metabolism. He showed that an alteration in an enzyme was linked to amino acid metabolism. In 1941 Beadle and Tatum postulated the hypothesis "one gene-one enzyme". Thus, each gene control the production, function and specificity of a particular enzyme. Studies conducted in differents organisms proves that the capacity to synthetize the appropriate amino acid is caused by the modification or loss of a single enzyme. This concept was changed by Vernon Ingram who postulated the hypotesis "one gene-one polypeptide" in base to the sickle cell anemia disease in humans . Also Ingram postulated that this disease is caused by a single gene mutation which is letal in homozygous with severe sickle cell anemia, and is semiletal in heterozygous that show an attenuated sickle cell anemia. Normal homocigotes individuals are normal for the form of their blood cells and their hemoglobin in an electrophoretic analysis migrated differently in comparation to those heterozygotous individuals. The fingerprint show that the differences between normal and diseased individuals was only a single amino acids substitution in one of the beta chain of polypeptide. The glutamic acid in normal individuals is replaced by valina in individuals with sickle cell anemia. The difference between valina and glutamic acids is only one base in the codon. Moreove, the amino acid changes in one chain is independent of changes in the other chain, suggesting that the gene determining the alpha and beta chain are located in different loci. Thus one gene codes for one polypeptide and several polypeptides may be necessary for a functional enzyme of the organism. In 1961, Seysmour Benzer studing the fine structure of genes by using mutants in the phage T4 of E.coli, use for first time the concept of cistron. Inside of a gene, there are differents cistrons or "functional units". Benzer demostrated the hypotesis of Ingram, the cistron corresponds

to a sequence of nucleotides that code for a polypeptidic chain [9, 10]

G.W. Beadle, E.L. Tatum in *Neurospora* ,Lederberg and Tatum in *E.coli* [13].

The ideas about the genetic action and its mutability were complemented by Goldschmidt in1940 [11] who defined the gene on the basis of its physiological action. With the first DNA sequencing by Frederick Sanger , it was clearly demostrated by C. Yanofsky that the gene is a nucleotide sequence that encodes for proteins. Thus, within the genes there are information for the amino acid sequence of the primary structure of protein. [12] Any mutation at nucleotides of a gene may cause an alteration in the primary structure of the protein. Depending on the phenotypic effect causing these mutations can be lethal, semiletal, deletereos or innocuous (silent mutation). Many researchers were interested in inducing mutations with differents agents in plants and animals such as Hermann Muller in *Drosophila* ,Milislav Demerec in bacteria, Áke Gustafsson in barley, George Snell in mice,

An important step in the process of regulation of gene expression were the Jacob and Monod experiments in *E. coli*. Using mutations were able to establish the first model of expression and gene silencing in prokaryotes. Based in pioneering works of Calvin Bridges and Goldschmidt on the effects of homeotic mutation on development in *Drosophila*, García-

Mutations have been historically the cornerstone of biological disciplines: in basic science, to understand biodiversity and evolution of species, in medicine to explain phenotypic variation and diseases, in education to justify the individual differences found between the students within a classroom and also in agriculture and veterinary in the improvement of plants and animals useful to man. Thus, Mutations have allowed the explosive growth of genetics as an experimental science. In multicellular organisms the cell differentiation requires a series of genetic and epigenetic changes. The mutations (epimutations) can occurs also post transcriptionally in the different type of RNAs that constitute the epigenome. This article explores this theme, in the framework of the adaptation, phenotypic plasticity and evolution of eukaryotes.

#### **2. Mutations at genome level**

At the beginning of the genetics as an experimental discipline, mutations have been associated to the classic Mendelian genes and, with the advent of molecular genetics these genetic changes are produced in the coding area of the DNA. A gene occupied a definite place in the chromosome that was associated with a well determined phenotype,thus the gene was simultaneously a unit of mutation and function and were indivisible by recombination [7]. Archivald Garrod in 1909 was interested in to explain the origins and inheritance of human diseases. Also he was the first proposing the concept that a gene is in direct relationship with the production of a specific protein and that establishes the genetic control of some inborn error of the metabolism. He showed that an alteration in an enzyme was linked to amino acid metabolism. In 1941 Beadle and Tatum postulated the hypothesis "one gene-one enzyme". Thus, each gene control the production, function and specificity of a particular enzyme. Studies conducted in differents organisms proves that the capacity to synthetize the appropriate amino acid is caused by the modification or loss of a single enzyme. This concept was changed by Vernon Ingram who postulated the hypotesis "one gene-one polypeptide" in base to the sickle cell anemia disease in humans . Also Ingram postulated that this disease is caused by a single gene mutation which is letal in homozygous with severe sickle cell anemia, and is semiletal in heterozygous that show an attenuated sickle cell anemia. Normal homocigotes individuals are normal for the form of their blood cells and their hemoglobin in an electrophoretic analysis migrated differently in comparation to those heterozygotous individuals. The fingerprint show that the differences between normal and diseased individuals was only a single amino acids substitution in one of the beta chain of polypeptide. The glutamic acid in normal individuals is replaced by valina in individuals with sickle cell anemia. The difference between valina and glutamic acids is only one base in the codon. Moreove, the amino acid changes in one chain is independent of changes in the other chain, suggesting that the gene determining the alpha and beta chain are located in different loci. Thus one gene codes for one polypeptide and several polypeptides may be necessary for a functional enzyme of the organism. In 1961, Seysmour Benzer studing the fine structure of genes by using mutants in the phage T4 of E.coli, use for first time the concept of cistron. Inside of a gene, there are differents cistrons or "functional units". Benzer demostrated the hypotesis of Ingram, the cistron corresponds to a sequence of nucleotides that code for a polypeptidic chain [9, 10]

182 Mutations in Human Genetic Disease

and: "the mutation create the gene" [8].

evolution of eukaryotes.

**2. Mutations at genome level** 

species that he named "elementary species" [1], [2]. Thus, this gave birth to the saltacionist Theory of Evolution that he described in his book entitled " Mutations". The harmony between Mutation Theory and Mendel model of heredity, the simplicity of the experimental method and the vast accumulation of supporting data, explain the big impact in the biological world [3]. Also, De Vries ventured with a hipothesis: " With the knowledge of the principles of the mutations will be possible in the future to induce mutations artificially" [4]. Wilhelm Johannsen argued that evolution consisted of discontinuous changes between "pure lines" and carried out their classic experiments in the beans *Phaseolus vulgaris,* through which coined the concepts of phenotype, genotype and gene [5] Other important step in the advances of the genetics as an experimental discipline, was the stablishment of relationships between mutations and genes discovered by Thomas Hunt Morgan in 1939 using *Drosophila* as biological material. Later Timoféeff- Ressovsky distinguished mutations at gene level and chromosomal aberrations. Morgan named mutations to these changes in individuals genes with variable effects [6] . Year later Morgan perfected the gene concepts as " the hereditary unit indivisible by recombination, located in the loci in a homologous chromosomal pair that can spontaneously mutate and belong to the linkage unit" [7]. In the framework of this concept the genes are located in a fixed position, specifically in a locus, concept coined by Morgan in 1915, and could change of position only by structural chromosomal reorganization [6]. This concept was accepted by the great majority of the scientific community of this time, prevailing until the discovery of transposable genetic elements in the second half of last century. However it is necessary to refer to some exceptions to the classic concept of the gene. Richard Goldschmidt in his book Theoretical Genetics denied the existence of an corpuscular gene; according to his opinion, in the chromosome only there is a definite pattern of changes that corresponds with the mutation

Mutations have been historically the cornerstone of biological disciplines: in basic science, to understand biodiversity and evolution of species, in medicine to explain phenotypic variation and diseases, in education to justify the individual differences found between the students within a classroom and also in agriculture and veterinary in the improvement of plants and animals useful to man. Thus, Mutations have allowed the explosive growth of genetics as an experimental science. In multicellular organisms the cell differentiation requires a series of genetic and epigenetic changes. The mutations (epimutations) can occurs also post transcriptionally in the different type of RNAs that constitute the epigenome. This article explores this theme, in the framework of the adaptation, phenotypic plasticity and

At the beginning of the genetics as an experimental discipline, mutations have been associated to the classic Mendelian genes and, with the advent of molecular genetics these genetic changes are produced in the coding area of the DNA. A gene occupied a definite place in the chromosome that was associated with a well determined phenotype,thus the The ideas about the genetic action and its mutability were complemented by Goldschmidt in1940 [11] who defined the gene on the basis of its physiological action. With the first DNA sequencing by Frederick Sanger , it was clearly demostrated by C. Yanofsky that the gene is a nucleotide sequence that encodes for proteins. Thus, within the genes there are information for the amino acid sequence of the primary structure of protein. [12] Any mutation at nucleotides of a gene may cause an alteration in the primary structure of the protein. Depending on the phenotypic effect causing these mutations can be lethal, semiletal, deletereos or innocuous (silent mutation). Many researchers were interested in inducing mutations with differents agents in plants and animals such as Hermann Muller in *Drosophila* ,Milislav Demerec in bacteria, Áke Gustafsson in barley, George Snell in mice, G.W. Beadle, E.L. Tatum in *Neurospora* ,Lederberg and Tatum in *E.coli* [13].

An important step in the process of regulation of gene expression were the Jacob and Monod experiments in *E. coli*. Using mutations were able to establish the first model of expression and gene silencing in prokaryotes. Based in pioneering works of Calvin Bridges and Goldschmidt on the effects of homeotic mutation on development in *Drosophila*, GarcíaBellido and Lewis proposed a model of gene regulation of development in eukaryotes [14, 15]. The homeotic mutations have been fundamentals to explain the genetic basis of development, adaptation and evolution in eukaryote organisms. However, in recent years have found that in regions of DNA does not code for proteins are transcribed an enormous amount of non-coding RNA (ncRNAs), which together with proteins, regulate gene expression. These RNA, including the rRNA and tRNA, together with the mRNA and chromatin are part of epigenome. The mutation at the level of the epigenome have been called epimutations and also cause phenotypic changes, including diseases but also evolutionary novelties that even can be inherited through a non-Mendelian pattern of inheritance.Then will delve into this important topic .

The Mutations and Their Relationships

with the Genome and Epigenome, RNAs Editing and Evolution in Eukaryotes 185

telomeres [69-70]

and other snRNAs [69,72]

brain development [76,77,94]\_

29, 30]. The ncRNAs are short single-stranded between 18 to 30nt length such as micro RNA(miRNAs) Small interfering RNA (siRNAs), small nuclear RNA (snRNAs), Small nucleolar RNAs(snoRNAs), piwi- interacting RNAs (piRNAs) and long nc RNA (lncRNA) 200-2800 nt length. All these ncRNAs are hairpin that are paired in some places similar to tRNAs. The homologies detected between the ncRNAs with endogenous viruses, tramposons and introns revealed that ncRNAs probably originates from RNA viruses [31]. In the eukaryote genome, the ncRNAs are located in the non coding areas of mRNAs, endogenous viruses, tramposons and also transcribed from non coding DNA areas. The ncRNAs not transcribed for proteins and are characterized for a great variety of processes that included genomic imprinting, as enhancers of transcriptional regulation, mRNA processing and modification, sex determination by dosage compensation, protein degradation, oncogenic, tumor-suppresive, neural and synaptic plasticity of learning and memory and cognitive capacity by regulating dendrite morphogenesis during early development and also viral and tramposons defense [28,29,30,32,33,34]. Most of the mRNA stability elements are considered to be located in the 5′- and 3′- untranslated regions (UTRs) of genes where are located ncRNAs [35, 36] In the following paragraphs are detailed the features and the functions of each ncRNAs in eukaryotes. Also describes the effects of the mutations in the origin of disease, and also in the adaptation and evolution of the species. In Table 1 are shows the principal hallmark characteristics of these smalls and long ncRNAs.

Name Length in nucleotides (nt) Principal functions References siRNAs 21-23 nt mRNA cleavage [41] miRNAs 21-23 nt Regulate developmental timing [50-52] piRNAs 29-30 nt Tramposons silencing in gametes [61]

snRNAs 90-216 nt Efficiency of splicing, maintaining

snoRNAs < 70 nt Guide methylation of rRNAs,tRNAs

lnc RNAs 200-2800 nt X chromosome inactivation, human

Note: lncRNAs always act in Cis position in the chromosome and small ncRNAs in Trans position [76].

The eukaryotic genome encode an ample amount of short interfering RNAs, in different cells and tissues principally miRNAs, siRNAs and piRNAs that have less than 200 nt length and are highly conserved. These short ncRNAs are engaged in specific gene regulation and modulate the development of several eukaryote organisms including mammals and are involved in gene silencing in higher eukaryotes [27,37]. They act by binding to complementary sites on targets mRNAs to induce cleavage or repression of transcription in

**Table 1.** Principal Hallmark characteristics of small and long non-coding RNAs

**4. The mutations at non-coding RNAs level** 

**4.1. Short interfering RNAs** 

### **3. Epimutations at epigenome level**

The concept of epigenome is a recent concept in genetics that arises with epigenesis concept. The epigenome involved the chemical changes at DNA level such as methylation and also histones acetylation, chromatin remodeling and phenotypic changes that originate by ncRNAs [16]. The epigenesis is a old concept that was coined in 1942 by Conrrad H. Waddington to explain as an adults can be formed from a cygote by cell differentiation and gene regulation. In a multicellular organism each cell has an epigenotype that is determined by which genes are functioning in that particular cell. The differentiation of multicellular organisms is controlled by epigenetic markers and are transmitted through cell division. However, have been demonstrated that epigenetic changes in germ cell line could be hereditable transgenerationally. Epigenesis is a heritable changes in the expression of genes that not involve a change in the nucleotide structure of DNA but only changes in the chromatin. These changes alter the capacity of genes to respond to external signals [17]. Epigenetic changes allows heritable or transgenerational modifications in the expression of genes without the need of mutations at DNA level and not necessarily following the Mendelian model of heredity. In classical model of Mendelian heredity a gene's effects were assumed to be independent of its parental origin, but is know that some genes have differents effects depending if gene was inherited via a sperm or an egg. This process is know as genomic imprinting. At present there is a lot of evidence that genomic imprinting inclusive may influence human behavior. Is know that children who inherit a chromosomal deletion of 15q11-q13 from their father have behavior different of children who inherit a similar deletion from their mother [18, 19, 20]. Also, experimental animal models in mouse shows that in utero or early life environmental exposures produce effects that can be inherited transgenerationally and are accompanied by epigenetic alterations [21]. These changes in the epigenome have been named as "epimutations". In humans there are just a few reports that have been used to suggest inheritance of epimutations and the search of these epigenetic inheritance is under way [18]. Some evidences have been described in colorectal cancer [ 22, 23, 24, 25].

Epigenesis and epimutation concepts also extend to ncRNAs that have different functions and in human genome constitute about of 60% of the total transcriptional output [26, 27, 28, 29, 30]. The ncRNAs are short single-stranded between 18 to 30nt length such as micro RNA(miRNAs) Small interfering RNA (siRNAs), small nuclear RNA (snRNAs), Small nucleolar RNAs(snoRNAs), piwi- interacting RNAs (piRNAs) and long nc RNA (lncRNA) 200-2800 nt length. All these ncRNAs are hairpin that are paired in some places similar to tRNAs. The homologies detected between the ncRNAs with endogenous viruses, tramposons and introns revealed that ncRNAs probably originates from RNA viruses [31]. In the eukaryote genome, the ncRNAs are located in the non coding areas of mRNAs, endogenous viruses, tramposons and also transcribed from non coding DNA areas. The ncRNAs not transcribed for proteins and are characterized for a great variety of processes that included genomic imprinting, as enhancers of transcriptional regulation, mRNA processing and modification, sex determination by dosage compensation, protein degradation, oncogenic, tumor-suppresive, neural and synaptic plasticity of learning and memory and cognitive capacity by regulating dendrite morphogenesis during early development and also viral and tramposons defense [28,29,30,32,33,34]. Most of the mRNA stability elements are considered to be located in the 5′- and 3′- untranslated regions (UTRs) of genes where are located ncRNAs [35, 36] In the following paragraphs are detailed the features and the functions of each ncRNAs in eukaryotes. Also describes the effects of the mutations in the origin of disease, and also in the adaptation and evolution of the species. In Table 1 are shows the principal hallmark characteristics of these smalls and long ncRNAs.


Note: lncRNAs always act in Cis position in the chromosome and small ncRNAs in Trans position [76].

**Table 1.** Principal Hallmark characteristics of small and long non-coding RNAs

#### **4. The mutations at non-coding RNAs level**

#### **4.1. Short interfering RNAs**

184 Mutations in Human Genetic Disease

inheritance.Then will delve into this important topic .

**3. Epimutations at epigenome level** 

colorectal cancer [ 22, 23, 24, 25].

Bellido and Lewis proposed a model of gene regulation of development in eukaryotes [14, 15]. The homeotic mutations have been fundamentals to explain the genetic basis of development, adaptation and evolution in eukaryote organisms. However, in recent years have found that in regions of DNA does not code for proteins are transcribed an enormous amount of non-coding RNA (ncRNAs), which together with proteins, regulate gene expression. These RNA, including the rRNA and tRNA, together with the mRNA and chromatin are part of epigenome. The mutation at the level of the epigenome have been called epimutations and also cause phenotypic changes, including diseases but also evolutionary novelties that even can be inherited through a non-Mendelian pattern of

The concept of epigenome is a recent concept in genetics that arises with epigenesis concept. The epigenome involved the chemical changes at DNA level such as methylation and also histones acetylation, chromatin remodeling and phenotypic changes that originate by ncRNAs [16]. The epigenesis is a old concept that was coined in 1942 by Conrrad H. Waddington to explain as an adults can be formed from a cygote by cell differentiation and gene regulation. In a multicellular organism each cell has an epigenotype that is determined by which genes are functioning in that particular cell. The differentiation of multicellular organisms is controlled by epigenetic markers and are transmitted through cell division. However, have been demonstrated that epigenetic changes in germ cell line could be hereditable transgenerationally. Epigenesis is a heritable changes in the expression of genes that not involve a change in the nucleotide structure of DNA but only changes in the chromatin. These changes alter the capacity of genes to respond to external signals [17]. Epigenetic changes allows heritable or transgenerational modifications in the expression of genes without the need of mutations at DNA level and not necessarily following the Mendelian model of heredity. In classical model of Mendelian heredity a gene's effects were assumed to be independent of its parental origin, but is know that some genes have differents effects depending if gene was inherited via a sperm or an egg. This process is know as genomic imprinting. At present there is a lot of evidence that genomic imprinting inclusive may influence human behavior. Is know that children who inherit a chromosomal deletion of 15q11-q13 from their father have behavior different of children who inherit a similar deletion from their mother [18, 19, 20]. Also, experimental animal models in mouse shows that in utero or early life environmental exposures produce effects that can be inherited transgenerationally and are accompanied by epigenetic alterations [21]. These changes in the epigenome have been named as "epimutations". In humans there are just a few reports that have been used to suggest inheritance of epimutations and the search of these epigenetic inheritance is under way [18]. Some evidences have been described in

Epigenesis and epimutation concepts also extend to ncRNAs that have different functions and in human genome constitute about of 60% of the total transcriptional output [26, 27, 28, The eukaryotic genome encode an ample amount of short interfering RNAs, in different cells and tissues principally miRNAs, siRNAs and piRNAs that have less than 200 nt length and are highly conserved. These short ncRNAs are engaged in specific gene regulation and modulate the development of several eukaryote organisms including mammals and are involved in gene silencing in higher eukaryotes [27,37]. They act by binding to complementary sites on targets mRNAs to induce cleavage or repression of transcription in a specific manner. Thus these ncRNAs could participate in the degradation of some specific sequence of mRNA. Also, a mutation in proteins required for miRNAS function or biogenesis can affect animal development [ 37, 38, 39,40 ]. Generally the target genes and the mechanism of target suppression are unknown, the reason for this is that miRNAs have a very short sequence of nucleotides, and also the interaction of base pairs with target mRNAs may be affected by a protein complex [38]. Unlike miRNAs of animals, miRNA target of plants are more easily identified because of near-perfect complementarity to their target sequences and act as siRNAs and destroy its target mRNA [41]. In plants, the miRNAs target sites are generally found into the protein–coding segment of the target mRNAs but in animals are found in untranslated region 3'UTR [40, 41]. MiRNAs and siRNAs are processed from a double-stranded RNA precursors about 70 nt by a specific ribonuclease, DICER that excises long RNA into short duplexes of 21-23 nucleotides called siRNAs and miRNAs. Only one type of DICER is found *in C. elegans* and humans indicating that the same DICER is acting on both miRNAs and siRNAs precursors [ 42,43]. However, two mutants, Dicer 1 and Dicer 2, have been discovery in *Drosophila* . Dicer 1 block the production of miRNA precursors. In a different way, Dicer 2 block the processing of siRNA precursors [44]. The excised short RNAs are associated with an ARGONAUTE proteins and constitute an RNAinducing silencing complex (RISC) that is able to target near- perfect complementary RNAs for their degradation or for the control of translation [ 38, 45]. In contrast to DICER , studies in *C.elegans* and *in Drosophila* embryos suggest that the maturation and function of siRNAs and miRNas have differents requirements for argonaute proteins [45]. Mutations in these proteins required for miRNAs function or biogenesis impair animal development [ 46]. Micro RNAs are highly conserved across a wide range of species, for this reason it is not uncommon that homologies have been described in miRNA binding sites [38, 47]. It was shown that a large subset of *Drosophila* miRNAs with homologs in the human genome is perfectly complementary to several classes of sequence motifs previously demonstrated to mediate in negative posttranscription regulation [48,49]. The functions of miRNAs began to be studied in the founding members of miRNAs was in lin-4 and let-7, genes that regulate developmental timing, were discovery from molecular analysis on *Caenorhabditis elegans* [50, 51,]. Both are 21-22 nt RNAs associated with apparent precursor RNAs with stem-loop structure, and both mediate post-transcriptional regulation of target mRNAs via imperfectly complementary sites in their 3' UTRs [37]. MiRNAs play significant regulatory roles in physiological aspect of development and pathologies in plants, flies, fishes, and mammals [52]. In *C.elegans* miRNAs involves to lys mi RNAs that regulates left-right asymmetry in the nervous system [34], and in *Drosophila* bantam miRNA control tissue growth and apoptosis [39]; miR-14 *in Drosophila* suppresses cell death and is required for normal fat metabolism control [53]. In *Bombix mori* has been discovered that miRNAs are relates with the molting stages and, based on the analysis of target genes, have been hypothesized that miRNAs regulate development on complex stages [54].In mouse miR-375 is involved in the pancreatic- islet-specific that regulates insulin secretion[55] and miR-181 is important in hematopoietic differentiation [56].In the sheep, the variety Texel, was identified the myostatin GDF8 gene in chromosome 2 . This gene has direct relation with a major effect on muscle mass. Also have been discovery that this gene has relation with the coding of a miRNA which is highly expressed in the skeletal muscle. A transition of G to A in 3' UTR The Mutations and Their Relationships

with the Genome and Epigenome, RNAs Editing and Evolution in Eukaryotes 187

occurs in an allele of the gene GDF8. This mutation inhibits the production of myostatin causing muscular hypertrophy [57]. MiRNAs also have a role in a normal development and function of heart muscle in vertebrates. In mouse embryos, overexpression of miRNAmiR-1 in the heart, during mid-embryogenesis originated lethality due to cardiomyocyte deficiency and heart failure [58].There are many evidences that mutations in miRNAs cause disease in humans. For example, karyotyping showing that chronic lymphocytic leukemia (CLL) has a genetic basis consisting in a deletion located in 13q14 chromosome. These deletion is associated to other diseases such as mantle cell lymphoma, multiple myeloma and prostate cancers [59].In humans, has been demonstrated that the hemizygous and/or homozygous loss at 13q14 constitute the most frequent chromosomal abnormality in CLL. Also has been demonstrate that two mutation in miRNAs : miR15 and miR16 are located into a 30-kb deletion area in CLL. Both genes are deleted or down-regulated in the majority of CLL [42]. In plants many mRNA target encode transcription factors that are important in morphogenesis regulation and, due to the high complementarity with mRNA targets act as siRNAs guiding the destruction of their mRNA target. In plants, miRNA target sites are principally found within the protein-coding segment of the target mRNA, but in animals miRNA act in 3' untraslated region (3'UTR) [40,41,60].A set of 3′ UTR motifs, such as the Brd-box (AGCUUUA), the K-box (CUGUGAUA) and the GY-box(GUCUUCC), were characterized as motifs involved in negative post-transcriptional regulation of genes in the enhancer of split and, Brd gene complexes of *Drosophila* the 5′ends of miRNAs may be

PiRNAs are other class of small ncRNAs molecules that have 29-30 nt lenght and form the piRNA-induced silencing complex (piRISC) protein in the germ line of many animal species. Piwi proteins bind to piRNAs, which map to transposons. PiRNAs are important regulators of

PiRNAs are produced by the primary processing of single-stranded transcripts of heterochromatic master loci [62] The piRISC complex protects the integrity of the genome from invasion of transposable elements and other genetic elements as viruses and silencing them. They express only in gonads, specially during the spermatogenesis regulating the meiosis.[ 63,64] but also has been described during de ovogenesis [61]. As a result of the loss of piRNAs silencing, in *Drosophila* piwi mutations lead to transposable element over expression and cause a transposition burst. PiRNAs mutants in females exhibit two types of abnormalities, over

Piwi proteins and piRNAs have conserved functions in transposon silencing in the embryonic male germ line. Piwi proteins are proposed to be piRNAs-guided endonucleases that initiate secondary piRNA biogenesis.The biogenesis and piRNA amplification is fundamental for the silencing of LINE1 transposons. Experimental data in mice in base to mutations in Mili and Miwi 2 alleles revealed that the defective piRNAs results in spermatogenic failure and sterility. [66].The relevance of the non-coding genome in human disease has mainly been studied in the context of the widespread disruption of miRNAs

gametogenesis and have been proposed to play roles in transposon silencing [61].

expression of transposons and severely underdeveloped ovaries [62,65].

important for target recognition [37]

**5. Mutations in Piwi interacting non-coding RNAs** 

occurs in an allele of the gene GDF8. This mutation inhibits the production of myostatin causing muscular hypertrophy [57]. MiRNAs also have a role in a normal development and function of heart muscle in vertebrates. In mouse embryos, overexpression of miRNAmiR-1 in the heart, during mid-embryogenesis originated lethality due to cardiomyocyte deficiency and heart failure [58].There are many evidences that mutations in miRNAs cause disease in humans. For example, karyotyping showing that chronic lymphocytic leukemia (CLL) has a genetic basis consisting in a deletion located in 13q14 chromosome. These deletion is associated to other diseases such as mantle cell lymphoma, multiple myeloma and prostate cancers [59].In humans, has been demonstrated that the hemizygous and/or homozygous loss at 13q14 constitute the most frequent chromosomal abnormality in CLL. Also has been demonstrate that two mutation in miRNAs : miR15 and miR16 are located into a 30-kb deletion area in CLL. Both genes are deleted or down-regulated in the majority of CLL [42]. In plants many mRNA target encode transcription factors that are important in morphogenesis regulation and, due to the high complementarity with mRNA targets act as siRNAs guiding the destruction of their mRNA target. In plants, miRNA target sites are principally found within the protein-coding segment of the target mRNA, but in animals miRNA act in 3' untraslated region (3'UTR) [40,41,60].A set of 3′ UTR motifs, such as the Brd-box (AGCUUUA), the K-box (CUGUGAUA) and the GY-box(GUCUUCC), were characterized as motifs involved in negative post-transcriptional regulation of genes in the enhancer of split and, Brd gene complexes of *Drosophila* the 5′ends of miRNAs may be important for target recognition [37]

#### **5. Mutations in Piwi interacting non-coding RNAs**

186 Mutations in Human Genetic Disease

a specific manner. Thus these ncRNAs could participate in the degradation of some specific sequence of mRNA. Also, a mutation in proteins required for miRNAS function or biogenesis can affect animal development [ 37, 38, 39,40 ]. Generally the target genes and the mechanism of target suppression are unknown, the reason for this is that miRNAs have a very short sequence of nucleotides, and also the interaction of base pairs with target mRNAs may be affected by a protein complex [38]. Unlike miRNAs of animals, miRNA target of plants are more easily identified because of near-perfect complementarity to their target sequences and act as siRNAs and destroy its target mRNA [41]. In plants, the miRNAs target sites are generally found into the protein–coding segment of the target mRNAs but in animals are found in untranslated region 3'UTR [40, 41]. MiRNAs and siRNAs are processed from a double-stranded RNA precursors about 70 nt by a specific ribonuclease, DICER that excises long RNA into short duplexes of 21-23 nucleotides called siRNAs and miRNAs. Only one type of DICER is found *in C. elegans* and humans indicating that the same DICER is acting on both miRNAs and siRNAs precursors [ 42,43]. However, two mutants, Dicer 1 and Dicer 2, have been discovery in *Drosophila* . Dicer 1 block the production of miRNA precursors. In a different way, Dicer 2 block the processing of siRNA precursors [44]. The excised short RNAs are associated with an ARGONAUTE proteins and constitute an RNAinducing silencing complex (RISC) that is able to target near- perfect complementary RNAs for their degradation or for the control of translation [ 38, 45]. In contrast to DICER , studies in *C.elegans* and *in Drosophila* embryos suggest that the maturation and function of siRNAs and miRNas have differents requirements for argonaute proteins [45]. Mutations in these proteins required for miRNAs function or biogenesis impair animal development [ 46]. Micro RNAs are highly conserved across a wide range of species, for this reason it is not uncommon that homologies have been described in miRNA binding sites [38, 47]. It was shown that a large subset of *Drosophila* miRNAs with homologs in the human genome is perfectly complementary to several classes of sequence motifs previously demonstrated to mediate in negative posttranscription regulation [48,49]. The functions of miRNAs began to be studied in the founding members of miRNAs was in lin-4 and let-7, genes that regulate developmental timing, were discovery from molecular analysis on *Caenorhabditis elegans* [50, 51,]. Both are 21-22 nt RNAs associated with apparent precursor RNAs with stem-loop structure, and both mediate post-transcriptional regulation of target mRNAs via imperfectly complementary sites in their 3' UTRs [37]. MiRNAs play significant regulatory roles in physiological aspect of development and pathologies in plants, flies, fishes, and mammals [52]. In *C.elegans* miRNAs involves to lys mi RNAs that regulates left-right asymmetry in the nervous system [34], and in *Drosophila* bantam miRNA control tissue growth and apoptosis [39]; miR-14 *in Drosophila* suppresses cell death and is required for normal fat metabolism control [53]. In *Bombix mori* has been discovered that miRNAs are relates with the molting stages and, based on the analysis of target genes, have been hypothesized that miRNAs regulate development on complex stages [54].In mouse miR-375 is involved in the pancreatic- islet-specific that regulates insulin secretion[55] and miR-181 is important in hematopoietic differentiation [56].In the sheep, the variety Texel, was identified the myostatin GDF8 gene in chromosome 2 . This gene has direct relation with a major effect on muscle mass. Also have been discovery that this gene has relation with the coding of a miRNA which is highly expressed in the skeletal muscle. A transition of G to A in 3' UTR

PiRNAs are other class of small ncRNAs molecules that have 29-30 nt lenght and form the piRNA-induced silencing complex (piRISC) protein in the germ line of many animal species. Piwi proteins bind to piRNAs, which map to transposons. PiRNAs are important regulators of gametogenesis and have been proposed to play roles in transposon silencing [61].

PiRNAs are produced by the primary processing of single-stranded transcripts of heterochromatic master loci [62] The piRISC complex protects the integrity of the genome from invasion of transposable elements and other genetic elements as viruses and silencing them. They express only in gonads, specially during the spermatogenesis regulating the meiosis.[ 63,64] but also has been described during de ovogenesis [61]. As a result of the loss of piRNAs silencing, in *Drosophila* piwi mutations lead to transposable element over expression and cause a transposition burst. PiRNAs mutants in females exhibit two types of abnormalities, over expression of transposons and severely underdeveloped ovaries [62,65].

Piwi proteins and piRNAs have conserved functions in transposon silencing in the embryonic male germ line. Piwi proteins are proposed to be piRNAs-guided endonucleases that initiate secondary piRNA biogenesis.The biogenesis and piRNA amplification is fundamental for the silencing of LINE1 transposons. Experimental data in mice in base to mutations in Mili and Miwi 2 alleles revealed that the defective piRNAs results in spermatogenic failure and sterility. [66].The relevance of the non-coding genome in human disease has mainly been studied in the context of the widespread disruption of miRNAs expression and function that is seen in human cancer. At present we are only beginning to understand the nature and extent of the piRNAs, snoRNAs, transcribed ultraconserved regions (T-UCRs) and large intergenic non-coding RNAs (lincRNAs) are emerging as key elements of cellular homeostasis [67]. Genomic imprinting causes parental origin–specific monoallelic gene expression through differential DNA methylation established in the parental germ line. However, the mechanisms underlying how specific sequences are selectively methylated are not fully understood. Has been found that the components of the piRNAs pathway are required for de novo methylation of the differentially methylated region (DMR) of the imprinted mouse Rasgrf1 locus, but not other paternally imprinted loci. A retrotransposon sequence within a ncRNAs spanning the DMR was targeted by piRNAs generated from a different locus. A direct repeat in the DMR, which is required for the methylation and imprinting of Rasgrf1, served as a promoter for this RNA. Has been proposed a model in which piRNAs and a target RNA direct the sequence-specific methylation of Rasgrf1.[68]

The Mutations and Their Relationships

with the Genome and Epigenome, RNAs Editing and Evolution in Eukaryotes 189

pairing rRNA processing involves a number of snoRNAs [69,72]. These activities involve direct base-pairing of the snoRNA with pre-rRNA using different domains. A mutation consisting of single nucleotide insertion in the guide domain shifts modification to an adjacent uridine in rRNA, and severely impairs both processing and cell growth [73].Have been described that U3 and U14 snoRNAs have been implicated in processing steps leading to 18S rRNA formation in eukaryotes. In addition, 18S rRNA formation in vertebrates requires U22 snoRNAs ,and in yeast it requires snR10 and snR30 snoRNAs.The role of snoRNAs in rRNA processing is distinct from the function of the majority of snoRNAs that serve as guide RNAs for rRNA modification. Mutations in U3 snoRNAs of Xenopus were tested for function in oocytes. The results show that U3 mutagenesis uncoupled cleavage at sites 1 and 2, flanking the 5' and 3' ends of 18S rRNA, and generated novel intermediates: 19S and 18.5S pre-rRNAs [74] This study reveals that budding yeast snoRNAs gene promoters are typically demarcated by a single, precisely positioned binding site for the telomere-associated protein Tbf1, which is required for full snoRNAs expression. Tbf1 is known to bind to subtelomeric regions of *S. cerevisiae* chromosomes, where it contributes to the maintenance of telomere length and the regulation of telomeric gene silencing. The subtelomeric binding protein Tbf1 is a global

transcriptional activator in budding yeast, where it activates snoRNA genes [75]

length of telomere in the chromosomes [79,80,81,82,83,84,85,86,87,88,89,90].

Macro or long coding RNAs are conserved and unlike the short RNA, always act in Cis position in the chromosomes and can be up to several hundred thousand nucleotides long , about 200-2800 nt. In the eukaryotic genome and, specially in mammals there are thousands of lncRNAs that are expressed in different cell lines and tissue and exhibit tissue-specific expression patterns. At moment there are a small amount of lncRNA in which are know in its function and stability, althought has been assumed that they are generally unstable. Reciently an genome-wide analysis in the mouse neuroblastoma cells, using a custom ncRNAs array has been determined that lncRNA show a similar range of half-lives to proteins-coding transcripts, suggesting that lncRNAs are not unstable and also that the stability of lncRNAs is a regulated process and depend of where are located in the genome these lncRNAs. Thus, the intergenic RNAs show more stability that those originated from introns of mRNA [76]. Also it is know that in mammals these lncRNAs have different regulatory functions , principally X chromosome inactivation by heterochromatinization (Xist gene) and coats the inactive X chromosome from which it is transcribed. This represents part of the mechanism by which transcriptional silencing is achieved [77]. The lncRNAs roX in flies plays a role in dosage compensation in sex determination similar to XIST gene in mammals [78]. Also the lncRNAs are involves in the regulation of transcriptional and post transcriptional pathway programming, regulation of mRNA splicing, epigenetic gene activation in the regulation of Hox genes that regulate development and also in genomic imprinting and as enhacers of gene expression and in the

**7. Mutations in macro or long non-coding RNAs** 

#### **6. Mutations in small nuclear ncRNAs**

SnRNAs are short molecules of RNA that are located within the nucleus of cells and participate in a variety of processes such as RNA splicing, regulation of transcription factors (7SK RNA) or RNA polimerase II (B2 RNA) and maintaining the telomeres [69]. RNA-RNA interactions between snRNAs or between snRNAs and the pre-mRNAs play critical roles in the accuracy and efficiency of the splicing. The snRNAs also are combined with the protein factors, they make an RNA-protein complex called small nucleoriboprotein (snRNP).The presence of dynamic RNA-RNA interactions within a ribonucleoprotein (RNP) complex like the spliceosome suggests that the snRNAs themselves may need to adopt more than one RNA conformation in order to execute their functions during splicing. Not all of these interactions are established simultaneously, nor do they persist once established. Rather, interactions are formed, modified, disrupted, and replaced during spliceosome assembly and splicing. [70]. The complex structure of spliceosome and the varied interactions between their protein subunits make than any mutations in the nucleotide structure of the snRNAs cause alterations in some of its interactions and functions. Thus, it has been demostrate that in yeast alternative RNA folding can cause cold sensitive function of RNA and that in the case of U2 snRNA, for which the potential to form the alternative structure is conserved, disrupting the alternative folding relieves the cold sensitive defect. This finding suggests that alternative RNA folding may provide a general explanation for the common occurrence of coldsensitive mutations in RNA and RNA binding proteins [70]. In the yeast *Schizosaccharomyces pombe* there are pre-mRNA processing (prp) mutants that are temperature sensitive or cold sensitive for growth. Some these mutants accumulated the U6 snRNAs precursor at the nonpermissive temperature [71]. Small snoRNAs, are ancient ncRNA that guide the methylation of rRNAs, tRNAs and other snRNAs. These snoRNAs are less than 70 nt in length including 10-20 nucleotides of antisense elements for base pairing rRNA processing involves a number of snoRNAs [69,72]. These activities involve direct base-pairing of the snoRNA with pre-rRNA using different domains. A mutation consisting of single nucleotide insertion in the guide domain shifts modification to an adjacent uridine in rRNA, and severely impairs both processing and cell growth [73].Have been described that U3 and U14 snoRNAs have been implicated in processing steps leading to 18S rRNA formation in eukaryotes. In addition, 18S rRNA formation in vertebrates requires U22 snoRNAs ,and in yeast it requires snR10 and snR30 snoRNAs.The role of snoRNAs in rRNA processing is distinct from the function of the majority of snoRNAs that serve as guide RNAs for rRNA modification. Mutations in U3 snoRNAs of Xenopus were tested for function in oocytes. The results show that U3 mutagenesis uncoupled cleavage at sites 1 and 2, flanking the 5' and 3' ends of 18S rRNA, and generated novel intermediates: 19S and 18.5S pre-rRNAs [74] This study reveals that budding yeast snoRNAs gene promoters are typically demarcated by a single, precisely positioned binding site for the telomere-associated protein Tbf1, which is required for full snoRNAs expression. Tbf1 is known to bind to subtelomeric regions of *S. cerevisiae* chromosomes, where it contributes to the maintenance of telomere length and the regulation of telomeric gene silencing. The subtelomeric binding protein Tbf1 is a global transcriptional activator in budding yeast, where it activates snoRNA genes [75]

#### **7. Mutations in macro or long non-coding RNAs**

188 Mutations in Human Genetic Disease

methylation of Rasgrf1.[68]

**6. Mutations in small nuclear ncRNAs** 

expression and function that is seen in human cancer. At present we are only beginning to understand the nature and extent of the piRNAs, snoRNAs, transcribed ultraconserved regions (T-UCRs) and large intergenic non-coding RNAs (lincRNAs) are emerging as key elements of cellular homeostasis [67]. Genomic imprinting causes parental origin–specific monoallelic gene expression through differential DNA methylation established in the parental germ line. However, the mechanisms underlying how specific sequences are selectively methylated are not fully understood. Has been found that the components of the piRNAs pathway are required for de novo methylation of the differentially methylated region (DMR) of the imprinted mouse Rasgrf1 locus, but not other paternally imprinted loci. A retrotransposon sequence within a ncRNAs spanning the DMR was targeted by piRNAs generated from a different locus. A direct repeat in the DMR, which is required for the methylation and imprinting of Rasgrf1, served as a promoter for this RNA. Has been proposed a model in which piRNAs and a target RNA direct the sequence-specific

SnRNAs are short molecules of RNA that are located within the nucleus of cells and participate in a variety of processes such as RNA splicing, regulation of transcription factors (7SK RNA) or RNA polimerase II (B2 RNA) and maintaining the telomeres [69]. RNA-RNA interactions between snRNAs or between snRNAs and the pre-mRNAs play critical roles in the accuracy and efficiency of the splicing. The snRNAs also are combined with the protein factors, they make an RNA-protein complex called small nucleoriboprotein (snRNP).The presence of dynamic RNA-RNA interactions within a ribonucleoprotein (RNP) complex like the spliceosome suggests that the snRNAs themselves may need to adopt more than one RNA conformation in order to execute their functions during splicing. Not all of these interactions are established simultaneously, nor do they persist once established. Rather, interactions are formed, modified, disrupted, and replaced during spliceosome assembly and splicing. [70]. The complex structure of spliceosome and the varied interactions between their protein subunits make than any mutations in the nucleotide structure of the snRNAs cause alterations in some of its interactions and functions. Thus, it has been demostrate that in yeast alternative RNA folding can cause cold sensitive function of RNA and that in the case of U2 snRNA, for which the potential to form the alternative structure is conserved, disrupting the alternative folding relieves the cold sensitive defect. This finding suggests that alternative RNA folding may provide a general explanation for the common occurrence of coldsensitive mutations in RNA and RNA binding proteins [70]. In the yeast *Schizosaccharomyces pombe* there are pre-mRNA processing (prp) mutants that are temperature sensitive or cold sensitive for growth. Some these mutants accumulated the U6 snRNAs precursor at the nonpermissive temperature [71]. Small snoRNAs, are ancient ncRNA that guide the methylation of rRNAs, tRNAs and other snRNAs. These snoRNAs are less than 70 nt in length including 10-20 nucleotides of antisense elements for base

Macro or long coding RNAs are conserved and unlike the short RNA, always act in Cis position in the chromosomes and can be up to several hundred thousand nucleotides long , about 200-2800 nt. In the eukaryotic genome and, specially in mammals there are thousands of lncRNAs that are expressed in different cell lines and tissue and exhibit tissue-specific expression patterns. At moment there are a small amount of lncRNA in which are know in its function and stability, althought has been assumed that they are generally unstable. Reciently an genome-wide analysis in the mouse neuroblastoma cells, using a custom ncRNAs array has been determined that lncRNA show a similar range of half-lives to proteins-coding transcripts, suggesting that lncRNAs are not unstable and also that the stability of lncRNAs is a regulated process and depend of where are located in the genome these lncRNAs. Thus, the intergenic RNAs show more stability that those originated from introns of mRNA [76]. Also it is know that in mammals these lncRNAs have different regulatory functions , principally X chromosome inactivation by heterochromatinization (Xist gene) and coats the inactive X chromosome from which it is transcribed. This represents part of the mechanism by which transcriptional silencing is achieved [77]. The lncRNAs roX in flies plays a role in dosage compensation in sex determination similar to XIST gene in mammals [78]. Also the lncRNAs are involves in the regulation of transcriptional and post transcriptional pathway programming, regulation of mRNA splicing, epigenetic gene activation in the regulation of Hox genes that regulate development and also in genomic imprinting and as enhacers of gene expression and in the length of telomere in the chromosomes [79,80,81,82,83,84,85,86,87,88,89,90].

In addition, several lncRNAs have been shown to be mis regulated in various diseases including cancer and neurological disorders [83,91]. One such alterations in an lncRNA, is Malat1 RNA (metastasis-associated lung adenocarcinoma transcript ). Malat1 also is highly abundant in neurons and It is enriched only when RNA polymerase II-dependent transcription is active. Knock-down studies revealed that Malat1 modulates the recruitment of SR family pre-mRNA-splicing factors to the transcription site of a transgene array. Malat1 controls the expression of genes involved not only in nuclear processes, but also in the function of the synapse. In cultured hippocampal neurons, knock-down of Malat1 decreases synaptic density, whereas its over-expression results in a cell-autonomous increase in synaptic density. These results suggest that Malat1 regulates synapse formation by modulating the expression of genes involved in synapse formation. [91]. lncRNAs are present not only in animals but also in plants where they are involved in gene silencing and in the phenotypic plasticity [92]. In mouse a lncRNAs that has been coined as Rubie (RNA upstream of BMP4 expressed in inner ear) originate malformation in the vestibular apparatus. The Mutation is expressed in developing semicircular canals. However, was discovered that the SWR/J allele of Rubie is disrupted by an intronic endogenous retrovirus that causes anormal splicing and premature polyadenylation of the transcript. Rubie lies in the conserved gene desert upstream of Bmp4, within a region previously shown to be important for inner ear expression of Bmp4 [93]. Also in vertebrates and specifically in humans has been described mutations in transposables elements that are related to neurodegerative diseases. The mutation was located in a degenerated long interspersed elements (LINES). This mutation expressed in the brain and causes lethal infantil encephalopathy suggesting that these repetitive elements are important in human brain development [94].

The Mutations and Their Relationships

with the Genome and Epigenome, RNAs Editing and Evolution in Eukaryotes 191

that occurs only in eukaryotes, changes the function of mutations at DNA level and their importance in the evolution of prokaryotes and eukaryotes. Thus the epimutations in ncRNAs also are very important in the adaptation of eukaryotes, specially in reaction norm

**9. The post-transcriptional nc RNAs epimutations and their role in the** 

Until recently it was thought that in eukaryotes the mutations important for the organism were located into the areas of DNA that code for proteins. Under this framework, protein were the only molecules that regulate the action of genes and, a mutation into the a structural gene could cause a change in the primary structure of proteins. A single amino acid change could cause a serious disease. With the advances in molecular genetics and the discovery of ncRNAs, now we know that In the ncRNAs also occurs epimutations that can also cause phenotypic changes and diseases. These epimutations are more difficult to interpret at a molecular level because they do not affect the protein sequence. Generally the epimutation in ncRNAs alter the RNA structural ensamble between ncRNAs and mRNAs and, alter the message of genetic information in the cells [101,102]. Similar to proteins, the epimutations produced in the ncRNAs into cells that belonging to differents organs and tissues within the body in

The non-coding region of DNA previously thought was garbage, we now know it is not. An exception to this rule is the contribution of by the transposable elements described in maize by Barbara McClintock in 1947, dubbed as controlling elements. The merit of her discovery was the realization that the genome is not static and there are genes that are unstable in terms of location in the genome and could promote its own transposition. Now we know that these transposable elements are found in unicellular and multicellular organisms and have a viral origin [31]. Also the discovery of transposable elements and horizontal transferences of genes had led to the understanding that the genome is a "fluid mosaic of genetic information" from different origins' where the horizontal transfer mediated by virus, tramposons and viruses play an important role in the genic flow between the organisms, not necessarily related genetically [31]. Reciently, in prokaryotes and eukaryotes there are many evidences in that another class of molecular interaction occurs in the regulation of gene action and cellular processes, principally manifested by small ncRNAs that base pairs with mRNAs and regulate the gene expression postranscriptional [101,103]. NcRNAs are a very good tool for the inactivation of specific messages, for example some classes of these ncRNAs such as siRNAs and miRNAs have been found in the regulation of of development and cell death. The nc RNAs act also in prokaryotes, in the replication and maintenance of extrachromosomal elements they have an epistatic effect to any transcriptional signals for their specific mRNAs.Thus, a single ncRNA can regulate multiple genes and have profound effects on

and phenotypic plasticity.

**norm of reaction and phenotypic plasticity** 

eukaryotes can cause a great variety of illness.

cell physiology[104].

#### **8. The RNA editing**

The epimutations at ncRNAs are very important for the adaptation of organism and could be also heritable. Traditionally has been considered that mutations are nucleotide changes that occur at the DNA level and also that are the only new source of genetic variation. However, an special epigenetic regulatory mechanism was discovered from the mitochondria of protozoa *Trypanosome* where a number of genes are expressed in a unconventional manner, the nucleotide sequence of primary transcripts is modified posttranscriptionally through the insertion or deletion of Uridine. These nucleotide alteration was coined as RNA editing [95,96] and also should be considered as "post-transcriptional epimutations". The RNA editing has been detected in unicellular and multicellular eukaryotes but not in prokaryotes. After this discovery, it was thought that this process affects only mRNAs, but now is known that also the editing occur in tRNAs, rRNAs and miRNAs [73,97,98,99]. In humans RNA editing is a change of adenosine to inosine mediated by the enzyme adenosine deaminase, acting on double – stranded RNA, where the inosine acts as guanosine [73,98]. In mammals also has been described another kind of RNA editing consisting in a change of cytosine to uridine [100]. This unexpected epigenetic mechanism that occurs only in eukaryotes, changes the function of mutations at DNA level and their importance in the evolution of prokaryotes and eukaryotes. Thus the epimutations in ncRNAs also are very important in the adaptation of eukaryotes, specially in reaction norm and phenotypic plasticity.

190 Mutations in Human Genetic Disease

development [94].

**8. The RNA editing** 

In addition, several lncRNAs have been shown to be mis regulated in various diseases including cancer and neurological disorders [83,91]. One such alterations in an lncRNA, is Malat1 RNA (metastasis-associated lung adenocarcinoma transcript ). Malat1 also is highly abundant in neurons and It is enriched only when RNA polymerase II-dependent transcription is active. Knock-down studies revealed that Malat1 modulates the recruitment of SR family pre-mRNA-splicing factors to the transcription site of a transgene array. Malat1 controls the expression of genes involved not only in nuclear processes, but also in the function of the synapse. In cultured hippocampal neurons, knock-down of Malat1 decreases synaptic density, whereas its over-expression results in a cell-autonomous increase in synaptic density. These results suggest that Malat1 regulates synapse formation by modulating the expression of genes involved in synapse formation. [91]. lncRNAs are present not only in animals but also in plants where they are involved in gene silencing and in the phenotypic plasticity [92]. In mouse a lncRNAs that has been coined as Rubie (RNA upstream of BMP4 expressed in inner ear) originate malformation in the vestibular apparatus. The Mutation is expressed in developing semicircular canals. However, was discovered that the SWR/J allele of Rubie is disrupted by an intronic endogenous retrovirus that causes anormal splicing and premature polyadenylation of the transcript. Rubie lies in the conserved gene desert upstream of Bmp4, within a region previously shown to be important for inner ear expression of Bmp4 [93]. Also in vertebrates and specifically in humans has been described mutations in transposables elements that are related to neurodegerative diseases. The mutation was located in a degenerated long interspersed elements (LINES). This mutation expressed in the brain and causes lethal infantil encephalopathy suggesting that these repetitive elements are important in human brain

The epimutations at ncRNAs are very important for the adaptation of organism and could be also heritable. Traditionally has been considered that mutations are nucleotide changes that occur at the DNA level and also that are the only new source of genetic variation. However, an special epigenetic regulatory mechanism was discovered from the mitochondria of protozoa *Trypanosome* where a number of genes are expressed in a unconventional manner, the nucleotide sequence of primary transcripts is modified posttranscriptionally through the insertion or deletion of Uridine. These nucleotide alteration was coined as RNA editing [95,96] and also should be considered as "post-transcriptional epimutations". The RNA editing has been detected in unicellular and multicellular eukaryotes but not in prokaryotes. After this discovery, it was thought that this process affects only mRNAs, but now is known that also the editing occur in tRNAs, rRNAs and miRNAs [73,97,98,99]. In humans RNA editing is a change of adenosine to inosine mediated by the enzyme adenosine deaminase, acting on double – stranded RNA, where the inosine acts as guanosine [73,98]. In mammals also has been described another kind of RNA editing consisting in a change of cytosine to uridine [100]. This unexpected epigenetic mechanism

## **9. The post-transcriptional nc RNAs epimutations and their role in the norm of reaction and phenotypic plasticity**

Until recently it was thought that in eukaryotes the mutations important for the organism were located into the areas of DNA that code for proteins. Under this framework, protein were the only molecules that regulate the action of genes and, a mutation into the a structural gene could cause a change in the primary structure of proteins. A single amino acid change could cause a serious disease. With the advances in molecular genetics and the discovery of ncRNAs, now we know that In the ncRNAs also occurs epimutations that can also cause phenotypic changes and diseases. These epimutations are more difficult to interpret at a molecular level because they do not affect the protein sequence. Generally the epimutation in ncRNAs alter the RNA structural ensamble between ncRNAs and mRNAs and, alter the message of genetic information in the cells [101,102]. Similar to proteins, the epimutations produced in the ncRNAs into cells that belonging to differents organs and tissues within the body in eukaryotes can cause a great variety of illness.

The non-coding region of DNA previously thought was garbage, we now know it is not. An exception to this rule is the contribution of by the transposable elements described in maize by Barbara McClintock in 1947, dubbed as controlling elements. The merit of her discovery was the realization that the genome is not static and there are genes that are unstable in terms of location in the genome and could promote its own transposition. Now we know that these transposable elements are found in unicellular and multicellular organisms and have a viral origin [31]. Also the discovery of transposable elements and horizontal transferences of genes had led to the understanding that the genome is a "fluid mosaic of genetic information" from different origins' where the horizontal transfer mediated by virus, tramposons and viruses play an important role in the genic flow between the organisms, not necessarily related genetically [31]. Reciently, in prokaryotes and eukaryotes there are many evidences in that another class of molecular interaction occurs in the regulation of gene action and cellular processes, principally manifested by small ncRNAs that base pairs with mRNAs and regulate the gene expression postranscriptional [101,103]. NcRNAs are a very good tool for the inactivation of specific messages, for example some classes of these ncRNAs such as siRNAs and miRNAs have been found in the regulation of of development and cell death. The nc RNAs act also in prokaryotes, in the replication and maintenance of extrachromosomal elements they have an epistatic effect to any transcriptional signals for their specific mRNAs.Thus, a single ncRNA can regulate multiple genes and have profound effects on cell physiology[104].

#### **10. Conclusions**

The mutations not only occur in the structural genes but also in those areas that code for ncRNAs, in the mRNA messenger ( RNA editing) and also in the introns and in both ends of mRNA, specifically in the 3'UTR and 5'UTR regions where as well are located ncRNAs. Thus mRNA is not only an intermediary between DNA and protein, as is expressed in the classic Crick's Central Dogme of Molecular Biology, but also correspond to a relevant producer of miRNAs and siRNAs. In addition the transcription of all eukaryotic genome generates a large amount of differents ncRNAs which together with proteins regulating the expression of genes. The experimental evidences show that ncRNAs do not occur randomly in all cells but there are an enrichment of a particular ncRNAs depending of their function and cell where they act. There is now evidences that the environmental and developmental influences have effects on the phenotype. The epigenetic changes at DNA and RNA level such as DNA methylation, acetylation of histones, epimutation and RNA editing have an importance in the Darwinian fitness and could be adaptative [105]. Also many of these changes are inherited in a different way that the classic Mendelian model of heredity. One of the assumptions of population genetics is that genes are vertically transmitted to the progeny according to the laws of Mendelian inheritance. In this context, and based on Weissmann's barriers between somatic and germinal cells, only genetic changes that take place within gametes are inherited by the next generation. However at present there are evidences about a non-Mendelian model of heredity which has a close proximity to a neo-Lamackian inheritance model.

The Mutations and Their Relationships

with the Genome and Epigenome, RNAs Editing and Evolution in Eukaryotes 193

Unlike prokaryontes,the eukaryote genome expresses numerous types of ncRNAs that play a fundamental role in the regulation and gene expression. Those small molecules have the possibility of interact with differents kinds of proteins generating a homeostatic system that can respond quickly to environmental changes. Both class of molecules, protein and ncRNAs, are the manifestation of a great amount of information accumulated within the genetic and epigenetic programs. The epigenetic plasticity protects individuals from environmental changes and explain the classic concepts of reaction norm and phenotypic plasticity that previously had been poorly explained on its genetic basis. But now we know that if there is an epigenetic control for these phenotypic changes. Also, these ncRNAs contribute to the processing of information in at least two form: a) Saving a lot of information on their small molecules with a minimal of energy cost.b) Rapid acquisition of information from environmental with a rapid response and adaptation. Further ncRNAs appear to facilitates the acceleration of the evolution of an organism's information contained and functional computanional system. This new picture provides a new dimensions about information processing in the brain [70] and in other cells belonging to other tissues where the ncRNAs can mitigate the negative effects of the environment, increasing adaptability

*Institute of Entomology, Universidad Metropolitana de Ciencias de la Educación, Santiago, Chile* 

Financed by the project code B-12-1, Direction of Extension of the Metropolitan University of

[1] Blakeslee AF (1933) The work of Professor Hugo De Vries. Scientific Monthly 36: 378-

[4] Gustafsson A (1963) Mutations and the Concept of Variability: 89-104. Recent Plant

[5] Sarkar S (1999) From the Reaktionsnorm to the Adaptative Norm : The Norm of

[6] Timoféef- Ressovsky NW (1939) Les mechanisms des mutations et la structure du gene. Actualités Scientifiques et Industrialles, 812. Genétique. Exposés publies sous la direction de B Ephrusi, Institut Biologie Physico-Chemique, Paris. Hermans and Cie,

[2] Blakeslee AF (1935) Hugo De Vries 1848-1935. Science 81: 581-582.

Breeding Research. Almquist ,Wiksell / John Willey, Sons.

Reaction, 1909-1960. Biological Philosophy 14 : 235-252.

[3] Shull GH (1933) Hugo De Vries at eighty-five. Journal of Heredity 24 (1):3-6.

and acceleration in the organic evolution.

**Author details** 

**11. References** 

Editeurs.

380.

Daniel Frías-Lasserre

**Acknowledgements** 

Educational Sciences, Santiago, Chile

This model is based on epigenetic changes induced by the environment, in the epimutations at ncRNAs level, in the mRNA editing and also in horizontal gene transfers. Thus epimutations could be heritable. In this type of heredity there must be no barriers that prevent the changes in somatic cells could be integrated into the genomic information that resides in the nucleus of germ cells. The transposable elements, viruses and ncRNAs can be vectors incorporating somatic mutations within the genome and epigenome of the germ cells. Thus could be evade the Weissman's barriers between somatic and germ cells through retrovirus [106]. Also a mutation in piRNAs which block the action of a virus or transposable element of somatic origin could facilitate the negative impact of mobile elements in germ cells and this change may be inherited.

In humans has been postulated that cardiovascular and metabolic function and that elements of the heritable or familial component of susceptibility to cardiovascular disease, obesity and other non-communicable diseases (NCD) can be transmitted across generations by non-genomic means. Placenta's inaccurate nutritional cues,increases the risk of NCD. Endocrine or nutritional interventions during early postnatal life can reverse epigenetic and phenotypic changes induced, for example, by unbalanced maternal diet during pregnancy. Elucidation of epigenetic processes may permit perinatal identification of individuals most at risk of later NCD and enable early intervention strategies to reduce such risk [105].

Unlike prokaryontes,the eukaryote genome expresses numerous types of ncRNAs that play a fundamental role in the regulation and gene expression. Those small molecules have the possibility of interact with differents kinds of proteins generating a homeostatic system that can respond quickly to environmental changes. Both class of molecules, protein and ncRNAs, are the manifestation of a great amount of information accumulated within the genetic and epigenetic programs. The epigenetic plasticity protects individuals from environmental changes and explain the classic concepts of reaction norm and phenotypic plasticity that previously had been poorly explained on its genetic basis. But now we know that if there is an epigenetic control for these phenotypic changes. Also, these ncRNAs contribute to the processing of information in at least two form: a) Saving a lot of information on their small molecules with a minimal of energy cost.b) Rapid acquisition of information from environmental with a rapid response and adaptation. Further ncRNAs appear to facilitates the acceleration of the evolution of an organism's information contained and functional computanional system. This new picture provides a new dimensions about information processing in the brain [70] and in other cells belonging to other tissues where the ncRNAs can mitigate the negative effects of the environment, increasing adaptability and acceleration in the organic evolution.

## **Author details**

192 Mutations in Human Genetic Disease

Lamackian inheritance model.

risk [105].

elements in germ cells and this change may be inherited.

The mutations not only occur in the structural genes but also in those areas that code for ncRNAs, in the mRNA messenger ( RNA editing) and also in the introns and in both ends of mRNA, specifically in the 3'UTR and 5'UTR regions where as well are located ncRNAs. Thus mRNA is not only an intermediary between DNA and protein, as is expressed in the classic Crick's Central Dogme of Molecular Biology, but also correspond to a relevant producer of miRNAs and siRNAs. In addition the transcription of all eukaryotic genome generates a large amount of differents ncRNAs which together with proteins regulating the expression of genes. The experimental evidences show that ncRNAs do not occur randomly in all cells but there are an enrichment of a particular ncRNAs depending of their function and cell where they act. There is now evidences that the environmental and developmental influences have effects on the phenotype. The epigenetic changes at DNA and RNA level such as DNA methylation, acetylation of histones, epimutation and RNA editing have an importance in the Darwinian fitness and could be adaptative [105]. Also many of these changes are inherited in a different way that the classic Mendelian model of heredity. One of the assumptions of population genetics is that genes are vertically transmitted to the progeny according to the laws of Mendelian inheritance. In this context, and based on Weissmann's barriers between somatic and germinal cells, only genetic changes that take place within gametes are inherited by the next generation. However at present there are evidences about a non-Mendelian model of heredity which has a close proximity to a neo-

This model is based on epigenetic changes induced by the environment, in the epimutations at ncRNAs level, in the mRNA editing and also in horizontal gene transfers. Thus epimutations could be heritable. In this type of heredity there must be no barriers that prevent the changes in somatic cells could be integrated into the genomic information that resides in the nucleus of germ cells. The transposable elements, viruses and ncRNAs can be vectors incorporating somatic mutations within the genome and epigenome of the germ cells. Thus could be evade the Weissman's barriers between somatic and germ cells through retrovirus [106]. Also a mutation in piRNAs which block the action of a virus or transposable element of somatic origin could facilitate the negative impact of mobile

In humans has been postulated that cardiovascular and metabolic function and that elements of the heritable or familial component of susceptibility to cardiovascular disease, obesity and other non-communicable diseases (NCD) can be transmitted across generations by non-genomic means. Placenta's inaccurate nutritional cues,increases the risk of NCD. Endocrine or nutritional interventions during early postnatal life can reverse epigenetic and phenotypic changes induced, for example, by unbalanced maternal diet during pregnancy. Elucidation of epigenetic processes may permit perinatal identification of individuals most at risk of later NCD and enable early intervention strategies to reduce such

**10. Conclusions** 

Daniel Frías-Lasserre *Institute of Entomology, Universidad Metropolitana de Ciencias de la Educación, Santiago, Chile* 

#### **Acknowledgements**

Financed by the project code B-12-1, Direction of Extension of the Metropolitan University of Educational Sciences, Santiago, Chile

#### **11. References**


[7] Mogan HT (1934) La relación de la genética con la Medicina y la Fisiología. Conferencia Nobel presentada en Estocolmo el 4 de Junio de 1934. Genetica Suplement. 2: 627- 631.

The Mutations and Their Relationships

with the Genome and Epigenome, RNAs Editing and Evolution in Eukaryotes 195

[25] Eddy, S. R. Non-coding RNA genes and the modern RNA world (2001) Nature Review

[26] Mattick, J.S.; Makunin, I.V. Non-coding RNA.(2006) Human Molecular. Genetics*.* 1: 17-

[29] Taft, R.J.; Pang, K. C.; Mercer, T. R.; Dinger, M.; Mattick, J.S.2010. Non-coding RNAs:

[30] Sánchez L(2008).Sex determining mechanism in insects. International Journal of

[31] Frias-Lasserre DA. (2012) Non Coding RNAs and Viruses in the Framework of the Phylogeny of the Genes, Epigenesis and Heredity. International Journal of Molecular

[32] Azzalin CM, Reichenbach P, Khoriauli L, Giulotto E, Lingner J. (2007) Telomeric repeat containing RNA and RNA surveillance factors at mammalian chromosome ends.

[33] Orom UA, Shiekhattar R.(2011) Long non coding RNAs and enhancers. Current

[34] Johnston RJ, Horbert O (2003) A microRNA controlling left/right neuronal asymmetry

[35] Holcik M, Liebhaber, S A. (1997). Four highly stable eukaryotic mRNAs assemble 3'untranslated region RNA-protein complexes sharing cis and trans

[36] Mazumder B, Seshadri V , Fox, P. 2003.Translational control by the 3′-UTR: the ends

[37] Lai EC (2003). MicroRNAs: Runts of Genome Assert Themselves. Current Biology

[38] Enright, AJ, John B, Gaul U, Tuschl T, Sander Ch, Marks D S, (2003) MicroRNA targets

[39] Brennecke J, Hipfner DR, Stark A, Russell RB, Cohen SM (2003) *bantam* encodes a developmentally regulates microRNA that controls cell proliferation and regulates the

[40] Lee RC, Feinbaum RL, Ambros V. (1993) The *C.elegans* heterochronic gene lin-4 encodes

[41] Rhoades MW, Reinhart BJ, Lim LP,Burge CB, Bartel DP. Prediction of plant microRNA

[42] Calin G A, Calin Dan Dumitru, Masayoshi Shimizu,Roberta Bichi, Simona Zupo,Evan Noch, Hansjuerg Aldler, Sashi Rattan, Michael Keating, Kanti Rai,Laura Rassenti, Thomas Kipps, Massimo Negrini, Florencia Bullrich, and Carlo M. Croce . (2002) Frequent deletions and down-regulation of micro- RNA genes *miR15* and *miR16* at

small RNAs with antisense complementarity to lin-14. Cell 75: 843-854.

componenets.Proceedings of National Academy of Science 94 (6): 2410-2414.

specify the means. Trends in Biochemical Sciences 28: 91-98.

[27] Stort, G. 2002. An expanding universe of noncoding RNAs. Science 296:1260-1263. [28] Ryan, J.; Taft, R. J.; Pang, K.C.; Mercer, T.R. Dinger, M.; Mattick, J. S. 2002. Non-coding

RNAs: regulators of disease. Journal of Pathoogy. 220: 126-139.

regulators of disease. Journal of Pathology*.* 220:126-139

Opinion in Genetics & Development 21:194-199.

in *Caenorhabditis elegans* Nature 426: 845-849.

in *Drosophila* Genome Biology 5: R1- R1-14.

targets. (2002) Cell, 110: 513-520.

pro-apoctotic genes hid in *Drosophila.* Cell 113: 25-36.

Devevelopment Biology 52: 837-956

Science 2012, *13*(1), 477-490

Science 318: 798-801.

13:R925-R936.

of Genetics12: 919-929.

29.


631.

354.

94.

1:613-629

565-570.

163.

Lancet 338: 638-639.

Nature Genetics 38: 1178-1183.

of Medicine 356: 697-705.

[7] Mogan HT (1934) La relación de la genética con la Medicina y la Fisiología. Conferencia Nobel presentada en Estocolmo el 4 de Junio de 1934. Genetica Suplement. 2: 627-

[8] Lima-de-Faria (1957) Goldschmidt's interpretation of the gene concept and the problem

[9] Benzer S.(1955) Fine structure of a genetic region in bacteriophage. Science 41:344-

[10] Benzer S (1959) On the topology of the genetic fine structure. Proceedings of the

[11] Goldschmidt RB (1940) Chromosomes and genes. American Association for the

[12] Yanofsky C (1967) Gene structure and protein structure. Scientifican American, 216: 80-

[13] Bonner DM (1948) Genes as determiners of cellular biochemestry. Science 108: 735-739. [14] García- Bellido A (1977) Homeotic and atavic mutation in insects. American Zoology

[15] Lewis EB (1978) A gene complex controlling segmentation in *Drosophila.* Nature 276:

[16] Ahmad A, Zhang Y, Cao XF (2010) Decoding the epigenetic language of plant

[17] Koerner MV, Pauler FM, Huang R ,Barlow DP (2009) The function of non-coding RNAs

[18] Haig D. (2000) Genomic imprinting, Sex-Biased Dispersal, and Social Behavior, 907:149-

[19] Hulten M., Amstrong S, Challinor P, Gould C, Hardy G, Leedham P, Lee T, McKeown C. (1991) Genomic imprinting in an Angelman and Prader- Willi translocation family.

[20] SkuseDH, James RS, Bishop DV, Coppin B, Dalton P, Aamondt-Leeper G, Bacarese-Hamilton M, Creswell C, McGurk R, Jacobs PA. (1997)Evidence from Turne's síndrome o fan imprinted X-linked locus affecting cognitive function. Nature 387 (6634): 705-708). [21] Nilsson E, Larsen G, Manikkam M, Guerrero-Bosagna C, Savenkova MI, Skinner MK. (2012) Environmentally Induced Epigenetic Transgenerational Inheritance of Ovarian

[22] Chan TI,Yuen ST, Kong CK, Chan YW, Chan ASY (2006) Heritable germline epimutation of MSH2 in a family with hereditary nonpolyposis colorectal cancer.

[23] Hitchins MP, Wong JJ, Suthers G, Suter CM, Martin DI, Hawkins NJ, Ward RL (2007) Inheritance of a cancer associated MLH1 germ-line epimutation. New England Journal

[24] Mercer TR, Dinger ME, Mattick JS (2009) Long non-coding RNAs: Insights into

of chromosome organization. Hereditas 43:462-465.

National Academy of Sciences, 45: 1607-1620.

development . Molecular. Plant 3: 719-728.

in genomic imprinting. Development, 136: 1771-1783.

Disease.PloS ONE 7(5):e36129.doi:10.1371/journal.pone.0036129

functions. Nature Review of Genetics. 10:155-159.

Advancement of Science, 14: 56-66


13q14 in chronic lymphocytic leukemia. Proceedings of National Academy of Science U S A. November 26; 99(24): 15524–15529.

The Mutations and Their Relationships

with the Genome and Epigenome, RNAs Editing and Evolution in Eukaryotes 197

[57] Clop A, Marcq F, Takeda H, Pirottin D, Tordoi X, Bibe B, Bouix J, Caiment F, Elsen JM, Eychenne F, Larzul C, Laville E, Meish F, Milenkovic D, Tobin J, Charlier C, George M.

[58] Zhao, Y., Ransom, J. F., Li, A., Vedantham, V., von Drehle, M., Muth, A. N., Tsuchihashi, T., McManus, M. T., Schwartz, R. J. and Srivastava, D. ( 2007). Dysregulation of cardiogenesis, cardiac conduction, and cell cycle in mice lacking

[59] Dong JT, Boyd JC, Frierson HF JR. (2001)Loss of heterozygosity at 13q14 and 13q21 in

[60] Xie Z, Kasschau KD, Carrington JC (2003) Negative feedback regulation of dicer-like1 in *Arabidopsis* by microRNA-guide mRNA degradation. Current Biology 13: 784-789. [61] Wilcznska A, Minshall N, Armisen J, Miska E A, Standart N (2009) Two Piwi proteins, Xiwi and Xili, are expressed in the Xenopus female germline. RNA 15:337–345. [62] Malone CD, Brennecke J, Dus M, SDtark A, McCombie R, Sachidanandam R, Hannon J H (2009) Specialized piRNA pathways act in germline and somatic tissues of the

[63] Kim V. N, 2006. Small RNAs just got bigger, Piwi-interacting RNAs (piRNAs) in

[64] Mikiko C. Siomi, Sato K, Pezic D, Aravin AA. 2011.PIWI-interacting small RNAs: the vanguard of genome defence. Nature Reviews Molecular Cell Biology 12, 246-258. [65] Sarot E, Payen-Groschêne G, Bucheton A, Pélisson A (2004) Evidence for a piwidependent RNA silencing of the gypsy endogenous retrovirus by the Drosophila

[66] De Fazio, S., Bartonicek, N., Di Giacomo, M., Abreu-Goodger, C., Sankar, A., Funaya, C., Antony, C., Moreira, P.N., Enright, A.J. & O'Carroll, 2011.The endonuclease activity of Mili fuels piRNA amplification that silences LINE1 elements D. Nature. 2011 Oct 23.

[67] Esteller M.2011.Non-coding RNAs in human disease. Nature Reviews Genetics 12, 861-

[68] Watanabe T, Tomizawa S ,Mitsuya K, Totoki Y, Yukamoto, Y, Kuramochi-Miyagawa S, Lida N, Hoki Y, Murphy P J, Toyoda A, Gotoh K, Hiura H, Arima T, Fujiyama A, Sado T, Shibata T, Nakano T, Lin H, Ichiyanagi K, Soloway P, Sasaki H. 2011. Role for piRNAs and Noncoding RNA in de Novo DNA Methylation of the Imprinted Mouse

[69] Laurent GS, Wahlestedt C (2007). Noncoding RNAs:couplers of analog and digital information in nervous system function?. Trends in Neurosciences 30:612-621. [70] Zavanelli MI, Britton JS, Igel A H, Ares M. 1994. Mutations in an Essential U2 Small Nuclear RNA Structure Cause Cold-Sensitive U2 Small Nuclear Ribonucleoprotein Function by Favoring Competing Alternative U2 RNA Structures. Molecular and

high grade, high stage prostate cancer. Prostate 49:166-171.

mammalian testes, Gene and Development., 20:1993-1997.

melanogaster flamenco gene. Genetics 166:1313–1321.

(2006). Nature Genetics 38: 813-818.

Drosophila ovary. Cell 137:522–535.

doi: 10.1038/nature10547.

*Rasgrf1* Locus. Science 332: 848-852.

Cellular Biology 14: 1689-1697.

874

miRNA-1. Cell 129, 303–317


[57] Clop A, Marcq F, Takeda H, Pirottin D, Tordoi X, Bibe B, Bouix J, Caiment F, Elsen JM, Eychenne F, Larzul C, Laville E, Meish F, Milenkovic D, Tobin J, Charlier C, George M. (2006). Nature Genetics 38: 813-818.

196 Mutations in Human Genetic Disease

timing. Cell 106: 23–34.

pathways. Cell 117: 69-81.

mouse and human. RNA 9:175-179.

lineages of *C.elegans.* Cell 24: 59-69.

*Caenorhabditis elegans.* Science *226*, 409–416.

e2997. doi: 10.1371/journal.pone.0002997.

differentiation. Science 303:83-86.

Regulates insulin secretion.Nature 432:226-230.

258: 432-442.

Biology 14:2162-2167.

Science 301: 336–338.

S A. November 26; 99(24): 15524–15529.

13q14 in chronic lymphocytic leukemia. Proceedings of National Academy of Science U

[43] Grishok, A., Pasquinelli, A.E., Conte, D., Li, N., Parrish, S., Ha,I., Baillie, D.L., Fire, A., Ruvkun, G., and Mello, C.C. (2001).Genes and mechanisms related to RNA interference regulate expression of the small temporal RNAs that control *C. elegans* developmental

[44] Lee YS, Nakahara K, Phan JW, Kim,K, HeZ, SontheIMER EJ, Carthew RW. (2004).Distinct roles for *Drosophila* Dicer-1 and Dicer-2 in the siRNA/miRNA silencing

[45] Okamura K, Ishizuka A, Siomi H, Siomi M. (2004) Distinct roles for Argonaute proteins in small RNA- directed RNA cleavage pathways.Genes & Development 18: 1655- 1666.Hamilton A.J, Baulcombe D.C. (1999). A specie of small antisence RNA in

[46] Ketting RF, Fischer SE, Bernstein E, Sijen T, Hannon GJ, Plasterk RHA. (2001) Dicer functions in RNA interference and in synthesis of small RNA involved in

[47] Moss E.G, Tang L (2003) Conservation of the heterochronic regulator Lin-28, its developmental expression and microRNA complementary sites. Development Biology

[48] Lagos-Quintana M, Rauhut R, Meyer J, Borkhardt, Tuschl.(2003) New micro RNAs from

[49] Landtthaler, Yalcin A,Tuschi T. 2004. The human DiGeorge Syndrome critical region gene 8 and its *D.melanogaster* homolog are required for miRNA biogenesis. Current

[50] Chalfie M, Horvitz Ch M, Sulston JE (1981) Mutations that lead to reiteration in the cell

[51] Ambros, V., and Horvitz, H.R. (1984). Heterochronic mutants of the nematode

[52] Carrington JC, Ambros V (2003) Role of microRNAs in plant and animal development.

[53] Xu P,Vernooy SY, Guo M, Hay BA (2003). The *Drosophila* micro RNA Mir-14suppresses cell death and is required for normal fat metabolism. Current Biology 13: 790-795. [54] Yu X, Zhou Q, Li S-C, Luo Q, Cai Y, et al. (2008) The Silkworm (*Bombyx mori*) microRNAs and Their Expressions in Multiple Developmental Stages. PLoS ONE 3(8):

[55] Poy MN, Eliason L, Krutzfeldt J, Kuwajima S, Ma X , MacDonald P. E., Pfeffer, S. Tuschl T., Rajewsky N., Rorsman, P. Stoffel, M. (2004) A pancratic islet-specific microRNA

[56] Chen CZ, Li L. Lodish HF,Bartel DP (2004) MicroRNAs modulate hematopoietic lineage

developmental timing in *C. elegans*. Genes and Development 15: 2654–2659

posttranscriptional gene silencing in plants. Science 286: 950-952.


[71] Urushiyama S, Tani T Ohshima Y. 1996. Isolation of novel pre-mRNA splicing mutants of *Schizosaccharomyces pombe.* Molecular and General Genetics MGG 253: 118-127

The Mutations and Their Relationships

with the Genome and Epigenome, RNAs Editing and Evolution in Eukaryotes 199

[87] Azzalin, C.M.; Reichenbach, P.; Khoriauli, L.; Giulotto, E.; Lingner, J. (2007) Telomeric repeat containing RNA and RNA surveillance factors at mammalian chromosome ends.

[88] Schoeftner, S.; Blasco, M.A. (2007) Developmentally regulated transcription of mammalian telomeres by DNA dependent RNA polymerase II. Natural Cell Biology,

[89] Arora, R.; Brun, C.M.C.; Azzalin, C.M. Terra. (2011) Long noncoding RNAs at

[90] Orom, U.A.; Shiekhattar, R. (2011) Long non-coding RNAs and enhancers*.* Current.

[91] Bernard D, Prasanth KV, Tripathi V, Colasse S, Nakamura T, Xuan Z, Zhang MQ, Sedel F, Jourdren L, Coulpier F, Triller A, Spector DL, Bessis A (2010) A long nuclear-retained non-coding RNA regulates synaptogenesis by modulating gene expression. The EMBO

[92] De Lucia F, Dean C. (2011) Long non-coding RNAs and chromatin regulation. 14:168-

[93] Roberts KA, Abraira VE,Tucker AF, Goodrich LV, Andrews NC (2012) Mutation of *Rubie*, a novel Long Coding RNA Located Upstream of *Bmp4*, Causes Vestibular

[96] Rubio MAT,Pastar I, Gaston KW, Ragone FL, Janzen ChJ, Cross GAM, Papavasiliou,FN,Alfonso JD.( 2007) An adenosine-to inosine tRNA editing enzyme that

[97] Luciano DJ, Mirsky H, Vendetti NJ,Maas S, (2004) RNA editing of miRNA precursor.

[98] Gott, J.M.; Emerson, R.B. (2000). Functions and mechanisms of RNA editing. Annual Review of Genetics 34, 499-531. Syndrome of an imprintedX-linked locus affecting

[99] Blanc V, Davidson NO (2003) C-to-U RNA Editing: Mechanisms Leading to Genetic

Cartault F, Munier P Benko, E, Desguerre I , Hanein S, Boddaert N, Bandiera S, Vellayoudoma J, Krejbich-Trototf P, Bintnerg M, Hoarauf JJ,Girardb M, Géninh E, de Lonlayb P, Fourmaintrauxa A, Navillej M, Rodriguezk D, Feingold J, Renouil M, Arnold Munnich A, Westhofm E, Fähling M, Lyonnetb S,Henrion-Caude A. (2012) Mutation in a primate-conserved retrotransposon reveals a noncoding RNA as a mediator of infantile encephalopathy. www.pnas.org/cgi/doi/10.1073/pnas.111159610. [94] Benne R. (1992) RNA editing in trypanosomes.Molecular Biology Report 16: 217-227. [95] Benne R. Van der Burg J,BrakenhoffJPJ, Sloof P, Van Boom JH, Tromp MC, (1986) Major transcript of the frameshifted coxll gene from *Trypanosome* mitochondria contains four

Malformation in Mice. PLoS ONE 7(1): e29495.doi:10.1371/journal.pone.0029495.

nucleotides that are not incoded in the DNA.Cell 46:819-826.

can perform C-to-U deamination of DNA. PNA 104: 7821-7826.

Diversity. The Journal of Biological Chemestry 278: 1395-1398.

eukaryotic telomeres. Program of Molecular Subcellular Biology, *51*, 65–94.

Opinion in Genetics Development, 21: 194–198.

Science, 318, 798–801.

Journal (2010) 29: 3082 – 3093.

10, 228–236.

173. Epub 2010

RNA 10:1174-1177.

cognitive function. Nature: 387: 705-708.


[87] Azzalin, C.M.; Reichenbach, P.; Khoriauli, L.; Giulotto, E.; Lingner, J. (2007) Telomeric repeat containing RNA and RNA surveillance factors at mammalian chromosome ends. Science, 318, 798–801.

198 Mutations in Human Genetic Disease

6210-6221.

614–620.

human?. RNA 13:463-467.

diverse cellular functions. Cell. 109: 145-148.

chromosome inactivation. Cell 143:390-403.

Biochemical Sciences 31: 526-532.

Biology, *24*, 7855–7862.

Molecular Biology 51:29-41

[71] Urushiyama S, Tani T Ohshima Y. 1996. Isolation of novel pre-mRNA splicing mutants of *Schizosaccharomyces pombe.* Molecular and General Genetics MGG 253: 118-127 [72] Kiss T.(2002) Small nucleolar RNAs: an abundant group of noncoding RNAs with

[73] Liang H,Landweber LF (2007) Hypothesis: RNA editing of microRNA target sites in

[74] Borovjagin A, Gerbi SA. (2001). *Xenopus* U3 snoRNA GAC-Box A9 and Box A Sequence Play Dictinct Functional Roles in rRNA Processing. Molecular and Cellular Biology 21:

[75] Preti M, Ribeyre C,Pcali CH, BDieci G.(2010) The Telomere-Binding Protein Tbf1 Demarcates snoRNA Gene Promoters in Saccharomyces cerevisiae. Molecular Cell 38,

[76] Clark MB, Johnston RL, Inostroza-Ponta M, Fox AH, Fortini E, Moscato P. Dinger ME, Mattick JS (2012). Genome-wide analysis of long noncoding RNA stability. Genome

[77] Tian D, Sun S, Lee JT.(2010) The long non-coding RNA,JPX, is a molecular switch for

[78] Deng X., Meller V.H. (2006). Non-coding RNA in fly dosage compensation. Trends in

[79] Koerner, M.V.; Pauler, F.M.; Huang, R.; Barlow, D.P. (2009) The function of non-coding

[80] Satterlee, J.S.; Barbee, S.; Jin, P.; Krichevsky, A.; Salama, S.; Schratt, G.; Wu, D.Y. (2007)

[81] Dinger, M.E.; Amaral, P.P.; Mercer, T.R.; Pang, K.C.; Bruce, S.J.; Gardiner, B.B.; Askarian-Amiri, M.E.; Ru, K.; Soldà, G.; Simons, C.Simons C, Sunkin SM, Crowe ML, Grimmond SM, Perkins AC, Mattick JS. (2008) Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. Genome Research., *18*, 1433–1445 [82] Dinger ME, Pang KC, Mercer TR, Crowe ML, Grimmond SM, Mattick JS (2009). NRED: a database of long noncoding RNA expression. Nucleic Acid Research 37: D122-126. [83] Taft, R.J.; Pang, K.C.; Mercer, T.R.; Dinger, M.; Mattick, J.S.( 2010) Non-coding RNAs:

[84] Thakur, N.; Tiwari, V.K.; Thomassin, H.; Pandey, R.R.; Kanduri, M.; Gondor, A.; Grange, T.; Ohlsson, R.; Kanduri, C. (2004) An antisense RNA regulates the bidirectional silencing property of the Kcnq1 imprinting control region. Molecular Cell

[85] Kurakawa R.(2011) Long non-coding RNAs as a regulator for transcription. Program of

[86] Raman, R.P.; Kanduri, C. (2011).Transcriptional and post transcriptional programming by long noncoding RNAs. Program of Molecular Subcellular Biology, 51, 1–27.

Noncoding RNAs in the brain. Journal of Neurosciences*.*, *27*, 11856–11859.

Research. http://www.genome.org/cgi/doi/10.1101/gr.131037.111.

RNAs in genomic imprinting. Development, 136, 1771–1783.

Regulators of disease. Journal of Pathology, *220*, 126–139.


[100] Shimoni Y, Friedlander G, Hertzroni G, Niv G, Altuvia S, Biham O, Margalit H. (2007) Regulation of gene expression by small non-coding RNAs: a quantitative view. Molecular Systems Biology 3:138.

**Chapter 10** 

© 2012 Krawczyk et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**Screening of Gene Mutations in Lung Cancer for** 

**Qualification to Molecularly Targeted Therapies** 

In many developed countries non-small-cell lung cancer (NSCLC), which accounts for approximately 85% of lung cancers, is the first cause of death in patients with malignant neoplasms. Depending on patients' medical status, surgical resection is possible in early stages of NSCLC. Regrettably, only 15-30% of newly diagnosed NSCLC cases can be qualified for operation. Therefore, chemotherapy and radiotherapy plays the dominant role in the multidisciplinary treatment of patients with NSCLC and small-cell lung cancer (SCLC). Unfortunately, both options of treatment in locally advanced and metastatic lung cancer have limited efficacy [1]. Molecularly targeted therapies offer new possibilities of lung cancer treatment in genetically predisposed patients. Within the next few years, personalised therapy of whole lung cancer population based on screening of different gene

The development of cancer usually depends on strong carcinogenic effect of substances found in cigarette smoke on bronchial epithelial cells. Those carcinogens lead to genetic disorders that cause appearance of preinvasive changes: squamous dysplasia preceding carcinoma *in situ* and squamous cell carcinoma as well as atypical adenomatous hyperplasia (AAH) preceding development of adenocarcinoma. The preinvasive cells as well as cancer cells are characterised with large genome changes. Comparative genomic hybridisation (CGH) studies have identified chromosomal aberrations, particularly amplifications and deletions, in lung cancer cells. Cancer cells exhibit deletions of chromosome 17 short arm, with loss of *p53* gene (deletion of 17(p12-13) and chromosome 9 short arm, with loss of *p16* gene (*CDKN2A*) (deletion of 9(p21-22). Both mentioned genes are suppressor genes and lack of their protein products allows aneuploid cancer cells to survive and accumulate serious chromosomal aberrations like deletions 3(p14-21), 8(p21-23), 13(q14), 13(q22-24) and allelic losses at 9(p21), 13(q24) as well as gains at 1(q21-31), 3(q21-22), 3(q25-27), 5(p13-14), 8(q23-

Paweł Krawczyk, Tomasz Kucharczyk and Kamila Wojas-Krawczyk

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48689

mutations will become a fact.

**1. Introduction** 


## **Screening of Gene Mutations in Lung Cancer for Qualification to Molecularly Targeted Therapies**

Paweł Krawczyk, Tomasz Kucharczyk and Kamila Wojas-Krawczyk

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48689

## **1. Introduction**

200 Mutations in Human Genetic Disease

Molecular Systems Biology 3:138.

10.1371/journal.pgen.1001074.

Review of Bichemical 74:199-217.

Trends in Genetics 21:7399-404.

Molecular Biology. 106:272-80.

Sydney, Australia, 286 pp.

[100] Shimoni Y, Friedlander G, Hertzroni G, Niv G, Altuvia S, Biham O, Margalit H. (2007) Regulation of gene expression by small non-coding RNAs: a quantitative view.

[101] Halvorsen M., Martin JS, Broadaway S, Laederach A. (2010) Disease-Associated Mutations that alter the RNA structural ensamble.Plos Genetics 6 (8): e1001074.doi:

[102] Storz G, Altuvia S, Wassarmann KM (2005) An abundance of RNA regulators Annual

[103] Gottesman S. (2005). Micros for microbes: non-coding regulatory RNAs in bacteria.

[104] Hanson M, Godfrey KM, Lillycrop KA, Burdge GC, Gluckman PD.(2011) Developmental plasticity and developmental origins of non-communicable disease: theoretical considerations and epigenetic mechanisms. Progress in Biophysic and

[105] Steele EJ, EJ, Lindle RA, Blanden RV (1998) Lamarck's signature. Allen and Unwin,

In many developed countries non-small-cell lung cancer (NSCLC), which accounts for approximately 85% of lung cancers, is the first cause of death in patients with malignant neoplasms. Depending on patients' medical status, surgical resection is possible in early stages of NSCLC. Regrettably, only 15-30% of newly diagnosed NSCLC cases can be qualified for operation. Therefore, chemotherapy and radiotherapy plays the dominant role in the multidisciplinary treatment of patients with NSCLC and small-cell lung cancer (SCLC). Unfortunately, both options of treatment in locally advanced and metastatic lung cancer have limited efficacy [1]. Molecularly targeted therapies offer new possibilities of lung cancer treatment in genetically predisposed patients. Within the next few years, personalised therapy of whole lung cancer population based on screening of different gene mutations will become a fact.

The development of cancer usually depends on strong carcinogenic effect of substances found in cigarette smoke on bronchial epithelial cells. Those carcinogens lead to genetic disorders that cause appearance of preinvasive changes: squamous dysplasia preceding carcinoma *in situ* and squamous cell carcinoma as well as atypical adenomatous hyperplasia (AAH) preceding development of adenocarcinoma. The preinvasive cells as well as cancer cells are characterised with large genome changes. Comparative genomic hybridisation (CGH) studies have identified chromosomal aberrations, particularly amplifications and deletions, in lung cancer cells. Cancer cells exhibit deletions of chromosome 17 short arm, with loss of *p53* gene (deletion of 17(p12-13) and chromosome 9 short arm, with loss of *p16* gene (*CDKN2A*) (deletion of 9(p21-22). Both mentioned genes are suppressor genes and lack of their protein products allows aneuploid cancer cells to survive and accumulate serious chromosomal aberrations like deletions 3(p14-21), 8(p21-23), 13(q14), 13(q22-24) and allelic losses at 9(p21), 13(q24) as well as gains at 1(q21-31), 3(q21-22), 3(q25-27), 5(p13-14), 8(q23-

© 2012 Krawczyk et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

24), 7(p12). The presence of deletions generates abnormal expression or impaired function of tumour suppressor genes such as *RB1*, *FHIT*, *RASSF1A*, *SEMA3B* and *PTEN*. However, the gain of chromosomal region including oncogenes is associated with overexpression or increased activity of *MYC*, *KRAS*, *EGFR*, *CCDN1*, *MCM2*, *RUVBL1, SOX2* and *BCL2* genes [2, 3]. Moreover, cell subclones with new genetic abnormalities may become dominant within metastases or within persistent or recurrent cancer deposits through selective pressures exerted by chemotherapy or molecularly targeted therapy [4].

Screening of Gene Mutations in Lung Cancer for Qualification to Molecularly Targeted Therapies 203

**Figure 1.** The percentage of NSCLC tumours with identified mutations in different genes (*EGFR* gene amplification and polysomy, as well as *p53* gene abnormalities which are common in NSCLC tumours

NSCLC is a heterogeneous aggregate of histological subtypes, which traditionally have been grouped together because of similarities of treatment outcome. Ideally, a tumour classification system should include morphologic and genetic distinctions between tumour types, which will help to define specific subset of patients responsive to certain molecularly targeted treatment. In terms of genetic mutations squamous cell carcinomas are the least described. Mutations have not been detected in over 50% of already screened tumours (Fig. 2). On the other hand adenocarcinoma cases are definitely better described and in only 20% of tumours screening fails to describe any mutations. Among 10-15% of non-smokers (but also light-smokers and former smokers) adenocarcinoma might develop regardless of tobacco smoking. In these cases in almost all tumours different genetic mutations have been found, mostly in epidermal growth factor receptor (*EGFR*) and *KRAS* genes as well as presence of *EML4-ALK* fusion gene (Fig. 3). Accumulated evidence suggested that lung cancer in ever smokers and never smokers follow distinct molecular pathways and may therefore respond to distinct therapy. One could speculate than non-small cell lung cancer in ever and never smokers are two distinct disorders regarding their molecular level and the

The most frequent irregularity found among squamous cell carcinoma patients is an amplification of gene for fibroblast growth factor receptor type 1 (FGFR) and *p53* gene abnormalities. These disturbances could overlap with other mutations. In SCC it is extremely rare to detect *EGFR*, *PTEN*, *ERBB2 (HER2)*, *PIK3CA*, *DDR2* or *BRAF* mutations,

have not been included on the graph).

manner of treatment planning [5, 6, 7, 8, 9, 10, 11, 12].

which are more typical for adenocarcinoma [5, 6, 7].

Deletion of chromosome 17 short arm, with loss of *p53* gene, is the most frequent disturbance in lung cancer (50-70%). Squamous cell carcinoma (SCC) of lung exhibits higher frequencies of deletions at chromosomal regions 3(p14-21), 8(p21-23), 17(p13) (*p53* gene), 13(q14) (*RB1* gene), 9(p21) (*CDKN2A* gene) and amplification of 3(q21-22) (*SOX2* gene) when compared with adenocarcinoma (AC). Amplification of 7(p11) and 14(q13) causing increased gene dosage and protein expression of thyroid transcriptional factor-1/NK2 homeobox-1 (TITF-1/NKX2-1) and of epidermal growth factor receptor (EGFR) are prevalent in lung adenocarcinoma [2, 3].

## **2. Genetic mutations in lung cancer cells**

Apart of chromosomal aberrations single gene mutations can appear in lung cancer cells. These mutations can be revealed with molecular biology techniques. Mentioned mutations do not often appear simultaneously in one cancer cell (less than 3% of tumour cells). They concern genes important for correct proliferation, differentiation and cell growth such as oncogenes and genes for signal proteins involved in a complicated network of intracellular signal transmission (predominantly genes for tyrosine and threonine-serine kinases).

The most important kind of genetic disturbances observed in NSCLC cells are point mutations (single nucleotide substitutions), small (few to a few dozen base pairs) deletions or insertions and formation of fusion genes as a result of translocation of gene fragments, usually within a single chromosome. Some of these alterations change the structure of proteins (sense mutations) which play an important role in oncogenesis, others shift the expression of oncogenes and suppressor genes, while some remain silent. Such processes lead to protein malfunction: they can increase or decrease protein expression or cause differences in normal enzyme activity.

Accumulation of driver mutations in different genes is detected depending on history of tumour exposure to carcinogens. Failure of DNA repair and progressive genetic instability leads to appearance of mutation that drives cancer development, its growth and metastases [4]. Molecular type of lung cancer is partially consistent with histological type of tumour. Although frequency of occurrence of some driver mutations is extremely rare, in only 20% of NSCLC tumours important mutations are not detected. Small cell lung cancer is less characterised in terms of incidence of genetic mutations. Until 2011, 1738 mutated genes and tens of thousands of different types of mutations were identified in NSCLC [2, 5, 6, 7]. Figure 1 shows the percentage of tumours with identified mutations in all histological types of NSCLC.

Screening of Gene Mutations in Lung Cancer for Qualification to Molecularly Targeted Therapies 203

202 Mutations in Human Genetic Disease

in lung adenocarcinoma [2, 3].

differences in normal enzyme activity.

of NSCLC.

**2. Genetic mutations in lung cancer cells** 

24), 7(p12). The presence of deletions generates abnormal expression or impaired function of tumour suppressor genes such as *RB1*, *FHIT*, *RASSF1A*, *SEMA3B* and *PTEN*. However, the gain of chromosomal region including oncogenes is associated with overexpression or increased activity of *MYC*, *KRAS*, *EGFR*, *CCDN1*, *MCM2*, *RUVBL1, SOX2* and *BCL2* genes [2, 3]. Moreover, cell subclones with new genetic abnormalities may become dominant within metastases or within persistent or recurrent cancer deposits through selective

Deletion of chromosome 17 short arm, with loss of *p53* gene, is the most frequent disturbance in lung cancer (50-70%). Squamous cell carcinoma (SCC) of lung exhibits higher frequencies of deletions at chromosomal regions 3(p14-21), 8(p21-23), 17(p13) (*p53* gene), 13(q14) (*RB1* gene), 9(p21) (*CDKN2A* gene) and amplification of 3(q21-22) (*SOX2* gene) when compared with adenocarcinoma (AC). Amplification of 7(p11) and 14(q13) causing increased gene dosage and protein expression of thyroid transcriptional factor-1/NK2 homeobox-1 (TITF-1/NKX2-1) and of epidermal growth factor receptor (EGFR) are prevalent

Apart of chromosomal aberrations single gene mutations can appear in lung cancer cells. These mutations can be revealed with molecular biology techniques. Mentioned mutations do not often appear simultaneously in one cancer cell (less than 3% of tumour cells). They concern genes important for correct proliferation, differentiation and cell growth such as oncogenes and genes for signal proteins involved in a complicated network of intracellular

The most important kind of genetic disturbances observed in NSCLC cells are point mutations (single nucleotide substitutions), small (few to a few dozen base pairs) deletions or insertions and formation of fusion genes as a result of translocation of gene fragments, usually within a single chromosome. Some of these alterations change the structure of proteins (sense mutations) which play an important role in oncogenesis, others shift the expression of oncogenes and suppressor genes, while some remain silent. Such processes lead to protein malfunction: they can increase or decrease protein expression or cause

Accumulation of driver mutations in different genes is detected depending on history of tumour exposure to carcinogens. Failure of DNA repair and progressive genetic instability leads to appearance of mutation that drives cancer development, its growth and metastases [4]. Molecular type of lung cancer is partially consistent with histological type of tumour. Although frequency of occurrence of some driver mutations is extremely rare, in only 20% of NSCLC tumours important mutations are not detected. Small cell lung cancer is less characterised in terms of incidence of genetic mutations. Until 2011, 1738 mutated genes and tens of thousands of different types of mutations were identified in NSCLC [2, 5, 6, 7]. Figure 1 shows the percentage of tumours with identified mutations in all histological types

signal transmission (predominantly genes for tyrosine and threonine-serine kinases).

pressures exerted by chemotherapy or molecularly targeted therapy [4].

**Figure 1.** The percentage of NSCLC tumours with identified mutations in different genes (*EGFR* gene amplification and polysomy, as well as *p53* gene abnormalities which are common in NSCLC tumours have not been included on the graph).

NSCLC is a heterogeneous aggregate of histological subtypes, which traditionally have been grouped together because of similarities of treatment outcome. Ideally, a tumour classification system should include morphologic and genetic distinctions between tumour types, which will help to define specific subset of patients responsive to certain molecularly targeted treatment. In terms of genetic mutations squamous cell carcinomas are the least described. Mutations have not been detected in over 50% of already screened tumours (Fig. 2). On the other hand adenocarcinoma cases are definitely better described and in only 20% of tumours screening fails to describe any mutations. Among 10-15% of non-smokers (but also light-smokers and former smokers) adenocarcinoma might develop regardless of tobacco smoking. In these cases in almost all tumours different genetic mutations have been found, mostly in epidermal growth factor receptor (*EGFR*) and *KRAS* genes as well as presence of *EML4-ALK* fusion gene (Fig. 3). Accumulated evidence suggested that lung cancer in ever smokers and never smokers follow distinct molecular pathways and may therefore respond to distinct therapy. One could speculate than non-small cell lung cancer in ever and never smokers are two distinct disorders regarding their molecular level and the manner of treatment planning [5, 6, 7, 8, 9, 10, 11, 12].

The most frequent irregularity found among squamous cell carcinoma patients is an amplification of gene for fibroblast growth factor receptor type 1 (FGFR) and *p53* gene abnormalities. These disturbances could overlap with other mutations. In SCC it is extremely rare to detect *EGFR*, *PTEN*, *ERBB2 (HER2)*, *PIK3CA*, *DDR2* or *BRAF* mutations, which are more typical for adenocarcinoma [5, 6, 7].

Screening of Gene Mutations in Lung Cancer for Qualification to Molecularly Targeted Therapies 205

Among patients with adenocarcinoma the most often detected irregularities are in *EGFR* gene, Kirsten rat sarcoma viral oncogene homolog (*KRAS*) gene and *p53* gene. In nonsmoking Caucasian population activating mutations in *EGFR* gene appear with frequency of over 50%. The most common mutations in this gene are: small (9-21 base pair) deletions in exon 19 (48% of detected mutations) and missense mutations in exon 21 (L858R, 41% of detected mutations). Substitutions in exon 18-21 or insertions and duplications in exon 20

Mutations in exon 18-21 of *EGFR* gene concern tyrosine kinase domain of the EGF receptor. Overexpression of EGFRs' tyrosine kinase function leads to hyperphosphorylation of intracellular signalling proteins of Pi3K/Akt or RAS/RAF/MAPK pathways without having to activate the receptor with its specific ligand – the EGF. The activation of Pi3K/Akt pathway results in stimulation of transcription factors such as STAT or excessive proliferation of cancer cells. Mutations in *EGFR* gene are most common in papillary AC, less frequent in adenocarcinoma with "lepidic predominant" growth and least frequent in solid

Mutations in *KRAS* gene are also common and are detected in 15-25% of adenocarcinoma cases. *KRAS* gene, which is coding a low molecular weight guanosine triphosphatase (GTPase) is considered to be the most frequently mutated oncogene in lung AC arising in patients with history of smoking. Most *KRAS* mutations involve replacing glycine with other amino acids such as valine, aspartic acid and glutamic acid in codon 12. Less frequent mutations consider codon 13 and 61. The emergence of the mutation causes the reduction of GDPase activity with subsequent potent activation of mitogenic and proliferative signalling through the RAF/MEK/ERK/MAPK cascade. Mutations in *KRAS* gene are most common in

Among other mutations detected in more than 2% of adenocarcinomas are *EML4-ALK* fusion gene, substitution V600E in *BRAF* oncogene, substitutions in codon 542, 545 and 1047 of *PIK3CA* oncogene, insertion in exon 20 of *ERBB2* gene, polysomy of *FGFR1* gene and amplification of *cMET* gene. Both anaplastic large cell lymphoma kinase (*ALK*) and echinoderm microtubule associated protein 4 (*EML4*) genes are located in chromosome 2p and fusion of both involves small inversions within this region. *EML4-ALK* fusion results in constitutive activation of ALK kinase. *EML4-ALK* fusion gene is prevalent in lung adenocarcinoma (2-4%), especially in signet ring cell carcinoma (<15%), in younger patients and in never- or light smokers. *EML4-ALK* fusion gene is mutually exclusive with *EGFR* and *KRAS* gene mutations. Recently, new fusion genes have been discovered in lung adenocarcinomas, including fusion of kinesis family member 5B (*KIF5B*) with ret protooncogene (*RET*) and fusion of coiled-coil domain containing protein 6 (*CCDC6*) with *RET* as well as fusions of *ALK* with c-ros oncogene 1 receptor tyrosine kinase (*ROS1*) [2, 5, 6, 7, 14,

Information about the mutations mentioned above come from large databases such as *Catalogue of Somatic Mutations in Cancer* (COSMIC), *My Cancer Genome*, *The Cancer Genome* 

solid mucinous adenocarcinoma and in acinar adenocarcinoma [2, 5, 6, 7].

are rare but they also appear [2, 9, 13].

AC [2, 13].

15].

**Figure 2.** The percentage of SCC tumours with identified mutations in different genes (*p53* gene abnormalities which are common in SCC tumours have not been included on the graph)

**Figure 3.** The percentage of AC tumours with identified mutations in different genes (*EGFR* gene amplification and polysomy have not been included on the graph).

Among patients with adenocarcinoma the most often detected irregularities are in *EGFR* gene, Kirsten rat sarcoma viral oncogene homolog (*KRAS*) gene and *p53* gene. In nonsmoking Caucasian population activating mutations in *EGFR* gene appear with frequency of over 50%. The most common mutations in this gene are: small (9-21 base pair) deletions in exon 19 (48% of detected mutations) and missense mutations in exon 21 (L858R, 41% of detected mutations). Substitutions in exon 18-21 or insertions and duplications in exon 20 are rare but they also appear [2, 9, 13].

204 Mutations in Human Genetic Disease

**Figure 2.** The percentage of SCC tumours with identified mutations in different genes (*p53* gene abnormalities which are common in SCC tumours have not been included on the graph)

**Figure 3.** The percentage of AC tumours with identified mutations in different genes (*EGFR* gene

amplification and polysomy have not been included on the graph).

Mutations in exon 18-21 of *EGFR* gene concern tyrosine kinase domain of the EGF receptor. Overexpression of EGFRs' tyrosine kinase function leads to hyperphosphorylation of intracellular signalling proteins of Pi3K/Akt or RAS/RAF/MAPK pathways without having to activate the receptor with its specific ligand – the EGF. The activation of Pi3K/Akt pathway results in stimulation of transcription factors such as STAT or excessive proliferation of cancer cells. Mutations in *EGFR* gene are most common in papillary AC, less frequent in adenocarcinoma with "lepidic predominant" growth and least frequent in solid AC [2, 13].

Mutations in *KRAS* gene are also common and are detected in 15-25% of adenocarcinoma cases. *KRAS* gene, which is coding a low molecular weight guanosine triphosphatase (GTPase) is considered to be the most frequently mutated oncogene in lung AC arising in patients with history of smoking. Most *KRAS* mutations involve replacing glycine with other amino acids such as valine, aspartic acid and glutamic acid in codon 12. Less frequent mutations consider codon 13 and 61. The emergence of the mutation causes the reduction of GDPase activity with subsequent potent activation of mitogenic and proliferative signalling through the RAF/MEK/ERK/MAPK cascade. Mutations in *KRAS* gene are most common in solid mucinous adenocarcinoma and in acinar adenocarcinoma [2, 5, 6, 7].

Among other mutations detected in more than 2% of adenocarcinomas are *EML4-ALK* fusion gene, substitution V600E in *BRAF* oncogene, substitutions in codon 542, 545 and 1047 of *PIK3CA* oncogene, insertion in exon 20 of *ERBB2* gene, polysomy of *FGFR1* gene and amplification of *cMET* gene. Both anaplastic large cell lymphoma kinase (*ALK*) and echinoderm microtubule associated protein 4 (*EML4*) genes are located in chromosome 2p and fusion of both involves small inversions within this region. *EML4-ALK* fusion results in constitutive activation of ALK kinase. *EML4-ALK* fusion gene is prevalent in lung adenocarcinoma (2-4%), especially in signet ring cell carcinoma (<15%), in younger patients and in never- or light smokers. *EML4-ALK* fusion gene is mutually exclusive with *EGFR* and *KRAS* gene mutations. Recently, new fusion genes have been discovered in lung adenocarcinomas, including fusion of kinesis family member 5B (*KIF5B*) with ret protooncogene (*RET*) and fusion of coiled-coil domain containing protein 6 (*CCDC6*) with *RET* as well as fusions of *ALK* with c-ros oncogene 1 receptor tyrosine kinase (*ROS1*) [2, 5, 6, 7, 14, 15].

Information about the mutations mentioned above come from large databases such as *Catalogue of Somatic Mutations in Cancer* (COSMIC), *My Cancer Genome*, *The Cancer Genome*  *Atlas* and the results obtained by the American *Lung Cancer Mutation Consortium* (LCMC) [5, 6, 7, 16].

Screening of Gene Mutations in Lung Cancer for Qualification to Molecularly Targeted Therapies 207

Routine genetic testing for somatic mutations in lung cancer biopsies is becoming the standard for providing optimal patients care. However, it is unclear whether this testing should be routine for all lung cancer patients, because the prevalence of the most common mutations is very low especially in heavy smokers with squamous cell carcinoma. Moreover, great number of molecular biology methods and variety of biological material acquired from patients create a critical need for robust, well-validated diagnostic tests and equipment that are both sensitive and specific for mutations. An *In Vitro* Diagnostic Medical Device (IVD) is defined in Directive 98/79/EC of European Parliament and of the Council. IVD is described as any medical device which is a reagent, calibrator, control material, kit, equipment or system, whether used alone or in combination, intended by the manufacturer to be used *in vitro* for the examination of specimens, including blood and tissue donations, derived from the human body for the purpose of providing information concerning pathological state and congenital abnormalities of patients as well as to monitor therapeutic effect. IVD equipment is labelled by CE marking according to European Product Safety

Molecularly targeted drugs are directed against abnormal proteins and other molecules, specific for cancer cells, participating in metabolic pathways. Excess activation of those pathways is essential for growth and unrestrained proliferation of cancer cells. Blocking these pathways results in inhibition of cell division and in cell apoptosis. Therefore,

1. if the mutation of the gene encoding a signalling pathway protein results in excessive activity while changing its structure, what allows more effective binding of the drug (e.g. activating mutations in *EGFR* gene and the efficacy of tyrosine kinase inhibitors of

2. if the mutation of the gene encoding a signalling pathway protein results in excessive activity of the pathway and its blocking, regardless of the matching of the drug to the target protein, impairs tumour cell proliferation, which can be achieved at two levels:

b. blocking of subsequent signalling pathway proteins stimulated by the abnormal

Therefore, many of the therapies currently under development target several signalling proteins, especially tyrosine kinase receptors (e.g. EGFR, HER2, HER3, IGF-1R, cMET) or proteins in downstream signalling pathway (RAS/RAF/MAPK/mTOR and Pi3K/AKT) [19,

Excessive stimulation of epidermal growth factor receptor increases proliferation of cancer cells in different kinds of tumours, i.a. in non-small-cell lung cancer. Cell growth signal is transmitted from EGFR (HER1), after its heterodimerisation with other member of HER family (ERBB2 – HER2, HER3 or HER4), through phosphorylation of Pi3K/AKT and

Regulations [20].

EGFR),

21].

**4. Molecularly targeted therapies in lung cancer** 

a. direct blocking of abnormal protein

protein [21].

molecularly targeted drugs show high efficacy in two groups of patients:

## **3. Molecular biology methods in lung cancer diagnostics**

Mutation testing has become an essential determinant in clinical practice in decision of treatment options for patients with non-small-cell lung carcinomas. Unfortunately NSCLC tumours, in which the molecular diagnostics is carried out, are highly heterogeneous and the cytological and histological material is often insufficient to complete the analysis (small percentage of cancer cells or DNA fragmentation in the process of paraffin embedding). Direct sequencing is still a frequently used method despite having low sensitivity and being time-consuming and labour-intensive. However, direct sequencing and particularly next generation sequencing (technology based on reversible dye terminators, sequencing by ligation and pirosequencing) are the methods of high-throughput screening for unknown mutations. Microarrays containing oligonucleotide mutation probes are emerging as useful platforms for the diagnosis of multiple genetic abnormalities in cancer cells [17, 18].

The multiplex SNaPshot PCR (minisequencing) technique is a PCR (polymerase chain reaction)-based assay for detection of known mutations. Specific primer which anneals immediately adjacent to the mutated region is extended by one base using a fluorescently labeled ddNTPs, which are detected in capillary electrophoresis. No further extension is possible because of the ddNTP binding. This kind of reaction is being used more and more frequently because of its fast and sensitive detection of many known mutations in a single assay [19].

Recent advances in molecular techniques have enabled the design of sensitive detection assays based on quantitative real-time PCR, but usually with limited degree of mutation coverage. Allele-specific PCR (ASP-PCR), amplification refractory mutation system PCR (ARMS-PCR), clamp PCR and mutant-enriched PCR (ME-PCR) are among these techniques. The most frequently used is the ARMS–PCR method that can detect a known SNP (single nucleotide polymorphism). It consists of two complementary reactions: one containing an ARMS primer specific for the normal DNA sequence that cannot amplify mutant DNA at a given locus and the other one containing a mutant-specific primer that cannot amplify normal DNA. High resolution melting (HRM) real-time PCR is also a technique that might allow fast screening for mutations. The real-time PCR technology itself is highly flexible and many alternative instruments and fluorescent probe systems have been developed recently [17, 18].

For detecting polysomy, gene amplifications and the presence of fusion genes molecular probes labelled with different fluorochromes and fluorescence *in situ* hybridisation (FISH) technique are being used. Techniques related to FISH, but allowing to label only one gene fragment, are silver *in situ* hybridisation (SISH) and chromogenic *in situ* hybridisation (CISH). The FISH technique requires an assessment of signal quantity from labelled genes and chromosome fragments with fluorescence microscopy whereas SISH or CISH staining can be analysed in light microscope [17, 18].

Routine genetic testing for somatic mutations in lung cancer biopsies is becoming the standard for providing optimal patients care. However, it is unclear whether this testing should be routine for all lung cancer patients, because the prevalence of the most common mutations is very low especially in heavy smokers with squamous cell carcinoma. Moreover, great number of molecular biology methods and variety of biological material acquired from patients create a critical need for robust, well-validated diagnostic tests and equipment that are both sensitive and specific for mutations. An *In Vitro* Diagnostic Medical Device (IVD) is defined in Directive 98/79/EC of European Parliament and of the Council. IVD is described as any medical device which is a reagent, calibrator, control material, kit, equipment or system, whether used alone or in combination, intended by the manufacturer to be used *in vitro* for the examination of specimens, including blood and tissue donations, derived from the human body for the purpose of providing information concerning pathological state and congenital abnormalities of patients as well as to monitor therapeutic effect. IVD equipment is labelled by CE marking according to European Product Safety Regulations [20].

## **4. Molecularly targeted therapies in lung cancer**

206 Mutations in Human Genetic Disease

6, 7, 16].

assay [19].

[17, 18].

can be analysed in light microscope [17, 18].

*Atlas* and the results obtained by the American *Lung Cancer Mutation Consortium* (LCMC) [5,

Mutation testing has become an essential determinant in clinical practice in decision of treatment options for patients with non-small-cell lung carcinomas. Unfortunately NSCLC tumours, in which the molecular diagnostics is carried out, are highly heterogeneous and the cytological and histological material is often insufficient to complete the analysis (small percentage of cancer cells or DNA fragmentation in the process of paraffin embedding). Direct sequencing is still a frequently used method despite having low sensitivity and being time-consuming and labour-intensive. However, direct sequencing and particularly next generation sequencing (technology based on reversible dye terminators, sequencing by ligation and pirosequencing) are the methods of high-throughput screening for unknown mutations. Microarrays containing oligonucleotide mutation probes are emerging as useful

platforms for the diagnosis of multiple genetic abnormalities in cancer cells [17, 18].

The multiplex SNaPshot PCR (minisequencing) technique is a PCR (polymerase chain reaction)-based assay for detection of known mutations. Specific primer which anneals immediately adjacent to the mutated region is extended by one base using a fluorescently labeled ddNTPs, which are detected in capillary electrophoresis. No further extension is possible because of the ddNTP binding. This kind of reaction is being used more and more frequently because of its fast and sensitive detection of many known mutations in a single

Recent advances in molecular techniques have enabled the design of sensitive detection assays based on quantitative real-time PCR, but usually with limited degree of mutation coverage. Allele-specific PCR (ASP-PCR), amplification refractory mutation system PCR (ARMS-PCR), clamp PCR and mutant-enriched PCR (ME-PCR) are among these techniques. The most frequently used is the ARMS–PCR method that can detect a known SNP (single nucleotide polymorphism). It consists of two complementary reactions: one containing an ARMS primer specific for the normal DNA sequence that cannot amplify mutant DNA at a given locus and the other one containing a mutant-specific primer that cannot amplify normal DNA. High resolution melting (HRM) real-time PCR is also a technique that might allow fast screening for mutations. The real-time PCR technology itself is highly flexible and many alternative instruments and fluorescent probe systems have been developed recently

For detecting polysomy, gene amplifications and the presence of fusion genes molecular probes labelled with different fluorochromes and fluorescence *in situ* hybridisation (FISH) technique are being used. Techniques related to FISH, but allowing to label only one gene fragment, are silver *in situ* hybridisation (SISH) and chromogenic *in situ* hybridisation (CISH). The FISH technique requires an assessment of signal quantity from labelled genes and chromosome fragments with fluorescence microscopy whereas SISH or CISH staining

**3. Molecular biology methods in lung cancer diagnostics** 

Molecularly targeted drugs are directed against abnormal proteins and other molecules, specific for cancer cells, participating in metabolic pathways. Excess activation of those pathways is essential for growth and unrestrained proliferation of cancer cells. Blocking these pathways results in inhibition of cell division and in cell apoptosis. Therefore, molecularly targeted drugs show high efficacy in two groups of patients:

	- a. direct blocking of abnormal protein
	- b. blocking of subsequent signalling pathway proteins stimulated by the abnormal protein [21].

Therefore, many of the therapies currently under development target several signalling proteins, especially tyrosine kinase receptors (e.g. EGFR, HER2, HER3, IGF-1R, cMET) or proteins in downstream signalling pathway (RAS/RAF/MAPK/mTOR and Pi3K/AKT) [19, 21].

Excessive stimulation of epidermal growth factor receptor increases proliferation of cancer cells in different kinds of tumours, i.a. in non-small-cell lung cancer. Cell growth signal is transmitted from EGFR (HER1), after its heterodimerisation with other member of HER family (ERBB2 – HER2, HER3 or HER4), through phosphorylation of Pi3K/AKT and

RAS/RAF/MAPK/mTOR pathway. The phosphorylation takes place due to EGFR tyrosine kinase activity, which performs hydrolysis of ATP to ADP and free phosphate. Tyrosine kinases are a part of EGFR but also other cell receptors and signalling proteins. Phosphorylation disorder initiated by EGFR tyrosine kinase is associated with the development of NSCLC that is independent from tobacco smoke carcinogens. Blocking of EGFR function may be achieved by using small molecule tyrosine kinase inhibitors (TKI) or monoclonal antibodies (such as cetuximab), which bind to extracellular domain of EGFR. Inhibition of tyrosine kinase function by TKI-EGFR is much more effective if the amino acid structure of the enzyme is disrupted by activating mutations in *EGFR* gene (described in the previous section). Cetuximab on the other hand demonstrates better effectiveness when high expression of EGFR is present on cancer cell surface [2, 9, 11, 12, 21, 22].

Screening of Gene Mutations in Lung Cancer for Qualification to Molecularly Targeted Therapies 209

**rate** 

71% vs. 47%

74% vs. 31%

62% vs. 32%

83% vs. 36%

58% vs. 15%

**Median PFS** 

9,5 vs. 6,3 months

10,8 vs. 5,4 months

9,2 vs. 6,3 months

13,1 vs. 4,6 months

9,7 vs. 5,2 months

**PFS (favouring TKI-EGFR)** 

HR=0,48 (95% CI: 0,36-

HR=0,31 (95% CI: 0,22-

HR=0,49 (95% CI: 0,34-

HR=0,16 (95% CI: 0,10-

HR=0,42 (95% CI: 0,27-

0,64)

0,41)

0,71)

0,26)

0,64

**mutation Treatment arms Response** 

paclitaxel/carboplatin

paclitaxel/carboplatin

docetaxel/carboplatin

gemcitabine/carboplatin

erlotinib vs. platinum

**Table 1.** Prospective, randomised studies of efficacy of first-line TKI-EGFR and standard chemotherapy

The phase III SATURN study was designed to examine the effect of erlotinib in maintenance therapy dedicated to patients who had clinical benefit after 4 cycles of standard chemotherapy. PFS was significantly prolonged (HR=0,71; p<0,0001) and response rate (11,9% vs. 5,4%) was improved with erlotinib compared to best supportive care in all patients. However, significantly prolonged PFS was observed with erlotinib mainly in group of patients whose tumours had *EGFR* mutation (HR=0,10; p<0,0001) [2, 11,

Although controversial clinical trial results, National Comprehensive Cancer Network (NCCN) recognises that the presence of EGFR-activating mutations represents a "critical"

Some genetic irregularities may be responsible for occurrence of primary or secondary resistance to reversible TKI-EGFR and disease progression even after more than ten months of therapy. *EGFR* wild-type gene and *KRAS* gene mutations are associated with intrinsic TKI-EGFR resistance. Moreover mutations in *KRAS* and *EGFR* genes do not occur simultaneously in the same cancer cell. Patients with mutated *KRAS* gene experience better PFS with standard chemotherapy than with TKI-EGFR therapy. However, a subgroup of 90 patients from SATURN study who had *KRAS* mutation showed no significant difference in PFS in erlotinib-arm and placebo-arm. Although KRAS mutation has been associated with clinical outcomes with cetuximab in colorectal cancer, no association was reported from

biomarker for appropriate patients selection for TKI-EGFR therapy [24].

gefitinib vs.

gefitinib vs.

gefitinib vs.

erlotinib vs.

doublet

**Study Patients with** 

IPASS 216 Asian

OPTIMAL 165 Asian

EURTAC 170 Caucasian patients

JP 0056 (NEJ 002)

WJTOG 3405

12, 23].

patients

200 North-East Japan patients

177 Asian patients

patients

in patients with *EGFR* gene mutations [12].

At the moment, two reversible EGFR TKIs are in use: gefitinib and erlotinib. Phase III study IPASS, carried out among Asian patients (up to 40% of *EGFR* gene mutation NSCLC carriers), has proven higher efficacy of gefitinib (71,2% response rate, longer progression free survival (PFS) up to 12 months and significant improvement in quality of life, but without overall survival (OS) prolongation) in compare to chemotherapy consisting of carboplatin and paclitaxel in patients with activating *EGFR* gene mutations. However, among patients with wild type *EGFR* gene, first line chemotherapy of advanced NSCLC with gefitinib was ineffective. The study included more than 1 200 adenocarcinoma patients, with a retrospective biomarker analysis performed on specimens from 437 tumour samples with evaluable *EGFR* gene mutation data. Mutations in *EGFR* gene were identified in 261 (59,7%) of these patients. Later studies comparing efficacy of erlotinib or gefitinib and standard chemotherapy had proven that EGFR TKIs are effective in first line of treatment (NEJ 002, WJTOG 3405, OPTIMAL, EURTAC studies) but only in patients with activating mutations in *EGFR* gene (Table 1). Moreover, OPTIMAL study showed that patients with deletion in exon 19 had longer median PFS than those with substitution L858R in exon 21 of *EGFR* gene. However, IPASS and WJTOG 3405 studies have not proven these observations [2, 11, 12, 21, 22, 23].

The BR.21 study concerned the effectiveness of erlotinib monotherapy in second or third line therapy in patients with advanced NSCLC. Erlotinib has prolonged PFS and improved quality of life when compared to best supportive care in the whole patients group, but an objective response was achieved in only 10% of patients. Patients with *EGFR* gene amplification, detected with FISH technique, responded more frequently to therapy with erlotinib. 61 (38,4%) of 159 tumours analysed in BR.21 study were positive for an increased *EGFR* gene copy number. Response rates were 21% and 5% in patients who were FISHpositive and FISH-negative, respectively. This benefit seemed to extend to survival (HR=0,43; p=0,004). It is not certain, if this result was related with underestimation of *EGFR* gene mutations in FISH-positive patients due to the use of sequencing method for *EGFR* gene mutation analysis. The INTEREST study confirmed this suggestion, demonstrating the superiority of gefitinib over docetaxel in second line of treatment in patients with activating mutation of *EGFR* gene. Application of reversible TKI-EGFR in II and III line of treatment in patients without activating mutations in *EGFR* gene is controversial [2, 11, 12, 21, 22, 23].


[2, 11, 12, 21, 22, 23].

RAS/RAF/MAPK/mTOR pathway. The phosphorylation takes place due to EGFR tyrosine kinase activity, which performs hydrolysis of ATP to ADP and free phosphate. Tyrosine kinases are a part of EGFR but also other cell receptors and signalling proteins. Phosphorylation disorder initiated by EGFR tyrosine kinase is associated with the development of NSCLC that is independent from tobacco smoke carcinogens. Blocking of EGFR function may be achieved by using small molecule tyrosine kinase inhibitors (TKI) or monoclonal antibodies (such as cetuximab), which bind to extracellular domain of EGFR. Inhibition of tyrosine kinase function by TKI-EGFR is much more effective if the amino acid structure of the enzyme is disrupted by activating mutations in *EGFR* gene (described in the previous section). Cetuximab on the other hand demonstrates better effectiveness when

At the moment, two reversible EGFR TKIs are in use: gefitinib and erlotinib. Phase III study IPASS, carried out among Asian patients (up to 40% of *EGFR* gene mutation NSCLC carriers), has proven higher efficacy of gefitinib (71,2% response rate, longer progression free survival (PFS) up to 12 months and significant improvement in quality of life, but without overall survival (OS) prolongation) in compare to chemotherapy consisting of carboplatin and paclitaxel in patients with activating *EGFR* gene mutations. However, among patients with wild type *EGFR* gene, first line chemotherapy of advanced NSCLC with gefitinib was ineffective. The study included more than 1 200 adenocarcinoma patients, with a retrospective biomarker analysis performed on specimens from 437 tumour samples with evaluable *EGFR* gene mutation data. Mutations in *EGFR* gene were identified in 261 (59,7%) of these patients. Later studies comparing efficacy of erlotinib or gefitinib and standard chemotherapy had proven that EGFR TKIs are effective in first line of treatment (NEJ 002, WJTOG 3405, OPTIMAL, EURTAC studies) but only in patients with activating mutations in *EGFR* gene (Table 1). Moreover, OPTIMAL study showed that patients with deletion in exon 19 had longer median PFS than those with substitution L858R in exon 21 of *EGFR* gene. However, IPASS and WJTOG 3405 studies have not proven these observations

The BR.21 study concerned the effectiveness of erlotinib monotherapy in second or third line therapy in patients with advanced NSCLC. Erlotinib has prolonged PFS and improved quality of life when compared to best supportive care in the whole patients group, but an objective response was achieved in only 10% of patients. Patients with *EGFR* gene amplification, detected with FISH technique, responded more frequently to therapy with erlotinib. 61 (38,4%) of 159 tumours analysed in BR.21 study were positive for an increased *EGFR* gene copy number. Response rates were 21% and 5% in patients who were FISHpositive and FISH-negative, respectively. This benefit seemed to extend to survival (HR=0,43; p=0,004). It is not certain, if this result was related with underestimation of *EGFR* gene mutations in FISH-positive patients due to the use of sequencing method for *EGFR* gene mutation analysis. The INTEREST study confirmed this suggestion, demonstrating the superiority of gefitinib over docetaxel in second line of treatment in patients with activating mutation of *EGFR* gene. Application of reversible TKI-EGFR in II and III line of treatment in patients without activating mutations in *EGFR* gene is controversial [2, 11, 12, 21, 22, 23].

high expression of EGFR is present on cancer cell surface [2, 9, 11, 12, 21, 22].

**Table 1.** Prospective, randomised studies of efficacy of first-line TKI-EGFR and standard chemotherapy in patients with *EGFR* gene mutations [12].

The phase III SATURN study was designed to examine the effect of erlotinib in maintenance therapy dedicated to patients who had clinical benefit after 4 cycles of standard chemotherapy. PFS was significantly prolonged (HR=0,71; p<0,0001) and response rate (11,9% vs. 5,4%) was improved with erlotinib compared to best supportive care in all patients. However, significantly prolonged PFS was observed with erlotinib mainly in group of patients whose tumours had *EGFR* mutation (HR=0,10; p<0,0001) [2, 11, 12, 23].

Although controversial clinical trial results, National Comprehensive Cancer Network (NCCN) recognises that the presence of EGFR-activating mutations represents a "critical" biomarker for appropriate patients selection for TKI-EGFR therapy [24].

Some genetic irregularities may be responsible for occurrence of primary or secondary resistance to reversible TKI-EGFR and disease progression even after more than ten months of therapy. *EGFR* wild-type gene and *KRAS* gene mutations are associated with intrinsic TKI-EGFR resistance. Moreover mutations in *KRAS* and *EGFR* genes do not occur simultaneously in the same cancer cell. Patients with mutated *KRAS* gene experience better PFS with standard chemotherapy than with TKI-EGFR therapy. However, a subgroup of 90 patients from SATURN study who had *KRAS* mutation showed no significant difference in PFS in erlotinib-arm and placebo-arm. Although KRAS mutation has been associated with clinical outcomes with cetuximab in colorectal cancer, no association was reported from

analyses of clinical studies of cetuximab in combination with chemotherapy in patients with NSCLC. Currently, *KRAS* mutation testing is not recommended in molecular diagnosis of NSCLC patients [11, 12].

Screening of Gene Mutations in Lung Cancer for Qualification to Molecularly Targeted Therapies 211

survived more than 2 years and 77% of patients survived more than 1 year. Newly defined kinase fusions (KIF5B with RET and ROS1 with ALK and with other fusion partners) may be

**Figure 4.** *EGFR* pathway components and possibility of new molecularly targeted therapies application

Drugs inhibiting neoangiogenesis within the tumour have also found an application in molecularly targeted therapy of patients with NSCLC. These drugs are bevacizumab – a monoclonal antibody directed against vascular endothelial growth factor (VEGF) and small molecule drugs, inhibiting tyrosine kinase functions of VEGFR, PDGFR, FGFR, RET and c-

American *Lung Cancer Mutation Consortium* (LCMC) had screened NSCLC tumour samples not only for *EGFR* and *ALK* mutations, but also for other known mutations such as *KRAS, EGFR, EML4-ALK, BRAF, HER2, PIK3CA, NRAS, MEK1, AKT1* and *MET* gene irregularities.

in resistance to reversible TKI-EGFR.

Kit (vargatef, sunitinib) [26, 27]

also promising targets for molecular therapies [11, 12, 14, 15, 26, 27].

The secondary resistance to reversible TKI-EGFR is connected with the inability to extend overall survival with erlotinib or gefitinib therapy. Underlying mechanism of resistance to reversible EGFR TKIs is an amplification of *IGF1R* and *MET* gene, but also mutations in exon 20 of *EGFR* and *HER2* genes. The presence of such abnormalities may have a pivotal role in qualification to novel therapies, currently in their last phase of clinical trials. Inhibitors of insulin-like growth factor receptor 1 (IGF1-R), both small molecule as well as monoclonal antibodies, and inhibitors of receptor for hepatocyte growth factor (cMET) (e.g. tivantinib – ARQ-197 or MetMab) may be used in some patients treated with reversible TKI-EGFR among whom a resistance for the therapy has occurred as an alternative way of Pi3K/AKT pathway stimulation created through overexpression of IGF1R and cMET (Figure 4) [25, 26, 27].

The occurrence of T790M mutation in exon 20 of *EGFR* gene and mutations in exon 20 of *HER2* gene may be important for the proper qualifications for the treatment with irreversible EGFR TKIs. Drugs like afatinib (BIBW-2992), PF-00299804 or neratinib (HKI-272) may be effective in case of resistance to reversible TKI-EGFR when a secondary mutation is present (e.g. T790M). The action of afatinib remains until the EGFR protein is removed from the cancer cell surface. Furthermore, afatinib also blocks HER2 and HER4 proteins which are preferential heterodimerisation partners for EGFR during stimulation by EGF. In LUX-Lung 1 study, afatinib efficacy (prolongation of PFS) was proven as a rescue treatment after failure of erlotinib or gefitinib if duration of second-line TKI-EGFR treatment exceeded 24 weeks (HR=0,38, p<0,0001). Irreversible TKI-EGFR may also be more effective than reversible TKI-EGFR in first-line of treatment of patients with activating mutations of *EGFR* gene. In the LUX-Lung 2 study, 129 patients with activating *EGFR* mutations and no previous TKI-EGFR treatment received afatinib as a single agent. Overall response rate was 60% with a promising PFS of 14 months. LUX-Lung 3 and LUX-Lung 6 studies are designed to compare effectiveness of afatinib and chemotherapy based on pemetrexed and cisplatin or gemcitabine and cisplatin in patients with *EGFR* mutations. As first-line treatment of patients with known *EGFR* mutation, PF-00299804 showed encouraging efficacy, which exceeded the erlotinib effectiveness. In patients with T790M and T854A mutations in *EGFR* gene, the combination of irreversible TKI-EGFR therapy with application of monoclonal antibody against EGFR (cetuximab) may be also reasonable [11, 12, 25, 26, 27].

Big hopes for the development of lung adenocarcinoma therapy are related to phase III studies over a novel, small molecule, molecularly targeted drug – crizotinib, an inhibitor of ALK, ROS1 and cMET. Crizotinib is particularly active in patients with *EML4-ALK* fusion gene, inducing disease control in up to 90% of such patients and prolonging their overall survival. In patients with *EML4-ALK* fusion gene, 64% of patients treated with crizotinib survived more than 2 years and 77% of patients survived more than 1 year. Newly defined kinase fusions (KIF5B with RET and ROS1 with ALK and with other fusion partners) may be also promising targets for molecular therapies [11, 12, 14, 15, 26, 27].

210 Mutations in Human Genetic Disease

NSCLC patients [11, 12].

4) [25, 26, 27].

[11, 12, 25, 26, 27].

analyses of clinical studies of cetuximab in combination with chemotherapy in patients with NSCLC. Currently, *KRAS* mutation testing is not recommended in molecular diagnosis of

The secondary resistance to reversible TKI-EGFR is connected with the inability to extend overall survival with erlotinib or gefitinib therapy. Underlying mechanism of resistance to reversible EGFR TKIs is an amplification of *IGF1R* and *MET* gene, but also mutations in exon 20 of *EGFR* and *HER2* genes. The presence of such abnormalities may have a pivotal role in qualification to novel therapies, currently in their last phase of clinical trials. Inhibitors of insulin-like growth factor receptor 1 (IGF1-R), both small molecule as well as monoclonal antibodies, and inhibitors of receptor for hepatocyte growth factor (cMET) (e.g. tivantinib – ARQ-197 or MetMab) may be used in some patients treated with reversible TKI-EGFR among whom a resistance for the therapy has occurred as an alternative way of Pi3K/AKT pathway stimulation created through overexpression of IGF1R and cMET (Figure

The occurrence of T790M mutation in exon 20 of *EGFR* gene and mutations in exon 20 of *HER2* gene may be important for the proper qualifications for the treatment with irreversible EGFR TKIs. Drugs like afatinib (BIBW-2992), PF-00299804 or neratinib (HKI-272) may be effective in case of resistance to reversible TKI-EGFR when a secondary mutation is present (e.g. T790M). The action of afatinib remains until the EGFR protein is removed from the cancer cell surface. Furthermore, afatinib also blocks HER2 and HER4 proteins which are preferential heterodimerisation partners for EGFR during stimulation by EGF. In LUX-Lung 1 study, afatinib efficacy (prolongation of PFS) was proven as a rescue treatment after failure of erlotinib or gefitinib if duration of second-line TKI-EGFR treatment exceeded 24 weeks (HR=0,38, p<0,0001). Irreversible TKI-EGFR may also be more effective than reversible TKI-EGFR in first-line of treatment of patients with activating mutations of *EGFR* gene. In the LUX-Lung 2 study, 129 patients with activating *EGFR* mutations and no previous TKI-EGFR treatment received afatinib as a single agent. Overall response rate was 60% with a promising PFS of 14 months. LUX-Lung 3 and LUX-Lung 6 studies are designed to compare effectiveness of afatinib and chemotherapy based on pemetrexed and cisplatin or gemcitabine and cisplatin in patients with *EGFR* mutations. As first-line treatment of patients with known *EGFR* mutation, PF-00299804 showed encouraging efficacy, which exceeded the erlotinib effectiveness. In patients with T790M and T854A mutations in *EGFR* gene, the combination of irreversible TKI-EGFR therapy with application of monoclonal antibody against EGFR (cetuximab) may be also reasonable

Big hopes for the development of lung adenocarcinoma therapy are related to phase III studies over a novel, small molecule, molecularly targeted drug – crizotinib, an inhibitor of ALK, ROS1 and cMET. Crizotinib is particularly active in patients with *EML4-ALK* fusion gene, inducing disease control in up to 90% of such patients and prolonging their overall survival. In patients with *EML4-ALK* fusion gene, 64% of patients treated with crizotinib

**Figure 4.** *EGFR* pathway components and possibility of new molecularly targeted therapies application in resistance to reversible TKI-EGFR.

Drugs inhibiting neoangiogenesis within the tumour have also found an application in molecularly targeted therapy of patients with NSCLC. These drugs are bevacizumab – a monoclonal antibody directed against vascular endothelial growth factor (VEGF) and small molecule drugs, inhibiting tyrosine kinase functions of VEGFR, PDGFR, FGFR, RET and c-Kit (vargatef, sunitinib) [26, 27]

American *Lung Cancer Mutation Consortium* (LCMC) had screened NSCLC tumour samples not only for *EGFR* and *ALK* mutations, but also for other known mutations such as *KRAS, EGFR, EML4-ALK, BRAF, HER2, PIK3CA, NRAS, MEK1, AKT1* and *MET* gene irregularities. Mutations were found in 54% (280/516) of completely tested tumours, in 15 certified genetic laboratories. Mutation screening is not only for research purposes, but is also designed to determine patients who might benefit from molecularly targeted therapies. Molecular testing could definitely identify the mutations associated with response or resistance to targeted therapies [16]. Nowadays, we have an opportunity to match molecularly targeted therapies with the structure of proteins that are taking part in signalling pathways of neoplasm cells. The efficiency of tyrosine kinase inhibitors of EGFR (erlotinib, gefitinib) and ALK (crizotinib) in NSCLC patients bearing *EGFR* or *ALK* activating mutations is the example of such relationship. These observations create new possibilities for personalisation of known molecularly targeted therapies (registered and tested in clinical trails) in large population of NSCLC patients [16]. LCMC idea was used to describe potential capability of therapy of NSCLC patients, based on presence of mutations in cancer cells. Similarly, the BATTLE program at the M.D. Anderson Cancer Centre in Houston assessed biomarker-guided treatment in patients with previously treated, advanced NSCLC and biopsy-amenable disease. For this purpose, cancer gene databases should be created to determine what is known about germline and somatic gene variants as well as treatment options and their outcomes. According to recent cancer genomic knowledge, clinical trials of novel molecularly targeted drugs, could be offered to cancer patients who are unlikely to benefit from a standard therapy, with relatively poor prognosis and to patients who are more likely to benefit from a novel therapy due to the presence of tumour genetic abnormalities that predict sensitivity, lack of resistance or toxicity of a treatment (Table 2) [4, 16, 19, 26, 27].

Screening of Gene Mutations in Lung Cancer for Qualification to Molecularly Targeted Therapies 213

serine/threonine kinase

serine/threonine kinase

EGFR and HER2

inhibitor of PI3Kα

kinase1/2 kinases);

c-Kit, EPH and PDGFRβ

antibody against FGFR1

(FGFR, PDGFR, VEGFR)

ribose) polymerase (PARP)

small molecule inhibitor of BRAF

small molecule, irreversible TKI of pan-HER; small molecule, irreversible TKI of

small molecule inhibitor of mTOR and PI3K kinases; small molecule inhibitor of pan-PI3K; small molecule selective

small molecule inhibitor of MEK 1/2 serine/threonine kinase (MAPK/ERK

small molecule inhibitor of BCR-ABL, SRC,

small molecule TKI of FGFR and VEGFR; small molecule kinase inhibitor of native and mutated BCR-ABL, VEGFR2, FGFR1, PDGFRα, mutated FLT3 and LYN; small molecule TKI of FGFRs; monoclonal

Monoclonal antibody against PDGFR α; small molecule inhibitors of kinases of VEGFR1-3, RET, c-Kit, PDGFR α and β

small molecule inhibitor of angiokinase

*BRAF* mutation GSK-1120212 small molecule inhibitor of MEK 1/2

GSK-2118436;

vemurafenib (PLX-4032)

Afatinib (BIBW2992), neratinib, PF299804, CI-1033, EKB-569, AV-412/MP-412, lapatinib

BEZ-235, GDC-0491, SAR-245409, BKM-120, BYL-716, OSI-027, PX-

866, MK-8669

nilotynib

1039

JTP-74057 (GSK-1120212); selumetinib (AZD-6244), GDC-0973, MEK-162, MSC-1936369B

erlotinib + dazatinib or

PD-173074, ponatinib (AP24534), BGJ-398, FP-

MEDI-575, IMC-3G3, sunitinib, sorafenib, OSI-930, pazopanib (votrient)

intedanib (BIBF-1120), dovitinib (TKI258)

*AKT1* mutation MK-2206, GSK-2110183 AKT inhibitors

BRCA1 deficiency olaparib + cisplatin small molecule inhibitor of poly(ADP-

**Table 2.** An example of qualification possibilities for molecularly targeted therapies based on NSCLC cell molecular signature (in most countries gefitinib, erlotinib and crizotinib are the only registered drugs in NSCLC therapy; other indications for therapy are hypothetical and are based only on the

*NRAS, MEK1* or

*BRAF, NRAS*  mutation

mutation in exon 20 of *EGFR* (e.g. T790M); *HER2*  mutation

*PIK3CA* mutation

*MEK1* mutation

*DDR2* mutation

*FGFR* amplification

*FGFR* and/or *PDGFR* 

results of early clinical trials).

amplification

(S768R)

*PDGFR*  amplification, *PDGFR* mutation, *c-Kit* mutation


toxicity of a treatment (Table 2) [4, 16, 19, 26, 27].

activating mutation

activating mutation

*KRAS* mutation; *MET* amplification

fusion gene *EML4- ALK* and fusion genes with *ROS1* gene component; *ROS1* mutation

of *EGFR*

**Genetic abnormality Treatment Mechanism of action**

erlotinib + OSI-906 or MM-121 or MK-0646

erlotinib + tivantinib

crizotinib, AP-26113, LDK-378, AF-802

onartuzumab (MetMAb); JTP-74057 (GSK1120212);

(ARQ-197) or

of *EGFR* erlotinib or gefitinib small molecule, reversible TKI-EGFR

small molecule, reversible TKI-EGFR + small molecule TKI IGF-1R or fully human

small molecule TKI-EGFR + small molecule TKI cMET or monovalent (one-armed) monoclonal antibody against cMET; small

monoclonal antibody against ErbB3

molecule inhibitor of MEK 1/2 serine/threonine kinase;

small molecule TKI of ALK, ROS1 and cMET; small molecule TKI of ALK and EGFR; small molecule TKI of ALK

Mutations were found in 54% (280/516) of completely tested tumours, in 15 certified genetic laboratories. Mutation screening is not only for research purposes, but is also designed to determine patients who might benefit from molecularly targeted therapies. Molecular testing could definitely identify the mutations associated with response or resistance to targeted therapies [16]. Nowadays, we have an opportunity to match molecularly targeted therapies with the structure of proteins that are taking part in signalling pathways of neoplasm cells. The efficiency of tyrosine kinase inhibitors of EGFR (erlotinib, gefitinib) and ALK (crizotinib) in NSCLC patients bearing *EGFR* or *ALK* activating mutations is the example of such relationship. These observations create new possibilities for personalisation of known molecularly targeted therapies (registered and tested in clinical trails) in large population of NSCLC patients [16]. LCMC idea was used to describe potential capability of therapy of NSCLC patients, based on presence of mutations in cancer cells. Similarly, the BATTLE program at the M.D. Anderson Cancer Centre in Houston assessed biomarker-guided treatment in patients with previously treated, advanced NSCLC and biopsy-amenable disease. For this purpose, cancer gene databases should be created to determine what is known about germline and somatic gene variants as well as treatment options and their outcomes. According to recent cancer genomic knowledge, clinical trials of novel molecularly targeted drugs, could be offered to cancer patients who are unlikely to benefit from a standard therapy, with relatively poor prognosis and to patients who are more likely to benefit from a novel therapy due to the presence of tumour genetic abnormalities that predict sensitivity, lack of resistance or


**Table 2.** An example of qualification possibilities for molecularly targeted therapies based on NSCLC cell molecular signature (in most countries gefitinib, erlotinib and crizotinib are the only registered drugs in NSCLC therapy; other indications for therapy are hypothetical and are based only on the results of early clinical trials).

#### **5. Summary**

It is worth remembering that the presence of mutations may overlap with much more severe genetic abnormalities of lung cancer cells. These irregularities result in profound changes in cancer cells ability to proliferate and in effect it becoming invulnerable to selective molecularly targeted therapies. Therefore, at present only few above mentioned drugs may be used in lung cancer patients instead of standard chemotherapy. In most cases, molecularly targeted therapies will find an application in patients who have already exhausted all standard chemotherapy forms.

Screening of Gene Mutations in Lung Cancer for Qualification to Molecularly Targeted Therapies 215

[8] Subramanian J, Govindan R (2007) Lung cancer in never smokers: a review. J. clin. oncol.

[9] Rudin CM, Avila-Tang E, Harris CC, Herman JG, Hirsch FR, Pao W, Schwartz AG, Vahakangas KH, Samet JM (2009) Lung cancer in never smokers: molecular profiles and

[10] Begun S (2012) Molecular changes in smoking-related lung cancer. Expert rev. mol.

[11] Kulesza P, Ramchandran K, Patel JD (2011) Emerging concepts in pathology and molecular biology of advanced non-small cell lung cancer. Am j clin pathol. 136: 228-

[12] Dienstmann R, Martinez P, Felip E (2011) Personalizing therapy with targeted agents in

[13] Yun CH, Boggon TJ, Li Y (2007) Structures of lung cancer-derived EGFR mutants and inhibitor complexes: Mechanism of activation and insights into differential inhibitor

[14] Pao W, Hutchinson KE (2012) Summary of KIF5B-RET fusions in individuals with lung

[15] Takeuchi K, Soda M, Togashi Y, Suzuki R, Sakata S, Hatano S, Asaka R, Hamanaka W, Ninomiya, Uehara H, Choi YL, Satoh Y, Okumura S, Nakagawa K, Mano H, Ishikawa Y (2012) RET, ROS1 and ALK fusions in lung cancer. Nature med. 18: 378-

[17] Felip E, Gridelli C, Baas P (2011) Metastatic non-small-cell lung cancer: consensus on pathology and molecular tests, first-line, second-line, and third-line therapy. 1st ESMO Consensus Conference in Lung Cancer; Lugano 2010. Ann. oncol.

[18] Pirker R, Herth FJF, Kerr KM (2010) Consensus for EGFR mutation testing in nonsmall cell lung cancer. Results from a European Workshop. J. thorac. oncol. 5(10): 1706-

[19] Heist RS, Engelman JA (2012) SnapShot: non-small cell lung cancer. Cancer cell. 21: 448-

[21] Salgia R, Hensing T, Campbell N (2011) Personalized treatment of lung cancer. Semin.

[22] Sun S, Schiller JH, Spinola M, Minna JD (2007) New molecularly targeted therapies for

[23] Janku F, Garrido-Laguna I, Petruzelka LB, Stewart DJ, Kurzrock R (2011) Novel therapeutic targets in non-small cell lung cancer. J. thorac. oncol. 6(9): 1601-1612. [24] National Comprehensive Cancer Network. http://www.nccn.org/professionals/

[20] Eur-Lex Access to European Union law. http://eur-lex.europa.eu/LexUriServ/

therapeutic implications. Clin. cancer res. 15(18): 5646-5661.

non-small cell lung cancer. Oncotarget 2: 135-177.

[16] Lung Cancer Mutation Consortium. http://www.golcmc.com/

sensitivity. Cancer cell. 11: 217-227.

cancer. Nature med. 18: 349-351.

doi:10.1093/annonc/mdr150

25(5): 561-570.

diagn. 12: 93-106.

238.

381.

1713.

448.e2

oncol. 38: 274-283.

lung cancer. J. clin. invest. 117(10): 2740-2750.

Multiple genetic alterations in lung cancer tumours and different targeted therapies based on appropriate molecular status of patients are still under investigation. However, the problems with proper obtaining and storage of tumour tissue for molecular testing as well as choosing adequate molecular methods for gene mutation screening is still open for discussion.

## **Author details**

Paweł Krawczyk *Corresponding author Department of Pneumonology, Oncology and Allergology, Medical University of Lublin, Lublin, Poland* 

Tomasz Kucharczyk and Kamila Wojas-Krawczyk *Department of Pneumonology, Oncology and Allergology, Medical University of Lublin, Lublin, Poland* 

#### **6. References**


[8] Subramanian J, Govindan R (2007) Lung cancer in never smokers: a review. J. clin. oncol. 25(5): 561-570.

214 Mutations in Human Genetic Disease

exhausted all standard chemotherapy forms.

*Department of Pneumonology, Oncology and Allergology,* 

Tomasz Kucharczyk and Kamila Wojas-Krawczyk *Department of Pneumonology, Oncology and Allergology,* 

It is worth remembering that the presence of mutations may overlap with much more severe genetic abnormalities of lung cancer cells. These irregularities result in profound changes in cancer cells ability to proliferate and in effect it becoming invulnerable to selective molecularly targeted therapies. Therefore, at present only few above mentioned drugs may be used in lung cancer patients instead of standard chemotherapy. In most cases, molecularly targeted therapies will find an application in patients who have already

Multiple genetic alterations in lung cancer tumours and different targeted therapies based on appropriate molecular status of patients are still under investigation. However, the problems with proper obtaining and storage of tumour tissue for molecular testing as well as choosing adequate molecular methods for gene mutation screening is still open for

[1] Jemal A, Bray F, Centem MM, Ferlay J, Ward E, Forman D (2011) Global cancer statistics.

[2] Kadara H, Kabbout M, Wistuba II (2012) Pulmonary adenocarcinoma: a renewed entity

[3] Dehan E, Ben-Dor A, Liao W (2007) Chromosomal aberrations and gene expression

[4] Dancey JE, Bedard PL, Onetto N, Hudson TJ (2012) The genetic basis for cancer

profiles in non-small cell lung cancer. Lung cancer. 56: 175-184.

**5. Summary** 

discussion.

**Author details** 

Paweł Krawczyk *Corresponding author* 

*Lublin, Poland* 

*Lublin, Poland* 

**6. References** 

*Medical University of Lublin,* 

*Medical University of Lublin,* 

CA Cancer j. clin. 61: 69-90.

in 2011. Respirology. 17: 50-65.

treatment decision. Cell. 148: 409-420.

[5] Catalogue of Somatic Mutations in Cancer – COSMIC. http://www.sanger.ac.uk/genetics/CGP/cosmic/ [6] My Cancer Genome*.* www.mycancergenome.org/

[7] The Cancer Genome Atlas. https://tcga-data.nci.nih.gov/tcga/

	- [25] Doebele RC, Oton AB, Peled N, Camidge DR, Bunn PA (2010) New strategies to overcome limitations of reversible EGFR tyrosine kinase inhibitor therapy in non-small cell lung cancer. Lung cancer. 69: 1-12.

**Chapter 11** 

© 2012 Wang and Zhong, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**Clinical and Genetic Heterogeneity of Autism** 

Autism (MIM 209850) comprises a heterogeneous group of disorders with a complex genetic etiology, characterized by impairments in reciprocal social communication and presence of restricted, repetitive and stereotyped patterns of behavior [1]. With an early onset prior to age 3 and prevalence as high as 0.9–2.6% [2,3], autism occurs predominantly in males, with a ratio of male: female of 4 to 1. It is one of the leading causes of childhood disability and

Diagnosis of autism is based on expert observation and assessment of behavior and cognition, not etiology or pathogenic mechanism. This is further emphasized by the current trend in the DSM-V, in which the category of Asperger syndrome is removed and the diagnostic criteria for autism are modified under the new heading of autism spectrum disorder (ASD). The change in diagnostic criteria is not based on known similarities or differences in causation between these clinically defined categories, but rather on the consensus of opinions of expert clinicians. For autism, several diagnostic instruments are available. Two are commonly used in autism research: the Autism Diagnostic Interview-Revised (ADI-R) that is a semi-structured parent interview [5], and the Autism Diagnostic Observation Schedule (ADOS) uses observation and interaction with the child(ren) [6]. The Childhood Autism Rating Scale (CARS) is used widely in clinical environments to assess severity of autism based on observation of children [7]. The M-CHAT was developed in the late 1990s as a first-stage screening tool for ASD in toddlers' age 18 to 24 months, with a

Autistic conditions are a spectrum of disorders, rather than a distinct clinical disorder, which means that the symptoms can be present in a variety of combinations with a range of severity. The disease has variable cognitive manifestations, ranging from a non-verbal child with mental retardation to a high-functioning college student with above average IQ with

and reproduction in any medium, provided the original work is properly cited.

Yu Wang and Nanbert Zhong

http://dx.doi.org/10.5772/48700

**1. Introduction** 

Additional information is available at the end of the chapter

inflicts serious suffering and burden for the family and society [4].

sensitivity of 0.87 and a specificity of 0.99 in American children [8, 9].

**2. Clinical heterogeneity of ASD** 


## **Clinical and Genetic Heterogeneity of Autism**

Yu Wang and Nanbert Zhong

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48700

## **1. Introduction**

216 Mutations in Human Genetic Disease

cell lung cancer. Lung cancer. 69: 1-12. [26] ClinicalTrials. gov http://clinicaltrials.gov/

[27] National Cancer Institute. http://www.cancer.gov/clinicaltrials

[25] Doebele RC, Oton AB, Peled N, Camidge DR, Bunn PA (2010) New strategies to overcome limitations of reversible EGFR tyrosine kinase inhibitor therapy in non-small

> Autism (MIM 209850) comprises a heterogeneous group of disorders with a complex genetic etiology, characterized by impairments in reciprocal social communication and presence of restricted, repetitive and stereotyped patterns of behavior [1]. With an early onset prior to age 3 and prevalence as high as 0.9–2.6% [2,3], autism occurs predominantly in males, with a ratio of male: female of 4 to 1. It is one of the leading causes of childhood disability and inflicts serious suffering and burden for the family and society [4].

> Diagnosis of autism is based on expert observation and assessment of behavior and cognition, not etiology or pathogenic mechanism. This is further emphasized by the current trend in the DSM-V, in which the category of Asperger syndrome is removed and the diagnostic criteria for autism are modified under the new heading of autism spectrum disorder (ASD). The change in diagnostic criteria is not based on known similarities or differences in causation between these clinically defined categories, but rather on the consensus of opinions of expert clinicians. For autism, several diagnostic instruments are available. Two are commonly used in autism research: the Autism Diagnostic Interview-Revised (ADI-R) that is a semi-structured parent interview [5], and the Autism Diagnostic Observation Schedule (ADOS) uses observation and interaction with the child(ren) [6]. The Childhood Autism Rating Scale (CARS) is used widely in clinical environments to assess severity of autism based on observation of children [7]. The M-CHAT was developed in the late 1990s as a first-stage screening tool for ASD in toddlers' age 18 to 24 months, with a sensitivity of 0.87 and a specificity of 0.99 in American children [8, 9].

## **2. Clinical heterogeneity of ASD**

Autistic conditions are a spectrum of disorders, rather than a distinct clinical disorder, which means that the symptoms can be present in a variety of combinations with a range of severity. The disease has variable cognitive manifestations, ranging from a non-verbal child with mental retardation to a high-functioning college student with above average IQ with

© 2012 Wang and Zhong, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

inadequate social skills [10]. Clinical heterogeneity of autism showed three major categories: idiopathic autism, autistic spectrum disorder (ASD), and syndromatic autistics that usually resulted from an identified syndrome with known genetic etiology. Traditionally, ASD includes autism, Asperger syndrome, where language appears normal, Rett syndrome and pervasive developmental disorder not otherwise specied (PDD-NOS), in which children meet some but not all criteria for autism. Rett syndrome (RTT), occurring almost exclusively in females, is characterized by developmental arrest between 5 and 18 months of age, followed by regression of acquired skills, loss of speech, stereotypic movements (classically of the hands), microcephaly, seizures, and intellectual difficulties. These disorders share decits in social communication and show variability in language and repetitive behavior domains [1]. Autistic individuals may have symptoms that are independent of the diagnosis. Mental retardation is present in approximately 75% of cases of autism, seizures in 15 to 30% of cases, attention deficit hyperactivity disorder (ADHD) in 59-75% of cases, schizophrenia (SZ) in 5% of cases, obsessive-compulsive disorder (OCD) in about 60% of cases and electroencephalographic abnormalities in 20 to 50% of cases [11]. In addition, approximately 15 to 37% of cases of autism have a comorbid medical condition such as epilepsy, sensory abnormalities, motor abnormalities, sleep disturbances, and gastrointestinal symptoms. Five to 14% of cases had a known genetic disorder or chromosomal anomaly. The 4 most common conditions associated with autistic phenotypes are fragile X syndrome, tuberous sclerosis, 15q duplications, and untreated phenylketonuria. Other conditions associated with autistic phenotypes include Angelman syndrome, Cowden disease, Smith-Lemli-Opitz syndrome, cortical dysplasia-focal epilepsy (CDFE) syndrome, Neurofibromatosis, and X-linked mental retardation.

Clinical and Genetic Heterogeneity of Autism 219

would have a more major role than genetics is not clear. Moreover, studies in families show that first-degree relatives of an autistic proband have a markedly increased risk for autism relative to the population, consistent with a strong familial or genetic effect observed in twins [18]. This is not to dispute the role of the environment but to emphasize that genes play an important role. Similar to other common diseases with genetic contributions, autism was thought to fit a model in which multiple variants, each with small to moderate effect sizes, interact with each other and perhaps in some cases, environmental factors, to lead to

Although autism is highly heritable, the identification of candidate genes has been hindered by the heterogeneity of the disease. Autism genetics is highly complex, involving many genes/loci and different genetic variations, including translocation, deletion, single nucleotide polymorphism (SNP) and copy number variation (CNV) [13, 19, 20]. The most obvious general conclusion from all of the published genetic studies is the extraordinary etiological heterogeneity of autism. No specific gene accounts for the majority of autism; rather, even the most common genetic forms account for not more than 1–2% of cases [21]. Further, these genes, including those mentioned earlier, represent a diversity of molecular mechanisms that include cell adhesion, neurotransmission, synaptic structure, RNA processing/splicing, and activity-dependent protein translation. Genetic heterogeneity of autistic cases has been documented by identification of single gene mutations and genomic variations including CNV. The mutant genes identified from autistic patients are: *FMR1, MECP2, CNTNAP2, PTEN, DHCR7, CACNA1C, UBE3A, TSC2, NF1, ARX, NLGN3, NLGN4, NRXN1, FOXP1, FOXP2, GRIK2,* and *SHANK3* (Table 1). Genomic variation including copy number deletion or duplication at loci of 1q21.2, 1q42.2, 2q31.1, 3p25.3, 7q11.23, 7q22.1, 7q36.3, 11q13.3, 12q14.2, 15q11-13, 16p11.2, 16q13.3, 17q11.2, 17q12, 17q21.32, 22q13.33, or

The presence of genetic and phenotypic heterogeneity in autism with a number of underlying pathogenic mechanisms is highlighted in this current review. There are at least three phenotypic presentations with distinct genetic underpinnings: (1) autism with syndromic phenotype characterized by rare, single-gene defects (Table 2); (2) broad autistic phenotypes caused by genetic variations in single or multiple genes, each of these variations being common and distributed continually in the general population but resulting in variant clinical phenotypes when it reaches a certain threshold through complex gene-gene and gene-environment interactions; and (3) severe and specific phenotype caused by 'de-novo' mutations in the patient or transmitted through asymptomatic carriers of such mutations (Table 3) [48, 49]. Understanding the neurobiological processes by which genotypes lead to phenotypes, along with the advances in developmental neuroscience and neuronal networks at the cellular and molecular level, are paving the way for translational research

autism; a situation referred to as complex genetics [13].

**4. Genetic heterogeneity of autism** 

Xp22.11 may also associate with autism.

**5. Genotype/phenotype correlation in ASD** 

#### **3. Autism is a complex genetic disorder**

It is widely held that autism is largely genetic in origin; several dozen autism susceptibility genes have been identified in the past decade, collectively accounting for about 20% of autistic cases. There is strong evidence from twin and family studies for the importance of complex genetic factors in the development of autism [12, 13]. Family studies have shown that a recurrence rate of autism in siblings of affected proband is as high as 8–10% [12, 14]. Thus, the recurrence risk in siblings is roughly 100 times higher than that found in the general population. The substantial degree of familial clustering in ASD could reflect shared environmental factors, but twin studies strongly point to genetics. Several epidemiological studies among sex-matched twins have clearly demonstrated significant differences of concordance rates in the monozygotic (MZ) and dizygotic (DZ) twins. The largest of these studies [15] found that 60% of the MZ pairs were concordant for autism compared with none of the DZ pairs, suggesting a heritability estimate of >90% assuming a multifactorial threshold model. This is what is observed in every twin study in autism, and is overall consistent with heritability estimates of about 70–80% [15, 16]. One exception is a very recent study with a large sample of twins, which, despite showing a concordance of about 0.6 for MZ twins and 0.25 for DZ twins, comes to the conclusion that shared environment plays a larger role than genetic factors [17]. However, the question of how a shared environment would have a more major role than genetics is not clear. Moreover, studies in families show that first-degree relatives of an autistic proband have a markedly increased risk for autism relative to the population, consistent with a strong familial or genetic effect observed in twins [18]. This is not to dispute the role of the environment but to emphasize that genes play an important role. Similar to other common diseases with genetic contributions, autism was thought to fit a model in which multiple variants, each with small to moderate effect sizes, interact with each other and perhaps in some cases, environmental factors, to lead to autism; a situation referred to as complex genetics [13].

#### **4. Genetic heterogeneity of autism**

218 Mutations in Human Genetic Disease

Neurofibromatosis, and X-linked mental retardation.

**3. Autism is a complex genetic disorder** 

inadequate social skills [10]. Clinical heterogeneity of autism showed three major categories: idiopathic autism, autistic spectrum disorder (ASD), and syndromatic autistics that usually resulted from an identified syndrome with known genetic etiology. Traditionally, ASD includes autism, Asperger syndrome, where language appears normal, Rett syndrome and pervasive developmental disorder not otherwise specied (PDD-NOS), in which children meet some but not all criteria for autism. Rett syndrome (RTT), occurring almost exclusively in females, is characterized by developmental arrest between 5 and 18 months of age, followed by regression of acquired skills, loss of speech, stereotypic movements (classically of the hands), microcephaly, seizures, and intellectual difficulties. These disorders share decits in social communication and show variability in language and repetitive behavior domains [1]. Autistic individuals may have symptoms that are independent of the diagnosis. Mental retardation is present in approximately 75% of cases of autism, seizures in 15 to 30% of cases, attention deficit hyperactivity disorder (ADHD) in 59-75% of cases, schizophrenia (SZ) in 5% of cases, obsessive-compulsive disorder (OCD) in about 60% of cases and electroencephalographic abnormalities in 20 to 50% of cases [11]. In addition, approximately 15 to 37% of cases of autism have a comorbid medical condition such as epilepsy, sensory abnormalities, motor abnormalities, sleep disturbances, and gastrointestinal symptoms. Five to 14% of cases had a known genetic disorder or chromosomal anomaly. The 4 most common conditions associated with autistic phenotypes are fragile X syndrome, tuberous sclerosis, 15q duplications, and untreated phenylketonuria. Other conditions associated with autistic phenotypes include Angelman syndrome, Cowden disease, Smith-Lemli-Opitz syndrome, cortical dysplasia-focal epilepsy (CDFE) syndrome,

It is widely held that autism is largely genetic in origin; several dozen autism susceptibility genes have been identified in the past decade, collectively accounting for about 20% of autistic cases. There is strong evidence from twin and family studies for the importance of complex genetic factors in the development of autism [12, 13]. Family studies have shown that a recurrence rate of autism in siblings of affected proband is as high as 8–10% [12, 14]. Thus, the recurrence risk in siblings is roughly 100 times higher than that found in the general population. The substantial degree of familial clustering in ASD could reflect shared environmental factors, but twin studies strongly point to genetics. Several epidemiological studies among sex-matched twins have clearly demonstrated significant differences of concordance rates in the monozygotic (MZ) and dizygotic (DZ) twins. The largest of these studies [15] found that 60% of the MZ pairs were concordant for autism compared with none of the DZ pairs, suggesting a heritability estimate of >90% assuming a multifactorial threshold model. This is what is observed in every twin study in autism, and is overall consistent with heritability estimates of about 70–80% [15, 16]. One exception is a very recent study with a large sample of twins, which, despite showing a concordance of about 0.6 for MZ twins and 0.25 for DZ twins, comes to the conclusion that shared environment plays a larger role than genetic factors [17]. However, the question of how a shared environment Although autism is highly heritable, the identification of candidate genes has been hindered by the heterogeneity of the disease. Autism genetics is highly complex, involving many genes/loci and different genetic variations, including translocation, deletion, single nucleotide polymorphism (SNP) and copy number variation (CNV) [13, 19, 20]. The most obvious general conclusion from all of the published genetic studies is the extraordinary etiological heterogeneity of autism. No specific gene accounts for the majority of autism; rather, even the most common genetic forms account for not more than 1–2% of cases [21]. Further, these genes, including those mentioned earlier, represent a diversity of molecular mechanisms that include cell adhesion, neurotransmission, synaptic structure, RNA processing/splicing, and activity-dependent protein translation. Genetic heterogeneity of autistic cases has been documented by identification of single gene mutations and genomic variations including CNV. The mutant genes identified from autistic patients are: *FMR1, MECP2, CNTNAP2, PTEN, DHCR7, CACNA1C, UBE3A, TSC2, NF1, ARX, NLGN3, NLGN4, NRXN1, FOXP1, FOXP2, GRIK2,* and *SHANK3* (Table 1). Genomic variation including copy number deletion or duplication at loci of 1q21.2, 1q42.2, 2q31.1, 3p25.3, 7q11.23, 7q22.1, 7q36.3, 11q13.3, 12q14.2, 15q11-13, 16p11.2, 16q13.3, 17q11.2, 17q12, 17q21.32, 22q13.33, or Xp22.11 may also associate with autism.

#### **5. Genotype/phenotype correlation in ASD**

The presence of genetic and phenotypic heterogeneity in autism with a number of underlying pathogenic mechanisms is highlighted in this current review. There are at least three phenotypic presentations with distinct genetic underpinnings: (1) autism with syndromic phenotype characterized by rare, single-gene defects (Table 2); (2) broad autistic phenotypes caused by genetic variations in single or multiple genes, each of these variations being common and distributed continually in the general population but resulting in variant clinical phenotypes when it reaches a certain threshold through complex gene-gene and gene-environment interactions; and (3) severe and specific phenotype caused by 'de-novo' mutations in the patient or transmitted through asymptomatic carriers of such mutations (Table 3) [48, 49]. Understanding the neurobiological processes by which genotypes lead to phenotypes, along with the advances in developmental neuroscience and neuronal networks at the cellular and molecular level, are paving the way for translational research

involving targeted interventions of affected molecular pathways and early intervention programs that promote normal brain responses to stimuli and alter the developmental trajectory [50]. Recent genetic results have improved our knowledge of the genetic basis of autism. Nevertheless, identification of phenotypic markers remains challenging due to phenotypic and genotypic heterogeneity.

Clinical and Genetic Heterogeneity of Autism 221

autism

Not conclusive

Not conclusive

conclusive

15–50%

3%

Not conclusive

Not conclusive

Not conclusive

Not conclusive

Not conclusive

Not conclusive

60–67% in males, 23% in female

symptoms Infancy

Reference

51-54

57

30

58-60

61, 62

32, 33

63

57

35

64

65

66

67, 68

15–50% 55, 56

Mechanism involved Risk of

Chromosomal rearrangements and large deletions, disruption of the transcription factor *FOXP2*, SNP

Chromatin remodeling; disruption of the transcription factor *FOXP2*; SNP;

Mutation in gene *TSC1* and subsequent hyperactivation of the downstream mTOR pathway, res**u**lting in increased

Mutations of gene *DHCR*, leading to a deficiency of cholesterol synthesis and

Missense mutations in the calcium

Maternal deletion, paternal UPD, deletions and epimutations at IC, mutations of *UBE3A*, Lack of expression of maternally expressed

Mutation in gene *TSC2* and subsequent hyperactivation of the downstream mTOR pathway, resulting in increased

cell growth and proliferation.

including the (AAAT)(n) and two

Naturally occurring mutations. Nonsense mutations, polyalanine tract expansions and missense mutations

CGG repeat expansion and DNA methylation of FMR1 gene, reduced

FMR1 expression

Abbreviations: LIS, lissencephaly; XLID, X-linked intellectual disability; EPI, epilepsy; OCD, obsessive compulsive

*MECP2* Xq28 Rett syndrome Mutations in *MECP2* and *CDKL5* Overlap in

Mutations of *DMD* gene resulting in absence of dystrophin protein

cell growth and proliferation.

an accumulation of 7 dehydrocholesterol

channel gene *CACNA1H* 

gene *UBE3A*

Gene/loci Chromosome Phenotype

*TSC1* 9q34.13 Tuberous

*DHCR7* 11q13.4 Smith-Lemli-

*CACNA1C* 12p13.33 Timothy

*UBE3A* 15q11.2 Angelman

*TSC2* 16p13.3 Tuberous

*DMD* Xp21.2 Duchenne muscular

*ARX* Xp21.3 LIS, XLID, EPI,

*FMR1* Xq27.3 Fragile X

*CNTNAP2* 7q35-q36.1 Recessive EPI

(human/mouse)

syndrome, ASD, ADHD, TS, OCD

Sclerosis type I.

Opitz syndrome

syndrome.

syndrome

Sclerosis type II

dystrophy

syndrome

disorder; TS, Tourette syndrome; ADHD, attention deficit hyperactivity disorder. **Table 2.** Autism plus syndromic ASD caused by rare, single-gene disorders

ASD

*NF1* 17q11.2 Neurofibromatosis Polymorphisms within the intron-27,

(CA)n

*CHD7* 8q12.1 CHARGE Mutations/deletions of gene *CHD7*,

*PTEN* 10q23.31 Cowden disease. Mutation of gene *PTEN* Not


**Table 1.** Genetic alteration identified from autism


phenotypic and genotypic heterogeneity.

*FMR1* The number of CGG in *FMR1* alleles

full mutation (>200)

G731S, I869T

**Table 1.** Genetic alteration identified from autism

is classified as intermediate mutation (45 to 55), premutation (55 to 200), or

R1119H, D1129H, I1253T, T1278I

involving targeted interventions of affected molecular pathways and early intervention programs that promote normal brain responses to stimuli and alter the developmental trajectory [50]. Recent genetic results have improved our knowledge of the genetic basis of autism. Nevertheless, identification of phenotypic markers remains challenging due to

Gene Genetic alteration Location Reference

H275A Exon 6 28 CNV (microdeletion) Promoter 29

*MECP2* T158M, T158A Missense mutation 25 *CNTNAP2* 3709delG Exon 22 26

*PTEN* Deletion Exon 2 30 *CACNA1C* G406R Missense mutation 31 *UBE3A* D15S122 5' end of *UBE3A* 32, 33 *TSC2* SNP Intron 4, 9; exon 40 34 *NF1* SNP Intron 27 35 *NLGN3* R451C Missense mutation 36, 37 *NLGN4* 1186insT Frameshift mutation 37

*NRXN1 De novo* 320-kb deletion Promoter and initial

Missense structural variant Neurexin1ß signal

1-bp insertion Exon 11 46 *De novo* 7.9-Mb deletion 22q13.2-qter 47

*FOXP1 De novo* intragenic deletion Exons 4-14 41 *FOXP2* Del CAA; Exon 5 42, 43

Frequency of the TT allele Intron 15 *GRIK2* SNP M867I 44 *SHANK3 De novo* Q321R Stop codon 45

5'untranslated region 22-24

27

38, 39

40

Exon 14, 17 Exon 20, 21, 23, 24

coding exons

peptide region

Abbreviations: LIS, lissencephaly; XLID, X-linked intellectual disability; EPI, epilepsy; OCD, obsessive compulsive disorder; TS, Tourette syndrome; ADHD, attention deficit hyperactivity disorder.

**Table 2.** Autism plus syndromic ASD caused by rare, single-gene disorders


Clinical and Genetic Heterogeneity of Autism 223

phenotype. With the development of microarrays capable of scanning the genome at submicroscopic resolution, there is accumulating evidence that multiple CNVs contribute to the genetic vulnerability to autism [80]. *de novo* CNV has been identified in up to 7–10% of sporadic autism [81, 82], but are less frequent in multiplex families, in which CNV accounts only for about 2% of families screened [80, 83]. This could possibly suggest different genetic liabilities in simplex and multiplex autism. Recurrent CNVs at 15q11-13 (1-3% of autism patients), 16p11 (1% of autism patients), and 22q11-13 have been confirmed in multiple studies [80, 83-86]. This hypothesis also has been proven largely successful in identifying autism-susceptibility candidate genes, including gains and losses at *SHANK2* [87], *SHANK3*  [88], *NRXN1* [13], *NLGN3* and *NLGN4* [37], and *PTCHD1* [89, 90]. Neurexins and neuroligins are synaptic cell-adhesion molecules (CAMs) that connect pre- and postsynaptic neurons at synapses, mediate trans-synaptic signaling, and shape neural network properties by specifying synaptic functions. The Shank family of proteins provides scaffolding for signaling molecules in the postsynaptic density of glutamatergic synapses. Genes encoding CAMs play crucial roles in modulating or fine-tuning synaptic formation and synaptic specification. Localization and interacting proteins at the synapse is shown in Figure 1.

**Figure 1.** Localization of cell-adhesion molecules and their interacting proteins at the synapse. Proteins

synaptic vesicles presynaptic

Veli Mint**. . .** . SERT

Neurexins . . . . . . . Contactins

mGluR

Fyn

Shanks

.

 CASK CAMK

glutamate Integrin receptor CNTNAPs site

PTPa

postsynaptic site

It is apparent that many different loci, each with a presumably unique yet subtle contribution to neurodevelopment, underlie the phenotype of autism. These observations have resulted in a paradigm shift away from the previously held "common disease-common variant" hypothesis to a "common disease-rare variant" model for the genetic architecture of autism. The central tenet of this model suggests a role for multiple, rare, highly penetrant, genetic risk factors for ASD, many of which are in the form of CNV. To make sense of the contribution of CNVs to autism, a "threshold" model has been proposed [80]. The model posits that different CNVs exhibit different penetrance depending on the dosage sensitivity and function (relative to autism) of the gene(s) they affect. Some CNVs have a large impact

associated with ASD are underlined.

Neuroligins

GZAKP GUK

PSD95

cortactin

Abbreviations: ID, intellectual disability; SCZ, schizophrenia; TS, Tourette syndrome; SLI, speech and language impairment; ADHD, attention deficit hyperactivity disorder

**Table 3.** Severe and specific phenotype with rare variants of genes

#### **6. Copy number variation (CNV): A paradigm shift in autism**

The strong genetic contribution shown in family studies and the association of cytogenetic changes, but apparent lack of common risk factors in autism, led to a hypothesis that rare sub-microscopic unbalanced changes in the form of CNVs likely contribute to the autism phenotype. With the development of microarrays capable of scanning the genome at submicroscopic resolution, there is accumulating evidence that multiple CNVs contribute to the genetic vulnerability to autism [80]. *de novo* CNV has been identified in up to 7–10% of sporadic autism [81, 82], but are less frequent in multiplex families, in which CNV accounts only for about 2% of families screened [80, 83]. This could possibly suggest different genetic liabilities in simplex and multiplex autism. Recurrent CNVs at 15q11-13 (1-3% of autism patients), 16p11 (1% of autism patients), and 22q11-13 have been confirmed in multiple studies [80, 83-86]. This hypothesis also has been proven largely successful in identifying autism-susceptibility candidate genes, including gains and losses at *SHANK2* [87], *SHANK3*  [88], *NRXN1* [13], *NLGN3* and *NLGN4* [37], and *PTCHD1* [89, 90]. Neurexins and neuroligins are synaptic cell-adhesion molecules (CAMs) that connect pre- and postsynaptic neurons at synapses, mediate trans-synaptic signaling, and shape neural network properties by specifying synaptic functions. The Shank family of proteins provides scaffolding for signaling molecules in the postsynaptic density of glutamatergic synapses. Genes encoding CAMs play crucial roles in modulating or fine-tuning synaptic formation and synaptic specification. Localization and interacting proteins at the synapse is shown in Figure 1.

222 Mutations in Human Genetic Disease

Gene Chromosome Phenotype

*NRXN1* 2p16.3 ASD, ID,

*GRIK2* 6q16.3 ASD,

*NLGN4X* Xp22.32-

p22.31

11p15.5 Beckwith-

15q11-q13 Prader-Willi

impairment; ADHD, attention deficit hyperactivity disorder

(human/mou

Recessive ID

Wiedemann syndrome

syndrome

Maternal duplication of 15q11-13 region

ASD, ID, TS, ADHD

**Table 3.** Severe and specific phenotype with rare variants of genes

Mechanism involved in ASD Reference

39

40

71, 72

73

74

74

75

76, 77

79

36, 37

De novo 320-kb deletion that removes the promoter and initial coding exons of the *NRXN1* gene, resulting in deletion of neurexin 1a

signal peptide region

in or near *NRXN1*gene

and regulate its expression

binding domain

disrupting *SNRPN*

with autism

*FOXP2* 7q31.1 ASD, SLI Directly bind intron 1 of the *CNTNAP2* gene

*SHANK3* 22q13.33 ASD Mutation at an intronic donor splice site, one

*NLGN3* Xq13.1 ASD R451C mutation within the esterase domain of neuroligin 3

**6. Copy number variation (CNV): A paradigm shift in autism** 

Abbreviations: ID, intellectual disability; SCZ, schizophrenia; TS, Tourette syndrome; SLI, speech and language

The strong genetic contribution shown in family studies and the association of cytogenetic changes, but apparent lack of common risk factors in autism, led to a hypothesis that rare sub-microscopic unbalanced changes in the form of CNVs likely contribute to the autism

*FOXP1* 3p13 ID, ASD, SLI *De novo* intragenic deletion encompassing exons

Missense structural variants in the neurexin 1b

Translocations and intragenic rearrangements

4-14 of *FOXP1*, *de novo* nonsense mutation (c.1573C>T) in the conserved fork head DNA-

SNP1 and SNP2 of gene *GRIK2* were associated

Overexpression of paternally expressed *IGF2*, due to a gain of DNA methylation at paternal allele of *IC1* and suppression of maternally expressed suppressing factor *CDKN1C* 

Paternal deletions, maternal UPD at15q11–13, deletions and epimutations of *IC*, translocations

missense mutation in the coding region

Maternal duplications of 15q11-13 region 78

Frameshift mutation (1186insT) 37

CNV 69, 70

se)

SCZ, Language delay

**Figure 1.** Localization of cell-adhesion molecules and their interacting proteins at the synapse. Proteins associated with ASD are underlined.

It is apparent that many different loci, each with a presumably unique yet subtle contribution to neurodevelopment, underlie the phenotype of autism. These observations have resulted in a paradigm shift away from the previously held "common disease-common variant" hypothesis to a "common disease-rare variant" model for the genetic architecture of autism. The central tenet of this model suggests a role for multiple, rare, highly penetrant, genetic risk factors for ASD, many of which are in the form of CNV. To make sense of the contribution of CNVs to autism, a "threshold" model has been proposed [80]. The model posits that different CNVs exhibit different penetrance depending on the dosage sensitivity and function (relative to autism) of the gene(s) they affect. Some CNVs have a large impact on autism susceptibility and these are typically *de novo* in origin, cause more severe autistic symptoms, are more prevalent among sporadic forms of autism, and are less influenced by other factors like gender and parent of origin. Other CNVs have moderate or mild effects that probably require other genetic (or non-genetic) factors to take the phenotype across the autistic threshold.

Clinical and Genetic Heterogeneity of Autism 225

development and differentiation [107, 108]. Our study has suggested that a close contact with natural rubber latex (NRL) could trigger an immunoreaction to Hevea brasiliensis (Hev-b) proteins in NRL and resulted in autism [109]. This led us to a hypothesis that immune reactions triggered by environmental factors could damage synapse formation and neuronal connections, which would result in missing normal structure or function of synaptic proteins that are encoded by genes *NLGNs, NRXN1, CNTNAPs, SHANKs, or in* 

Autism is a heterogeneous disorder with a fundamental question of whether autism represents an etiologically heterogeneous disorder in which a myriad of genetic or environmental risk factors perturb common underlying molecular pathways in the brain [110]. Two recent studies have suggested there could be convergence at the level of molecular mechanisms in autism. The first study on molecular convergence in autism identified protein interactors of known autism or autism-associated genes [111]. This interactome revealed several novel interactions, including between two autism candidate genes, *SHANK3* and *TSC1*. The biological pathways identified in this study include synapse, cytoskeleton and GTPase signaling, demonstrating a remarkable overlap with those identified by the gene expression. The second, an analysis of gene expression in postmortem autism brain, provides strong evidence for a shared set of molecular alterations in a majority of cases of autism. This included disruption of the normal gene expression pattern that differentiates frontal and temporal lobes and two groups of genes deregulated in autistic brains: one related to neuronal function, and the other related to immune/inflammatory responses [111]. Genes associated with neuronal function were enriched in metabolic signal pathways, providing evidence that these changes were causal, rather than the consequence of the disease [112]. In contrast, the immune/inflammatory changes did not show a strong genetic signal, indicating a non-genetic etiology for this process and implicating environmental or epigenetic factors instead. These results provide strong evidence for converging molecular abnormalities in autism, and implicating transcriptional and splicing deregulation as underlying mechanisms of neuronal dysfunction in this

Autism is a heterogeneous set of brain developmental disorders with complex genetics, involving interactions between genetic, epigenetic and environmental factors. The heterogenerous genetics involves many genes/loci and different genetic variations in autism, such as deletion, translocation, SNP and CNV. Recent studies have also suggested there could be convergence at the level of molecular mechanisms in autism. Although the genetic basis is well documented, considering phenotypic and genotypic heterogeneity,

correspondences between genotype and phenotype have yet to be well established.

*deregulation of gene expression of FMR1, PTEN, FOXPs,* and *GRIK2*.

**8. Converging molecular pathways of autism** 

disorder.

**9. In summary** 

## **7. Epigenetics plays an important role in autism**

In addition to structural genetic factors that play causative roles for autism, environmental factors also play an important role in autism by influencing fetal or early postnatal brain development, directly or *via* epigenetic modifications. Epigenetic modifications include cytosine methylation, post-translational modification of histones, small interfering RNA and genomic imprinting. Involvement of epigenetic factors in autism is demonstrated by the central role of epigenetic regulatory mechanisms in the pathogenesis of Rett syndrome and fragile X syndrome (FXS), both are the monogenic disorders resulted from single gene defects and commonly associated with autism [38-40]. FXS is a result of a triplet expansion of CGG repeats at the 5' untranslated region of *FMR1* gene, which encodes the FMRP (fragile X mental retardation protein). FMRP is proposed to act as a translation regulator of specific mRNAs in the brain and involved in synaptic development and maturation, through its nucleo-cytoplasmic shuttle activity as an RNA-binding protein. It has been shown that FMRP uses its arginine-glycine-glycine (RGG) box domain to bind a subset of mRNA targets that form a G-quadruplex structure. FMRP has also been shown to undergo the post-translational modifications of arginine methylation and phosphorylation [91, 92]. Our recent study demonstrated that alteration of methylation patterns at loci of *Neurex1* and *ENO2* are associated with autism [Wang and Zhong, manuscript in preparation].

Genomic imprinting is the classic example of regulation of gene expression *via* epigenetic modifications, such as hypemethylation, that leads to parent of origin-specific gene expression. In addition, a growing number of genes that are not imprinted are regulated by DNA methylation, including Reelin (*RELN*) [41, 93-96], which has been considered as a candidate for autism. Several of the linkage peaks overlap or are in close proximity to regions that are subject to genomic imprinting on chromosomes 15q11-13, 7q21-31.31, 7q32.3-36.3 and possibly 4q21-31, 11p11.2-13 and 13q12.3, with the loci on chromosomes15q and 7q demonstrating the most compelling evidence for a combination of genetic and epigenetic factors that confer risks for autism [97-101]. Genes in the imprinted cluster on chromosome 15q11–13 include *MKRN3, ZNF127AS, MAGE12, NDN, ATP10A, GABRA5, GABRB3,* and *GABRG3* [102, 103]. Genes in the imprinted cluster on chromosome 7q21.3 include *SGCE, PEG10, PPP1R9A, DLX5, CALCR, ASB4, PON1, PON2,* and *PON3* [104, 105].

Research has recently focused on the connections between the immune system and the early development of brain, including its possible role in the development of autism [106]. Immune aberrations consistent with a deregulated immune response may target neuronal development and differentiation [107, 108]. Our study has suggested that a close contact with natural rubber latex (NRL) could trigger an immunoreaction to Hevea brasiliensis (Hev-b) proteins in NRL and resulted in autism [109]. This led us to a hypothesis that immune reactions triggered by environmental factors could damage synapse formation and neuronal connections, which would result in missing normal structure or function of synaptic proteins that are encoded by genes *NLGNs, NRXN1, CNTNAPs, SHANKs, or in deregulation of gene expression of FMR1, PTEN, FOXPs,* and *GRIK2*.

#### **8. Converging molecular pathways of autism**

Autism is a heterogeneous disorder with a fundamental question of whether autism represents an etiologically heterogeneous disorder in which a myriad of genetic or environmental risk factors perturb common underlying molecular pathways in the brain [110]. Two recent studies have suggested there could be convergence at the level of molecular mechanisms in autism. The first study on molecular convergence in autism identified protein interactors of known autism or autism-associated genes [111]. This interactome revealed several novel interactions, including between two autism candidate genes, *SHANK3* and *TSC1*. The biological pathways identified in this study include synapse, cytoskeleton and GTPase signaling, demonstrating a remarkable overlap with those identified by the gene expression. The second, an analysis of gene expression in postmortem autism brain, provides strong evidence for a shared set of molecular alterations in a majority of cases of autism. This included disruption of the normal gene expression pattern that differentiates frontal and temporal lobes and two groups of genes deregulated in autistic brains: one related to neuronal function, and the other related to immune/inflammatory responses [111]. Genes associated with neuronal function were enriched in metabolic signal pathways, providing evidence that these changes were causal, rather than the consequence of the disease [112]. In contrast, the immune/inflammatory changes did not show a strong genetic signal, indicating a non-genetic etiology for this process and implicating environmental or epigenetic factors instead. These results provide strong evidence for converging molecular abnormalities in autism, and implicating transcriptional and splicing deregulation as underlying mechanisms of neuronal dysfunction in this disorder.

#### **9. In summary**

224 Mutations in Human Genetic Disease

autistic threshold.

105].

on autism susceptibility and these are typically *de novo* in origin, cause more severe autistic symptoms, are more prevalent among sporadic forms of autism, and are less influenced by other factors like gender and parent of origin. Other CNVs have moderate or mild effects that probably require other genetic (or non-genetic) factors to take the phenotype across the

In addition to structural genetic factors that play causative roles for autism, environmental factors also play an important role in autism by influencing fetal or early postnatal brain development, directly or *via* epigenetic modifications. Epigenetic modifications include cytosine methylation, post-translational modification of histones, small interfering RNA and genomic imprinting. Involvement of epigenetic factors in autism is demonstrated by the central role of epigenetic regulatory mechanisms in the pathogenesis of Rett syndrome and fragile X syndrome (FXS), both are the monogenic disorders resulted from single gene defects and commonly associated with autism [38-40]. FXS is a result of a triplet expansion of CGG repeats at the 5' untranslated region of *FMR1* gene, which encodes the FMRP (fragile X mental retardation protein). FMRP is proposed to act as a translation regulator of specific mRNAs in the brain and involved in synaptic development and maturation, through its nucleo-cytoplasmic shuttle activity as an RNA-binding protein. It has been shown that FMRP uses its arginine-glycine-glycine (RGG) box domain to bind a subset of mRNA targets that form a G-quadruplex structure. FMRP has also been shown to undergo the post-translational modifications of arginine methylation and phosphorylation [91, 92]. Our recent study demonstrated that alteration of methylation patterns at loci of *Neurex1* and

*ENO2* are associated with autism [Wang and Zhong, manuscript in preparation].

Genomic imprinting is the classic example of regulation of gene expression *via* epigenetic modifications, such as hypemethylation, that leads to parent of origin-specific gene expression. In addition, a growing number of genes that are not imprinted are regulated by DNA methylation, including Reelin (*RELN*) [41, 93-96], which has been considered as a candidate for autism. Several of the linkage peaks overlap or are in close proximity to regions that are subject to genomic imprinting on chromosomes 15q11-13, 7q21-31.31, 7q32.3-36.3 and possibly 4q21-31, 11p11.2-13 and 13q12.3, with the loci on chromosomes15q and 7q demonstrating the most compelling evidence for a combination of genetic and epigenetic factors that confer risks for autism [97-101]. Genes in the imprinted cluster on chromosome 15q11–13 include *MKRN3, ZNF127AS, MAGE12, NDN, ATP10A, GABRA5, GABRB3,* and *GABRG3* [102, 103]. Genes in the imprinted cluster on chromosome 7q21.3 include *SGCE, PEG10, PPP1R9A, DLX5, CALCR, ASB4, PON1, PON2,* and *PON3* [104,

Research has recently focused on the connections between the immune system and the early development of brain, including its possible role in the development of autism [106]. Immune aberrations consistent with a deregulated immune response may target neuronal

**7. Epigenetics plays an important role in autism** 

Autism is a heterogeneous set of brain developmental disorders with complex genetics, involving interactions between genetic, epigenetic and environmental factors. The heterogenerous genetics involves many genes/loci and different genetic variations in autism, such as deletion, translocation, SNP and CNV. Recent studies have also suggested there could be convergence at the level of molecular mechanisms in autism. Although the genetic basis is well documented, considering phenotypic and genotypic heterogeneity, correspondences between genotype and phenotype have yet to be well established.

## **Author details**

Yu Wang1, Nanbert Zhong 1,2,3,\*

*1Shanghai Children's Hospital Affiliated to Shanghai Jiaotong University, Shanghai, China 2Peking University Center of Medical Genetics, Beijing, China 3New York State Institute for Basic Research in Developmental Disabilities, Staten Island, New York, USA* 

Clinical and Genetic Heterogeneity of Autism 227

[12] Szatmari P, Jones MB, Zwaigenbaum L (1998) Genetics of autism: overview and new

[13] Abrahams BS, Geschwind DH (2008) Advances in autism genetics: on the threshold of a

[14] Zwaigenbaum L, Bryson S, Roberts W (2005) Behavioral markers of autism in the first

[15] Bailey A, Le Couteur A, Gottesman I (1995) Autism as a strongly genetic disorder:

[16] Rosenberg RE, Law JK, Yenokyan G (2009) Characteristics and concordance of autism spectrum disorders among 277 twin pairs. Arch Pediatr Adolesc Med. 163: 907–914. [17] Hallmayer J, Cleveland S, Torres A (2011) Genetic heritability and shared environmental factors among twin pairs with autism. Arch Gen Psychiatry. 68: 1095-

[18] Bolton P, Macdonald H, Pickles A (1994) A case-control family history study of autism.

[19] Glessner JT, Wang K, Cai G (2009) Autism genome-wide copy number variation reveals

[20] Wang K, Zhang H, Ma D (2009) Common genetic variants on 5p14.1 associate with

[21] Bucan M, Abrahams BS, Wang K (2009) Genome-wide analyses of exonic copy number variants in a family-based study point to novel autism susceptibility genes. PLoS Genet.

[22] Maddalena A, Richards CS, McGinniss MJ (2001) Technical standards and guidelines for Fragile X: The first of a series of disease-specific supplements to the Standards and Guidelines for Clinical Genetics Laboratories of the American College of Medical Genetics. Quality assurance subcommittee of the laboratory practice committee. Genet

[23] Pfeiffer BE, Huber KM (2009) The state of synapses in fragile X syndrome.

[24] Tan H, Li H, Jin P (2009) RNA-mediated pathogenesis in fragile X-associated disorders.

[25] Goffin D, Allen M, Zhang L (2011) Rett syndrome mutation MeCP2 T158A disrupts DNA binding, protein stability and ERP responses. Nat Neurosci. 15: 274-283. [26] Strauss KA, Puffenberger EG, Huentelman MJ (2006) Recessive symptomatic focal epilepsy and mutant contactin-associated protein-like 2. N Engl J Med. 354: 1370–1377. [27] Bakkaloglu B, O'Roak BJ, Louvi A (2008) Molecular cytogenetic analysis and resequencing of contactin associated protein-like 2 in autism spectrum disorders. Hum

[28] O'Roak BJ, Deriziotis P, Lee C (2011) Exome sequencing in sporadic autism spectrum

[29] Nord AS, Roeb W, Dickel DE (2011) Reduced transcript expression of genes affected by

disorders identifies severe de novo mutations. Nat Genet. 46: 585–589.

inherited and de novo CNVs in autism. Eur J Hum Genet. 19: 727–731.

Evidence from a British twin study. Psychological Medicine. 25: 63–77.

directions. J Autism and Dev Disord. 28: 351–368.

new neurobiology. Nat Rev Genet. 9: 341–355.

year of life. Intern J. Dev Neurosci. 23: 143–152.

Child Psychol Psychiatry. 35: 877–900.

ubiquitin and neuronal genes. Nature. 459: 569–573.

autism spectrum disorders. Nature. 459: 528–533.

1102.

5: e1000536.

Med. 3: 200-205.

Genet. 82: 165–173.

Neuroscientist. 15: 549-567.

Neurosci Lett. 466: 103-108.

## **Acknowledgement**

This work was supported in part by the "973" program (2012CB517905) granted by the Chinese Ministry of Science and Technology, the Shanghai Municipal Department of Science and Technology (2009JC1412600), and the New York State Office of People with Developmental Disabilities (OPWDD).

## **10. References**


<sup>\*</sup> Corresponding Author

[12] Szatmari P, Jones MB, Zwaigenbaum L (1998) Genetics of autism: overview and new directions. J Autism and Dev Disord. 28: 351–368.

226 Mutations in Human Genetic Disease

Yu Wang1, Nanbert Zhong 1,2,3,\*

**Acknowledgement** 

**10. References** 

FL, pp. 476–498.

Developmental Disabilities (OPWDD).

*1Shanghai Children's Hospital Affiliated to Shanghai Jiaotong University, Shanghai, China* 

[1] Geschwind DH (2009) Advances in autism. Annu Rev Med. 60: 367–380.

Autism Diagnostic Interview. Autism Dev Disord. 27: 501-517.

developmental disorders. Autism Dev Disord. 31: 131-151.

population sample. Am. J. Psychiatry. 168: 904–912.

of autism. Autism Dev Disord. 30: 205-223.

Options Neurol. 6: 391-400.

London. 22p.

Corresponding Author

 \*

*3New York State Institute for Basic Research in Developmental Disabilities, Staten Island, New York,* 

This work was supported in part by the "973" program (2012CB517905) granted by the Chinese Ministry of Science and Technology, the Shanghai Municipal Department of Science and Technology (2009JC1412600), and the New York State Office of People with

[2] Kogan MD, Blumberg SJ, Schieve LA (2007) Prevalence of parent-reported diagnosis of autism spectrum disorderamong children in the US. Pediatrics. 124: 1395–1403. [3] Kim YS, Leventhal L, Koh YJ (2011) Prevalence of autism spectrum disorders in a total

[4] Ganz ML (2006) The Costs of Autism In Moldin, SO and Rubenstein, JLR (eds), Understanding Autism: from Basic Neuroscience to Treatment. CRC Press, Boca Raton,

[5] Lord C, Pickles A, McLennan J (1997) Diagnosing autism: analyses of data from the

[6] Lord C, Risi S, Lambrecht L (2000) The autism diagnostic observation schedule-generic: a standard measure of social and communication deficits associated with the spectrum

[7] Schopler E, Reichler R, Renner BR (1991) The childhood autism rating scale. Los Angeles: Western Psychological Services; 1988, Psychol Monogr. 117: 313-357. [8] Robins D, Fein D, Barton M, Green J (2001) The Modified Checklist for Autism in Toddlers: an initial study investigating the early detection of autism and pervasive

[9] Pinto MJ, Levy S (2004) Early diagnosis of autism spectrum disorders. Curr Treat

[10] Gillberg C and Coleman M (2000) The biology of autistic syndromes, 3rd ed. Mac Keith,

[11] Fombonne E (2001) Is there an epidemic of autism? Pediatrics. 107: 411–412.

*2Peking University Center of Medical Genetics, Beijing, China* 

**Author details** 

*USA* 


[30] Conti S, Condò M, Posar A (2011) Phosphatase and Tensin Homolog (PTEN) Gene Mutations and Autism: Literature review and a case report of a patient with Cowden Syndrome, Autistic Disorder and Epilepsy. J. Child Neurol. 29: 123-126.

Clinical and Genetic Heterogeneity of Autism 229

[48] Chiocchetti A, Klauck SM (2011) Genetic analyses for identifying molecular

[49] Bonnet-Brilhault F. (2011) Genotype/phenotype correlation in autism: genetic models

[50] Eapen V (2011) Genetic basis of autism: is there a way forward? Curr Opin Psychiatry.

[51] Vernes SC, Newbury DF, Abrahams BS (2008) A functional genetic link between

[52] Newbury DF, Paracchini S, Scerri TS (2011) Investigation of dyslexia and SLI risk variants in reading- and language-impaired subjects. Behav Genet. 41: 90–104. [53] Poot M, Beyer V, Schwaab I (2010) Disruption of CNTNAP2 and additional structural genome changes in a boy with speech delay and autism spectrum disorder.

[54] Sehested LT, Møller RS, Bache I (2010) Deletion of 7q34-q36.2 in two siblings with mental retardation, language delay, primary amenorrhea, dysmorphic features. Am J

[55] Teramitsu I, Kudo LC, London SE (2004) Parallel FoxP1 and FoxP2 expression in songbird and human brain predicts functional interaction. Neurosci. 24: 3152–3163. [56] Panaitof SC, Abrahams BS, Dong H (2010) Language-related Cntnap2 gene is differentially expressed in sexually dimorphic song nuclei essential for vocal learning in

[57] Shoubridge C, Tan MH, Fullston T (2010) Mutations in the nuclear localization sequence of the Aristaless related homeobox; sequestration of mutant ARX with IPO13 disrupts normal subcellular distribution of the transcription factor and retards cell

[58] Hartshorne TS, Grialou TL, Parker KR (2005) Autistic-like behavior in CHARGE

[59] Johansson M, Rastam M, Billstedt E (2006) Autism spectrum disorders and underlying

[60] Smith IM, Nichols SL, Issekutz K (2005) Behavioral profiles and symptoms of autism in CHARGE syndrome: preliminary Canadian epidemiological data. Am J Med Genet A.

[61] Skuse DH, James RS, Bishop DV (1997) Evidence from Turner's syndrome of an

[62] Bianconi SE, Conley SK, Keil MF (2011) Adrenal function in Smith-Lemli-Opitz

[63] Depil K, Beyl S, Stary-Weinzinger A(2011) Timothy mutation disrupts the link between activation and inactivation in Ca(V)1.2 protein. J Biol Chem. 286: 31557-31564. [64] Klymiuk N, Thirion C, Burkhardt K (2011) 238 tailored pig model of Duchenne

[65] Valerio N, Romina M, Paolo C (2009) Recent advances in neurobiology of Tuberous

brain pathology in CHARGE association. Dev Med Child Neurol. 48: 40-50.

imprinted X-linked locus affecting cognitive function. Nature. 387: 705-708.

distinct developmental language disorders. N Engl J Med. 359: 2337–2345.

mechanisms in autism spectrum disorders. Encephale. 37: 68-74.

and phenotypic characterization. Encephale. 37: 68-74.

24: 226-236.

Neurogenetics. 11: 81–89.

Med Genet. 152A: 3115–3119.

division. Pathogenetics. 3: 1.

133A: 248-256.

songbirds. Comp. Neurol. 518: 1995–2018.

syndrome. Am J Med Genet A. 133A: 257-261.

syndrome. Am J Med Genet A. 155A: 2732-2738.

muscular dystrophy. Reprod Fertil Dev. 24: 231.

Sclerosis Complex. Brain Dev. 31: 104-113.


[48] Chiocchetti A, Klauck SM (2011) Genetic analyses for identifying molecular mechanisms in autism spectrum disorders. Encephale. 37: 68-74.

228 Mutations in Human Genetic Disease

119-123.

Genet. 131B: 43-47.

Psychiatry. 17: 71-84.

Brain Res. 1380: 98-105.

disorders. J. Biol Chem. 281: 22085-22091.

UBE3A Psychiatry Res. 185: 33-38.

[30] Conti S, Condò M, Posar A (2011) Phosphatase and Tensin Homolog (PTEN) Gene Mutations and Autism: Literature review and a case report of a patient with Cowden

[31] Splawski I, Yoo DS, Stotz SC (2006) CACNA1H mutations in autism spectrum

[32] Guffanti G, Strik Lievers L, Bonati MT (2011) Role of UBE3A and ATP10A genes in autism susceptibility region 15q11-q13 in an Italian population: a positive replication for

[33] Nurmi EL, Bradford Y, Chen Y (2001) Linkage disequilibrium at the Angelman

[34] FJ Serajee, R Nabi, H Zhong (2003) Association of INPP1, PIK3CG, and TSC2 gene variants with autistic disorder: Implications for phosphatidylinositol. J Med Genet. 40:

[35] Marui T, Hashimoto O, Nanba E (2004) Association between theNeurofibro matosis-1 (NF1) locus and autism in the Japanese population. Am J Med Genet B Neuropsychiatr

[36] Jamain S, Quach H, Betancur C (2003) Mutations of the X-linked genes encoding neuroligins NLGN3 and NLGN4 are associated with autism. Nat Genet. 34: 27–29. [37] Comoletti D, De Jaco A, Jennings LL (2004) The Arg451 Cys- neuroligin-3 mutation associated with autism reveals a defect in protein processing. J Neurosci. 24: 4889–4893. [38] Friedman JM, Baross A, Delaney AD (2006) Oligonucleotide microarray analysis of genomic imbalance in children with mental retardation. Am J Hum Genet. 79: 500–513. [39] Zahir FR, Baross A, Delaney AD (2008) A patient with vertebral, cognitive and behavioural abnormalities and a de novo deletion of N RXN1a. Med Genet. 45: 239–243. [40] Feng J, Schroer R, Yan J (2006) High frequency of neurexin 1 signal peptide structural

[41] Hamdan FF, Daoud H, Rochefort D (2010) De novo mutations in FOXP1 in cases with intellectual disability, autism, and language impairment. Am J Hum Genet. 87: 671-678. [42] Li H, Yamagata T, Mori M (2005) Absence of causative mutations and presence of autism-related allele in FOXP2 in Japanese autistic patients. Brain Dev. 27: 207-210. [43] Mukamel Z, Konopka G, Wexler E (2011) Regulation of MET by FOXP2, genes implicated in higher cognitive dysfunction and autism risk. J Neurosci. 31: 11437-11442. [44] Jamain S, Betancur C, Quach H (2002) Linkage and association of the glutamate

[45] Durand CM, Perroy J, Loll F (2012) SHANK3 mutations identified in autism lead to modification of dendritic spine morphology via an actin-dependent mechanism. Mol

[46] Kolevzon A, Cai G, Soorya L (2011) Analysis of a purported SHANK3 mutation in a boy with autism: clinical impact of rare variant research in neurodevelopmental disabilities.

[47] Chen CP, Lin SP, Chern SR (2010) A de novo 7.9 Mb deletion in 22q13.2→qter in a boy with autistic features, epilepsy, developmental delay, atopic dermatitis and abnormal

Syndrome, Autistic Disorder and Epilepsy. J. Child Neurol. 29: 123-126.

syndrome gene UBE3A in autism families. Genomics. 77: 105-113.

variants in patients with autism. Neurosci Lett. 409: 10–13.

receptor 6 gene with autism. Mol Psychiatry. 7: 302-310.

immunological findings. Eur J Med Genet. 53: 329-332.


[66] Bianconi SE, Conley SK, Keil MF (2011) Adrenal function in Smith-Lemli-Opitz syndrome. Am J Med Genet A. J. 155A: 2732-2738.

Clinical and Genetic Heterogeneity of Autism 231

[84] Szatmari P, Paterson AD, Zwaigenbaum L (2007) Mapping autism risk loci using

[85] Weiss LA, Shen Y, Korn JM (2008) Association between microdeletion and

[86] Kumar RA, KaraMohamed S, Sudi J (2008) Recurrent 16p11.2 microdeletions in autism.

[87] Berkel S, Marshall CR, Weiss B (2010) Mutations in the SHANK2 synaptic scaffolding gene in autism spectrum disorder and mental retardation.Nature Genetics. 42: 489–491. [88] Durand CM, Betancur C, Boeckers TM (2007) Mutations in the gene encoding the synaptic scaffolding protein SHANK3 are associated with autism spectrum disorders.

[89] Pinto D, Pagnamenta AT, Klei L (2010) Functional impact of global rare copy number

[90] Noor A, Whibley A, Marshall CR (2010) Disruption at the PTCHD1 locus on Xp22.11 in autism spectrum disorder and intellectual disability. Sci Transl Med. 2: 49ra68. [91] Auerbach BD, Osterweil EK, Bear MF(2011) Mutations causing syndromic autism

[92] Evans TL, Blice-Baum AC, Mihailescu MR (2012) Analysis of the Fragile X mental retardation protein isoforms 1, 2 and 3 interactions with the G-quadruplex forming

[93] Noh JS, Sharma RP, Veldic M (2005) DNA methyltransferase1 regulates reelin mRNA expression in mouse primary cortical cultures. Proc Natl Acad Sci USA. 102: 1749–1754. [94] Grayson DR, Chen Y, Costa E (2006) The human reelin gene: Transcription factors (t), repressors (2) and the methylation switch(t/2) in schizophrenia. Pharmacol. Ther. 111:

[95] Sato N, Fukushima N, Chang R (2006) Differential and epigenetic gene expression profiling identifies frequent disruption of the RELN pathway in pancreatic

[96] Serajee FJ, Zhong H, Mahbubul AH (2006) Association of Reelin gene polymorphisms

[97] Numachi Y, Yoshida S, Yamashita M (2004) Psychostimulant alters expression of DNA

[98] Huang CH, Chen CH. (2006) Absence of association of a polymorphic GGC repeat at the 50 untranslated region of the reelin gene with schizophrenia. Psychiatry Res. 142:

[99] Skaar DA, Shao Y, Haines JL (2005) Analysis of the RELN gene as a genetic risk factor

[100] Li J, Nguyen L, Gleason C (2004) Lack of evidence for an association between WNT2 and RELN polymorphisms and autism. Am J Med Genet B Neuropsychiatr. Genet. 126:

[101] Bonora E, Beyer KS, Lamb JA (2003) Analysis of reelin as a candidate gene for autism.

methyltransferase mRNA in the rat brain. Ann. NY Acad Sci. 1025: 102–109.

genetic linkage and chromosomal rearrangements. Nat Genet. 39: 319-328.

microduplication at 16p11.2 and autism. N Engl J Med. 358: 667-675.

variation in autism spectrum disorder. Nature. 466: 368-372.

define an axis of synaptic pathophysiology. Nature. 480: 63-68.

semaphorin 3F mRNA. Mol Biosyst. 8: 642-649.

cancers.Gastroenterology. 30: 548–565.

for autism. Mol. Psychiatry. 10: 563–571.

Mol. Psychiatry. 8: 885–892.

with autism. Genomics. 87: 75–83.

Hum Mol Genet. 17: 628-638.

Nature Genetics. 39: 25–27.

272–286.

89–92.

51–57.


[84] Szatmari P, Paterson AD, Zwaigenbaum L (2007) Mapping autism risk loci using genetic linkage and chromosomal rearrangements. Nat Genet. 39: 319-328.

230 Mutations in Human Genetic Disease

483.

131: 565-579.

Genet A. 140: 1136-1142.

conditions. Nature. 16: 919–923.

[66] Bianconi SE, Conley SK, Keil MF (2011) Adrenal function in Smith-Lemli-Opitz

[67] Coutinho AM, Oliveira G, Katz C (2007) MECP2 coding sequence and 3'UTR variation in 172 unrelated autistic patients. Am J Med Genet B Neuropsychiatr Genet.144B: 475-

[68] Shibayama A, Cook EH, Feng J (2004) MECP2 structural and 3'-UTR variants in schizophrenia, autism and other psychiatricdiseases: a possible association with autism.

[69] Glessner JT, Wang K, Cai G (2009) Autism genome-wide copy number variation reveals

[70] Szatmari P, Paterson AD, Zwaigenbaum L (2007) Mapping autism risk loci using

[71] Kim HG, Kishikawa S, Higgins AW (2008) Disruption of neurexin 1 associated with

[72] Wisniowiecka KB, Nesteruk M, Peters SU (2010) Intragenic rearrangementsin NRXN1 in three families with autismspectrum disorder, developmental delay, and speech

[73] Hamdan FF, Daoud H, Rochefort D (2010) De novo mutations in FOXP1 in cases with intellectual disability, autism, and language impairment. Am J Hum Genet. 87: 671-678. [74] Casey JP, Magalhaes T, Conroy JM (2011) Regan RA novel approach of homozygous haplotype sharing identifies candidate genes in autism spectrum disorder. Hum Genet.

[75] Kent L, Bowdin S, Kirby GA (2008) Beckwith Weidemann syndrome: a behavioral phenotype-genotype study.Am J Med Genet B Neuropsychiatr Genet. 147B: 1295-1297. [76] Descheemaeker MJ, Govers V, Vermeulen PJ (2006) Pervasive developmental disorders in Prader-Willi syndrome: the Leuven experience in 59 subjects and controls. Am J Med

[77] Veltman MW, Thompson RJ, Roberts SE (2004) Prader-Willi Syndrome-a study comparing deletion and uniparental disomy cases with reference to autism spectrum

[78] Hogart A, Wu D, Lasalle JM (2010) The comorbidity of autism with the genomic

[79] Gauthier J, Champagne N, Lafrenière RG (2010). De novo mutations in the gene Encoding the synaptic scaffolding protein SHANK3 in patients ascertained for

[80] Cook EH, Scherer SW (2008) Copynumber variations associated with neuropsychiatric

[81] Sebat J, Lakshmi B, Malhotra D (2007) Strong association of de novo copy number

[82] Marshall CR, Noor A, Vincent JB (2008) Structural variation of chromosomes in autism

[83] Morrow EM, Yoo SY, Flavell SW (2008) Identifying autism loci and genes by tracing

genetic linkage and chromosomal rearrangements.Nat Genet. 39: 319–328.

syndrome. Am J Med Genet A. J. 155A: 2732-2738.

Am J Med Genet B Neuropsychiatr Genet. 128B: 50-53.

autism spectrum disorder. Am J Hum Genet. 82: 199–207.

delay. Am J Med Genet B Neuropsychiatr Genet. 153B: 983–993.

ubiquitin and neuronal genes.Nature. 459: 569–573.

disorders. Eur Child Adolesc Psychiatry. 13: 42-50.

schizophrenia. Proc Natl Acad Sci. 107: 7863-7868.

mutations with autism. Science. 316: 445-449.

recent shared ancestry. Science. 321: 218-223.

spectrum disorder. Am J Hum Genet. 82: 477-488.

disorders of chromosome 15q11.2-q13. Neurobiol Dis. 38: 181-191.


[102] Lee S, Walker CL, Karten B (2005) Essential role for the Prader-Willi syndrome protein necdin in axonal outgrowth. Hum Mol Genet. 14: 627–637.

**Chapter 0**

**Chapter 12**

**Bioinformatics Approaches to the Functional**

In the search for genetic mutations susceptible to human diseases, researchers take either genome-wide approaches or candidate gene approaches [1]. Traditional techniques in both approaches, such as chromosomal scan on the pedigree data and case-control design for a small number of genes of interest, however, have limitations in either achieving high resolution to identify specific genes, or obtaining whole genome coverage. Discoveries from pedigree linkage usually pointed to one or a few chromosomal regions related to the phenotype of interest, and these regions generally harbor many (perhaps hundreds) of genes, which rendered pinpointing actual genetic causes a daunting task. On the other hand, association studies typically focused on a couple of genes, some of which may participate in the same pathway, and the number of interrogated variants was always experimentally manageable. However, technical advances have brought high-throughput approaches within the reach of more and more scientists, increasing the volume of variants that researchers can interrogate by genotyping array and next-generation sequencing techniques at an exponential pace. A recent dbSNP build (build 135), a large public-domain database of single-nucleotide polymorphisms (SNPs), hosts more than 41.7 million validated human mutations, and with ongoing large-scale efforts such as the 1000 Genomes Project [2], that number is poised to

Of all genomic variants, those occurring in the protein-coding genes and resulting in amino acid substitutions hold special interest, as we have more knowledge about coding genes and their products than other genomic elements. Amino acid substitutions, or nonsynonymous SNPs (nsSNPs), not only change primary protein sequence but also have the potential for altering protein structure and disrupting or creating functional sites. These consequences can

Currently, about 1.2 million nsSNPs have been mapped to NCBI RefSeq proteins (2012/06), but we only have knowledge for a small fraction of them. The Human Gene Mutation

and reproduction in any medium, provided the original work is properly cited.

©2012 Li et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

be tested experimentally, although doing so is costly and time-consuming.

**Profiling of Genetic Variants**

Biao Li, Predrag Radivojac and Sean Mooney

http://dx.doi.org/10.5772/45900

**1. Introduction**

grow significantly larger.

Additional information is available at the end of the chapter


## **Bioinformatics Approaches to the Functional Profiling of Genetic Variants**

Biao Li, Predrag Radivojac and Sean Mooney

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/45900

#### **1. Introduction**

232 Mutations in Human Genetic Disease

46: 1239–1247.

2349–2362.

151–157.

Mol Neurosci. 43: 443-452.

[102] Lee S, Walker CL, Karten B (2005) Essential role for the Prader-Willi syndrome protein

[103] Kashiwagi A, Meguro M, Hoshiya H (2003) Predominant maternal expression of the

[104] Draganov DI, Teiber JF, Speelman A (2005) Human paraoxonases (PON1, PON2 and PON3) are lactonases with overlapping and distinct substrate specificities. Lipid Res.

[105] Terry-Lorenzo RT, RoadcapDW, Otsuka T (2005) Neurabin/protein phosphatase-1 complex regulates dendritic spine morphogenesis and maturation. Mol Biol Cell. 16:

[106] Croen LA, Grether JK, Yoshida CK (2005) Maternal autoimmune diseases, asthma, and allergies, and childhood autism spectrum disorders. Arch Pediatr Adolesc Med. 159:

[107] Braunschweig D, Ashwood P, Krakowiak P (2008) Autism: maternally derived

[108] Singer HS, Morris CM, Gause CD (2008) Antibodies against fetal brain in sera of others

[109] Shen C, Zhao XL, Zhong N. (2010) A proteomic investigation of B lymphocytes in an autisc faily: A pilot study of exposure to natural rubber latx (NRL) may lead to autism. J

[110] Glessner JT, Wang K, Cai G (2009) Autism genome-wide copy number variation

[111] Sakai Y, Shaw CA, Dawson BC (2011) Protein interactome reveals converging

[112] Voineagu I, Wang X, Johnston P (2011) Transcriptomic analysis of autistic brain

antibodies specific for fetal brain proteins. NeuroToxicology. 29: 226–231.

molecular pathways among autism disorders. Sci Transl Med. 3: 86ra49.

with autistic children. Neuroimmunol. 194: 165–172.

reveals ubiquitin and neuronal genes. Nature. 459: 569–573.

reveals convergent molecular pathology. Nature. 474: 380–384.

mouse Atp10c in hippocampus and olfactory bulb. Hum Genet. 48: 194–198.

necdin in axonal outgrowth. Hum Mol Genet. 14: 627–637.

In the search for genetic mutations susceptible to human diseases, researchers take either genome-wide approaches or candidate gene approaches [1]. Traditional techniques in both approaches, such as chromosomal scan on the pedigree data and case-control design for a small number of genes of interest, however, have limitations in either achieving high resolution to identify specific genes, or obtaining whole genome coverage. Discoveries from pedigree linkage usually pointed to one or a few chromosomal regions related to the phenotype of interest, and these regions generally harbor many (perhaps hundreds) of genes, which rendered pinpointing actual genetic causes a daunting task. On the other hand, association studies typically focused on a couple of genes, some of which may participate in the same pathway, and the number of interrogated variants was always experimentally manageable. However, technical advances have brought high-throughput approaches within the reach of more and more scientists, increasing the volume of variants that researchers can interrogate by genotyping array and next-generation sequencing techniques at an exponential pace. A recent dbSNP build (build 135), a large public-domain database of single-nucleotide polymorphisms (SNPs), hosts more than 41.7 million validated human mutations, and with ongoing large-scale efforts such as the 1000 Genomes Project [2], that number is poised to grow significantly larger.

Of all genomic variants, those occurring in the protein-coding genes and resulting in amino acid substitutions hold special interest, as we have more knowledge about coding genes and their products than other genomic elements. Amino acid substitutions, or nonsynonymous SNPs (nsSNPs), not only change primary protein sequence but also have the potential for altering protein structure and disrupting or creating functional sites. These consequences can be tested experimentally, although doing so is costly and time-consuming.

Currently, about 1.2 million nsSNPs have been mapped to NCBI RefSeq proteins (2012/06), but we only have knowledge for a small fraction of them. The Human Gene Mutation

Database (HGMD; [3]) logs roughly 69,000 nsSNPs that are associated with diseases or traits; UniProt documents 37,000 nsSNPs as being neutral. For every six nsSNPs deposited in the public databases, five will have no disease or phenotype association. This gap will even grow larger as the emerging personal genome projects (www.personalgenomes.org) and whole-exome sequencing [4, 5] discover more rare variants.

For example, amino acid substitution T183A, identified in the prion protein (PRNP), can cause spongiform encephalopathy by disrupting the consensus sequence NX[ST] through the loss

Bioinformatics Approaches to the Functional Pro ling of Genetic Variants 235

Many computational tools aiming to establish that nsSNPs cause disease are based on evolutionary characteristics, structural consequences, or functional impact, alone or in combination. One early and established method, SIFT (sort intolerant from tolerant substitutions; [13]), estimates the predisposition to disease for mutation solely by exploiting conservation information from sequence homology. Another well-known tool, PolyPhen-2 [14], uses predicted physicochemical features based on protein sequence in a naive Bayes

In this chapter, we discuss the structural and functional impact of nsSNPs on the underlying proteins. We will provide concrete examples of both aspects, showing mechanisms through which amino acid substitutions affect proteins and contribute to disease phenotypes. We describe algorithms for predicting stability changes and for assigning probabilities to putative phosphorylation sites. We then apply these concepts/tools to the problem of distinguishing deleterious mutations from neutral ones. Finally, we will present another nsSNP prediction approach, MutPred, and apply it to a subset of dbSNP. Through these efforts, we aim to characterize a variety of computational approaches to the problem of inferring disease consequences for genetic variants, and demonstrate that these approaches are fruitful.

A classic disease that results from protein structural change via amino acid substitution is sickle cell anemia [15]. Replacement of a hydrophilic glutamic acid residue with a strong hydrophobic valine on the sixth amino acid of hemoglobin subunit beta causes the protein to aggregate and form rigid molecules, which in turn reshape the red blood cells as sickle-like [16]. The sickle cells die prematurely and thus result in anemia. Other possible structural abnormalities that nsSNPs can induce include changes of secondary structure, gain or loss of protein stability, and other physicochemical property alterations. In this section, we will illustrate two mutations on a cancer-related gene, BRCA1, and then describe an algorithm for predicting protein stability; finally, we will discuss its application to discriminating neutral

BRCA1 is a well-known suppressor of breast and ovarian cancer tumors. Two C-terminal sequence repeats (BRCT) are essential for BRCA1's function, since mutations of stop codon and missense substitutions on these regions were observed in breast cancer patients [17, 18]. The crystal structure of the BRCT segments [19] shows that these two domains pack to each other in a tandem manner where one helix on the N-terminal domain and two helices on the

Two amino acid substitutions occur on this interface at A1708E, located near the end of the *α*1 helix, and at M1775R, located near the beginning of the *α*2 helix. At position 1708, the mutant glutamic acid is much larger than the original alanine (having a molecular weight of 147 versus 89) and introduces negative charge. Because M1708 lies near the center of the interaction surface, the compact core cannot accommodate this mutation sterically. Thus,

C-terminal domain form an inner-domain interaction surface (Figure 1).

of the threonine [12].

classifier, in addition to sequence alignment.

**2. Structural impact of mutations**

and deleterious mutations.

Accompanying the compilation of a myriad of variants, a natural question arises about interpreting them in the context of human health. More specifically, how do we assess the disease risk for individual variants based on available biomedical information? Population studies, such as genome-wide association studies, have in recent years provided estimates of an odds ratio by comparing the frequencies of hundreds of thousands of genomic variants between disease/trait patients and healthy controls. One centralized resource, namely the Catalog of Published Genome-Wide Association Studies from the National Human Genome Research Institute [6], has collected published association studies involving at least 100,000 variants from 2008. The latest version (2012/06) records 8,063 significant mutation-trait associations from 1,287 studies. Most of these associations present a modest effect size with a median odds ratio (OR) of 1.36 (interquartile range [IQR]: 1.19–2.02). One clear observation from these studies is that the majority of variants occur in non-coding regions where the two most frequent locations are intergenic regions (43 percent) and introns (40 percent). In sharp contrast, only 368 nsSNPs associated with 177 diseases/traits were reported, with a slightly stronger effect size: a median OR of 1.52 (IQR: 1.21–3.33). This examination makes clear that the number of cohort studies will not keep pace with the increase in nsSNP data generation, suggesting that computational approaches may provide an important aid to our understanding of mutation-disease relationships.

Among all genome-level characteristics, scientists have collected the most knowledge about protein-coding genes, and they have published many investigations into the impacts of missense variants. Through mapping disease-associated nsSNPs and amino acid changes without disease annotations to the multispecies sequence alignment, researchers have observed that mutations related to monogenic diseases occurred significantly more frequently at slow-evolving positions, while neutral nsSNPs were enriched at fast-evolving positions [7, 8]. This observation therefore suggests that evolutionary rate could act as an indicator for discriminating diseases from neutral mutations. Also, the availability of crystal structure for numerous proteins provides us an opportunity to examine nsSNP consequences in the steric context. For example, p53, a well-studied tumor suppressor protein, is involved in many critical cell processes, such as DNA repair and cell-cycle regulation; p53 is inactive in half of all cancers [9]. Six mutation hot spots, such as R175H, R273H, and R282W, have been mapped to the p53 DNA-binding core domain that is critical to its activation, and most of them destabilize protein structure, leading to the degradation of p53 [10]. Intriguingly, certain mutations introduced to the mutant p53 could counteract this reduced stability and potentially rescue its functionality [11]. For example, nsSNP N268D in mutant p53 results in a hydrogen bond which bridges two strands and ultimately leads to an increase in thermodynamic stability. Finally, nsSNPs could influence a broad array of functional sites, including protein- and ligand-binding sites, catalytic residues, and numerous post-translational modification (PTM) sites. N-linked glycosylation, one type of PTM, is essential for the folding of some proteins. Proteins subjected to N-linked glycosylation contain an NX[ST] motif recognized by enzymes. For example, amino acid substitution T183A, identified in the prion protein (PRNP), can cause spongiform encephalopathy by disrupting the consensus sequence NX[ST] through the loss of the threonine [12].

Many computational tools aiming to establish that nsSNPs cause disease are based on evolutionary characteristics, structural consequences, or functional impact, alone or in combination. One early and established method, SIFT (sort intolerant from tolerant substitutions; [13]), estimates the predisposition to disease for mutation solely by exploiting conservation information from sequence homology. Another well-known tool, PolyPhen-2 [14], uses predicted physicochemical features based on protein sequence in a naive Bayes classifier, in addition to sequence alignment.

In this chapter, we discuss the structural and functional impact of nsSNPs on the underlying proteins. We will provide concrete examples of both aspects, showing mechanisms through which amino acid substitutions affect proteins and contribute to disease phenotypes. We describe algorithms for predicting stability changes and for assigning probabilities to putative phosphorylation sites. We then apply these concepts/tools to the problem of distinguishing deleterious mutations from neutral ones. Finally, we will present another nsSNP prediction approach, MutPred, and apply it to a subset of dbSNP. Through these efforts, we aim to characterize a variety of computational approaches to the problem of inferring disease consequences for genetic variants, and demonstrate that these approaches are fruitful.

## **2. Structural impact of mutations**

2 Will-be-set-by-IN-TECH

Database (HGMD; [3]) logs roughly 69,000 nsSNPs that are associated with diseases or traits; UniProt documents 37,000 nsSNPs as being neutral. For every six nsSNPs deposited in the public databases, five will have no disease or phenotype association. This gap will even grow larger as the emerging personal genome projects (www.personalgenomes.org) and

Accompanying the compilation of a myriad of variants, a natural question arises about interpreting them in the context of human health. More specifically, how do we assess the disease risk for individual variants based on available biomedical information? Population studies, such as genome-wide association studies, have in recent years provided estimates of an odds ratio by comparing the frequencies of hundreds of thousands of genomic variants between disease/trait patients and healthy controls. One centralized resource, namely the Catalog of Published Genome-Wide Association Studies from the National Human Genome Research Institute [6], has collected published association studies involving at least 100,000 variants from 2008. The latest version (2012/06) records 8,063 significant mutation-trait associations from 1,287 studies. Most of these associations present a modest effect size with a median odds ratio (OR) of 1.36 (interquartile range [IQR]: 1.19–2.02). One clear observation from these studies is that the majority of variants occur in non-coding regions where the two most frequent locations are intergenic regions (43 percent) and introns (40 percent). In sharp contrast, only 368 nsSNPs associated with 177 diseases/traits were reported, with a slightly stronger effect size: a median OR of 1.52 (IQR: 1.21–3.33). This examination makes clear that the number of cohort studies will not keep pace with the increase in nsSNP data generation, suggesting that computational approaches may provide an important aid to our

Among all genome-level characteristics, scientists have collected the most knowledge about protein-coding genes, and they have published many investigations into the impacts of missense variants. Through mapping disease-associated nsSNPs and amino acid changes without disease annotations to the multispecies sequence alignment, researchers have observed that mutations related to monogenic diseases occurred significantly more frequently at slow-evolving positions, while neutral nsSNPs were enriched at fast-evolving positions [7, 8]. This observation therefore suggests that evolutionary rate could act as an indicator for discriminating diseases from neutral mutations. Also, the availability of crystal structure for numerous proteins provides us an opportunity to examine nsSNP consequences in the steric context. For example, p53, a well-studied tumor suppressor protein, is involved in many critical cell processes, such as DNA repair and cell-cycle regulation; p53 is inactive in half of all cancers [9]. Six mutation hot spots, such as R175H, R273H, and R282W, have been mapped to the p53 DNA-binding core domain that is critical to its activation, and most of them destabilize protein structure, leading to the degradation of p53 [10]. Intriguingly, certain mutations introduced to the mutant p53 could counteract this reduced stability and potentially rescue its functionality [11]. For example, nsSNP N268D in mutant p53 results in a hydrogen bond which bridges two strands and ultimately leads to an increase in thermodynamic stability. Finally, nsSNPs could influence a broad array of functional sites, including protein- and ligand-binding sites, catalytic residues, and numerous post-translational modification (PTM) sites. N-linked glycosylation, one type of PTM, is essential for the folding of some proteins. Proteins subjected to N-linked glycosylation contain an NX[ST] motif recognized by enzymes.

whole-exome sequencing [4, 5] discover more rare variants.

understanding of mutation-disease relationships.

A classic disease that results from protein structural change via amino acid substitution is sickle cell anemia [15]. Replacement of a hydrophilic glutamic acid residue with a strong hydrophobic valine on the sixth amino acid of hemoglobin subunit beta causes the protein to aggregate and form rigid molecules, which in turn reshape the red blood cells as sickle-like [16]. The sickle cells die prematurely and thus result in anemia. Other possible structural abnormalities that nsSNPs can induce include changes of secondary structure, gain or loss of protein stability, and other physicochemical property alterations. In this section, we will illustrate two mutations on a cancer-related gene, BRCA1, and then describe an algorithm for predicting protein stability; finally, we will discuss its application to discriminating neutral and deleterious mutations.

BRCA1 is a well-known suppressor of breast and ovarian cancer tumors. Two C-terminal sequence repeats (BRCT) are essential for BRCA1's function, since mutations of stop codon and missense substitutions on these regions were observed in breast cancer patients [17, 18]. The crystal structure of the BRCT segments [19] shows that these two domains pack to each other in a tandem manner where one helix on the N-terminal domain and two helices on the C-terminal domain form an inner-domain interaction surface (Figure 1).

Two amino acid substitutions occur on this interface at A1708E, located near the end of the *α*1 helix, and at M1775R, located near the beginning of the *α*2 helix. At position 1708, the mutant glutamic acid is much larger than the original alanine (having a molecular weight of 147 versus 89) and introduces negative charge. Because M1708 lies near the center of the interaction surface, the compact core cannot accommodate this mutation sterically. Thus,

of secondary structure and solvent accessibility. MUpro fit a set of features derived from protein sequence to an experimental stability data by nonlinear transformation through SVM. The ProTherm database [25] collects from the literature a range of experimentally measured thermodynamic parameters, such as Gibbs free energy changes for wild-type and mutant proteins, with experimental conditions, including pH and temperature. From ProTherm MUpro used protein sequences and mutations for training and test purposes, along with

Bioinformatics Approaches to the Functional Pro ling of Genetic Variants 237

MUpro adopted a standard binary classification scheme in feature generation by selecting a window centered on a mutant position and then encoding each amino acid in the window as a vector of 20 elements. In this kind of vector, each element corresponds to one of 20 standard amino acids and takes a value of 1 if the corresponding amino acid is identical to the one observed or else 0. MUpro considered a window of seven amino acids for each mutation, thereby representing the feature set by a 140-element vector. The first 20-element vector records information about wild-type and mutant amino acids at the mutant position,

In a two-dimensional space, linear classifiers are designed to separate two classes of data points by a straight line. As illustrated in Figure 2 (left plot), any lines passing through the space between two parallel lines can separate the blue points (one class) from the orange (the other class) perfectly, and thus would be a good choice for linear classification. However, SVM algorithms [26] would select the dashed line, which distances two lines equally, as the class boundary. In other words SVMs optimize a margin separator that maximizes its distance to data points. Figure 2 shows the margin *m* between two classes, which is the optimization object in SVMs algorithm. Mathematically, larger *m* is expected to provide the classifier greater generalization, which measures how well the classifier performs on new, unseen data points.

> ● ●

●

●

●

● ●●● ● ●●●● ● ●

●

●●

●●●

●●● ● ●●●

● ●●● ●●●

● ● ● ●

● ●●●●● ●●●● ● ●●●●●●●● ● ●

●●●● ●●●●●●● ●●● ●●●

●● ●●● ●

 ● ●

● ●●●●

●● ●● ●●●● ●●

●

●● ●

●

●●● ●●●●

● ● ●● ●●● ● ●

●● ● ●

● ● ●●●●● ●●●●●●●●● ●●● ●●●● ●●●●●● ●●●●●● ● ●●●●

● ● ●●● ● ●●

●

●●●

●

−10 −50 5

MUpro Prediction

●●●●

● ● ●●●●●● ●● ●● ●●

● ●●●●●

●

●

●●●

●

● ●● ●

● ●

● ● ●●● ●

● ●

●

●

●

●

−10

**Figure 2.** The left plot illustrates a linear classification on separable data with two classes (blue and orange). The class boundary (dashed line) is the middle line between two parallel lines. The right plot shows MUpro predictions against experimental values for 1,008 nsSNPs; points on the diagonal

−50

Experimental Energy Change

 5

●

● ●

●

●

●

●

●●● ● ● ●

●

●● ●

●●

●

●●●● ●●●

●

●

●●●●● ● ● ●●●●●●●●● ●

● ● ●●

●

●●

● ●

●● ●

●●●●●●● ●●● ●●●●●●●●●● ●●●●●● ● ●●●

●●

●●● ●

●

●●●●●●● ● ●●●● ●●● ●●●●

●

● ● ●●●●● ●●●●●● ● ●●●●●●

● ●

●●●● ● ● ● ● ●●●●●● ●●● ●

 ● ●● ●● ●●●● ●●●● ● ●●

●

● ● ●●●●● ●● ●●●●● ● ●●●● ● ●

●●●●●

●●● ● ● ●●●●●●

●

●●

●●●●● ● ● ● ● ● ●●●● ●

● ●● ● ●● ●● ● ●●●

● ●● ●● ●●

●

● ● ● ●●●● ●● ●●● ●●

● ●●

●

● ●

●

● ● ●

●

●

●

●●

●

●

●

●

● ● ● ● ●● ● ● ●

● ●●● ●● ●●● ●●● ●●●●●●●● ●●●●●●

●

●●

●● ●

●●

and the final six vectors document the six flanking amino acids.

*m*

represent exact predictions.

numeric energy changes.

**Figure 1.** The crystal structure of human BRCT domains (PDB ID: 1JNX). The N-terminus is shown in blue; the C-terminus, in red. Residues A1708 and M1775 are depicted as ball and stick models. Three helices, *α*1 from the N-terminus and both *α*2 and *α*3 from the C-terminus, pack into a hydrophobic core that is important to the folding of BRCT domains.

A1708E would destabilize the BRCT interaction. On the other hand, although R1775 could be placed on the edge of the BRCT interface spatially, it positions a positive charge against the nearby R1835. Thus, both mutations would destabilize the BRCT core through either sterical incompatibility or disruption of electrostatic interactions [19]. This explanation found support from a mutation sensitivity assay that measures the stability of the inner domain interaction subject to proteolytic degradation. The wild-type protein resists the digestion by trypsin, elastase, and chymotrypsin, whereas the mutant with M1775R was partially degraded and A1708E was almost completely degraded [19]. The BRCT structure and in vitro experiments suggest that the genetic variants A1708E and M1775R cause the BRCA1 defect by destabilizing its inner-domain interaction.

From this example, we can see that crystal structure can be a powerful tool in interpreting possible consequences of nsSNPs by physicochemical principles. However, we cannot reasonably expect every protein and its mutants to have high-resolution three-dimensional (3D) structures or homology models available, either because of difficulties in structural determination, such as for membrane proteins, or because some proteins are intrinsically disordered [20].

To overcome this severe limitation, many computational tools aiming to predict structural properties use sequence information as input, either by direct use of sequence or through derived features such as amino acid composition and sequence motifs. Here, we describe a stability prediction method proposed by [21], namely MUpro, which was based on a sophisticated machine learning technique–Support Vector Machine (SVM)–and which achieved good performance.

In traditional molecular dynamics simulation, potential functions from a force field were usually calculated to obtain ΔΔ*G*, which was mainly influenced by interactions between nonlocal amino acids [22]. Although it is generally difficult, if not completely impossible, to infer protein structural architecture accurately based solely on amino acid sequence, pioneering work from [23, 24] showed that protein sequence was effective in the prediction of secondary structure and solvent accessibility. MUpro fit a set of features derived from protein sequence to an experimental stability data by nonlinear transformation through SVM. The ProTherm database [25] collects from the literature a range of experimentally measured thermodynamic parameters, such as Gibbs free energy changes for wild-type and mutant proteins, with experimental conditions, including pH and temperature. From ProTherm MUpro used protein sequences and mutations for training and test purposes, along with numeric energy changes.

4 Will-be-set-by-IN-TECH

**Figure 1.** The crystal structure of human BRCT domains (PDB ID: 1JNX). The N-terminus is shown in blue; the C-terminus, in red. Residues A1708 and M1775 are depicted as ball and stick models. Three helices, *α*1 from the N-terminus and both *α*2 and *α*3 from the C-terminus, pack into a hydrophobic core

A1708E would destabilize the BRCT interaction. On the other hand, although R1775 could be placed on the edge of the BRCT interface spatially, it positions a positive charge against the nearby R1835. Thus, both mutations would destabilize the BRCT core through either sterical incompatibility or disruption of electrostatic interactions [19]. This explanation found support from a mutation sensitivity assay that measures the stability of the inner domain interaction subject to proteolytic degradation. The wild-type protein resists the digestion by trypsin, elastase, and chymotrypsin, whereas the mutant with M1775R was partially degraded and A1708E was almost completely degraded [19]. The BRCT structure and in vitro experiments suggest that the genetic variants A1708E and M1775R cause the BRCA1 defect by destabilizing

From this example, we can see that crystal structure can be a powerful tool in interpreting possible consequences of nsSNPs by physicochemical principles. However, we cannot reasonably expect every protein and its mutants to have high-resolution three-dimensional (3D) structures or homology models available, either because of difficulties in structural determination, such as for membrane proteins, or because some proteins are intrinsically

To overcome this severe limitation, many computational tools aiming to predict structural properties use sequence information as input, either by direct use of sequence or through derived features such as amino acid composition and sequence motifs. Here, we describe a stability prediction method proposed by [21], namely MUpro, which was based on a sophisticated machine learning technique–Support Vector Machine (SVM)–and which

In traditional molecular dynamics simulation, potential functions from a force field were usually calculated to obtain ΔΔ*G*, which was mainly influenced by interactions between nonlocal amino acids [22]. Although it is generally difficult, if not completely impossible, to infer protein structural architecture accurately based solely on amino acid sequence, pioneering work from [23, 24] showed that protein sequence was effective in the prediction

that is important to the folding of BRCT domains.

its inner-domain interaction.

achieved good performance.

disordered [20].

MUpro adopted a standard binary classification scheme in feature generation by selecting a window centered on a mutant position and then encoding each amino acid in the window as a vector of 20 elements. In this kind of vector, each element corresponds to one of 20 standard amino acids and takes a value of 1 if the corresponding amino acid is identical to the one observed or else 0. MUpro considered a window of seven amino acids for each mutation, thereby representing the feature set by a 140-element vector. The first 20-element vector records information about wild-type and mutant amino acids at the mutant position, and the final six vectors document the six flanking amino acids.

In a two-dimensional space, linear classifiers are designed to separate two classes of data points by a straight line. As illustrated in Figure 2 (left plot), any lines passing through the space between two parallel lines can separate the blue points (one class) from the orange (the other class) perfectly, and thus would be a good choice for linear classification. However, SVM algorithms [26] would select the dashed line, which distances two lines equally, as the class boundary. In other words SVMs optimize a margin separator that maximizes its distance to data points. Figure 2 shows the margin *m* between two classes, which is the optimization object in SVMs algorithm. Mathematically, larger *m* is expected to provide the classifier greater generalization, which measures how well the classifier performs on new, unseen data points.

**Figure 2.** The left plot illustrates a linear classification on separable data with two classes (blue and orange). The class boundary (dashed line) is the middle line between two parallel lines. The right plot shows MUpro predictions against experimental values for 1,008 nsSNPs; points on the diagonal represent exact predictions.

#### 6 Will-be-set-by-IN-TECH 238 Mutations in Human Genetic Disease Bioinformatics Approaches to the Functional Profiling of Genetic Variants <sup>7</sup>

When data sets overlap, SVMs still try to optimize a new objective function that considers both *m* and penalties from misclassification. Regardless of the separability of the data, *m* depends only on points located on the parallel lines (completely separable) or points located between them (partially separable). These points are called support vectors.

stability greatly facilitate this task by applying to virtually any protein with sequences

Bioinformatics Approaches to the Functional Pro ling of Genetic Variants 239

Besides structural consequences, variants can disrupt molecular functional sites, such as catalytic residues and DNA/protein binding sites, which are usually position-specific or share consensus motifs. Those disruptions, however, do not necessarily involve disruption of structure. A prominent class of sites that variants would affect consists of diverse PTM sites, of which some of the most frequent types are phosphorylation, glycosylation, acetylation, methylation, and ubiquitination. PTMs play an important role in cellular signal transduction and regulation, and activating and inactivating certain key proteins rely on precise modulation of PTMs in cell activities. For instance, without environmental stress, p53 is suppressed through ubiquitination catalyzed by E3 ubiquitin ligases, while in the presence of stress, such as DNA damage, p53 is activated by a variety of PTM enzymes, including acetylation and phosphorylation on its flexible DNA-binding domain [29]. PTM sites and flanking residues generally form consensus sequences with a high degree of variety, and therefore variants within these enzyme-specific motifs could abolish known functionalities or create new ones. This section starts by detailing two concrete examples of functional changes due to variants, followed by a description of DisPhos (Disorder-enhanced Phosphorylation sites predictor), an established phosphorylation predictor, and then explain how the concepts

FGFR2 (fibroblast growth factor receptor 2), one of four members of FGFR family of receptor tyrosine kinases, plays an important role in transmembrane signal transduction. Recent research identified one missense mutation, A628T, as being involved in LADD syndrome through severely impairing the kinase activity of FGFR2 [30]. Residue A628 is in the center of the catalytic pocket in the tyrosine kinase domain of FGFR2. A mutant structure, A628T-FGFR2 [31], reveals that the substitution of the smaller amino acid alanine at position 628 with the larger, polar threonine pushes one of the key residues, R630, out of the catalytic pocket; that movement disrupts the hydrogen bond between D626 and R630 existed in the wild-type structure (Figure 3, left). Although the position of D626 remains almost unchanged, R630 is too far away from the catalytic pocket and fails to stabilize the interaction with substrates, which consequently greatly compromises the catalytic ability of FGFR2. Compared with wild-type FGFR2, the A628T-FGFR2 mutant has roughly the same structure but highly

It has been observed that amino acid substitutions occurred on non-PTM-sites could spread their influence to neighboring PTM sites on the same protein. One of such examples is PTPS, human PTP (protein tyrosine phosphatase) synthase, which catalyzes triphosphate elimination. PTPS participates in the biosynthetic pathway for tetrahydrobiopterin (BH4). Lack of PTPS catalytic activity causes a deficiency of BH4, which in turn leads to hyperphenylalaninemia (HPA), an autosomal recessive disorder. Missense mutation R16C was associated with HPA and resulted in reduced activity of PTPS [32]. Moreover, phosphorylation of S19 on PTPS is required for maximal enzyme activity [33]. So how does R16C affect phosphorylation on S19? There are multiple potential explanations. One is that the structure of PTPS shows the exposure of both R16 and S19 on the surface of the protein (Figure

of gain and loss of phosphorylation can be used to analyze a cancer data.

available.

reduced kinase activity.

**3. Functional impact of mutations**

Besides data classification, SVMs can perform regression for data points with continuous response values, where the objective function measures the difference between prediction and actual values. But unlike typical linear regression, SVM regressions do not penalize differences falling within a predefined range.

The abilities of SVMs, however, go beyond linear classification and regression. By projecting the original data points into higher dimensional spaces, SVMs actually create additional, and usually more complex, features from the input points. By using the same linear settings as described above in these newly high-dimensional spaces, SVMs can effectively capture highly nonlinear relationships among data which otherwise would be missed.

MUpro applied a popular SVM implementation, SVMlight [27], to carry out energy change sign classification and regression. In 1,008 training mutations, MUpro performed rather well against true energy changes, with a root-mean-square deviation (RMSD) of 0.39 (Figure 2, right plot). Moreover, it made more accurate predictions with less dramatic actual stability changes between wild-type and mutant amino acids. Generally, MUpro tended to underestimate larger energy changes.

In one early comprehensive examination of the effects of nsSNPs on protein function, [28] catalogued nsSNP effects according to structural and sequence changes caused by the introduction of mutant amino acids. That study extracted 262 disease-causing missense variants from the HGMD and 42 neutral variants from hypertension-associated genes. Proteins harboring these variants either had 3D structures deposited in the Protein Data Bank (PDB) or they could find homologous ones with a sequence similarity of at least 40 percent. They then modeled both wild-type and mutant protein structures based on available 3D structures. By examining a broad range of physicochemical parameters from built models, including loss of hydrogen bonds, loss of a salt bridge, over-packing, and disruption of binding, Wang *et al.* could compare distributions of effects observed in disease-causing and neutral variants (Table 1). Their results clearly demonstrated that loss of stability accounts for many more disease-causing variants than neutral variants (83 versus 26 percent) and that 70 percent of neutral variants cause no measurable effects on the protein structure.



This survey suggests that nsSNPs giving rise to stability changes will more likely be disease-related than not, and this property might be useful in distinguishing disease-causing from neutral nsSNPs. Moreover, computational tools like MUpro capable of predicting stability greatly facilitate this task by applying to virtually any protein with sequences available.

#### **3. Functional impact of mutations**

6 Will-be-set-by-IN-TECH

When data sets overlap, SVMs still try to optimize a new objective function that considers both *m* and penalties from misclassification. Regardless of the separability of the data, *m* depends only on points located on the parallel lines (completely separable) or points located between

Besides data classification, SVMs can perform regression for data points with continuous response values, where the objective function measures the difference between prediction and actual values. But unlike typical linear regression, SVM regressions do not penalize

The abilities of SVMs, however, go beyond linear classification and regression. By projecting the original data points into higher dimensional spaces, SVMs actually create additional, and usually more complex, features from the input points. By using the same linear settings as described above in these newly high-dimensional spaces, SVMs can effectively capture highly

MUpro applied a popular SVM implementation, SVMlight [27], to carry out energy change sign classification and regression. In 1,008 training mutations, MUpro performed rather well against true energy changes, with a root-mean-square deviation (RMSD) of 0.39 (Figure 2, right plot). Moreover, it made more accurate predictions with less dramatic actual stability changes between wild-type and mutant amino acids. Generally, MUpro tended to

In one early comprehensive examination of the effects of nsSNPs on protein function, [28] catalogued nsSNP effects according to structural and sequence changes caused by the introduction of mutant amino acids. That study extracted 262 disease-causing missense variants from the HGMD and 42 neutral variants from hypertension-associated genes. Proteins harboring these variants either had 3D structures deposited in the Protein Data Bank (PDB) or they could find homologous ones with a sequence similarity of at least 40 percent. They then modeled both wild-type and mutant protein structures based on available 3D structures. By examining a broad range of physicochemical parameters from built models, including loss of hydrogen bonds, loss of a salt bridge, over-packing, and disruption of binding, Wang *et al.* could compare distributions of effects observed in disease-causing and neutral variants (Table 1). Their results clearly demonstrated that loss of stability accounts for many more disease-causing variants than neutral variants (83 versus 26 percent) and that 70

percent of neutral variants cause no measurable effects on the protein structure.

Effect Disease Neutral Stability 83 26 Ligand binding 5 2 Other 2 2 No effect 10 70 **Table 1.** Percentage of effects from missense variants on protein function (adapted from Figure 2 in [28]) This survey suggests that nsSNPs giving rise to stability changes will more likely be disease-related than not, and this property might be useful in distinguishing disease-causing from neutral nsSNPs. Moreover, computational tools like MUpro capable of predicting

them (partially separable). These points are called support vectors.

nonlinear relationships among data which otherwise would be missed.

differences falling within a predefined range.

underestimate larger energy changes.

Besides structural consequences, variants can disrupt molecular functional sites, such as catalytic residues and DNA/protein binding sites, which are usually position-specific or share consensus motifs. Those disruptions, however, do not necessarily involve disruption of structure. A prominent class of sites that variants would affect consists of diverse PTM sites, of which some of the most frequent types are phosphorylation, glycosylation, acetylation, methylation, and ubiquitination. PTMs play an important role in cellular signal transduction and regulation, and activating and inactivating certain key proteins rely on precise modulation of PTMs in cell activities. For instance, without environmental stress, p53 is suppressed through ubiquitination catalyzed by E3 ubiquitin ligases, while in the presence of stress, such as DNA damage, p53 is activated by a variety of PTM enzymes, including acetylation and phosphorylation on its flexible DNA-binding domain [29]. PTM sites and flanking residues generally form consensus sequences with a high degree of variety, and therefore variants within these enzyme-specific motifs could abolish known functionalities or create new ones. This section starts by detailing two concrete examples of functional changes due to variants, followed by a description of DisPhos (Disorder-enhanced Phosphorylation sites predictor), an established phosphorylation predictor, and then explain how the concepts of gain and loss of phosphorylation can be used to analyze a cancer data.

FGFR2 (fibroblast growth factor receptor 2), one of four members of FGFR family of receptor tyrosine kinases, plays an important role in transmembrane signal transduction. Recent research identified one missense mutation, A628T, as being involved in LADD syndrome through severely impairing the kinase activity of FGFR2 [30]. Residue A628 is in the center of the catalytic pocket in the tyrosine kinase domain of FGFR2. A mutant structure, A628T-FGFR2 [31], reveals that the substitution of the smaller amino acid alanine at position 628 with the larger, polar threonine pushes one of the key residues, R630, out of the catalytic pocket; that movement disrupts the hydrogen bond between D626 and R630 existed in the wild-type structure (Figure 3, left). Although the position of D626 remains almost unchanged, R630 is too far away from the catalytic pocket and fails to stabilize the interaction with substrates, which consequently greatly compromises the catalytic ability of FGFR2. Compared with wild-type FGFR2, the A628T-FGFR2 mutant has roughly the same structure but highly reduced kinase activity.

It has been observed that amino acid substitutions occurred on non-PTM-sites could spread their influence to neighboring PTM sites on the same protein. One of such examples is PTPS, human PTP (protein tyrosine phosphatase) synthase, which catalyzes triphosphate elimination. PTPS participates in the biosynthetic pathway for tetrahydrobiopterin (BH4). Lack of PTPS catalytic activity causes a deficiency of BH4, which in turn leads to hyperphenylalaninemia (HPA), an autosomal recessive disorder. Missense mutation R16C was associated with HPA and resulted in reduced activity of PTPS [32]. Moreover, phosphorylation of S19 on PTPS is required for maximal enzyme activity [33]. So how does R16C affect phosphorylation on S19? There are multiple potential explanations. One is that the structure of PTPS shows the exposure of both R16 and S19 on the surface of the protein (Figure

#### 8 Will-be-set-by-IN-TECH 240 Mutations in Human Genetic Disease Bioinformatics Approaches to the Functional Profiling of Genetic Variants <sup>9</sup>

Residue Positive Sites (P) Negative Sites (N) N/P Ratio S 613 10,798 17.6 T 140 9,051 64.7 Y 136 5,103 37.5

Bioinformatics Approaches to the Functional Pro ling of Genetic Variants 241

Type Features Dimension Amino acid composition Binary coding 480 Amino acid frequency Binary coding 20 Disorder VLXT, VL2, VLV, VLC, VLS 5 Secondary structure Helix, loop and sheet 7 Sequence property Complexity and flexibility 2 Residue property Net charge, aromatic content, 5

analysis (PCA) to continuous features and then fitted logistic regression models to the

Generally, binary classifiers work best in settings of balanced or close to balanced data sets in terms of accuracy, sensitivity, and specificity. For a classification in which the class boundary is determined by a solution that maximizes accuracy–the default configuration for many popular classifiers–training on highly unbalanced data sets inevitably results in extreme values for sensitivity or specificity, ultimately leading to poor generalization. DisPhos

The combination of data filtering, feature selection, and sophisticated training and test configurations enabled DisPhos to achieve accuracy ranges between 70 and 80 percent, an improvement over the accuracy of other similar predictors. Moreover, the features derived from disorder predictions improved the accuracy by two percent on average, and these improvements showed the usefulness of disorder features in the prediction of

DisPhos represents outcomes as probabilities, which quantitatively measure the likelihood that the underlying residues are phosphorylation sites. This characteristic facilitated the definition of gain and loss of phosphorylation for a specific site [40], and since these concepts can be interpreted readily, they may help provide insight into the underlying molecular mechanisms of mutations associated with diseases. Actually, the definitions of gain and loss are not limited to phosphorylation sites and can apply just as well to many other functional

Using bioinformatics tools that predict functional and structural attributes on both wild-type and mutant protein sequences provides us with two probabilistic estimates for a property *p*:

*<sup>i</sup>* denoting a wild type site and *<sup>s</sup><sup>m</sup>*

*<sup>i</sup>* AND *p* = 0 at*s*

*m*

*w*

*<sup>i</sup>* denoting

*<sup>i</sup>* ). (1)

*<sup>i</sup>* ) at site *si*, with *<sup>s</sup><sup>w</sup>*

*P*(loss of property *p* at site *si*) = *P*(*p* = 1 at*s*

exposed/buried

Hydrophobic moment, Hydrophobicity,

**Table 2.** Data sets used in DisPhos (adapted from Table 1 in [37])

**Table 3.** Descriptive and predicted features used in DisPhos training.

adopted an ensemble strategy to correct this issue in the S/T/Y data sets.

transformed data sets.

phosphorylation sites.

and structural properties.

*<sup>i</sup>* ) and *<sup>P</sup>*(*<sup>p</sup>* <sup>=</sup> 1 at*s<sup>m</sup>*

a mutant site. Then, conceptually, we have

*P*(*p* = 1 at*s<sup>w</sup>*

**Figure 3.** The crystal structure of the catalytic pocket of the A628T-FGFR2 mutant (left, PDB ID: 3B2T) and ribbon view of human PTPS structure (right, PDB ID: 3I2B). In both cases, the N-terminus is colored in blue and the C-terminus in red. Residues of interest are depicted as ball and stick models.

3, right; [34]) that forms the consensus sequence R16XXS19 for cGMP protein kinase II. The substitution C16 disrupts this kinase-recognizable motif and thus hinders phosphorylation, which ultimately leads to the inactivation of PTPS. Another explanation is that a removal of R16 prevents a salt bridge between it and a phosphate group when attached, which in turn results the loss of stability of the modified protein.

As with the stability prediction tool MUpro, described in the previous section, experimental difficulties have promoted the development of computational approaches to estimating many common PTM sites based on protein sequence. For the prediction of phosphorylation, DisPhos differs from other available methods like NetPhos [35] and ScanSite [36], since its model explicitly includes a range of characteristic features from the predicted disorder region around the phosphorylation site [37].

In some cases, researchers have found phosphorylation sites located on intrinsically disordered regions or have observed disorder-to-order or order-to-disorder conformational changes upon phosphorylation [38]. DisPhos exploited such observations by integrating predicted disorder information with the motif profile to improve its predictive performance.

Because phosphorylation occurs on residues S, T, and Y (S/T/Y), DisPhos assembled three pairs of positive-negative data sets, with each pair corresponding to one residue-specific predictor. First, it extracted proteins with phosphorylation annotations from UniProt (Universal Protein Resource); it then combined this data with data from Phospho.ELM [39]. DisPhos placed a 25-residue segment centered on each annotated S/T/Y into a positive set, while placing the same length segment around every non-annotated S/T/Y on the same protein into a negative set. To reduce the sequence bias caused by homologs or duplications, DisPhos only kept entries with a pairwise sequence similarity of less than 30 percent, which means that it allowed up to seven matches from alignment without gap. Due to the small size of experimentally verified phosphorylation sites, the filtered data sets were highly unbalanced (Table 2).

DisPhos used a broad range of features to discriminate positive from negative sites (Table 3).

To cope with the highly dimensional, yet sparse feature space, DisPhos performed feature selection by applying a permutation test to binary features and applying principal component


**Table 2.** Data sets used in DisPhos (adapted from Table 1 in [37])

8 Will-be-set-by-IN-TECH

**Figure 3.** The crystal structure of the catalytic pocket of the A628T-FGFR2 mutant (left, PDB ID: 3B2T) and ribbon view of human PTPS structure (right, PDB ID: 3I2B). In both cases, the N-terminus is colored

3, right; [34]) that forms the consensus sequence R16XXS19 for cGMP protein kinase II. The substitution C16 disrupts this kinase-recognizable motif and thus hinders phosphorylation, which ultimately leads to the inactivation of PTPS. Another explanation is that a removal of R16 prevents a salt bridge between it and a phosphate group when attached, which in turn

As with the stability prediction tool MUpro, described in the previous section, experimental difficulties have promoted the development of computational approaches to estimating many common PTM sites based on protein sequence. For the prediction of phosphorylation, DisPhos differs from other available methods like NetPhos [35] and ScanSite [36], since its model explicitly includes a range of characteristic features from the predicted disorder region

In some cases, researchers have found phosphorylation sites located on intrinsically disordered regions or have observed disorder-to-order or order-to-disorder conformational changes upon phosphorylation [38]. DisPhos exploited such observations by integrating predicted disorder information with the motif profile to improve its predictive performance. Because phosphorylation occurs on residues S, T, and Y (S/T/Y), DisPhos assembled three pairs of positive-negative data sets, with each pair corresponding to one residue-specific predictor. First, it extracted proteins with phosphorylation annotations from UniProt (Universal Protein Resource); it then combined this data with data from Phospho.ELM [39]. DisPhos placed a 25-residue segment centered on each annotated S/T/Y into a positive set, while placing the same length segment around every non-annotated S/T/Y on the same protein into a negative set. To reduce the sequence bias caused by homologs or duplications, DisPhos only kept entries with a pairwise sequence similarity of less than 30 percent, which means that it allowed up to seven matches from alignment without gap. Due to the small size of experimentally verified phosphorylation sites, the filtered data sets were highly unbalanced

DisPhos used a broad range of features to discriminate positive from negative sites (Table 3). To cope with the highly dimensional, yet sparse feature space, DisPhos performed feature selection by applying a permutation test to binary features and applying principal component

in blue and the C-terminus in red. Residues of interest are depicted as ball and stick models.

results the loss of stability of the modified protein.

around the phosphorylation site [37].

(Table 2).


**Table 3.** Descriptive and predicted features used in DisPhos training.

analysis (PCA) to continuous features and then fitted logistic regression models to the transformed data sets.

Generally, binary classifiers work best in settings of balanced or close to balanced data sets in terms of accuracy, sensitivity, and specificity. For a classification in which the class boundary is determined by a solution that maximizes accuracy–the default configuration for many popular classifiers–training on highly unbalanced data sets inevitably results in extreme values for sensitivity or specificity, ultimately leading to poor generalization. DisPhos adopted an ensemble strategy to correct this issue in the S/T/Y data sets.

The combination of data filtering, feature selection, and sophisticated training and test configurations enabled DisPhos to achieve accuracy ranges between 70 and 80 percent, an improvement over the accuracy of other similar predictors. Moreover, the features derived from disorder predictions improved the accuracy by two percent on average, and these improvements showed the usefulness of disorder features in the prediction of phosphorylation sites.

DisPhos represents outcomes as probabilities, which quantitatively measure the likelihood that the underlying residues are phosphorylation sites. This characteristic facilitated the definition of gain and loss of phosphorylation for a specific site [40], and since these concepts can be interpreted readily, they may help provide insight into the underlying molecular mechanisms of mutations associated with diseases. Actually, the definitions of gain and loss are not limited to phosphorylation sites and can apply just as well to many other functional and structural properties.

Using bioinformatics tools that predict functional and structural attributes on both wild-type and mutant protein sequences provides us with two probabilistic estimates for a property *p*: *P*(*p* = 1 at*s<sup>w</sup> <sup>i</sup>* ) and *<sup>P</sup>*(*<sup>p</sup>* <sup>=</sup> 1 at*s<sup>m</sup> <sup>i</sup>* ) at site *si*, with *<sup>s</sup><sup>w</sup> <sup>i</sup>* denoting a wild type site and *<sup>s</sup><sup>m</sup> <sup>i</sup>* denoting a mutant site. Then, conceptually, we have

$$P(\text{loss of property } p \text{ at site } s\_i) = P(p = 1 \text{ at } s\_i^w \text{ AND } p = 0 \text{ at } s\_i^m). \tag{1}$$

Given that *s<sup>w</sup>* and *s<sup>m</sup>* are actually different molecules, we consider that *P*(*p* = 1 at*s<sup>w</sup> <sup>i</sup>* ) and *P*(*p* = 0 at*s<sup>m</sup> <sup>i</sup>* ) are not dependent because of any underlying process. Therefore, we can expand the right hand of equation (1) as a product:

$$\begin{split} P(p = 1 \text{ at } \mathbf{s}\_i^w \text{ AND } p = 0 \text{ at } \mathbf{s}\_i^m) &= P(p = 1 \text{ at } \mathbf{s}\_i^w) \cdot P(p = 0 \text{ at } \mathbf{s}\_i^m) \\ &= P(p = 1 \text{ at } \mathbf{s}\_i^w) \cdot [1 - P(p = 1 \text{ at } \mathbf{s}\_i^m)] \end{split} \tag{2}$$

Phosphorylaiton change Disease nsSNPs Control nsSNPs *P*-value Gain 1.91 0.86 0.014 Loss 1.70 1.50 0.59 **Table 4.** Percentage of mutations predicted to have undergone gain or loss of phosphorylation. *P*-values

Bioinformatics Approaches to the Functional Pro ling of Genetic Variants 243

This survey showed how the concepts of gain and loss of phosphorylation could distinguish cancer-associated from neutral somatic mutations; it also suggested that they could serve as useful features for discriminating between general disease-related nsSNPs and neutral ones.

In light of the above observations on the wide variety of consequences of a single mutation, we developed a large range of features for each variant and employed a popular machine learning technique, random forest, to distinguish disease-associated mutations from neutral

In a supervised learning scenario, we collected two sets of disease-associated mutations. One set came from the HGMD [3], in which 95 percent of mutations were annotated to monogenic diseases. We extracted the other set from a cancer-sequencing project [41]. Also, we created two corresponding control data sets (Table 5). For the HGMD data, we took a set of variants from UniProt that were annotated as polymorphisms to serve as controls (SPP). We identified all neutral mutations that occurred on the same proteins observed in the cancer data set and used them as the cancer controls. On average, HGMD proteins harbored 7.3 times as many variants as SPP proteins, while we observed a much less dramatic difference between cancer

> Data set Mutations Proteins Type HGMD 39,218 1,879 Disease SPP 26,439 9,305 Neutral Cancer 653 519 Disease Cancer control 1,016 312 Neutral

We generated a total of 130 numeric attributes based on protein sequences for each mutation and utilized them as the input into a random forest classifier. These attributes can be divided into three major types (Table 6). Other evolutionary attributes include position-specific scoring matrix (PSSM) generated by PSI-BLAST, Pfam domain profile, and

As the PTPS example shows, the influence of nsSNPs could spread to neighboring PTM sites. Accordingly, we expanded the definitions for gain/loss of structural and functional properties to pick up the largest gain/loss changes within an 11-residue window centered on the mutant

Random forest is an ensemble learning technique based on a population of binary decision trees, each of which is grown on a proportion of randomly chosen features and bootstrapped samples [54]. For classification, the outcome is the majority voting of individual trees.

were computed by *t*-test.

data set and its controls.

**4. Mutation prediction: MutPred**

ones. We called the model MutPred [42].

**Table 5.** Summary of disease and neutral data sets.

transition frequency from SNAP [43].

position.

By substituting equation (1) with equation (2), we get

$$P(\text{loss of property } p \text{ at site } s\_i) = P(p = 1 \text{ at } s\_i^w) \cdot [1 - P(p = 1 \text{ at } s\_i^m)] \tag{3}$$

Likewise, we can define gain of a property as

$$P(\text{gain of property } p \text{ at site } s\_i) = \left[1 - P(p = 1 \text{ at } s\_i^{\text{w}})\right] \cdot P(p = 1 \text{ at } s\_i^{\text{m}}) \tag{4}$$

Figure 4 shows the contour of gain of a property. Note that we can still compute gain/loss even if the predictions for the property are the same for wild-type and mutant sequences. The value of gain/loss varies from 0 to 0.25 when both predictions take a value of 0 through 0.5.

**Figure 4.** The contour of gain of property with respect to probability on mutant sequence–*x*-axis, *P*(mutant)–and wild-type sequence–*y*-axis, *P*(wild)). The dashed line denotes sites with equal probabilities for the two types of sequences.

[40] showed one application of gain and loss of phosphorylation. An experiment in their study collected 1,099 breast and colorectal cancer nsSNPs occurring on 847 proteins from a large-scale cancer-tumor-sequencing project [41]. Radivojac *et al*. then paired control and mutation data by randomly mutating on the same set of 847 wild-type proteins at the codon level. Their study then calculated gain and loss of phosphorylation for each mutation in both data sets, and found that disease-associated nsSNPs were significantly more likely to be involved in adding new phosphorylation sites (Table 4).


**Table 4.** Percentage of mutations predicted to have undergone gain or loss of phosphorylation. *P*-values were computed by *t*-test.

This survey showed how the concepts of gain and loss of phosphorylation could distinguish cancer-associated from neutral somatic mutations; it also suggested that they could serve as useful features for discriminating between general disease-related nsSNPs and neutral ones.

### **4. Mutation prediction: MutPred**

10 Will-be-set-by-IN-TECH

*<sup>i</sup>* ) are not dependent because of any underlying process. Therefore, we can

*w*

*w*

*w*

*<sup>i</sup>* ) · *P*(*p* = 0 at*s*

*<sup>i</sup>* ) · [1 − *P*(*p* = 1 at*s*

*<sup>i</sup>* ) · [1 − *P*(*p* = 1 at*s*

*<sup>i</sup>* )] · *P*(*p* = 1 at*s*

*w*

0.05

0.65

0.8

0.9

*m i* )

*m*

*m*

*m*

*<sup>i</sup>* ) = *P*(*p* = 1 at*s*

Figure 4 shows the contour of gain of a property. Note that we can still compute gain/loss even if the predictions for the property are the same for wild-type and mutant sequences. The value of gain/loss varies from 0 to 0.25 when both predictions take a value of 0 through 0.5.

0.1

0.15

0.25

0.35

0.45

0.55

0.6

0.7

0.75

P(mutant)

[40] showed one application of gain and loss of phosphorylation. An experiment in their study collected 1,099 breast and colorectal cancer nsSNPs occurring on 847 proteins from a large-scale cancer-tumor-sequencing project [41]. Radivojac *et al*. then paired control and mutation data by randomly mutating on the same set of 847 wild-type proteins at the codon level. Their study then calculated gain and loss of phosphorylation for each mutation in both data sets, and found that disease-associated nsSNPs were significantly more likely to

0.0 0.2 0.4 0.6 0.8 1.0

0.5

= *P*(*p* = 1 at*s*

*<sup>i</sup>* ) and

*<sup>i</sup>* )] (2)

*<sup>i</sup>* )] (3)

*<sup>i</sup>* ) (4)

Given that *s<sup>w</sup>* and *s<sup>m</sup>* are actually different molecules, we consider that *P*(*p* = 1 at*s<sup>w</sup>*

*m*

*P*(*p* = 0 at*s<sup>m</sup>*

expand the right hand of equation (1) as a product:

By substituting equation (1) with equation (2), we get

P(wild)

0.0

be involved in adding new phosphorylation sites (Table 4).

probabilities for the two types of sequences.

 0.2

 0.4

 0.6

 0.8

 1.0

*<sup>i</sup>* AND *p* = 0 at*s*

*P*(loss of property *p* at site *si*) = *P*(*p* = 1 at*s*

*P*(gain of property *p* at site *si*)=[1 − *P*(*p* = 1 at*s*

0.05

0.1

0.2

0.3

**Figure 4.** The contour of gain of property with respect to probability on mutant sequence–*x*-axis, *P*(mutant)–and wild-type sequence–*y*-axis, *P*(wild)). The dashed line denotes sites with equal

0.4

0.05

*w*

Likewise, we can define gain of a property as

*P*(*p* = 1 at*s*

In light of the above observations on the wide variety of consequences of a single mutation, we developed a large range of features for each variant and employed a popular machine learning technique, random forest, to distinguish disease-associated mutations from neutral ones. We called the model MutPred [42].

In a supervised learning scenario, we collected two sets of disease-associated mutations. One set came from the HGMD [3], in which 95 percent of mutations were annotated to monogenic diseases. We extracted the other set from a cancer-sequencing project [41]. Also, we created two corresponding control data sets (Table 5). For the HGMD data, we took a set of variants from UniProt that were annotated as polymorphisms to serve as controls (SPP). We identified all neutral mutations that occurred on the same proteins observed in the cancer data set and used them as the cancer controls. On average, HGMD proteins harbored 7.3 times as many variants as SPP proteins, while we observed a much less dramatic difference between cancer data set and its controls.


**Table 5.** Summary of disease and neutral data sets.

We generated a total of 130 numeric attributes based on protein sequences for each mutation and utilized them as the input into a random forest classifier. These attributes can be divided into three major types (Table 6). Other evolutionary attributes include position-specific scoring matrix (PSSM) generated by PSI-BLAST, Pfam domain profile, and transition frequency from SNAP [43].

As the PTPS example shows, the influence of nsSNPs could spread to neighboring PTM sites. Accordingly, we expanded the definitions for gain/loss of structural and functional properties to pick up the largest gain/loss changes within an 11-residue window centered on the mutant position.

Random forest is an ensemble learning technique based on a population of binary decision trees, each of which is grown on a proportion of randomly chosen features and bootstrapped samples [54]. For classification, the outcome is the majority voting of individual trees.


**Table 6.** Major attributes used in MutPred. † unpublished in-house program. ‡ used in latest version of MutPred.

0.0 0.2 0.4 0.6 0.8 1.0

HGMD Cancer Random

**Figure 5.** The Receiver Operating Characteristic (ROC) curves for HGMD and cancer data sets (left), and example distributions of gain/loss property *p* in neutral and disease sets (green and red, respectively; right). An empirical distribution of the putatively neutral substitutions can be used to define a threshold

Bioinformatics Approaches to the Functional Pro ling of Genetic Variants 245

score *<* 0.05; (3) very confident hypotheses: MutPred score *>* 0.78 AND property score *<* 0.01,

We applied MutPred to 203,899 nsSNPs deposited in the dbSNP (build 135) and examined the score distribution and frequent hypotheses behind predicted deleterious mutations. In general, 35 percent of mutations were predicted with scores higher than 0.5; thus, we classified them as disease-associated (Figure 6). Of these deleterious mutations, 19.6 percent got at least one functional or structural hypothesis of possible molecular mechanism. The top three hypotheses all pointed to structural changes: gain of disorder (9.7 percent), loss of stability (8.5 percent), and loss of disorder (6.2 percent). This result agrees with [28]–at least in the sense that these changes are the most frequently seen. On the other hand, common functional alterations involved in disease included loss of MoRF binding (6.0 percent), gain of

Understanding mutation data generated in biomedical research stimulates the development of computational methods. Previous studies have revealed structural and functional impacts on underlying proteins from variants, and research has proven that these impacts can differentiate between disease-associated and neutral mutations. Most current prediction tools have taken advantage of these characteristics, along with evolutionary information readily available from sequence alignment. Such tools have demonstrated impressive classification

*r* on the false positive rate that, in turn, can be used to accept/reject the null hypothesis on new substitutions. The area shaded in green represents the *P*-value threshold (corresponding to the score *r*) that is used by MutPred to hypothesize molecular cause of disease. A particular area under the right tail

1 − Specificity

of the neutral distribution is referred to as the property score.

where 0.78 corresponds to specificity 0.95 in HGMD data set.

methylation (5.9 percent), and gain of catalytic residue (5.6 percent).

0.0

**5. Conclusion**

 0.2

 0.4

 0.6

Sensitivity

 0.8

 1.0

Compared to a normal single decision tree, each subtree within a random forest uses only partial features and samples, which results in small correlations among subtrees and effectively reduces the overall variance of the model. Moreover, random forests inherit some attractive properties from decision trees, such as robustness to outliers and ease of interpretation.

In our model, we specified 1,000 trees to build the classifier between disease and neutral mutations. The HGMD achieved better accuracy than the somatic cancer data, suggesting that monogenic disease-related mutations are more suited to MutPred than somatic cancer mutations (Table 7). This is likely due to the large number of passenger variants (not causative) in tissue cancer sequencing data sets. Also, in terms of area under the curve (AUC) MutPred observed 0.86 in HGMD and 0.69 in cancer data sets (Figure 5, left).


**Table 7.** Percentage of classification performance measurement for HGMD and cancer data sets.

MutPred can provide not only comparable predictions for a mutation's predisposition to cause diseases [55], but it also allows the estimation of the significance level for individual gain/loss of properties (Figure 5, right). It is reasonable to assume that the distribution of property *p* in the neutral data set provides an unbiased approximation of the true null distribution, given the fact that UniProt provided the largest available set of curated neutral variants. Therefore, we could generate hypotheses about the molecular mechanism underlying variants at three different confidence levels: (1) actionable hypotheses: 0.78 ≥ MutPred score *>* 0.5 AND property score *<* 0.05; (2) confident hypotheses: MutPred score *>* 0.78 AND 0.01 ≤ property

**Figure 5.** The Receiver Operating Characteristic (ROC) curves for HGMD and cancer data sets (left), and example distributions of gain/loss property *p* in neutral and disease sets (green and red, respectively; right). An empirical distribution of the putatively neutral substitutions can be used to define a threshold *r* on the false positive rate that, in turn, can be used to accept/reject the null hypothesis on new substitutions. The area shaded in green represents the *P*-value threshold (corresponding to the score *r*) that is used by MutPred to hypothesize molecular cause of disease. A particular area under the right tail of the neutral distribution is referred to as the property score.

score *<* 0.05; (3) very confident hypotheses: MutPred score *>* 0.78 AND property score *<* 0.01, where 0.78 corresponds to specificity 0.95 in HGMD data set.

We applied MutPred to 203,899 nsSNPs deposited in the dbSNP (build 135) and examined the score distribution and frequent hypotheses behind predicted deleterious mutations. In general, 35 percent of mutations were predicted with scores higher than 0.5; thus, we classified them as disease-associated (Figure 6). Of these deleterious mutations, 19.6 percent got at least one functional or structural hypothesis of possible molecular mechanism. The top three hypotheses all pointed to structural changes: gain of disorder (9.7 percent), loss of stability (8.5 percent), and loss of disorder (6.2 percent). This result agrees with [28]–at least in the sense that these changes are the most frequently seen. On the other hand, common functional alterations involved in disease included loss of MoRF binding (6.0 percent), gain of methylation (5.9 percent), and gain of catalytic residue (5.6 percent).

#### **5. Conclusion**

12 Will-be-set-by-IN-TECH

Catalytic residues † MoRFs [45]

Methylation sites [46] Glycosylation sites † Ubiquitination sites [47]

B-factor [50]

**Table 6.** Major attributes used in MutPred. † unpublished in-house program. ‡ used in latest version of

Compared to a normal single decision tree, each subtree within a random forest uses only partial features and samples, which results in small correlations among subtrees and effectively reduces the overall variance of the model. Moreover, random forests inherit some attractive properties from decision trees, such as robustness to outliers and ease of

In our model, we specified 1,000 trees to build the classifier between disease and neutral mutations. The HGMD achieved better accuracy than the somatic cancer data, suggesting that monogenic disease-related mutations are more suited to MutPred than somatic cancer mutations (Table 7). This is likely due to the large number of passenger variants (not causative) in tissue cancer sequencing data sets. Also, in terms of area under the curve (AUC) MutPred

> Data set Sensitivity Specificity Accuracy HGMD 76.8 79.0 77.7 Cancer 60.9 68.4 65.5

MutPred can provide not only comparable predictions for a mutation's predisposition to cause diseases [55], but it also allows the estimation of the significance level for individual gain/loss of properties (Figure 5, right). It is reasonable to assume that the distribution of property *p* in the neutral data set provides an unbiased approximation of the true null distribution, given the fact that UniProt provided the largest available set of curated neutral variants. Therefore, we could generate hypotheses about the molecular mechanism underlying variants at three different confidence levels: (1) actionable hypotheses: 0.78 ≥ MutPred score *>* 0.5 AND property score *<* 0.05; (2) confident hypotheses: MutPred score *>* 0.78 AND 0.01 ≤ property

**Table 7.** Percentage of classification performance measurement for HGMD and cancer data sets.

Phosphorylation sites DisPhos [37]

Solvent accessibility PHD/Prof [48] Stability MUpro [21] Intrinsic disorder DISPROT [49]

Transmembrane helix HMMTOP [51] Coiled-coil structure marcoil [52]

Conservation index‡[53]

Type Property Software Functional properties DNA-binding residues DBS-PRED [44]

Structure and dynamics Secondary structure PHD/Prof [48]

Evolutionary information Sequence Conservation SIFT [13]

observed 0.86 in HGMD and 0.69 in cancer data sets (Figure 5, left).

MutPred.

interpretation.

Understanding mutation data generated in biomedical research stimulates the development of computational methods. Previous studies have revealed structural and functional impacts on underlying proteins from variants, and research has proven that these impacts can differentiate between disease-associated and neutral mutations. Most current prediction tools have taken advantage of these characteristics, along with evolutionary information readily available from sequence alignment. Such tools have demonstrated impressive classification

**6. References**

*Science*, 322(5903):881–888, 2008.

*Genome Med*, 1(1):13, 2009.

109(10):3879–3884, 2012.

Oct 1991.

78(4):543–6, 1994.

265(5170):346–355, 1994.

11(5):863–874, 2001.

[1] David Altshuler, Mark J. Daly, and Eric S. Lander. Genetic mapping in human disease.

Bioinformatics Approaches to the Functional Pro ling of Genetic Variants 247

[2] 1000 Genomes Project Consortium. A map of human genome variation from

[3] Peter D Stenson, Matthew Mort, Edward V Ball, Katy Howells, Andrew D Phillips, Nick St Thomas, and David N Cooper. The human gene mutation database: 2008 update.

[4] Jamie K Teer and James C Mullikin. Exome sequencing: the sweet spot before whole

[5] Jens G. Lohr, Petar Stojanov, Michael S. Lawrence, Daniel Auclair, Bjoern Chapuy, Carrie Sougnez, Peter Cruz-Gordillo, Birgit Knoechel, Yan W. Asmann, Susan L. Slager, Anne J. Novak, Ahmet Dogan, Stephen M. Ansell, Brian K. Link, Lihua Zou, Joshua Gould, Gordon Saksena, Nicolas Stransky, Claudia Rangel-Escareño, Juan Carlos Fernandez-Lopez, Alfredo Hidalgo-Miranda, Jorge Melendez-Zajgla, Enrique Hernández-Lemus, Angela Schwarz-Cruz y Celis, Ivan Imaz-Rosshandler, Akinyemi I. Ojesina, Joonil Jung, Chandra S. Pedamallu, Eric S. Lander, Thomas M. Habermann, James R. Cerhan, Margaret A. Shipp, Gad Getz, and Todd R. Golub. Discovery and prioritization of somatic mutations in diffuse large b-cell lymphoma (dlbcl) by whole-exome sequencing. *Proceedings of the National Academy of Sciences*,

[6] Lucia A. Hindorff, Praveen Sethupathy, Heather A. Junkins, Erin M. Ramos, Jayashri P. Mehta, Francis S. Collins, and Teri A. Manolio. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. *Proceedings*

[7] C D Bottema, R P Ketterling, S Ii, H S Yoon, J A Phillips, 3rd, and S S Sommer. Missense mutations and evolutionary conservation of amino acids: evidence that many of the amino acids in factor ix function as "spacer" elements. *Am J Hum Genet*, 49(4):820–38,

[8] M. P. Miller and S. Kumar. Understanding human disease mutations through the use of

[9] C Prives. How loops, beta sheets, and alpha helices help us to understand p53. *Cell*,

[10] Y Cho, S Gorina, PD Jeffrey, and NP Pavletich. Crystal structure of a p53 tumor suppressor-dna complex: understanding tumorigenic mutations. *Science*,

[11] Andreas C. Joerger, Mark D. Allen, and Alan R. Fersht. Crystal structure of a superstable mutant of human p53 core domain. *Journal of Biological Chemistry*, 279(2):1291–1296, 2004. [12] E Grasbon-Frodl, Holger Lorenz, U Mann, R M Nitsch, Otto Windl, and H A Kretzschmar. Loss of glycosylation associated with the t183a mutation in human prion

[13] P C Ng and S Henikoff. Predicting deleterious amino acid substitutions. *Genome Res*,

population-scale sequencing. *Nature*, 467(7319):1061–73, 2010.

*of the National Academy of Sciences*, 106(23):9362–9367, 2009.

disease. *Acta Neuropathol*, 108(6):476–84, Dec 2004.

interspecific genetic variation. *Hum Mol Genet*, 10(21):2319–28, 2001.

genomes. *Hum Mol Genet*, 19(R2):R145–51, 2010.

**Figure 6.** The distribution of MutPred scores for nsSNPs from dbSNP (left), and the top ten hypotheses for disease-associated mutations (right). The density on the left is a normalized frequency to ensure a total area in the bar plot equals one.

accuracy in monogenic disease-associated mutations but have performed less well for cancer somatic mutations. One explanation from an evolutionary perspective for this descrepency is that cancers usually arise late in life, so they are subjected to less purifying selection. This makes conservation information in cancers less useful than in monogenic diseases [56]. This field faces two immediate challenges: (1) How can we improve these tools to improve performance with somatic mutations? If the consensus opinion holds that tools depending on evolutionary knowledge are less effective than when applied to monogenic-disease-related mutations, it seems that research should explore other avenues. Inclusion of the mutation context in the model–e.g., pathways containing disease proteins–might offer a starting point for new directions. (2) How can we more accurately elucidate the molecular mechanisms for predicted deleterious mutations? MutPred has demonstrated this concept through definitions of gain/loss of individual properties. Similar features should be considered once they prove capable of reliably discriminating between disease-associated and neutral mutations. By continuously improving our computational tools, we can obtain better and more accurate understandings of biology and human health.

## **Author details**

Biao Li *The Buck Institute for Research on Aging, Novato, CA 94945, USA*

Predrag Radivojac *Indiana University, Bloomington, IN 47405, USA*

Sean Mooney *The Buck Institute for Research on Aging, Novato, CA 94945, USA*

#### **6. References**

14 Will-be-set-by-IN-TECH

**Figure 6.** The distribution of MutPred scores for nsSNPs from dbSNP (left), and the top ten hypotheses for disease-associated mutations (right). The density on the left is a normalized frequency to ensure a

accuracy in monogenic disease-associated mutations but have performed less well for cancer somatic mutations. One explanation from an evolutionary perspective for this descrepency is that cancers usually arise late in life, so they are subjected to less purifying selection. This makes conservation information in cancers less useful than in monogenic diseases [56]. This field faces two immediate challenges: (1) How can we improve these tools to improve performance with somatic mutations? If the consensus opinion holds that tools depending on evolutionary knowledge are less effective than when applied to monogenic-disease-related mutations, it seems that research should explore other avenues. Inclusion of the mutation context in the model–e.g., pathways containing disease proteins–might offer a starting point for new directions. (2) How can we more accurately elucidate the molecular mechanisms for predicted deleterious mutations? MutPred has demonstrated this concept through definitions of gain/loss of individual properties. Similar features should be considered once they prove capable of reliably discriminating between disease-associated and neutral mutations. By continuously improving our computational tools, we can obtain better and more accurate

Percent 0 2 4 6 810

Loss of disorder Loss of MoRF binding Gain of methylation Gain of catalytic residue Gain of MoRF binding Loss of catalytic residue Gain of ubiquitination Loss of methylation

Gain of disorder

Loss of stability

MutPred Score

understandings of biology and human health.

*Indiana University, Bloomington, IN 47405, USA*

*The Buck Institute for Research on Aging, Novato, CA 94945, USA*

*The Buck Institute for Research on Aging, Novato, CA 94945, USA*

**Author details**

Predrag Radivojac

Sean Mooney

Biao Li

total area in the bar plot equals one.

0.0 0.2 0.4 0.6 0.8 1.0

Density

0.0

 0.5

 1.0

 1.5


[14] Ivan A Adzhubei, Steffen Schmidt, Leonid Peshkin, Vasily E Ramensky, Anna Gerasimova, Peer Bork, Alexey S Kondrashov, and Shamil R Sunyaev. A method and server for predicting damaging missense mutations. *Nat Methods*, 7(4):248–9, 2010.

[31] Erin D Lew, Jae Hyun Bae, Edyta Rohmann, Bernd Wollnik, and Joseph Schlessinger. Structural basis for reduced fgfr2 activity in ladd syndrome: Implications for fgfr

Bioinformatics Approaches to the Functional Pro ling of Genetic Variants 249

[34] T Oppliger, B Thöny, H Nar, D Bürgisser, R Huber, C W Heizmann, and N Blau. Structural and functional consequences of mutations in 6-pyruvoyltetrahydropterin synthase causing hyperphenylalaninemia in humans. phosphorylation is a requirement

[35] N Blom, S Gammeltoft, and S Brunak. Sequence and structure-based prediction of

[36] M B Yaffe, G G Leparc, J Lai, T Obata, S Volinia, and L C Cantley. A motif-based profile scanning approach for genome-wide prediction of signaling pathways. *Nat Biotechnol*,

[37] Lilia M Iakoucheva, Predrag Radivojac, Celeste J Brown, Timothy R O'Connor, Jason G Sikes, Zoran Obradovic, and A Keith Dunker. The importance of intrinsic disorder for

[38] D P Teufel, M Bycroft, and A R Fersht. Regulation by phosphorylation of the relative affinities of the n-terminal transactivation domains of p53 for p300 domains and mdm2.

[39] Holger Dinkel, Claudia Chica, Allegra Via, Cathryn M Gould, Lars J Jensen, Toby J Gibson, and Francesca Diella. Phospho.elm: a database of phosphorylation sites–update

[40] Predrag Radivojac, Peter H Baenziger, Maricel G Kann, Matthew E Mort, Matthew W Hahn, and Sean D Mooney. Gain and loss of phosphorylation sites in human cancer.

[41] Tobias Sjöblom, Siân Jones, Laura D Wood, D Williams Parsons, Jimmy Lin, Thomas D Barber, Diana Mandelker, Rebecca J Leary, Janine Ptak, Natalie Silliman, Steve Szabo, Phillip Buckhaults, Christopher Farrell, Paul Meeh, Sanford D Markowitz, Joseph Willis, Dawn Dawson, James K V Willson, Adi F Gazdar, James Hartigan, Leo Wu, Changsheng Liu, Giovanni Parmigiani, Ben Ho Park, Kurtis E Bachman, Nickolas Papadopoulos, Bert Vogelstein, Kenneth W Kinzler, and Victor E Velculescu. The consensus coding sequences

[42] Biao Li, Vidhya G Krishnan, Matthew E Mort, Fuxiao Xin, Kishore K Kamati, David N Cooper, Sean D Mooney, and Predrag Radivojac. Automated inference of molecular mechanisms of disease from amino acid substitutions. *Bioinformatics*, 25(21):2744–50,

[43] Yana Bromberg and Burkhard Rost. Snap: predict effect of non-synonymous

of human breast and colorectal cancers. *Science*, 314(5797):268–74, 2006.

polymorphisms on function. *Nucleic Acids Res*, 35(11):3823–35, 2007.

eukaryotic protein phosphorylation sites. *J Mol Biol*, 294(5):1351–62, 1999.

protein phosphorylation. *Nucleic Acids Res*, 32(3):1037–1049, 2004.

2011. *Nucleic Acids Res*, 39(Database issue):D261–7, 2011.

for in vivo activity. *J Biol Chem*, 270(49):29498–506, 1995.

autoinhibition and activation. *Proc Natl Acad SciUSA*, 104(50):19802–7, 2007. [32] B Thöny, W Leimbacher, N Blau, A Harvie, and C W Heizmann. Hyperphenylalaninemia due to defects in tetrahydrobiopterin metabolism: molecular characterization of mutations in 6-pyruvoyl-tetrahydropterin synthase. *Am J Hum Genet*, 54(5):782–92, 1994. [33] T Scherer-Oppliger, W Leimbacher, N Blau, and B Thöny. Serine 19 of human 6-pyruvoyltetrahydropterin synthase is phosphorylated by cgmp protein kinase ii. *J Biol*

*Chem*, 274(44):31341–8, 1999.

19(4):348–53, 2001.

2009.

*Oncogene*, 28(20):2112–8, 2009.

*Bioinformatics*, 24(16):i241–7, 2008.


[31] Erin D Lew, Jae Hyun Bae, Edyta Rohmann, Bernd Wollnik, and Joseph Schlessinger. Structural basis for reduced fgfr2 activity in ladd syndrome: Implications for fgfr autoinhibition and activation. *Proc Natl Acad SciUSA*, 104(50):19802–7, 2007.

16 Will-be-set-by-IN-TECH

[14] Ivan A Adzhubei, Steffen Schmidt, Leonid Peshkin, Vasily E Ramensky, Anna Gerasimova, Peer Bork, Alexey S Kondrashov, and Shamil R Sunyaev. A method and server for predicting damaging missense mutations. *Nat Methods*, 7(4):248–9, 2010. [15] L Pauling and H A Itano. Sickle cell anemia a molecular disease. *Science*, 110(2865):543–8,

[16] B C Wishner, K B Ward, E E Lattman, and W E Love. Crystal structure of sickle-cell

[17] Y Miki, J Swensen, D Shattuck-Eidens, P A Futreal, K Harshman, S Tavtigian, Q Liu, C Cochran, L M Bennett, and W Ding. A strong candidate for the breast and ovarian

[18] L S Friedman, E A Ostermeyer, C I Szabo, P Dowd, E D Lynch, S E Rowell, and M C King. Confirmation of brca1 by analysis of germline mutations linked to breast and

[19] R S Williams, R Green, and J N Glover. Crystal structure of the brct repeat region from the breast cancer-associated protein brca1. *Nat Struct Biol*, 8(10):838–42, 2001. [20] A K Dunker, J D Lawson, C J Brown, R M Williams, P Romero, J S Oh, C J Oldfield, A M Campen, C M Ratliff, K W Hipps, J Ausio, M S Nissen, R Reeves, C Kang, C R Kissinger, R W Bailey, M D Griswold, W Chiu, E C Garner, and Z Obradovic. Intrinsically

[21] Jianlin Cheng, Arlo Randall, and Pierre Baldi. Prediction of protein stability changes for single-site mutations using support vector machines. *Proteins*, 62(4):1125–1132, 2006. [22] D Gilis and M Rooman. Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local

versus non-local interactions along the sequence. *J Mol Biol*, 272(2):276–90, 1997. [23] P Y Chou and G D Fasman. Prediction of protein conformation. *Biochemistry*,

[24] N Qian and T J Sejnowski. Predicting the secondary structure of globular proteins using

[25] M D Shaji Kumar, K Abdulla Bava, M Michael Gromiha, Ponraj Prabakaran, Koji Kitajima, Hatsuho Uedaira, and Akinori Sarai. Protherm and pronit: thermodynamic databases for proteins and protein-nucleic acid interactions. *Nucleic Acids Res*,

[26] Trevor Hastie, Robert Tibshirani, and J. H Friedman. *The elements of statistical learning: data mining, inference, and prediction*. Springer series in statistics. Springer, New York, NY,

[27] Thorsten Joachims. *Learning to classify text using support vector machines*, volume SECS

[28] Z Wang and J Moult. Snps, protein structure, and disease. *Hum Mutat*, 17(4):263–270,

[29] Christopher L Brooks and Wei Gu. p53 ubiquitination: Mdm2 and beyond. *Mol Cell*,

[30] Imad Shams, Edyta Rohmann, Veraragavan P Eswarakumar, Erin D Lew, Satoru Yuzawa, Bernd Wollnik, Joseph Schlessinger, and Irit Lax. Lacrimo-auriculo-dento-digital syndrome is caused by reduced activity of the fibroblast growth factor 10 (fgf10)-fgf

receptor 2 signaling pathway. *Mol Cell Biol*, 27(19):6903–12, 2007.

deoxyhemoglobin at5aresolution. *J Mol Biol*, 98(1):179–94, 1975.

cancer susceptibility gene brca1. *Science*, 266(5182):66–71, 1994.

ovarian cancer in ten families. *Nat Genet*, 8(4):399–404, 1994.

disordered protein. *J Mol Graph Model*, 19(1):26–59, 2001.

neural network models. *J Mol Biol*, 202(4):865–84, Aug 1988.

1949.

13(2):222–45, Jan 1974.

2nd edition, 2009.

21(3):307–15, 2006.

2001.

34(Database issue):D204–6, 2006.

668. Kluwer Academic Publishers, Boston, 2002.


© 2012 Sassolas et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**Anderson's Disease/Chylomicron Retention** 

**Disease and Mutations in the** *SAR1B* **Gene** 

Additional information is available at the end of the chapter

of novel or previously-described mutations in the *SAR1B* gene.

http://dx.doi.org/10.5772/45976

**1. Introduction** 

**2. Clinical features** 

A. Sassolas, M. Di Filippo, L.P. Aggerbeck, N. Peretti and M.E. Samson-Bouma

Anderson's Disease (AD)/Chylomicron Retention Disease (CMRD) (OMIM #607689) is a rare autosomal recessively inherited lipid malabsorption syndrome characterized by hypocholesterolemia associated with failure to thrive, diarrhea, steatorrhea and abdominal distension that presents most frequently in young infants. Charlotte Anderson first published a description of the disorder in 1961 [1] based upon observations of a young girl of seven months of age who manifested a characteristic macroscopic and microscopic appearance of the intestinal mucosa which was filled with fat. Forty two years later, in 2003, Jones and colleagues [2], in 8 families, identified mutations in the *SAR1B* gene, which encodes for the intracellular trafficking protein SAR1b, and proposed that this was the molecular defect in the disorder. The disease is very rare. From the first clinical description of the disease up to the identification of the causal gene, only 39 patients from 24 families were described in the literature [3-21]. From 2003 to the present, 23 new patients from 14 additional families have been identified. In all, 16 different mutations in the *SAR1B* gene now have been described in 34 patients from 21 families [2, 22-27]. Here, we provide an overview of this disease, including the description of 4 new patients from 3 new families (one new mutation), and we describe the predicted molecular impact on the SAR1b protein

The first symptoms of AD/CMRD, which most frequently occurr within a few months after birth, consist of failure to thrive, diarrhea with steatorrhea and abdominal distension. Of the 62 patients described in the literature, only 4 were diagnosed as adults; two sisters presented with diarrhea that was found to have begun in infancy [21, 23], the third adult had severe neurological signs in infancy [6] and the past medical history of the last adult revealed some

and reproduction in any medium, provided the original work is properly cited.

## **Anderson's Disease/Chylomicron Retention Disease and Mutations in the** *SAR1B* **Gene**

A. Sassolas, M. Di Filippo, L.P. Aggerbeck, N. Peretti and M.E. Samson-Bouma

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/45976

## **1. Introduction**

18 Will-be-set-by-IN-TECH

[44] Shandar Ahmad, M Michael Gromiha, and Akinori Sarai. Analysis and prediction of dna-binding proteins and their binding residues based on composition, sequence and

[45] Predrag Radivojac, Slobodan Vucetic, Timothy R O'Connor, Vladimir N Uversky, Zoran Obradovic, and A Keith Dunker. Calmodulin signaling: analysis and prediction of a

[46] Kenneth M. Daily, Predrag Radivojac, and A. Keith Dunker. Intrinsic disorder and protein modifications: building an svm predictor for methylation. In *IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2005*, pages

[47] Predrag Radivojac, Vladimir Vacic, Chad Haynes, Ross R Cocklin, Amrita Mohan, Joshua W Heyen, Mark G Goebl, and Lilia M Iakoucheva. Identification, analysis, and

[48] B Rost. Phd: predicting one-dimensional protein structure by profile-based neural

[49] Kang Peng, Predrag Radivojac, Slobodan Vucetic, A Keith Dunker, and Zoran Obradovic. Length-dependent prediction of protein intrinsic disorder. *BMC Bioinformatics*, 7:208,

[50] Predrag Radivojac, Zoran Obradovic, David K Smith, Guang Zhu, Slobodan Vucetic, Celeste J Brown, J David Lawson, and A Keith Dunker. Protein flexibility and intrinsic

[51] A Krogh, B Larsson, G von Heijne, and E L Sonnhammer. Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. *J Mol*

[52] Mauro Delorenzi and Terry Speed. An hmm model for coiled-coil domains and a comparison with pssm-based predictions. *Bioinformatics*, 18(4):617–25, 2002. [53] J Pei and N V Grishin. Al2co: calculation of positional conservation in a protein sequence

[55] Janita Thusberg, Ayodeji Olatubosun, and Mauno Vihinen. Performance of mutation pathogenicity prediction methods on missense variants. *Hum Mutat*, 32(4):358–68, 2011. [56] Sudhir Kumar, Joel T Dudley, Alan Filipski, and Li Liu. Phylomedicine: an evolutionary telescope to explore and diagnose the universe of disease mutations. *Trends Genet*,

disorder-dependent molecular recognition. *Proteins*, 63(2):398–410, 2006.

prediction of protein ubiquitination sites. *Proteins*, 78(2):365–80, 2010.

structural information. *Bioinformatics*, 20(4):477–86, 2004.

networks. *Methods Enzymol*, 266:525–39, 1996.

disorder. *Protein Sci*, 13(1):71–80, 2004.

alignment. *Bioinformatics*, 17(8):700–12, 2001.

[54] Leo Breiman. Random forests. *Machine Learning*, 45(1):5–32, 2001.

*Biol*, 305(3):567–80, 2001.

27(9):377–86, 2011.

475–481, 2005.

2006.

Anderson's Disease (AD)/Chylomicron Retention Disease (CMRD) (OMIM #607689) is a rare autosomal recessively inherited lipid malabsorption syndrome characterized by hypocholesterolemia associated with failure to thrive, diarrhea, steatorrhea and abdominal distension that presents most frequently in young infants. Charlotte Anderson first published a description of the disorder in 1961 [1] based upon observations of a young girl of seven months of age who manifested a characteristic macroscopic and microscopic appearance of the intestinal mucosa which was filled with fat. Forty two years later, in 2003, Jones and colleagues [2], in 8 families, identified mutations in the *SAR1B* gene, which encodes for the intracellular trafficking protein SAR1b, and proposed that this was the molecular defect in the disorder. The disease is very rare. From the first clinical description of the disease up to the identification of the causal gene, only 39 patients from 24 families were described in the literature [3-21]. From 2003 to the present, 23 new patients from 14 additional families have been identified. In all, 16 different mutations in the *SAR1B* gene now have been described in 34 patients from 21 families [2, 22-27]. Here, we provide an overview of this disease, including the description of 4 new patients from 3 new families (one new mutation), and we describe the predicted molecular impact on the SAR1b protein of novel or previously-described mutations in the *SAR1B* gene.

## **2. Clinical features**

The first symptoms of AD/CMRD, which most frequently occurr within a few months after birth, consist of failure to thrive, diarrhea with steatorrhea and abdominal distension. Of the 62 patients described in the literature, only 4 were diagnosed as adults; two sisters presented with diarrhea that was found to have begun in infancy [21, 23], the third adult had severe neurological signs in infancy [6] and the past medical history of the last adult revealed some

© 2012 Sassolas et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

clumsiness in walking and running and very loose bowel movements in infancy [7]. These patients may have spontaneously avoided the fat in their diets to minimize symptoms. Non specific malabsorptive diarrhea is present in almost all cases with steatorrhea, even when a low fat diet is observed [28]. The diagnosis is sometimes delayed (often for several years) because the symptoms are non-specific and are attributed to chronic diarrhea (cystic fibrosis or coeliac disease). Thus, 39/45 patients exhibited the first symptoms before one year of age, whereas only 21/52 received the proper diagnosis without undue delay. As consequence of diarrhea, failure to thrive (-1 to -4DS for height and/or weight) is also frequent (45/51 patients) and persists if a low fat diet is not instituted. Other digestive symptoms, such as vomiting or a grossly distended abdomen are commonly observed. Usually, if a low fat diet supplemented with lipid soluble vitamins is instituted, the growth starts again; however, some patients with a delayed diagnosis do not attain a normal height and weight [29]. Tolerance to fat in the diet has been reported in a few cases [14, 16, 22, 24, 27]; however, in most instances, diarrhea begins again when fat is reintroduced in the diet [29].

Anderson's Disease/Chylomicron Retention Disease and Mutations in the *SAR1B* Gene 253

90% diarrhea, 57% failure to thrive

in patients is 0,49mM) associated with a normal triglyceride level is pathognomonic of AD, if all the secondary causes of malabsorption such as celiac disease, exocrine pancreatic insufficiency (cystic fibrosis or Shwachman-Diamond syndrome), and the Mc Kusick syndrome (small height and malabsorption with exactly the same lipid profile as AD) have been ruled out. Further, other causes of familial hypocholesterolemias must be carefully ruled out; for example, some patients with AD/CMRD have low levels of triglycerides and high levels of HDLc that are similar to those found in atypical abetalipoproteinemia [30, 31] or homozygous hypobetalipoproteinemia (data not shown). Plasma levels of vitamin E, measured before supplementation in patients diagnosed during the last decade, are usually low or very low (but detectable, from 0,5 to 6,8 µM, 3 of 19 patients had undetectable levels). In patients described previously, the undetectable levels were probably due to technical limitations (reported values range from 0, 23 to 11,3 µM, and 13 of 28 patients had undetectable levels). Mild decreases of vitamin A have also been found [5, 6, 11, 12, 18, 21, 24, 27] but there are normal levels of other fat soluble vitamins in most of the AD/CMRD patients.

**Patients data All published cases Published cases with mutations** 

age at onset 56% < 3 mths, 87% < 1 year 53% < 3 mths, 84% < 1 year age at diagnosis 60% > 1 year, 23% > 10 years 50% > 1 year, 23% > 10 years

TC mM n=54 M=1,75 (0,86-3,38) n=34 M=1,81 (1,11-2,82) TG mM n=48 M=0,87 (0,36-2,06) n=33 M=0,92 (0,36-1,98) HDLc mM n=26 M=0,49 (0,32-0,83) n=23 M=0,50 (0,32-0,83) LDLc mM n=26 M=0,87 (0,26-1,61) n=23 M=0,88 (0,31-1,61) apoB g/l n=37 M=0,44 (0,20-0,82) n=21 M=0,49 (0,20-0,82) apoA1 g/l n=31 M=0,52 (0,26-0,90) n=18 M=0,52 (0,38-0,90) Vitamin E µM n=43 M=2,74 (0 – 11,3) n=23 M=2,81 (0 – 7,6)

In most cases, an essential fatty acid (FA) deficiency has been not investigated, nevertheless, a decrease of linoleic acid (C18:2 n-6) and normal levels of n-3 FA have been found in two files of patients [10, 28]. For all the patients, the lipid profiles of the heterozygous parents were normal.

Four new cases of AD/CMRD in 3 families have recently been discovered (Table 2, 3). All the individuals presented with diarrhea and failure to thrive (4/4 patients). Interestingly, one of the patients presented with tremor at diagnosis (Table 2). The plasma lipids and vitamin E exhibit a wide range of levels and, in particular, the triglycerides and total and LDL

The inability of the enterocytes to secrete chylomicrons and apoB 48 after a fat load is a common clinical feature of AD/CMRD, ABL (abetalipoproteinemia) and, generally,

N 62 34

thrive

(TC: total cholesterol, TG: triglycerides, HDLc: HDL cholesterol, LDLc: LDL cholesterol)

cholesterol values which is an other characteristic of AD.

major clinical data 90% diarrhea, 88% failure to

**Table 1.** Mean data for all the published cases

Hepatic and neurological abnormalities, although sometimes reported in young patients, generally are tardive manifestations, particularly when the diagnosis and the implementation of dietary vitamin supplements are delayed. Several cases of transient hepatomegaly have been described [6, 9, 11, 16, 17, 22] and one or both aminotransaminases (ASAT and ALAT) are frequently reported to be increased (13/15 patients of Charcosset [22]) but confirmed hepatic steatosis are infrequent (three cases described) [11, 25]. However, no instance of cirrhosis has been reported. In young adults or older patients, neurological abnormalities consist mainly of areflexia [11, 12, 14, 22]. In some cases, more severe neurological degeneration consisting of ataxia, sensory neuropathy and/or tremor has been reported [6, 7, 11, 19]. Mild defects in color vision and retinal function also have been observed [11, 14, 28] but no retinis pigmentosa has been reported. Acanthocytosis is very rare and usually transient [6, 12, 17, 27].

Mild muscular abnormalities have been described in several patients and consist mainly of muscular pain and cramps; one patient was described with myopathy [6]. Creatine kinase (CK) levels are often found to be elevated (1,5-2,5 times normal) [23, 27]. Jones et al (2003) have shown that high levels of SAR1B mRNA expression occurs in tissues other than intestine [2] and, therefore, extra-intestinal clinical manifestations might occurr in AD/CMRD. Silvain et al have described a cardiomyopathy in an adult and documented the accumulation of lipids in some muscle fibers [23]. Consequently, clinical evaluation and follow-up of these patients should include CK levels and cardiac examination.

Poor mineralization and delayed bone maturation may be present and vitamin D levels may be normal or decreased [5, 12, 18, 21, 23, 28]. Several patients also have exhibited associated infectious diseases [14, 16].

AD/CMRD patients exhibit a particular recessive hypocholesterolemia which differs from other familial hypocholesterolemias. The hypocholesterolemia manifests itself by a decrease of plasma LDL (LDLc) and HDL (HDLc) cholesterol (both by approximately 50%) associated with a normal level of triglycerides (Table 1). The severe decrease of HDLc (the mean level in patients is 0,49mM) associated with a normal triglyceride level is pathognomonic of AD, if all the secondary causes of malabsorption such as celiac disease, exocrine pancreatic insufficiency (cystic fibrosis or Shwachman-Diamond syndrome), and the Mc Kusick syndrome (small height and malabsorption with exactly the same lipid profile as AD) have been ruled out. Further, other causes of familial hypocholesterolemias must be carefully ruled out; for example, some patients with AD/CMRD have low levels of triglycerides and high levels of HDLc that are similar to those found in atypical abetalipoproteinemia [30, 31] or homozygous hypobetalipoproteinemia (data not shown). Plasma levels of vitamin E, measured before supplementation in patients diagnosed during the last decade, are usually low or very low (but detectable, from 0,5 to 6,8 µM, 3 of 19 patients had undetectable levels). In patients described previously, the undetectable levels were probably due to technical limitations (reported values range from 0, 23 to 11,3 µM, and 13 of 28 patients had undetectable levels). Mild decreases of vitamin A have also been found [5, 6, 11, 12, 18, 21, 24, 27] but there are normal levels of other fat soluble vitamins in most of the AD/CMRD patients.


(TC: total cholesterol, TG: triglycerides, HDLc: HDL cholesterol, LDLc: LDL cholesterol)

**Table 1.** Mean data for all the published cases

252 Mutations in Human Genetic Disease

clumsiness in walking and running and very loose bowel movements in infancy [7]. These patients may have spontaneously avoided the fat in their diets to minimize symptoms. Non specific malabsorptive diarrhea is present in almost all cases with steatorrhea, even when a low fat diet is observed [28]. The diagnosis is sometimes delayed (often for several years) because the symptoms are non-specific and are attributed to chronic diarrhea (cystic fibrosis or coeliac disease). Thus, 39/45 patients exhibited the first symptoms before one year of age, whereas only 21/52 received the proper diagnosis without undue delay. As consequence of diarrhea, failure to thrive (-1 to -4DS for height and/or weight) is also frequent (45/51 patients) and persists if a low fat diet is not instituted. Other digestive symptoms, such as vomiting or a grossly distended abdomen are commonly observed. Usually, if a low fat diet supplemented with lipid soluble vitamins is instituted, the growth starts again; however, some patients with a delayed diagnosis do not attain a normal height and weight [29]. Tolerance to fat in the diet has been reported in a few cases [14, 16, 22, 24, 27]; however, in

most instances, diarrhea begins again when fat is reintroduced in the diet [29].

very rare and usually transient [6, 12, 17, 27].

infectious diseases [14, 16].

Hepatic and neurological abnormalities, although sometimes reported in young patients, generally are tardive manifestations, particularly when the diagnosis and the implementation of dietary vitamin supplements are delayed. Several cases of transient hepatomegaly have been described [6, 9, 11, 16, 17, 22] and one or both aminotransaminases (ASAT and ALAT) are frequently reported to be increased (13/15 patients of Charcosset [22]) but confirmed hepatic steatosis are infrequent (three cases described) [11, 25]. However, no instance of cirrhosis has been reported. In young adults or older patients, neurological abnormalities consist mainly of areflexia [11, 12, 14, 22]. In some cases, more severe neurological degeneration consisting of ataxia, sensory neuropathy and/or tremor has been reported [6, 7, 11, 19]. Mild defects in color vision and retinal function also have been observed [11, 14, 28] but no retinis pigmentosa has been reported. Acanthocytosis is

Mild muscular abnormalities have been described in several patients and consist mainly of muscular pain and cramps; one patient was described with myopathy [6]. Creatine kinase (CK) levels are often found to be elevated (1,5-2,5 times normal) [23, 27]. Jones et al (2003) have shown that high levels of SAR1B mRNA expression occurs in tissues other than intestine [2] and, therefore, extra-intestinal clinical manifestations might occurr in AD/CMRD. Silvain et al have described a cardiomyopathy in an adult and documented the accumulation of lipids in some muscle fibers [23]. Consequently, clinical evaluation and

Poor mineralization and delayed bone maturation may be present and vitamin D levels may be normal or decreased [5, 12, 18, 21, 23, 28]. Several patients also have exhibited associated

AD/CMRD patients exhibit a particular recessive hypocholesterolemia which differs from other familial hypocholesterolemias. The hypocholesterolemia manifests itself by a decrease of plasma LDL (LDLc) and HDL (HDLc) cholesterol (both by approximately 50%) associated with a normal level of triglycerides (Table 1). The severe decrease of HDLc (the mean level

follow-up of these patients should include CK levels and cardiac examination.

In most cases, an essential fatty acid (FA) deficiency has been not investigated, nevertheless, a decrease of linoleic acid (C18:2 n-6) and normal levels of n-3 FA have been found in two files of patients [10, 28]. For all the patients, the lipid profiles of the heterozygous parents were normal.

Four new cases of AD/CMRD in 3 families have recently been discovered (Table 2, 3). All the individuals presented with diarrhea and failure to thrive (4/4 patients). Interestingly, one of the patients presented with tremor at diagnosis (Table 2). The plasma lipids and vitamin E exhibit a wide range of levels and, in particular, the triglycerides and total and LDL cholesterol values which is an other characteristic of AD.

The inability of the enterocytes to secrete chylomicrons and apoB 48 after a fat load is a common clinical feature of AD/CMRD, ABL (abetalipoproteinemia) and, generally, homozygous FHBL (familial hypobetalipoproteinemia). When observed with videoendoscopy, the intestine of AD/CMRD patients shows a white mucosa ("*gelée blanche*"). This typical white stippling, like hoar frosting, covers the mucosal surface of the small intestine (Fig 1A, B) even in the fasted state in contrast to healthy individuals. When intestinal biopsies from patients who have fasted are observed by light microscopy, they appear to have a normal number of *villi* of appropriate length. However, the enterocytes are overloaded with birefringent droplets in the cytoplasm (Fig 1 C, D) [1, 5, 6, 8, 9, 11, 12, 14, 16-18, 20, 25, 27]. These droplets are present, mainly, in the upper one-third of the *villus* of the enterocyte and they stain positively with oil red O indicating that they are fat droplets (mainly triglyceride) (Fig 1D, E). In some cases, the droplets are seen to be present preferentially on one side of the *villus* as opposed to both sides, whereas, in other cases (or sometimes in the same case), they may be present on both sides [32]. When the biopsies are examined by electron microscopy, two types of lipid-containing structures, in fact, are observed in the cytoplasm which alter the normal architecture of the cells.Very large lipid droplets (1025 nm average diameter), not in a membrane-bound compartment, are present along with smaller lipoprotein–sized particles (305 nm average diameter) which are present in membrane-bound structures (Fig 2 A, B) [32]. This is in contrast to enterocytes in biopsies Anderson's Disease/Chylomicron Retention Disease and Mutations in the *SAR1B* Gene 255

from patients with ABL which exhibit only (or predominantly) the very large lipid droplets whereas the smaller lipoprotein-sized particles, in membrane bound structures, are absent. In the enterocytes of both AD/CMRD and ABL patients, the Golgi apparatus is often distended but it is, generally, empty and free of lipoprotein-like particles. Further, in AD/CMRD, lipoprotein-like particles are observed, although in only a few cases, in the intercellular spaces between the enterocytes in contrast to ABL where they are never

In addition to the lipid profiles of the patient and the parents, the diagnosis is supported by the absence of secretion of chylomicrons after a fat load, the presence of white duodenal mucosa upon endoscopy, the presence of cytosolic lipid droplets and lipoprotein-sized particles in the enterocytes of the intestinal biopsy and, finally, the discovery of a mutation in *SAR1B* gene. It should be noted, however, that the AD/CMRD phenotype has been observed in patients for which there is no mutation in the coding sequence of the *SAR1B*

SAR1 is a well-known GTPase (guanine tri-phosphatase) which belongs to the ARF (ADPribosylation factor) family of small GTPases [34, 35]. SAR1 initiates the assembly of COPII (coat protein complex II) in the endoplasmic reticulum (ER) by binding to SEC12. Then, SAR1-GDP is converted into SAR1-GTP which undergoes a large conformational change in the two switch regions. The residue Threonine 56, in switch 1, forms bonds to the у phosphate and Mg2+ and the residue Glycine 78, in switch 2, binds to the у phosphate. The movements expose the amino terminal, amphipatic α1 helix (« the membrane anchor ») which then inserts into the ER membrane [36]. Mg2+ has an important regulatory role in this conformational change, mostly related to switch 1 [37]. The membrane-bound SAR1 recruits SEC23-SEC24 and triggers the formation of the pre-budding complex which then recruits SEC13-SEC31 to form the COPII vesicle [36, 38]. SEC24 interacts with specific cargo proteins and concentrates them into the COPII vesicle [39]. SAR1 GTP hydrolysis is stimulated by SEC23 and SEC31 and permits vesicle fission, allowing transport to the Golgi, and eventual disassembly of the coat for recycling of the components [40-42]. SED4p, a protein with 45% homology to SEC12p, accelerates the dissociation of SEC23-24 from the membrane if no cargo is transported with COPII vesicles and it has been proposed that this restricted

disassembly might play a role in concentrating cargoes into COPII vesicles [43].

The typical size of the COPII vesicles ranges from 60 to 70 nm in diameter, which would appear to prohibit these vesicles from carrying chylomicrons (250 nm average diameter) from the ER to the Golgi apparatus [44]. Another vesicle (350-500 nm in diameter), the prechylomicron transport vesicle (PCTV), has been shown to be able to transport chylomicrons [45]. The PCTV is composed of several proteins: VAMP7 (vesicle-associated membrane protein 7) which is the v-SNARE (vesicle-associated soluble N-ethylmaleimide-sensitive factor attachment protein receptor), apoprotein B48 (a cargo), FABP1 (also called liver fatty acid- binding protein, LFABP) (budding initiator), the fatty acid transporter CD36 (a fatty

observed in intercellular spaces.

gene ([33] and unpublished data).

**3. Functions of the SAR1B protein** 

Intestinal endoscopy after a 12-hour fast. In contrast to what is observed in a normal subject (A), video-endoscopy of the duodenum (D) of patient AD2 (B), shows the typical « white hoary frosting » on the small intestinal mucosa. In contrast with a normal subject(C), light microscopy of the duodenal biopsy from AD2 (D) shows the typical vacuolated enterocytes (black arrows) that stain positively with oil red O (E, black arrows). Note the typical heterogeneous aspect of the villi either fat loaded (black arrows) or without lipid droplets (white arrows). Goblet cells are normal (D, arrow g). (C ×100; D ×400; E ×200).

**Figure 1.** Intestinal endocopy after a 12-hour fast (A, B, C, D, E) (from A. Georges [27])

from patients with ABL which exhibit only (or predominantly) the very large lipid droplets whereas the smaller lipoprotein-sized particles, in membrane bound structures, are absent. In the enterocytes of both AD/CMRD and ABL patients, the Golgi apparatus is often distended but it is, generally, empty and free of lipoprotein-like particles. Further, in AD/CMRD, lipoprotein-like particles are observed, although in only a few cases, in the intercellular spaces between the enterocytes in contrast to ABL where they are never observed in intercellular spaces.

In addition to the lipid profiles of the patient and the parents, the diagnosis is supported by the absence of secretion of chylomicrons after a fat load, the presence of white duodenal mucosa upon endoscopy, the presence of cytosolic lipid droplets and lipoprotein-sized particles in the enterocytes of the intestinal biopsy and, finally, the discovery of a mutation in *SAR1B* gene. It should be noted, however, that the AD/CMRD phenotype has been observed in patients for which there is no mutation in the coding sequence of the *SAR1B* gene ([33] and unpublished data).

## **3. Functions of the SAR1B protein**

254 Mutations in Human Genetic Disease

homozygous FHBL (familial hypobetalipoproteinemia). When observed with videoendoscopy, the intestine of AD/CMRD patients shows a white mucosa ("*gelée blanche*"). This typical white stippling, like hoar frosting, covers the mucosal surface of the small intestine (Fig 1A, B) even in the fasted state in contrast to healthy individuals. When intestinal biopsies from patients who have fasted are observed by light microscopy, they appear to have a normal number of *villi* of appropriate length. However, the enterocytes are overloaded with birefringent droplets in the cytoplasm (Fig 1 C, D) [1, 5, 6, 8, 9, 11, 12, 14, 16-18, 20, 25, 27]. These droplets are present, mainly, in the upper one-third of the *villus* of the enterocyte and they stain positively with oil red O indicating that they are fat droplets (mainly triglyceride) (Fig 1D, E). In some cases, the droplets are seen to be present preferentially on one side of the *villus* as opposed to both sides, whereas, in other cases (or sometimes in the same case), they may be present on both sides [32]. When the biopsies are examined by electron microscopy, two types of lipid-containing structures, in fact, are observed in the cytoplasm which alter the normal architecture of the cells.Very large lipid droplets (1025 nm average diameter), not in a membrane-bound compartment, are present along with smaller lipoprotein–sized particles (305 nm average diameter) which are present in membrane-bound structures (Fig 2 A, B) [32]. This is in contrast to enterocytes in biopsies

Intestinal endoscopy after a 12-hour fast. In contrast to what is observed in a normal subject (A), video-endoscopy of the duodenum (D) of patient AD2 (B), shows the typical « white hoary frosting » on the small intestinal mucosa. In contrast with a normal subject(C), light microscopy of the duodenal biopsy from AD2 (D) shows the typical vacuolated enterocytes (black arrows) that stain positively with oil red O (E, black arrows). Note the typical heterogeneous aspect

of the villi either fat loaded (black arrows) or without lipid droplets (white arrows).

**Figure 1.** Intestinal endocopy after a 12-hour fast (A, B, C, D, E) (from A. Georges [27])

Goblet cells are normal (D, arrow g). (C ×100; D ×400; E ×200).

SAR1 is a well-known GTPase (guanine tri-phosphatase) which belongs to the ARF (ADPribosylation factor) family of small GTPases [34, 35]. SAR1 initiates the assembly of COPII (coat protein complex II) in the endoplasmic reticulum (ER) by binding to SEC12. Then, SAR1-GDP is converted into SAR1-GTP which undergoes a large conformational change in the two switch regions. The residue Threonine 56, in switch 1, forms bonds to the у phosphate and Mg2+ and the residue Glycine 78, in switch 2, binds to the у phosphate. The movements expose the amino terminal, amphipatic α1 helix (« the membrane anchor ») which then inserts into the ER membrane [36]. Mg2+ has an important regulatory role in this conformational change, mostly related to switch 1 [37]. The membrane-bound SAR1 recruits SEC23-SEC24 and triggers the formation of the pre-budding complex which then recruits SEC13-SEC31 to form the COPII vesicle [36, 38]. SEC24 interacts with specific cargo proteins and concentrates them into the COPII vesicle [39]. SAR1 GTP hydrolysis is stimulated by SEC23 and SEC31 and permits vesicle fission, allowing transport to the Golgi, and eventual disassembly of the coat for recycling of the components [40-42]. SED4p, a protein with 45% homology to SEC12p, accelerates the dissociation of SEC23-24 from the membrane if no cargo is transported with COPII vesicles and it has been proposed that this restricted disassembly might play a role in concentrating cargoes into COPII vesicles [43].

The typical size of the COPII vesicles ranges from 60 to 70 nm in diameter, which would appear to prohibit these vesicles from carrying chylomicrons (250 nm average diameter) from the ER to the Golgi apparatus [44]. Another vesicle (350-500 nm in diameter), the prechylomicron transport vesicle (PCTV), has been shown to be able to transport chylomicrons [45]. The PCTV is composed of several proteins: VAMP7 (vesicle-associated membrane protein 7) which is the v-SNARE (vesicle-associated soluble N-ethylmaleimide-sensitive factor attachment protein receptor), apoprotein B48 (a cargo), FABP1 (also called liver fatty acid- binding protein, LFABP) (budding initiator), the fatty acid transporter CD36 (a fatty

Anderson's Disease/Chylomicron Retention Disease and Mutations in the *SAR1B* Gene 257

**Figure 3.** Sequence alignment of SAR1B protein with functional regions

**Electron microscopy of duodenal biopsies of patients with AD**. As shown for AD3 (A, B, C) and AD2 (D, E), two types of particles are apparent in the enterocytes in these patients (A, D): large lipid droplets, free in the cytoplasm (L), and smaller, lipoprotein-sized like particles (Lp), surrounded by a membrane. A higher magnification shows in (B) some individual lipoprotein-sized particles surrounded by a membrane (\*)near a Golgi apparatus (G) which appears distended but devoid of particles and in (C, E) numerous lipoprotein-sized particles accumulated in membrane bound compartment (membrane, white arrow). The intercellular spaces are empty. The cell nucleus is labelled N.

**Figure 2.** Electron microscopy of duodenal biopsies of patients with AD (from A. Georges ref 27)

**Electron microscopy of duodenal biopsies of patients with AD**. As shown for AD3 (A, B, C) and AD2 (D, E), two types of particles are apparent in the enterocytes in these patients (A, D): large lipid droplets, free in the cytoplasm (L), and smaller, lipoprotein-sized like particles (Lp), surrounded by a membrane. A higher magnification shows in (B) some individual lipoprotein-sized particles surrounded by a membrane (\*)near a Golgi apparatus (G) which appears distended but devoid of particles and in (C, E) numerous lipoprotein-sized particles accumulated in membrane bound

compartment (membrane, white arrow). The intercellular spaces are empty. The cell nucleus is labelled N. **Figure 2.** Electron microscopy of duodenal biopsies of patients with AD (from A. Georges ref 27)

**Figure 3.** Sequence alignment of SAR1B protein with functional regions

acid translocase) and the COPII proteins [46]. PCTV budding does not require GTP (and, consequently, SAR1) but rather ATP [44]. Further, VAMP7 is necessary for the fusion of the PCTV with the Golgi [44, 47]. The role of Sar1 in the budding of PCTV has been clarified, recently, in an elegant study by Siddiqi and Mansbach (2012) [47]. They showed that the binding of FABP1 to intestinal ER generates PCTV. A cytosolic multi-protein complex (composed of SAR1b, SEC13, SVIP (Small VPC/p97- Interactive Protein) binds all the FABP1 which is subsequently liberated by the phosphorylation of SAR1b by PKCζ (Protein Kinase C Zeta).

Anderson's Disease/Chylomicron Retention Disease and Mutations in the *SAR1B* Gene 259

Five X-ray crystallographic-derived structures for SAR1b bound to GDP or GTP, alone or complexed with other COPII components, have been deposited in the Protein Data Bank. Three of these structures are derived from *S. cerevisiae* (yeast) recombinant protein and two from *Cricetulus griseus* (hamster) recombinant protein. These structures provide insights into the structural changes that SAR1b may undergo upon GDP/GTP binding as well as demonstrating which parts of the protein constitute interfaces with other COPII components. No X-ray derived structures of SAR1b complexed with components of the PCTV are available to our knowledge. There is also one X-ray derived structure for SAR1a

The X-ray structures show that SAR1b has six central β strands (5 parallel, β2 antiparallel) that are sandwiched between three α helixes on each side (Figure 4). In SAR1-GDP (the inactive form), the α1helix is retracted into a pocket formed by the β2- β3 hairpin. The β strands 1-2-3 are approximately parallel to the membrane allowing their juxtaposition with the membrane (the N and C terminus and β2- β3 hairpin would participate in this membrane interaction) [36]. The Mg2+ ion is coordinated by an oxygen atom of the phosphate of the GDP and the hydroxyl oxygen of Threonine residue 39 (in SAR1-GDP) [37]. Many H bonds stabilize the structure and could be altered by mutations (see discussion

using human recombinant protein.

Using the 1F6B model Cricetulus griseus SAR1b [53]

having a sequence identical to that of human SAR1b.

**Figure 4.** Three dimensional structure of SAR1B protein

below), for example Ser 179 with Asp 137 and Leu 181 [2].

(which lacks the first twelve AA) and Swiss pdb Viewer:

In yellow: **β strands** In blue: **α helixes** In white: **loops**  In green**: GDP**

two residues were modified (I80V, V163I) in order to produce a structural module

These findings raise a number of questions as to the mechanism by which *SAR1B* gene mutations could affect PCTV transport to produce AD/CMRD. In particular, it is not clear how mutations that are located in regions involved in the binding and hydrolysis of GDP/GTP (and for which the effect on COPII mediated transport is evident) would affect PCTV transport (see below: Predicted impact of the mutations). Since SAR1b plays a role in both vesicle budding and vesicle fusion to the Golgi apparatus, further studies will be necessary to completely understand the apparently multiple roles that SAR1b plays in PCTV transport. Recently, L Jin and coll showed that the ubiquitylation by CUL3-KLHL2 allow the formation of COPII vesicle of a size sufficient to transport collagen (300-400 nm) [48]. It is of interest to know whether this mechanism also could permit the transport of chylomicrons. These recent data provide novel insights into the possible mechanisms for the transport of chylomicrons (either by PCTVs or COPII vesicles) and are very interesting because impaired COPII function results not only in AD/CMRD but also in collagen deposition defects [49] and lenticulo-structural dysplasia (SEC23A mutation). However, given the ubiquitous expression and essential roles of COPII components such as SAR1 and SEC23 as well as other proteins involved in trafficking between ER and Golgi, it is still not entirely clear as to how mutations in these proteins produce diseases with such marked tissue specific effects and low incidence.

#### **4. Structure of the SAR1b protein**

Although the SAR1 protein is included in the GTPase superfamily (and, in particular, the RAS superfamily) members of which are present in most living cells, from bacteria to vertebrates, it is only slightly related to other RAS or ARF proteins and is distant from the RAB/YPT1/SEC4 subclass [50, 51]. SAR1 is conserved from an evolutionary standpoint and appears to present in all eukaryotes. However, whereas yeast and insects have a single SAR1 protein, higher organisms express two forms, SAR1b and SAR1a (both with 198 amino acids), which differ by 20 amino-acid residues [52].The function of SAR1a has not been elucidated yet and, to date, no variant in the *SAR1A* gene has been described. The sequence alignment of SAR1b as compared to SAR1p (Figure 3) illustrates the different regions that are highly conserved across species and shows the different functional motifs in SAR1b that participate in vesicle budding, in GDP/GTP binding and hydrolysis and in interactions with other COP proteins.

Five X-ray crystallographic-derived structures for SAR1b bound to GDP or GTP, alone or complexed with other COPII components, have been deposited in the Protein Data Bank. Three of these structures are derived from *S. cerevisiae* (yeast) recombinant protein and two from *Cricetulus griseus* (hamster) recombinant protein. These structures provide insights into the structural changes that SAR1b may undergo upon GDP/GTP binding as well as demonstrating which parts of the protein constitute interfaces with other COPII components. No X-ray derived structures of SAR1b complexed with components of the PCTV are available to our knowledge. There is also one X-ray derived structure for SAR1a using human recombinant protein.

Using the 1F6B model Cricetulus griseus SAR1b [53] (which lacks the first twelve AA) and Swiss pdb Viewer: two residues were modified (I80V, V163I) in order to produce a structural module having a sequence identical to that of human SAR1b. In yellow: **β strands** In blue: **α helixes** In white: **loops**  In green**: GDP**

258 Mutations in Human Genetic Disease

tissue specific effects and low incidence.

**4. Structure of the SAR1b protein** 

interactions with other COP proteins.

C Zeta).

acid translocase) and the COPII proteins [46]. PCTV budding does not require GTP (and, consequently, SAR1) but rather ATP [44]. Further, VAMP7 is necessary for the fusion of the PCTV with the Golgi [44, 47]. The role of Sar1 in the budding of PCTV has been clarified, recently, in an elegant study by Siddiqi and Mansbach (2012) [47]. They showed that the binding of FABP1 to intestinal ER generates PCTV. A cytosolic multi-protein complex (composed of SAR1b, SEC13, SVIP (Small VPC/p97- Interactive Protein) binds all the FABP1 which is subsequently liberated by the phosphorylation of SAR1b by PKCζ (Protein Kinase

These findings raise a number of questions as to the mechanism by which *SAR1B* gene mutations could affect PCTV transport to produce AD/CMRD. In particular, it is not clear how mutations that are located in regions involved in the binding and hydrolysis of GDP/GTP (and for which the effect on COPII mediated transport is evident) would affect PCTV transport (see below: Predicted impact of the mutations). Since SAR1b plays a role in both vesicle budding and vesicle fusion to the Golgi apparatus, further studies will be necessary to completely understand the apparently multiple roles that SAR1b plays in PCTV transport. Recently, L Jin and coll showed that the ubiquitylation by CUL3-KLHL2 allow the formation of COPII vesicle of a size sufficient to transport collagen (300-400 nm) [48]. It is of interest to know whether this mechanism also could permit the transport of chylomicrons. These recent data provide novel insights into the possible mechanisms for the transport of chylomicrons (either by PCTVs or COPII vesicles) and are very interesting because impaired COPII function results not only in AD/CMRD but also in collagen deposition defects [49] and lenticulo-structural dysplasia (SEC23A mutation). However, given the ubiquitous expression and essential roles of COPII components such as SAR1 and SEC23 as well as other proteins involved in trafficking between ER and Golgi, it is still not entirely clear as to how mutations in these proteins produce diseases with such marked

Although the SAR1 protein is included in the GTPase superfamily (and, in particular, the RAS superfamily) members of which are present in most living cells, from bacteria to vertebrates, it is only slightly related to other RAS or ARF proteins and is distant from the RAB/YPT1/SEC4 subclass [50, 51]. SAR1 is conserved from an evolutionary standpoint and appears to present in all eukaryotes. However, whereas yeast and insects have a single SAR1 protein, higher organisms express two forms, SAR1b and SAR1a (both with 198 amino acids), which differ by 20 amino-acid residues [52].The function of SAR1a has not been elucidated yet and, to date, no variant in the *SAR1A* gene has been described. The sequence alignment of SAR1b as compared to SAR1p (Figure 3) illustrates the different regions that are highly conserved across species and shows the different functional motifs in SAR1b that participate in vesicle budding, in GDP/GTP binding and hydrolysis and in

**Figure 4.** Three dimensional structure of SAR1B protein

The X-ray structures show that SAR1b has six central β strands (5 parallel, β2 antiparallel) that are sandwiched between three α helixes on each side (Figure 4). In SAR1-GDP (the inactive form), the α1helix is retracted into a pocket formed by the β2- β3 hairpin. The β strands 1-2-3 are approximately parallel to the membrane allowing their juxtaposition with the membrane (the N and C terminus and β2- β3 hairpin would participate in this membrane interaction) [36]. The Mg2+ ion is coordinated by an oxygen atom of the phosphate of the GDP and the hydroxyl oxygen of Threonine residue 39 (in SAR1-GDP) [37]. Many H bonds stabilize the structure and could be altered by mutations (see discussion below), for example Ser 179 with Asp 137 and Leu 181 [2].

The X-ray data also provide insights into the roles played by the different parts of the structure in SAR1b functions (see the protein alignment Figure 3). The amino- (N) terminal part of SAR1b contains the STAR (SAR1 NH2 Terminal Activation Recruitment) motif, a hydrophobic sequence of amino acids (AA) (1-9), a structure different from other ARF superfamily GTPases, which recruits SEC12, and the α1 amphipathic helix (AA 15-19, residues VLNFL). The role of the α1 amphipathic helix is fundamental as demonstrated by the loss of all export activity of SAR1B following the substitution of the 4 hydrophobic AA by 4 Alanine [53]. Between the STAR motif and the α1 helix, a short domain (AA 9-14, YSGFS) participates in deforming the ER membrane [38]. Three other regions contact the membrane, one each in the N- (AA 1-25) and carboxyl- (C) (AA 195-198) terminii and a central motif in the β2- β3 strand (AA 65-70) [36, 38]. There is one motif that recognizes the guanine base (AA 134-137, NKXD) and two active sites for GTP hydrolysis (AA 32-38, motif GXXXXGK and AA 75-78, motif DXXG) [54]. Close to the GTP hydrolysis site, Threonine 39 is a highly conserved residue and the substitution T39N inhibits SAR1 function by interfering with activation by SEC12 [53].

Anderson's Disease/Chylomicron Retention Disease and Mutations in the *SAR1B* Gene 261

p.M1\_H43del algerian 2 F Ho 6y 22

**number sex status age dg references** 

**mutation ethnic origin Family** 

**exon 2** c.32 G>A p.G11D thaï 1 M comp Hz 11m 24

p.M1\_H43del algerian 2 M Ho 8y 22

**exon 3** c.83\_84 delTG p.L28R fsX34 french canad 3 F comp Hz ? 2, 11 **(59-178bp)** c.83\_84 delTG p.L28R fsX34 morrocan 4 F Ho 7m 25 c.83\_84 delTG p.L28R fsX34 morrocan 5 F Ho 8m 27 c.92 T>C p.L31P morrocan 6 M Ho 3m *this article*  c.92 T>C p.L31P morrocan 6 M Ho 15y *this article*  c.109 G>A p.G37R algerian 7 F Ho 3,5y 2, 13 c.109 G>A p.G37R algerian 7 M Ho 3m 2, 13 c.109 G>A p.G37R morrocan 8 M Ho 3y 2, 12 c.142 delG p.D48T fsX17 turkish 9 M Ho 10m 27 c.142 delG p.D48T fsX17 turkish 9 F Ho 1m 27 **exon 4** c.184 G>A p.E62K tunisian 10 F Ho 7y 26 **(179-244bp)** c.224 A>G p.D75G thaï 1 M comp.Hz *(see family 1 exon 2)* **exon 6** c.349-1 G>C p.S117K160del italian 11 M Ho 12y 2, 19 **(349-480bp)** c.349-1 G>C p.S117K160del italian 11 M Ho 19y 2, 19 c.364 G>T p.E122X turkish 12 M Ho 3m 22 c.364 G>T p.E122X turkish 12 F Ho 6y 22 c.364 G>T p.E122X turkish 12 F Ho 8y 22 c.364 G>T p.E122X turkish 12 M Ho 11y 22 c.409 G>A p.D137N french canad 13 M Ho ? 2, 11 c.409 G>A p.D137N french canad 13 F Ho ? 2, 11 c.409 G>A p.D137N french canad 3 F comp.Hz *(see family 3 exon 3)* c.409 G>A p.D137N french canad 14 M Ho 3m 22 c.409 G>A p.D137N french canad 14 M Ho 2m 22 c.409 G>A p.D137N french canad 15 M Ho 3m 22 c.409 G>A p.D137N french canad 16 F comp Hz 2w 22 c.409 G>A p.D137N french canad 16 M comp Hz 3,5m 22 c.409 G>A p.D137N french canad 17 F Ho 50y *this article*  c.409 G>A p.D137N caucasian 18 M Ho 8m *this article*  **exon 7** c.499 G>T p.E167X caucasian 19 F Ho 34y 21, 23 **(481-597bp)** c.499 G>T p.E167X caucasian 19 F Ho 38y 23 c.536 G>T p.S179I pakistan 20 F comp Hz 6m 2 c.537 T>A p.S179R french canad 16 F comp.Hz *(see family 16 exon 6)* c.537 T>A p.S179R french canad 18 M comp.Hz *(see family 16 exon 6)* c.537 T>A p.S179R french canad 21 F Ho 10y 22 c.537 T>A p.S179R french canad 21 M Ho 2m 22 c.537 T>A p.S179R french canad 22 F Ho 5m 22 c.542 T>C p.L181P pakistan 20 F comp.Hz *(see family 20 exon 7)* c.554 G>T p.G185V portuguese 23 F Ho 2y 22 c.555-557 dupTTAC p.G187LfsX13 turkish 24 F Ho 1y 2, 16 c.555-557 dupTTAC p.G187LfsX13 turkish 24 M Ho 1y 2, 16 (Ho: homozygous, comp Hz: compound heterozygous, age dg: age at diagnosis, m months, y years, w weeks)

*SAR1B* **DNA variant protein** 

**Table 2.** All published mutations in *SAR1B* gene

**(1-58 bp)** c.-4482\_58 +1406 del 5946 ins 15bp (*named* del exon 2)

The two switch regions (AA 48-59 and AA 78-94) contain two very important residues, the Threonine at position 56 and the Glycine at position 78, respectively [53]. A second unique structural region of SAR1, not observed in the ARF GTPases, is a long surface-exposed loop (AA 156-171) which connects the α4 helix and the β6 strand and which regulates the function of SAR1b. The substitution Thr158Ala abolishes the activity of SAR1 [53]. A specific C-terminal motif (AA 171-181, PXEVFMC/VSV/L), present in the β6 strand, targets SAR1b to the ER [55].

The three-dimensional structure was obtained by crystallography [36, 53] and then by a computational approach. By crystallography (without the nine first and the 48-55 residues ), SAR1-GDP appeared as a dimer [37, 53]. Nothing is available about an in vivo GTPase activity with this dimer structure. Moreover, Long and coll (2010) showed that SAR1b may function as a monomer [56], so we will only consider the monomer form.

## **5. Predicted impact of the mutations in the** *SAR1B* **gene on the structure and function of the protein**

Currently, including the 4 new cases belonging to 3 new families reported here (one new missense mutation), mutations in the *SAR1B* gene have been established for 43 individuals with AD/CMRD (belonging to 24 families). There are only 17 unique mutations. The majority of individuals are homozygous for their mutation (38/43) and 5 individuals from 4 families are compound heterozygous. There are a total of 7 nonsense and 10 missense mutations (Table 2). Since structural information concerning SAR1b in PCTV vesicles is not available, the discussion of the possible effects of *SAR1B* gene mutations upon protein function will be limited to the COPII vesicle transport system.

Recently we identified the same mutation (del exon2) as the Algerian family (n°2) in 3 patients from 2 Tunisian families (to be published).


(Ho: homozygous, comp Hz: compound heterozygous, age dg: age at diagnosis, m months, y years, w weeks)

**Table 2.** All published mutations in *SAR1B* gene

260 Mutations in Human Genetic Disease

interfering with activation by SEC12 [53].

**and function of the protein** 

the ER [55].

The X-ray data also provide insights into the roles played by the different parts of the structure in SAR1b functions (see the protein alignment Figure 3). The amino- (N) terminal part of SAR1b contains the STAR (SAR1 NH2 Terminal Activation Recruitment) motif, a hydrophobic sequence of amino acids (AA) (1-9), a structure different from other ARF superfamily GTPases, which recruits SEC12, and the α1 amphipathic helix (AA 15-19, residues VLNFL). The role of the α1 amphipathic helix is fundamental as demonstrated by the loss of all export activity of SAR1B following the substitution of the 4 hydrophobic AA by 4 Alanine [53]. Between the STAR motif and the α1 helix, a short domain (AA 9-14, YSGFS) participates in deforming the ER membrane [38]. Three other regions contact the membrane, one each in the N- (AA 1-25) and carboxyl- (C) (AA 195-198) terminii and a central motif in the β2- β3 strand (AA 65-70) [36, 38]. There is one motif that recognizes the guanine base (AA 134-137, NKXD) and two active sites for GTP hydrolysis (AA 32-38, motif GXXXXGK and AA 75-78, motif DXXG) [54]. Close to the GTP hydrolysis site, Threonine 39 is a highly conserved residue and the substitution T39N inhibits SAR1 function by

The two switch regions (AA 48-59 and AA 78-94) contain two very important residues, the Threonine at position 56 and the Glycine at position 78, respectively [53]. A second unique structural region of SAR1, not observed in the ARF GTPases, is a long surface-exposed loop (AA 156-171) which connects the α4 helix and the β6 strand and which regulates the function of SAR1b. The substitution Thr158Ala abolishes the activity of SAR1 [53]. A specific C-terminal motif (AA 171-181, PXEVFMC/VSV/L), present in the β6 strand, targets SAR1b to

The three-dimensional structure was obtained by crystallography [36, 53] and then by a computational approach. By crystallography (without the nine first and the 48-55 residues ), SAR1-GDP appeared as a dimer [37, 53]. Nothing is available about an in vivo GTPase activity with this dimer structure. Moreover, Long and coll (2010) showed that SAR1b may

**5. Predicted impact of the mutations in the** *SAR1B* **gene on the structure** 

Currently, including the 4 new cases belonging to 3 new families reported here (one new missense mutation), mutations in the *SAR1B* gene have been established for 43 individuals with AD/CMRD (belonging to 24 families). There are only 17 unique mutations. The majority of individuals are homozygous for their mutation (38/43) and 5 individuals from 4 families are compound heterozygous. There are a total of 7 nonsense and 10 missense mutations (Table 2). Since structural information concerning SAR1b in PCTV vesicles is not available, the discussion of the possible effects of *SAR1B* gene mutations upon protein

Recently we identified the same mutation (del exon2) as the Algerian family (n°2) in 3

function as a monomer [56], so we will only consider the monomer form.

function will be limited to the COPII vesicle transport system.

patients from 2 Tunisian families (to be published).

#### **5.1. Nonsense mutations**

Among the seven non-sense mutations, one deletes exon 2 (p.1-4482\_58+1406 del 5946 ins 15bp, named "del exon 2") and one eliminates exon 6 (p.S117K160del), two are stop codons (p.E122X, p.E167X) which lead to truncated proteins, and two deletions and an insertion produce frameshifts followed by stop codons (p.L28RfsX34, p.D48TfsX17, p.G187LfsX13) leading to truncated proteins and modified C-terminal sequences. The major deletion (5943bp) of exon 2 (family 2 and new Tunisian patients) potentially leads to 4 different proteins [22] each of which lacks part of the N-terminus. The largest fragment lacks the first 43 residues, including the STAR motif, the α1 helix, the active site for GTP hydrolysis and Threonine 39. The deletion of exon 6 eliminates the recognition site for the guanine base (AA 134-137) thus abolishing the function of SAR1b. The five other nonsense mutations (resulting in stop codons) produce truncated proteins lacking the C-terminus. The shortest fragment is predicted to have about 33 AA and the longest contains 187/198 AA but, interestingly, all are predicted to abolish the function of the protein in the same manner. This suggests that the C terminal part of the protein plays a major role in the function of SAR1.

Anderson's Disease/Chylomicron Retention Disease and Mutations in the *SAR1B* Gene 263

Using the 1F6B model Cricetulus griseus SAR1b [53] and Swiss Pdb Viewer

**Figure 5.** Localization of missense mutations in the three-dimensional structure of SAR1B

#### **5.2. Missense mutations**

The Swiss-pdb Viewer 3.1 program ([57], available on http://www.expasy.org/spdbv/) was used to calculate atomic resolution structural models for SAR1b having missense mutations (Table 3). First, using the 1F6B model [53] and PDB for *Cricetulus griseus* SAR1b (which lacks the first twelve AA and the 48-55 residues), two residues were modified (I80V, V163I) in order to produce a structural module having a sequence identical to that of human SAR1b. The effects of the missense mutations of AD/CMRD on this "humanized" structure were then modelled.

All the missense mutations are located on the exterior of the three dimensional structure, in strategic places near the recognition, binding and hydrolysis sites for the guanine base (in the N- and C-terminii) and/or affect a highly conserved residue in SAR1/Arf proteins. From the N- to the C- terminus the predicted effects may be summarized as the following (Figure 3). The p.G11D mutation is located in the membrane interacting site (anchorage of the Nterminal part of the molecule) and probably prevents binding to SEC12 and fixation to the ER membrane. The substitution G11P, associated with Y9F and S14F, has been described as being deleterious for vesicle release [38], however no model is available for this mutation (since the coordinates of the first 12 residues of the protein could not be established by the X-ray study leading to the 1F6B structure). The new mutation p.L31P affects the AA just before the active site of GTP and could decrease the GTP hydrolysis. The substitution of a linear (leucine) by a cyclic (proline) residue could lead to steric hindrance (Figure 5). The p.G37R and the p.D75G mutations are located in two different GDP hydrolysis sites. Replacement of glycine 37 by arginine creates steric hindrances with C178 and N134 and the replacement of the aspartic acid 75 by glycine abolishes the H bond with L38. All four of these mutations reduce or eliminate the affinity of SAR1b for GDP/GTP and are expected to Anderson's Disease/Chylomicron Retention Disease and Mutations in the *SAR1B* Gene 263

Using the 1F6B model Cricetulus griseus SAR1b [53] and Swiss Pdb Viewer

262 Mutations in Human Genetic Disease

**5.1. Nonsense mutations** 

**5.2. Missense mutations** 

SAR1.

then modelled.

Among the seven non-sense mutations, one deletes exon 2 (p.1-4482\_58+1406 del 5946 ins 15bp, named "del exon 2") and one eliminates exon 6 (p.S117K160del), two are stop codons (p.E122X, p.E167X) which lead to truncated proteins, and two deletions and an insertion produce frameshifts followed by stop codons (p.L28RfsX34, p.D48TfsX17, p.G187LfsX13) leading to truncated proteins and modified C-terminal sequences. The major deletion (5943bp) of exon 2 (family 2 and new Tunisian patients) potentially leads to 4 different proteins [22] each of which lacks part of the N-terminus. The largest fragment lacks the first 43 residues, including the STAR motif, the α1 helix, the active site for GTP hydrolysis and Threonine 39. The deletion of exon 6 eliminates the recognition site for the guanine base (AA 134-137) thus abolishing the function of SAR1b. The five other nonsense mutations (resulting in stop codons) produce truncated proteins lacking the C-terminus. The shortest fragment is predicted to have about 33 AA and the longest contains 187/198 AA but, interestingly, all are predicted to abolish the function of the protein in the same manner. This suggests that the C terminal part of the protein plays a major role in the function of

The Swiss-pdb Viewer 3.1 program ([57], available on http://www.expasy.org/spdbv/) was used to calculate atomic resolution structural models for SAR1b having missense mutations (Table 3). First, using the 1F6B model [53] and PDB for *Cricetulus griseus* SAR1b (which lacks the first twelve AA and the 48-55 residues), two residues were modified (I80V, V163I) in order to produce a structural module having a sequence identical to that of human SAR1b. The effects of the missense mutations of AD/CMRD on this "humanized" structure were

All the missense mutations are located on the exterior of the three dimensional structure, in strategic places near the recognition, binding and hydrolysis sites for the guanine base (in the N- and C-terminii) and/or affect a highly conserved residue in SAR1/Arf proteins. From the N- to the C- terminus the predicted effects may be summarized as the following (Figure 3). The p.G11D mutation is located in the membrane interacting site (anchorage of the Nterminal part of the molecule) and probably prevents binding to SEC12 and fixation to the ER membrane. The substitution G11P, associated with Y9F and S14F, has been described as being deleterious for vesicle release [38], however no model is available for this mutation (since the coordinates of the first 12 residues of the protein could not be established by the X-ray study leading to the 1F6B structure). The new mutation p.L31P affects the AA just before the active site of GTP and could decrease the GTP hydrolysis. The substitution of a linear (leucine) by a cyclic (proline) residue could lead to steric hindrance (Figure 5). The p.G37R and the p.D75G mutations are located in two different GDP hydrolysis sites. Replacement of glycine 37 by arginine creates steric hindrances with C178 and N134 and the replacement of the aspartic acid 75 by glycine abolishes the H bond with L38. All four of these mutations reduce or eliminate the affinity of SAR1b for GDP/GTP and are expected to

**Figure 5.** Localization of missense mutations in the three-dimensional structure of SAR1B


Anderson's Disease/Chylomicron Retention Disease and Mutations in the *SAR1B* Gene 265

affect the stability of the protein. The substitution p.E62K affecting a well-conserved AA belongs to some residues forming the interface with SEC23 [36], abolishes the H-bond with Glu63 and is predicted to be deleterious by "in silico" analysis (Polyphen, available on http://genetics.bwh.harvard.edu/pph2/ [58] and SIFT available on http://sift.jcvi.org/ [59- 63]). A H-bond with the guanine in the guanine recognition site is abolished by the p.D137N mutation (Figure 5). Similarly the p.S179I and p.S179R mutations abolish the H-bonds with Asp 137 and with the guanine base. The substitution of a leucine for a proline (L181P) leads to steric hindrance with the guanine base and p.G185V modifies a highly conserved residue in the Arf/Sar1 family and is predicted to be deleterious by "*in silico*" analysis (Polyphen, SIFT). The last four mutations modify the α helix and β strands in the C-terminus and could

Founder effects are likely in the North African and French Canadian families (Table 2); it is likely that the same founder effect is responsible for the mutations of the North African patients (del exon 2, c.109 G>A). However, it is more uncertain for the c.409G>A and c.83\_84delTG mutations, since the pedigrees of these families are not available. Perhaps

Table 4 provides the lipid profiles of the patients for which mutations in *SAR1B* have been established. As is typical for individuals affected with AD/CMRD, the mean values of total and HDL-cholesterol, apoAI and apoB are decreased, LDL-cholesterol is mildly decreased and triglycerides are in the normal range, however there is a large range of values for each of these parameters. As previously discussed, some patients present with low triglycerides or apoB levels and could be confused with atypical abetalipoproteinemia (familes 7, 12), and those with normal HDL cholesterol (family 10) could be confused with heterozygous FHBL. In homozygous patients, missense mutations are more frequent (12 families) than nonsense (8) and are as severe as nonsense mutations, except for the patient in family 10 (p.E62K) who has a normal HDL cholesterol level. The clinical data are not different among patients with different mutations. Several patients have been diagnosed later (adult or teenager) probably because of a mild intestinal syndrome and false diagnoses. Nevertheless, among the late

It has been suggested previously [22] that there is no apparent correlation between the genotype and the phenotype in AD/CMRD patients. For example, patients (from different families) with the same homozygous *SAR1B* mutation (for example the D137N mutation) exhibit different lipid profiles and vitamin E levels as do patients from the same families with the same mutations (the E122X and the S179R mutations). It is possible that modifier genes could be a cause of the different phenotypes. For example, a decrease in the transcriptional factor SREBP (Sterol Regulatory Element Binding Protein) has been shown to block the incorporation of SCAP (SREBP chaperone) in COPII vesicles and an acute depletion of

there are hot spots, or different founder effects at the same place in the gene.

diagnoses (10 patients after 10 years of age), only 3 have a missense mutation.

**7. The biological and clinical impact of SAR1b mutations:** 

affect the stability as well as the conformation of the protein.

**6. Possible founder effects:** 

**a** Swiss Pdb Viewer 3.7 based upon the template 1F6b lacking the first 12 residues of SAR1b *C.g*. (resolution: 1,70Å, R value: 0,220, homolgy 98,9%) modified (p.I80V and p.V163I: homology 100% )

**b** Grantham distance (Alamut )

**c** http://www.ebi.ac.uk/clustalw/

residue conservation: +, identical; c, conserved substitution; s, semi conserved substitution

**d** PolyPhen-2 v2.2.2r395 http://genetics.bwh.harvard.edu/pph/

**e** Sorting Intolerant From Tolerant: http://blocks.fhcrc.org/sift/SIFT.html

**Table 3.** Molecular impact of missense mutations

affect the stability of the protein. The substitution p.E62K affecting a well-conserved AA belongs to some residues forming the interface with SEC23 [36], abolishes the H-bond with Glu63 and is predicted to be deleterious by "in silico" analysis (Polyphen, available on http://genetics.bwh.harvard.edu/pph2/ [58] and SIFT available on http://sift.jcvi.org/ [59- 63]). A H-bond with the guanine in the guanine recognition site is abolished by the p.D137N mutation (Figure 5). Similarly the p.S179I and p.S179R mutations abolish the H-bonds with Asp 137 and with the guanine base. The substitution of a leucine for a proline (L181P) leads to steric hindrance with the guanine base and p.G185V modifies a highly conserved residue in the Arf/Sar1 family and is predicted to be deleterious by "*in silico*" analysis (Polyphen, SIFT). The last four mutations modify the α helix and β strands in the C-terminus and could affect the stability as well as the conformation of the protein.

### **6. Possible founder effects:**

264 Mutations in Human Genetic Disease


G37R 75988 125

S179I -8867 142

S179R -7550 110

L181P -9023 98

**b** Grantham distance (Alamut ) **c** http://www.ebi.ac.uk/clustalw/

G11D no

Energy kJ/mol a Grantham

distance b

L31P -6560 98 steric hindrance

E62K -7777 56 loss of one H-

D75G -9406 94 loss of one H-

D137N -10 086 23 loss of one H-

G185V 127 288 109 Steric hindrance

**d** PolyPhen-2 v2.2.2r395 http://genetics.bwh.harvard.edu/pph/ **e** Sorting Intolerant From Tolerant: http://blocks.fhcrc.org/sift/SIFT.html

**Table 3.** Molecular impact of missense mutations

Consequence of mutation on prot. (concerned residue) a

modelisation 94 no modelisation ? 0 0 0

steric hindrances (Asn134, Cys178)

loss of : one weak H-bond (GDP) and one H-bond (Asp137)

loss of : one weak H-bond (GDP) and one H-bond (Asp137)

steric hindrances with GDP

R value: 0,220, homolgy 98,9%) modified (p.I80V and p.V163I: homology 100% )

residue conservation: +, identical; c, conserved substitution; s, semi conserved substitution

Residue conservation c

Sar1/ Arf family prot.

Small GTP binding prot.

(Score)e Sar1b prot.

Sar1 prot.

(Val97) + c c c

(Glu63) + c + 0

(Lys38) + + + +

bond (GDP) + + + +

(Met177) + + + 0

**a** Swiss Pdb Viewer 3.7 based upon the template 1F6b lacking the first 12 residues of SAR1b *C.g*. (resolution: 1,70Å,

+ + + +

+ + s 0

+ s 0

c 0 0 0 beningn

PolyPhen prediction (Score) d

possibly damaging (0,927)

probably damaging (1,0)

probably damaging (1,0)

possibly damaging (0,955)

probably damaging (0,99)

probably damaging (1,0)

probably damaging (1,0)

probably damaging (1,0)

(0,281)

probably damaging (1,0)

SIFT prediction

> affect protein (0,03)

> affect protein (0,02)

> affect protein (0,00)

> affect protein (0,00)

> affect protein (0,00)

> affect protein (0,00)

> affect protein (0,00)

> affect protein (0,00)

> affect protein (0,00)

> affect protein (0,00)

**wild type** 

> Founder effects are likely in the North African and French Canadian families (Table 2); it is likely that the same founder effect is responsible for the mutations of the North African patients (del exon 2, c.109 G>A). However, it is more uncertain for the c.409G>A and c.83\_84delTG mutations, since the pedigrees of these families are not available. Perhaps there are hot spots, or different founder effects at the same place in the gene.

## **7. The biological and clinical impact of SAR1b mutations:**

Table 4 provides the lipid profiles of the patients for which mutations in *SAR1B* have been established. As is typical for individuals affected with AD/CMRD, the mean values of total and HDL-cholesterol, apoAI and apoB are decreased, LDL-cholesterol is mildly decreased and triglycerides are in the normal range, however there is a large range of values for each of these parameters. As previously discussed, some patients present with low triglycerides or apoB levels and could be confused with atypical abetalipoproteinemia (familes 7, 12), and those with normal HDL cholesterol (family 10) could be confused with heterozygous FHBL. In homozygous patients, missense mutations are more frequent (12 families) than nonsense (8) and are as severe as nonsense mutations, except for the patient in family 10 (p.E62K) who has a normal HDL cholesterol level. The clinical data are not different among patients with different mutations. Several patients have been diagnosed later (adult or teenager) probably because of a mild intestinal syndrome and false diagnoses. Nevertheless, among the late diagnoses (10 patients after 10 years of age), only 3 have a missense mutation.

It has been suggested previously [22] that there is no apparent correlation between the genotype and the phenotype in AD/CMRD patients. For example, patients (from different families) with the same homozygous *SAR1B* mutation (for example the D137N mutation) exhibit different lipid profiles and vitamin E levels as do patients from the same families with the same mutations (the E122X and the S179R mutations). It is possible that modifier genes could be a cause of the different phenotypes. For example, a decrease in the transcriptional factor SREBP (Sterol Regulatory Element Binding Protein) has been shown to block the incorporation of SCAP (SREBP chaperone) in COPII vesicles and an acute depletion of cellular cholesterol concentration has been shown to decrease COPII transport [64, 65]. Other genes that modulate cholesterol homeostasis could interfere such as *MTTP* (microsomal triglycerides transfert protein), *APOB*, *ABCG5/G8* (ATP Binding Cassette G5/G8).

Anderson's Disease/Chylomicron Retention Disease and Mutations in the *SAR1B* Gene 267

Recently a polymorphism of *PCSK9* (proprotein convertase subtilisin/kexin type 9), p.L15\_16insL, has been reported in an AD patient [27]. This polymorphism is frequent (25% heterozygous in normal individuals and 34% in cases of HBL) and weakly hypocholesterolemic (-14%) [66]. Further, mutations or polymorphisms in other COPII and PCTV genes could contribute to the different phenotypes by modifying the network of all their corresponding proteins. However, none of these mutations have been described in cases of AD/CMRD. The search for polymorphisms in multiple proteins is very timeconsuming but could be facility by the new sequencing methods. Rare polymorphisms in the coding regions of the *SAR1B* and *SAR1A* genes have been described but none of these has been observed in the *SAR1A* gene in any of our patients and only one polymorphism (heterozygous) has been found in the *SAR1B* gene (L45L) in our patients. This polymorphism is found with the same frequency in the patients as in normal individuals (0,18 versus 0,19, respectively). The impact of this polymorphism has not been studied.

**8. Management of AD/CMRD (for details, see the guidelines of Peretti,** 

Treatment consists primarily of a low fat diet, with the appropriate amounts of n-6 and n-3 fatty acids, supplemented with fat soluble vitamins. The failure to thrive of the children is the most important clinical feature and catch-up growth is not observed systematically [29]. The neurological and ophtamological complications may be less severe than in other familial hypocholesterolemias and may depend upon the levels of the fat soluble vitamins and when vitamin supplementation is instituted. Myolysis and cardiac abnormalities have been observed in some AD/CMRD patients [23] and consequently, measurement of the serum CK level should be included in the evaluation and follow-up of the patients. A moderate degree

Significant advances in the diagnosis of AD/CMRD and in the understanding of lipoprotein secretion have occurred over the last decade. However, many questions remain to be answered. SAR1b is a ubiquitous protein, essential for the trafficking of proteins between the ER and the Golgi. Why do the mutations in *SAR1B*, that have been reported to date, apparently affect only the intestine and the transport of chylomicrons in the enterocyte? Although an increase of *SAR1A* mRNA was measured in enterocytes containing mutated *SAR1B* [27], the AD/CMRD phenotype was still manifested by a lack of chylomicron secretion. Under what conditions, if any, could SAR1a replace SAR1b? Is SAR1a the veritable GTPase for COPII vesicles? Do some mutations or polymorphisms in other regulator genes explain the lack of correlation between genotype and phenotype in AD/CMRD? There are some CMRD patients without mutations of *SAR1B, SAR1A, VAMP7, MTTP* genes (unpublished data). What gene mutations could explain the AD/CMRD phenotype in these patients? Novel technologies (such as whole exome and whole genome sequencing) may

provide a better understanding of this disease and open novel diagnostic approaches.

of fat liver is common, but until now no case of cirrhosis has been published.

**9. Conclusions and future prospects** 

**2010 [29])** 


(TC total cholesterol, TG triglycerides, LDLc LDL cholesterol, HDLc HDL cholesterol : mM; apoB, apoA1: g/l; vitE vitamin E: µM)

**Table 4.** Biological data in described cases with mutations

Recently a polymorphism of *PCSK9* (proprotein convertase subtilisin/kexin type 9), p.L15\_16insL, has been reported in an AD patient [27]. This polymorphism is frequent (25% heterozygous in normal individuals and 34% in cases of HBL) and weakly hypocholesterolemic (-14%) [66]. Further, mutations or polymorphisms in other COPII and PCTV genes could contribute to the different phenotypes by modifying the network of all their corresponding proteins. However, none of these mutations have been described in cases of AD/CMRD. The search for polymorphisms in multiple proteins is very timeconsuming but could be facility by the new sequencing methods. Rare polymorphisms in the coding regions of the *SAR1B* and *SAR1A* genes have been described but none of these has been observed in the *SAR1A* gene in any of our patients and only one polymorphism (heterozygous) has been found in the *SAR1B* gene (L45L) in our patients. This polymorphism is found with the same frequency in the patients as in normal individuals (0,18 versus 0,19, respectively). The impact of this polymorphism has not been studied.

## **8. Management of AD/CMRD (for details, see the guidelines of Peretti, 2010 [29])**

Treatment consists primarily of a low fat diet, with the appropriate amounts of n-6 and n-3 fatty acids, supplemented with fat soluble vitamins. The failure to thrive of the children is the most important clinical feature and catch-up growth is not observed systematically [29]. The neurological and ophtamological complications may be less severe than in other familial hypocholesterolemias and may depend upon the levels of the fat soluble vitamins and when vitamin supplementation is instituted. Myolysis and cardiac abnormalities have been observed in some AD/CMRD patients [23] and consequently, measurement of the serum CK level should be included in the evaluation and follow-up of the patients. A moderate degree of fat liver is common, but until now no case of cirrhosis has been published.

#### **9. Conclusions and future prospects**

266 Mutations in Human Genetic Disease

vitamin E: µM)

**Table 4.** Biological data in described cases with mutations

cellular cholesterol concentration has been shown to decrease COPII transport [64, 65]. Other genes that modulate cholesterol homeostasis could interfere such as *MTTP* (microsomal

**mutation ethnic origin family sex status TC TG HDLC LDLC apoB apoA1 vitE references**  p.G11D thaï 1 M comp Hz 1,81 1,29 0,54 0,43 1,5 24 p.M1\_H43del algerian 2 F Ho 2,01 1,44 0,32 1,04 0,5 0,42 3,3 22 p.M1\_H43del algerian 2 M Ho 2,32 0,78 0,4 1,57 0,55 0,45 2,6 22 p.L28R fsX34 french canad 3 F comp Hz 2,2 0,73 1,4 2, 11 p.L28R fsX34 morrocan 4 F Ho 1,45 0,77 0,36 0,73 0,39 0,4 1,2 25 p.L28R fsX34 morrocan 5 F Ho 2,31 1,36 0,7 1 0,82 0,5 2,4 27 p.L31P morrocan 6 M Ho 1,96 0,89 0,77 0,79 0,37 0,91 1,34 *this article*  p.L31P morrocan 6 M Ho 2,09 0,93 0,59 1,31 3,75 *this article*  p.G37R algerian 7 F Ho 1,26 0,67 0,2 0,39 7,6 2, 13 p.G37R algerian 7 M Ho 1,79 1,44 0,33 0,38 2, 13 p.G37R morrocan 8 M Ho 1,55 0,59 0,36 0,64 2,9 2, 12 p.D48T fsX17 turkish 9 M Ho 2,61 1,24 0,57 1,48 0,56 0,7 4,4 27 p.D48T fsX17 turkish 9 F Ho 2,72 1,36 0,83 1,28 0,43 0,9 6,8 27 p.E62K tunisian 10 F Ho 2,59 1,3 1,14 0,4 26 p.D75G thaï 1 M comp Hz 24 p.S117K160del italian 11 M Ho 2,07 0,94 0,52 0,78 1 2, 19 p.S117K160del italian 11 M Ho 2,43 1,28 0,7 1,22 5 2, 19 p.E122X turkish 12 M Ho 1,99 0,43 0,57 1,23 0,36 4,71 22 p.E122X turkish 12 F Ho 1,26 0,5 0,53 0,51 0,38 0,43 0,88 22 p.E122X turkish 12 F Ho 1,37 0,72 0,39 0,66 0,33 0,51 1,44 22 p.E122X turkish 12 M Ho 1,36 0,45 0,45 0,71 0,35 0,59 1,42 22 p.D137N french canad 13 M Ho 1,85 0,94 0 2, 11 p.D137N french canad 13 F Ho 2,08 0,59 1,6 2, 11 p.D137N french canad 2 F comp Hz 2, 11 p.D137N french canad 14 M Ho 1,3 0,45 0,49 0,61 22 p.D137N french canad 14 M Ho 0,86 0,37 0,38 0,31 22 p.D137N french canad 15 M Ho 1,24 0,82 0,41 0,46 22 p.D137N french canad 16 F comp Hz 1,39 0,91 0,36 0,62 22 p.D137N french canad 16 M comp Hz 1,11 0,54 0,45 0,42 22 p.D137N french canad 17 F Ho 2,52 1,35 0,53 1,38 *this article*  p.D137N caucasian 18 M Ho 1,41 0,85 0,35 0,68 0,24 0,57 2,5 *this article*  p.E167X caucasian 19 F Ho 1,86 0,43 0,44 0,57 <1 21, 23 p.E167X caucasian 19 F Ho 2,15 0,36 0,55 0,62 <1 23 p.S179I pakistan 20 F comp Hz 1,4 0,79 0,44 0,6 0,59 2 p.S179R french canad 16 F comp Hz 22 p.S179R french canad 16 M comp Hz 22 p.S179R french canad 21 F Ho 2,82 1,36 0,59 1,61 22 p.S179R french canad 21 M Ho 1,5 0,78 0,56 0,59 22 p.S179R french canad 22 F Ho 1,78 1,28 0,56 22 p.L181P pakistan 20 F comp Hz 2 p.G185V portuguese 23 F Ho 2,36 1,98 0,49 0,98 0,61 0,46 2,5 22 p.G187L fsX13 turkish 24 F Ho 2 1,6 0,7 0,5 6,6 2, 16 p.G187L fsX13 turkish 24 M Ho 1,5 1,5 0,5 0,5 3,6 2, 16 (TC total cholesterol, TG triglycerides, LDLc LDL cholesterol, HDLc HDL cholesterol : mM; apoB, apoA1: g/l; vitE

triglycerides transfert protein), *APOB*, *ABCG5/G8* (ATP Binding Cassette G5/G8).

Significant advances in the diagnosis of AD/CMRD and in the understanding of lipoprotein secretion have occurred over the last decade. However, many questions remain to be answered. SAR1b is a ubiquitous protein, essential for the trafficking of proteins between the ER and the Golgi. Why do the mutations in *SAR1B*, that have been reported to date, apparently affect only the intestine and the transport of chylomicrons in the enterocyte? Although an increase of *SAR1A* mRNA was measured in enterocytes containing mutated *SAR1B* [27], the AD/CMRD phenotype was still manifested by a lack of chylomicron secretion. Under what conditions, if any, could SAR1a replace SAR1b? Is SAR1a the veritable GTPase for COPII vesicles? Do some mutations or polymorphisms in other regulator genes explain the lack of correlation between genotype and phenotype in AD/CMRD? There are some CMRD patients without mutations of *SAR1B, SAR1A, VAMP7, MTTP* genes (unpublished data). What gene mutations could explain the AD/CMRD phenotype in these patients? Novel technologies (such as whole exome and whole genome sequencing) may provide a better understanding of this disease and open novel diagnostic approaches.

## **Author details**

A. Sassolas1,2, M. Di Filippo1,2, L.P. Aggerbeck3, N. Peretti2,4 and M.E. Samson-Bouma5 *1Department of Biochemistry, GHE, Hospices Civils de Lyon, France 2INSERM U1060 CarMeN, University of Lyon, Lyon, France 3INSERM UMR-S747, University Paris Descartes, Paris France 4Department of Pediatric Gastroenterology, GHE, Hospices Civils de Lyon, Lyon, France 5INSERM U698, University Denis Diderot, Centre Hospitalier Universitaire Xavier Bichat, Paris, France* 

Anderson's Disease/Chylomicron Retention Disease and Mutations in the *SAR1B* Gene 269

[11] Roy CC, Levy E, Green PH, Sniderman A, Letarte J, Buts JP, et al. (1987) Malabsorption, hypocholesterolemia, and fat-filled enterocytes with increased intestinal apoprotein B.

[12] Lacaille F, Bratos M, Bouma ME, Jos J, Schmitz J, Rey J (1989) [Anderson's disease.

[13] Pessah M, Benlian P, Beucler I, Loux N, Schmitz J, Junien C, et al. (1991) Anderson's disease: genetic exclusion of the apolipoprotein-B gene in two families. J Clin Invest.

[14] Strich D, Goldstein R, Phillips A, Shemer R, Goldberg Y, Razin A, et al. (1993) Anderson's disease: no linkage to the apo B locus. J Pediatr Gastroenterol Nutr. 16:257-

[15] Patel S, Pessah M, Beucler I, Navarro J, Infante R (1994) Chylomicron retention disease: exclusion of apolipoprotein B gene defects and detection of mRNA editing in an

[16] Nemeth A, Myrdal U, Veress B, Rudling M, Berglund L, Angelin B (1995) Studies on lipoprotein metabolism in a family with jejunal chylomicron retention. Eur J Clin Invest.

[17] Benavent MO, Chirivella Casanova M, Pereda Pérez A, Ribes Konickx C, Ferrer Calvete J (1997) Enfermedad de Anderson (esteatorrea por retencion de quilomicrones):

[18] Dannoura AH, Berriot-Varoqueaux N, Amati P, Abadie V, Verthier N, Schmitz J, et al. (1999) Anderson's disease: exclusion of apolipoprotein and intracellular lipid transport

[19] Aguglia U, Annesi G, Pasquinelli G, Spadafora P, Gambardella A, Annesi F, et al. (2000) Vitamin E deficiency due to chylomicron retention disease in Marinesco-Sjogren

[20] Boldrini R, Biselli R, Bosman C (2001) Chylomicron retention disease--the role of ultrastructural examination in differential diagnosis. Pathol Res Pract. 197:753-757. [21] Mignard S, Calon E, Hespel JP, Le Treut A (2004) [A severely disturbed lipid profile].

[22] Charcosset M, Sassolas A, Peretti N, Roy CC, Deslandres C, Sinnett D, et al. (2008) Anderson or chylomicron retention disease: molecular impact of five mutations in the SAR1B gene on the structure and the functionality of Sar1b protein. Mol Genet Metab.

[23] Silvain M, Bligny D, Aparicio T, Laforet P, Grodet A, Peretti N, et al. (2008) Anderson's disease (chylomicron retention disease): a new mutation in the SARA2 gene associated

[24] Treepongkaruna S, Chongviriyaphan N, Suthutvoravut U, Charoenpipop D, Choubtum L, Wattanasirichaigoon D (2009) Novel missense mutations of SAR1B gene in an infant

with chylomicron retention disease. J Pediatr Gastroenterol Nutr. 48:370-373. [25] Cefalu AB, Calvo PL, Noto D, Baldi M, Valenti V, Lerro P, et al. (2010) Variable phenotypic expression of chylomicron retention disease in a kindred carrying a

with muscular and cardiac abnormalities. Clin Genet. 74:546-552.

mutation of the Sara2 gene. Metabolism. 59:463-467.

Chylomicron retention disease. Gastroenterology. 92:390-399.

affected family. Atherosclerosis. 108:201-207.

Criterios diagnosticos. An Esp Pediatr. 47:195-198.

genes. Arterioscler Thromb Vasc Biol. 19:2494-2508.

syndrome. Ann Neurol. 47:260-264.

Ann Biol Clin (Paris). 62:330-333.

87:367-370.

25:271-280.

93:74-84.

264.

Clinical and morphologic study of 7 cases]. Arch Fr Pediatr. 46:491-498.

## **Acknowledgement**

We thank the physicians Dr C. Vilain, Dr Damaj and others who have referred new patients for molecular investigation. We thank S. Dumont for technical assistance. This study was partially supported by a grant from French Health Ministry, Rare Diseases Plan.

#### **10. References**


[11] Roy CC, Levy E, Green PH, Sniderman A, Letarte J, Buts JP, et al. (1987) Malabsorption, hypocholesterolemia, and fat-filled enterocytes with increased intestinal apoprotein B. Chylomicron retention disease. Gastroenterology. 92:390-399.

268 Mutations in Human Genetic Disease

**Acknowledgement** 

**10. References** 

54:1271.

28:1263-1274.

A. Sassolas1,2, M. Di Filippo1,2, L.P. Aggerbeck3, N. Peretti2,4 and M.E. Samson-Bouma5

*4Department of Pediatric Gastroenterology, GHE, Hospices Civils de Lyon, Lyon, France* 

partially supported by a grant from French Health Ministry, Rare Diseases Plan.

*5INSERM U698, University Denis Diderot, Centre Hospitalier Universitaire Xavier Bichat, Paris,* 

We thank the physicians Dr C. Vilain, Dr Damaj and others who have referred new patients for molecular investigation. We thank S. Dumont for technical assistance. This study was

[1] Anderson C, Townley R, Freemann J, Johansen P (1961) Unusual causes of steatorrhoea

[2] Jones B, Jones EL, Bonney SA, Patel HN, Mensenkamp AR, Eichenbaum-Voline S, et al. (2003) Mutations in a Sar1 GTPase of COPII vesicles are associated with lipid

[3] Silverberg M, Kessler J, Neumann PZ, Wiglesworth FW (1968) An intestinal lipid transport defect. A possible variant of hypo-beta-lipoproteinemia. Gastroenterology.

[4] Polonovski C, Navarro J, Fontaine JL, de Gouyon F, Saudubray JM, Cathelineau L (1970)

[6] Scott BB, Miller JP, Losowsky MS (1979) Hypobetalipoproteinaemia--a variant of the

[7] Gauthier S, Sniderman A (1983) Action tremor as a manifestation of chylomicron

[8] Bouma ME, Beucler I, Aggerbeck LP, Infante R, Schmitz J (1986) Hypobetalipoproteinemia with accumulation of an apoprotein B-like protein in intestinal cells. Immunoenzymatic and biochemical characterization of seven cases of

[9] Polanco I, Mellado MJ, Lama R, Larrauri J, Zapata A, Redondo E, et al. (1986)

[10] Levy E, Marcel Y, Deckelbaum RJ, Milne R, Lepage G, Seidman E, et al. (1987) Intestinal apoB synthesis, lipids, and lipoproteins in chylomicron retention disease. J Lipid Res.

[Anderson's disease. Apropos of a new case]. An Esp Pediatr. 24:185-188.

[5] Costil J (1976) Maladie d'Anderson. Journées parisiennes de pédiatrie. 229-239.

*1Department of Biochemistry, GHE, Hospices Civils de Lyon, France 2INSERM U1060 CarMeN, University of Lyon, Lyon, France 3INSERM UMR-S747, University Paris Descartes, Paris France* 

in infancy and childhood. Med J Aust 48:617-622.

[Anderson's disease]. Ann Pediatr (Paris). 17:342-354.

Bassen-Kornzweig syndrome. Gut. 20:163-168.

Anderson's disease. J Clin Invest. 78:398-410.

retention disease. Ann Neurol. 14:591.

absorption disorders. Nat Genet. 34:29-31.

**Author details** 

*France* 


[26] Fancello T, Najah M, Magnolo AL, Jelassi A, Di Leo E, Slimene N, et al. Novel mutations in SAR1B and MTP genes in chylomicron retention disease and abetalipoproteinemia (2011) 74th European Atherosclerosis Society Congress. Gothenburg. 2011.

Anderson's Disease/Chylomicron Retention Disease and Mutations in the *SAR1B* Gene 271

[41] Jensen D, Schekman R (2011) COPII-mediated vesicle formation at a glance. J Cell Sci.

[42] Zanetti G, Pahuja KB, Studer S, Shim S, Schekman R (2011) COPII and the regulation of

[43] Kodera C, Yorimitsu T, Nakano A, Sato K (2011) Sed4p stimulates Sar1p GTP hydrolysis

[44] Mansbach CM, Siddiqi SA (2010) The biogenesis of chylomicrons. Annu Rev Physiol.

[45] Siddiqi SA, Gorelick FS, Mahan JT, Mansbach CM, 2nd (2003) COPII proteins are required for Golgi fusion but not for endoplasmic reticulum budding of the pre-

[46] Siddiqi S, Saleem U, Abumrad NA, Davidson NO, Storch J, Siddiqi SA, et al. (2010) A novel multiprotein complex is required to generate the prechylomicron transport

[47] Siddiqi S, Mansbach CM (2012) Phosphorylation of Sar1b releases the Liver Fatty Acid Binding Protein from a Multiprotein Complex in intestinal cytosol enabling it to bind

[48] Jin L, Pahuja KB, Wickliffe KE, Gorur A, Baumgartel C, Schekman R, et al. (2012) Ubiquitin-dependent regulation of COPII coat size and function. Nature. 482:495-500. [49] Kim SD, Pahuja KB, Ravazzola M, Yoon J, Boyadjiev SA, Hammamoto S, et al. (2012) The SEC23-SEC31 interface plays a critical role for the export of procollagen from the

[50] Bourne HR, Sanders DA, McCormick F (1990) The GTPase superfamily: a conserved

[51] Kuge O, Dascher C, Orci L, Rowe T, Amherdt M, Plutner H, et al. (1994) Sar1 promotes vesicle budding from the endoplasmic reticulum but not Golgi compartments. J Cell

[52] Shoulders CC, Stephens DJ, Jones B (2004) The intracellular transport of chylomicrons

[53] Huang M, Weissman JT, Beraud-Dufour S, Luan P, Wang C, Chen W, et al. (2001) Crystal structure of Sar1-GDP at 1.7 A resolution and the role of the NH2 terminus in

[54] Dever TE, Glynias MJ, Merrick WC (1987) GTP-binding domain: three consensus sequence elements with distinct spacing. Proc Natl Acad Sci U S A. 84:1814-1818. [55] d'Enfert C, Gensse M, Gaillardin C (1992) Fission yeast and a plant have functional homologues of the Sar1 and Sec12 proteins involved in ER to Golgi traffic in budding

[56] Long KR, Yamamoto Y, Baker AL, Watkins SC, Coyne CB, Conway JF, et al. (2010) Sar1 assembly regulates membrane constriction and ER export. J Cell Biol. 190:115-128. [57] Guex N, Peitsch MC (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment

the Pre-Chylomicron Transport Vesicle. J Biol Chem. Paper in Press.

protein sorting in mammals. Nat Cell Biol. 14:20-28.

chylomicron transport vesicle. J Cell Sci. 116:415-427.

vesicle from intestinal ER. J Lipid Res. 51:1918-1928.

endoplasmique reticulum. J Biol Chem. Paper in Press.

switch for diverse cell functions. Nature. 348:125-132.

requires the small GTPase, Sar1b. Curr Opin Lipidol. 15:191-197.

for comparative protein modeling. Electrophoresis. 18:2714-2723.

and promotes limited coat disassembly. Traffic. 12:591-599.

124:1-4.

72:315-333.

Biol. 125:51-65.

ER export. J Cell Biol. 155:937-948.

yeast. EMBO J. 11:4205-4211.


[41] Jensen D, Schekman R (2011) COPII-mediated vesicle formation at a glance. J Cell Sci. 124:1-4.

270 Mutations in Human Genetic Disease

97:136-142.

2691.

348:908-915.

114:483-495.

[26] Fancello T, Najah M, Magnolo AL, Jelassi A, Di Leo E, Slimene N, et al. Novel mutations in SAR1B and MTP genes in chylomicron retention disease and abetalipoproteinemia

[27] Georges A, Bonneau J, Bonnefont-Rousselot D, Champigneulle J, Rabes JP, Abifadel M, et al. (2011) Molecular analysis and intestinal expression of SAR1 genes and proteins in

[29] Peretti N, Sassolas A, Roy CC, Deslandres C, Charcosset M, Castagnetti J, et al. (2010) Guidelines for the diagnosis and management of chylomicron retention disease based on a review of the literature and the experience of two centers. Orphanet J Rare Disl.

[30] Sakamoto O, Abukawa D, Takeyama J, Arai N, Nagano M, Hattori H, et al. (2006) An atypical case of abetalipoproteinaemia with severe fatty liver in the absence of

[31] Di Filippo M, Crehalet H, Samson-Bouma ME, Bonnet V, Aggerbeck LP, Rabes JP, et al. (2012) Molecular and functional analysis of two new MTTP gene mutations in an

[32] Dannoura AH, Berriot-Varoqueaux N, Amati P, Abadie V, Verthier N, Schmitz J, et al. (1999) Anderson's disease : exclusion of apolipoprotein and intracellular lipid transport

[33] Okada T, Miyashita M, Fukuhara J, Sugitani M, Ueno T, Samson-Bouma ME, et al. (2011) Anderson's disease/chylomicron retention disease in a Japanese patient with uniparental disomy 7 and a normal SAR1B gene protein coding sequence. Orphanet J

Rare Disl. Available: http://www.ojrd.com/content/6/1/78. Accessed 2012 Mar 23. [34] Nakano A, Muramatsu M (1989) A novel GTP-binding protein, Sar1p, is involved in transport from the endoplasmic reticulum to the Golgi apparatus. J Cell Biol. 109:2677-

[35] Barlowe C, Schekman R (1993) SEC12 encodes a guanine-nucleotide-exchange factor

[36] Bi X, Corpina RA, Goldberg J (2002) Structure of the Sec23/24-Sar1 pre-budding

[37] Rao Y, Bian C, Yuan C, Li Y, Chen L, Ye X, et al. (2006) An open conformation of switch I revealed by Sar1-GDP crystal structure at low Mg2+. Biochem Biophys Res Commun.

[38] Bielli A, Haney CJ, Gabreski G, Watkins SC, Bannykh SI, Aridor M (2005) Regulation of Sar1 NH2 terminus by GTP binding and hydrolysis promotes membrane deformation

[39] Mossessova E, Bickford LC, Goldberg J (2003) SNARE selectivity of the COPII coat. Cell.

[40] Futai E, Hamamoto S, Orci L, Schekman R (2004) GTP/GDP exchange by Sec12p enables

COPII vesicle bud formation on synthetic liposomes. Embo J. 23:4146-4155.

essential for transport vesicle budding from the ER. Nature. 365:347-349.

(2011) 74th European Atherosclerosis Society Congress. Gothenburg. 2011.

Anderson's disease (Chylomicron retention disease). Orphanet J Rare Dis. 6:1. [28] Peretti N, Roy CC, Sassolas A, Deslandres C, Drouin E, Rasquin A, et al. (2009) Chylomicron retention disease: a long term study of two cohorts. Mol Genet Metab.

Available: http://www.ojrd.com/content/5/1/24. Accessed 2012 Mar 23.

steatorrhoea or acanthocytosis. Eur J Pediatr. 165:68-70.

genes. Arterioscler Thromb Vasc Biol. 19:2494-2508.

complex of the COPII vesicle coat. Nature. 419:271-277.

to control COPII vesicle fission. J Cell Biol. 171:919-924.

atypical case of abetalipoproteinemia. J Lipid Res. 53:548-555.


[58] Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. (2010) A method and server for predicting damaging missense mutations. Nat Methods. 7:248- 249.

**Chapter 14** 

© 2012 Tuna and Amos, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**Activating Mutations and** 

Musaffe Tuna and Christopher I. Amos

http://dx.doi.org/10.5772/48701

**1. Introduction** 

subpopulation.

(Esteller 2007).

Additional information is available at the end of the chapter

Neoplasia, the accumulation of abnormal cells, occurs because tumor cells often lose control of proliferative signaling, escape growth suppression, can become invasive and metastasize and grow in abnormal environments, induce angiogenesis, withstand cell death, deregulate cellular energetic constraints, avoid immune destruction, promote inflammation and enhance genome instability and mutation (Hanahan and Weinberg 2011). Understanding the mechanisms underlying both the sensitivity and the resistance of tumor cells to anticancer agents first requires understanding the global view of the cancer genome (genetic, genomic, and epigenetic alterations) to identify driver events that decisively influence the viability and clinical behavior of a given tumor. This knowledge, together with an understanding of the mechanism of action of drugs, will lead to the identification of novel targets and the development of targeted therapeutics in the appropriate patient

By 1982, mutations and chromosomal translocations had been established as key genetic mechanisms that are capable of driving cancer. Then, the *MYC* proto-oncogene was found to be activated by translocation as well as amplification, and amplification thus became recognized as an additional cardinal mechanism of cancer gene deregulation (Collins and Groudine 1982; Taub, Kirsch et al. 1982; Vennstrom, Sheiness et al. 1982; Alitalo, Schwab et al. 1983). Epigenetic modifications of genomic DNA or histones by methylation or acetylation also became recognized as key mediators of the cancer phenotype

One of the first pivotal discoveries of activating mutations was within *BRAF* (Figure 1), which encodes a serine/threonine kinase oncogene that transmits proliferative and survival signals downstream of RAS in the mitogen-activated protein (MAP) kinase cascade (Davies,

and reproduction in any medium, provided the original work is properly cited.

**Targeted Therapy in Cancer** 


**Chapter 14** 

## **Activating Mutations and Targeted Therapy in Cancer**

Musaffe Tuna and Christopher I. Amos

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48701

## **1. Introduction**

272 Mutations in Human Genetic Disease

249.

11:863-874.

27:460-466.

protein function. Genome Res. 12:436-446.

function. Nucleic Acids Res. 31:3812-3814.

proteins. Mol Biol Cell. 17:1593-1605.

function. Annu Rev Genomics Hum Genet. 7:61-80.

[58] Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. (2010) A method and server for predicting damaging missense mutations. Nat Methods. 7:248-

[59] Ng PC, Henikoff S (2001) Predicting deleterious amino acid substitutions. Genome Res.

[60] Ng PC, Henikoff S (2002) Accounting for human polymorphisms predicted to affect

[61] Ng PC, Henikoff S (2003) SIFT: Predicting amino acid changes that affect protein

[62] Ng PC, Henikoff S (2006) Predicting the effects of amino acid substitutions on protein

[63] Kumar P, Henikoff S, Ng PC (2009) Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc. 4:1073-1081. [64] Espenshade PJ, Li WP, Yabe D (2002) Sterols block binding of COPII proteins to SCAP, thereby controlling SCAP sorting in ER. Proc Natl Acad Sci U S A. 99:11694-11699. [65] Ridsdale A, Denis M, Gougeon PY, Ngsee JK, Presley JF, Zha X (2006) Cholesterol is required for efficient endoplasmic reticulum-to-Golgi transport of secretory membrane

[66] Yue P, Averna M, Lin X, Schonfeld G (2006) The c.43\_44insCTG variation in PCSK9 is associated with low plasma LDL-cholesterol in a Caucasian population. Hum Mutat. Neoplasia, the accumulation of abnormal cells, occurs because tumor cells often lose control of proliferative signaling, escape growth suppression, can become invasive and metastasize and grow in abnormal environments, induce angiogenesis, withstand cell death, deregulate cellular energetic constraints, avoid immune destruction, promote inflammation and enhance genome instability and mutation (Hanahan and Weinberg 2011). Understanding the mechanisms underlying both the sensitivity and the resistance of tumor cells to anticancer agents first requires understanding the global view of the cancer genome (genetic, genomic, and epigenetic alterations) to identify driver events that decisively influence the viability and clinical behavior of a given tumor. This knowledge, together with an understanding of the mechanism of action of drugs, will lead to the identification of novel targets and the development of targeted therapeutics in the appropriate patient subpopulation.

By 1982, mutations and chromosomal translocations had been established as key genetic mechanisms that are capable of driving cancer. Then, the *MYC* proto-oncogene was found to be activated by translocation as well as amplification, and amplification thus became recognized as an additional cardinal mechanism of cancer gene deregulation (Collins and Groudine 1982; Taub, Kirsch et al. 1982; Vennstrom, Sheiness et al. 1982; Alitalo, Schwab et al. 1983). Epigenetic modifications of genomic DNA or histones by methylation or acetylation also became recognized as key mediators of the cancer phenotype (Esteller 2007).

One of the first pivotal discoveries of activating mutations was within *BRAF* (Figure 1), which encodes a serine/threonine kinase oncogene that transmits proliferative and survival signals downstream of RAS in the mitogen-activated protein (MAP) kinase cascade (Davies,

© 2012 Tuna and Amos, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Bignell et al. 2002). This was after the discovery of *HRAS* mutations (Reddy, Reynolds et al. 1982; Tabin, Bradley et al. 1982) and similar mutations within *KRAS* (Capon, Seeburg et al. 1983; Shimizu, Birnbaum et al. 1983), *NRAS* (Bos, Toksoz et al. 1985), and other genes. Some of the driver mutations were found to be targets for therapy, whereas others play crucial roles in resistance to therapy. Here, we focus on activating mutations, small molecules that have been used to target mutated genes, and mutations that play crucial roles in resistance to certain therapeutic agents.

Activating Mutations and Targeted Therapy in Cancer 275

(inactivating mutations) can occur either by small-scale mutation or by large-scale

Small-scale mutations can be grouped into the following classes on the basis of the effect of

a. **Base substitution mutation** is the replacement (exchange) of a single nucleotide by another. Base substitutions can be either a **transition**—substitution of a pyrimidine by a pyrimidine (C↔T) or a purine by a purine (A↔G)—or a **transversion**—substitution of a pyrimidine by a purine or vice versa (A↔G, A↔C, G↔T, T↔C). Single nucleotide mutation can lead to qualitative rather than quantitative changes in the function of a protein. The biological activity can be retained, but the characteristics may differ, such as optimum pH and stability. Mutations that occur in coding DNA can be grouped into

i. **Synonymous (silent) mutations.** In this type of mutation, even if the sequence changes, the amino acid is not altered due to the degenerate genetic code, except if the mutations affect splicing by activating a cryptic splice site or by altering an exonic splice enhancer sequence. Because silent mutations usually confer no advantage or disadvantage to the

ii. **Non-synonymous mutations.** In this type of mutation, the altered sequence changes the amino acid, which can be a polypeptide (gene product) or functional non-coding RNA. Non-synonymous mutations may have a harmful effect, no effect, or a beneficial effect in the organism. Non-synonymous mutations can be grouped into **nonsense** mutations, where the altered amino acid is replaced by a stop codon, which results in premature termination and is likely to cause loss of function or expression because of degradation of mRNA, and **missense** mutations, where the altered codon specifies a different amino acid, which may affect protein function or stability. **Splice site mutations** are likely to cause aberrant splicing, such as exon skipping or intron retention, and mutations in **promoter** sequences can result in altered gene expression. Finally, some mutations alter the normal stop codon, which terminates mRNA transcription so that a longer or

b. **Deletions.** In this type of mutation, one or more nucleotides are lost from a sequence. i. Deletion of multiple codons (three bases) may affect protein function or stability.

ii. A frameshift mutation—not of a multiple of three bases (codon)—is likely to result in

iii. A large deletion—partial- or whole-gene deletion—is likely to result in premature

c. **Insertions.** In this type of mutation, one or more nucleotides are added into a sequence. i. Insertion of 3 nucleotides (a codon) or of multiple codons may affect protein function or

ii. A frameshift mutation, which occurs when either <3 or >3 nucleotides are inserted, is

organism in which they arise, they are also called neutral mutations.

shorter amino acid than normal is translated.

premature termination with loss of function.

termination with loss of function or expression.

likely to result in premature termination with loss of function.

alterations, such as loss of region of tumor suppressor gene or whole chromosome.

the mutation on the DNA sequence:

two classes:

stability.

**Figure 1.** The historical timelines for discovery of driver translocation, mutation and amplification.

## **2. Types of mutations**

Oncogenesis results from mutations or alterations of genes that regulate cell functions such as proliferation, growth, invasion, angiogenesis, metastasis, death, energy metabolism, genome stability, and replication. Simple mutations can be induced in DNA by exposure to a variety of mutagens, such as radiation and chemicals, or by spontaneous errors in DNA replication and repair. Genes with mutations that cause cancer can be grouped into two classes: oncogenes and tumor suppressor genes.

Oncogenes are the mutant form of proto-oncogenes, a class of normal cellular protein-coding genes that promote the growth and survival of cells. Oncogenes encode proteins such as:


Tumor suppressor genes, which control cell growth, can be grouped into two classes: gatekeeper and caretaker tumor suppressor genes. Gatekeeper tumor suppressor genes (e.g., *RB1* and *TP53*) block tumor development by controlling cell division and survival, and caretaker tumor suppressor genes (e.g., *MSH2* and *MLH1*) protect the integrity of the genome.

Activation of proto-oncogenes (activating mutations) can occur either by large-scale alterations, such as gain/amplification, insertion, or chromosome translocation, or by smallscale mutations, such as point mutation. Inactivation of tumor suppressor genes (inactivating mutations) can occur either by small-scale mutation or by large-scale alterations, such as loss of region of tumor suppressor gene or whole chromosome.

274 Mutations in Human Genetic Disease

to certain therapeutic agents.

Identification of *MYC* amplification & translocation & *HRAS* & *KRAS* & *NRAS* mutation

*ERBB2* Cloning & amplification

**1985-1987**

**2. Types of mutations** 

Mechanism of action: fusion of the *ABL*

Discovery of *BCR-ABL* translocation

classes: oncogenes and tumor suppressor genes.

b. Growth factor receptors (e.g., *ERBB2*, *EGFR*, and *MET*); c. Intracellular signal transduction factors (e.g., *RAS* and *RAF*);

f. Inhibitors of programmed cell death machinery (e.g., *BCL2*).

a. Growth factors (e.g., *PDGF* and *IGF1*);

d. Cell cycle factors (e.g., *CDK4*);

*JUN*, and *MYC*); and

genome.

Bignell et al. 2002). This was after the discovery of *HRAS* mutations (Reddy, Reynolds et al. 1982; Tabin, Bradley et al. 1982) and similar mutations within *KRAS* (Capon, Seeburg et al. 1983; Shimizu, Birnbaum et al. 1983), *NRAS* (Bos, Toksoz et al. 1985), and other genes. Some of the driver mutations were found to be targets for therapy, whereas others play crucial roles in resistance to therapy. Here, we focus on activating mutations, small molecules that have been used to target mutated genes, and mutations that play crucial roles in resistance

**Figure 1.** The historical timelines for discovery of driver translocation, mutation and amplification.

Identification of *PIK3CA* in colon cancer *& EGFR* mutation in lung cancer

Identification of *EML4-ALK* translocation

**2006** Identification of *ABL* mutations

Identification of *IDH1*  mutation

**1960 1982-1985 2002 2007**

**2004**

**1973 2009**

**2005** Identification of *ETS-ETV4* translocation

Identification of *BRAF*  mutation

Oncogenesis results from mutations or alterations of genes that regulate cell functions such as proliferation, growth, invasion, angiogenesis, metastasis, death, energy metabolism, genome stability, and replication. Simple mutations can be induced in DNA by exposure to a variety of mutagens, such as radiation and chemicals, or by spontaneous errors in DNA replication and repair. Genes with mutations that cause cancer can be grouped into two

Oncogenes are the mutant form of proto-oncogenes, a class of normal cellular protein-coding genes that promote the growth and survival of cells. Oncogenes encode proteins such as:

e. Transcription factors that control the expression of growth promoting genes (e.g., *FOS*,

Tumor suppressor genes, which control cell growth, can be grouped into two classes: gatekeeper and caretaker tumor suppressor genes. Gatekeeper tumor suppressor genes (e.g., *RB1* and *TP53*) block tumor development by controlling cell division and survival, and caretaker tumor suppressor genes (e.g., *MSH2* and *MLH1*) protect the integrity of the

Activation of proto-oncogenes (activating mutations) can occur either by large-scale alterations, such as gain/amplification, insertion, or chromosome translocation, or by smallscale mutations, such as point mutation. Inactivation of tumor suppressor genes Small-scale mutations can be grouped into the following classes on the basis of the effect of the mutation on the DNA sequence:


iii. A large insertion, which is partial-gene duplication, is likely to result in premature termination with loss of function. Whole-gene duplication may have an effect because of increased gene dosage.

Activating Mutations and Targeted Therapy in Cancer 277

20, 21, 27 V617F8, 19, 20

<sup>27</sup> K539L19

*IDH2*

H538DK539LI540S19

*EGFR BRAF KRAS PIK3CA c-KIT BCR-ABL IDH1 JAK2* 

G719S12 K439Q12 G12S5, 12 E545Q4 W557R13 G250E10 R132S2, 9, 26 N542-E543del19 T790M12 K439T12 G12A5, 9, 12 E545A4 V559A13 Q252H10 R132G2, 9, 26 F537-K539delinsK19 L858R4, 12 T440P12 G12D5, 12 E545G4, 5, 29 V560D22 Y253F10 R132L2, 9, 26 H538-K539delinsL19 L858Q12 V459L12 G12V5, 12 E545V4 D816H13, 22 Y253H10 R132V2, 9 F537-I546dupF547L19 L858L21 G469A14 G13C5, 12 Q546K4, 6 F504L13 E255K10 R132G2 E543del19 D761Y12 R462I5 G13R12 Q546E4 S502-Y503insFA13 E255V10 V71I14, 24 H538QK539L19 L747S12 I463S5 G13S12 Q546P4 K550N13 D276G10 G123R24 I540-E543delinsMK19 T854A12 G464E5, 11, 17 G13A12 Q546R4, 6 Y553N13 E279K10 G97D9, 25 F547V19

F788L21 G464R5, 11, 17 Q61K10, 12, 13 D549N4 K558N13 F311L10 F537-F547dup19 R748K21, 22 G466A6, 12, 13 Q61L5, 10, 12 H1047L4, 6, 29 G565V13 T315I10 I540-N542delinsS19 L747–S752del G466E12, 13 Q61R12, 13 H1047R4, 5, 6, 9, 28, 29 N566D13 T315A10 V294M13 V536-F547dup19 E746-A750del4, 17, 28 G466R12, 13 Q61H5, 10, 12, 14 Q1064R6 V569G13 F317L10 R172K2, 9 V536-I546dup19

T710A *FLT3* 25 G469A3, 5, 10, 11, 12, 13,16 L19F14 G12-R19del6 N655K13 M351T10 R172S2

E749K25 G469R3, 5, 10, 11, 12, 13,16 E63K14 R88Q4, 5, 6, 9 D816V2, 13 F359V10 R140Q2, 9, 19, 20 Y592A2

A767T25 G469V3, 5, 10, 11, 12, 13,16 G12F10, 12 E109del6 D820Y13 V379I10 R140L2, 26 Y599F2 K745R28 K475E13 V344G6 N822I13 L384M10 F691L2

T263P9 D587A5 M1043V6, 9 A829P13 H396P10 D835N2 A289V9 D594G5, 13, 16, 23 M1043I6 I841V13 F486S10 D835Y2 G598V9 D594K5, 13, 16, 23 E81K6 S864F13 E459K D835A2 L861Q9 D594V5, 13, 16, 23 H1048R6 V120F29 D835E2 R680G9 F595L5, 11 G1049R6 V560D22 D835H2 G136A9 G596R5 E418K6 Y503-F504insAY22 D835V2 G136C9 L597Q11, 12, 13, 17 C420R6 Y570-L576del22 D835F2 G323A9 L597R11, 12, 13, 17 H701P6 A599T15 I836F2 A787C9 L597S11, 12, 13, 17 LWGIHLM10del9 V833L10, 30 I836S2 C866A9 L597V11, 12, 13, 17 P18del9 P577S10, 30 M837P2 G865A9 T599I5 N345K9 V825A10, 30 Y842H2 C866T9 T599-ins(T-T)13 C420R9 L576P30 Y842D2

G719A12 M117R13 G12C5, 12 E542K4, 5, 6, 28, 29 K642E13, 22 M244V10 R132C2, 5, 8, 9, 19,

G719C12 I326T4 G12R12 E545K4, 5, 6, 9, 12, 28, 29 L576P13 L248V10 R132H1, 2, 9, 21, 26,

P782L21 G464V5, 11, 17 G13D5, 12, 14 Q546L4 556insL13 V299L10

G735S24 N581S5 E309NfsX106 N822K2, 13 L387M10 R108K9 E586K17 E453K4, 6 Y823D13 H396R10

13, 17, 22, 23, 24 E562K 30

13, 17, 22, 23, 24 N564S30

13, 17, 22, 23, 24 D816I10

S492R25 V600R D816G10 F712S25 V600-K605 ins13 D816F10 T725T25 K601E5, 13, 22 V825A2 V742V21, 25 K601N5, 13, 22 D816Y2, 17, 26 F795S25 R682Q6 R634R30 G796S25 A728V1 D820G19 G796V21 V825I19 T751I21 E839K19 R748K21 I957T19 R836R25 P31L19 T847I4 R956Q19 Q820R21 T22A19 E804G21 G961S19 L828M21 K642E13 F856Y21 V559D22 F856L21 W557R22

13, 17, 22, 23, 24 D816V2, 10, 17

13, 17, 18, 19, 23 D816H2, 10, 17, 26

G971T9 V600D3, 4, 5, 7, 9, 11, 12,

G988A9 V600E3, 4, 5, 7, 9, 11, 12,

D1006Y14 V600G3, 4, 5, 7, 9, 11, 12,

M178I28 V600K3, 4, 5, 7, 9, 11, 12,

I475V28 V600M3, 4, 5, 8, 9, 11, 12,

S752-I759del4 G466V 12, 13 A146T5, 14 A1066V6 R634W13 F317V10 R172M2, 9 L707S25 F468C5 Y64D14 Y1021C6 V654A13 F317C10 R172G2, 9

E711V25 G469E3, 5, 10, 11, 12, 13,16 K117N14 R38H6 D816H13 E355G10 R172W2, 9

E762G25 G469S3, 5, 10, 11, 12, 13,16 K147N28 G106A6 D820V13 F359C10 R140W2

iv. A dynamic mutation, which is the expansion of a dinucleotide or a trinucleotide repeat, may alter gene expression or may alter protein stability or function.

Whereas mutations in coding DNA have a phenotypic effect, mutations in non-coding DNA are less likely to have a phenotypic effect, except when the mutation occurs in a regulatory sequence such as a promoter sequence and miRNAs. Mutations exert their phenotypic effect through either gain of function or loss of function. Loss-of-function mutations result in either reduced activity or complete loss of the gene product. Gain-of-function mutations can result in either an increased level of expression or the development of a new function of the gene product.

Important progress has been made in developing new technologies for identifying mutations. One of these is next-generation sequencing. This technology enables the identification of copy number changes, chromosomal alterations such as translocations and inversions, and point mutations.

#### **3. Activating mutations and targeted therapies**

Recent advances in molecular oncology and discoveries in genetic alterations have yielded new treatment strategies that target specific molecules and pathways in the cancer cell and thereby shed light on personalized therapy. In the past, treatment decisions were based on pathologic results. Now, diagnostic or therapeutic decisions are often also based on genetics/genomic alterations. Currently, the genomic view effectively guides cancer treatment decisions and predicts therapeutic response. Early clinical success was achieved with all-*trans* retinoic acid therapy in patients with acute promyelocytic leukemia (characterized by chromosomal translocations involving retinoic acid receptor α, the target of all-*trans* retinoic acid) (Huang, Ye et al. 1988; Castaigne, Chomienne et al. 1990), Herceptin (trastuzumab, a monoclonal antibody) and in patients with breast cancer in which *ERBB2* is amplified and/or overexpressed (Baselga, Tripathy et al. 1999; Slamon, Leyland-Jones et al. 2001; Vogel, Cobleigh et al. 2002). Also, imatinib mesylate and, subsequently, nilotinib (a selective ABL tyrosine kinase inhibitor [TKI]) have proved effective in patients with the *BCR-ABL* fusion gene, including most individuals (95%) with chronic myeloid leukemia (CML), which constitutively activates the ABL tyrosine kinase (Mauro, O'Dwyer et al. 2002). These successes motivated the discovery of new targets and selective inhibitors for those targets. Currently, targeted therapeutics are used to target receptor tyrosine kinases (*EGFR*, *ERBB2*, *FGFR1*, *FGFR2*, *FGFR3*, *PDGFRA*, *PDGFRB*, *ALK*, *c-MET*, *IGF1R*, *c-KIT*, *FLT3*, and *RET*), non-receptor tyrosine kinases (*ABL*, *JAK2*, and *SRC*), serine-threonine-lipid kinases (*BRAF*, *Aura A and B kinases*, *mTOR*, and *PIK3*), and DNA damage and repair genes (*BRCA1* and *BRCA2*), however not all therapeutics are selective inhibitors. Here, we focus on activating mutations that are targeted by selective inhibitors to inhibit only mutated genes; *EGFR, ALK, c-KIT, BCR-ABL, JAK2, BRAF, IDH1, IDH2*, *FLT3* and *PIK3CA* (Table 1).

#### Activating Mutations and Targeted Therapy in Cancer 277


276 Mutations in Human Genetic Disease

gene product.

of increased gene dosage.

inversions, and point mutations.

**3. Activating mutations and targeted therapies** 

iii. A large insertion, which is partial-gene duplication, is likely to result in premature termination with loss of function. Whole-gene duplication may have an effect because

iv. A dynamic mutation, which is the expansion of a dinucleotide or a trinucleotide repeat,

Whereas mutations in coding DNA have a phenotypic effect, mutations in non-coding DNA are less likely to have a phenotypic effect, except when the mutation occurs in a regulatory sequence such as a promoter sequence and miRNAs. Mutations exert their phenotypic effect through either gain of function or loss of function. Loss-of-function mutations result in either reduced activity or complete loss of the gene product. Gain-of-function mutations can result in either an increased level of expression or the development of a new function of the

Important progress has been made in developing new technologies for identifying mutations. One of these is next-generation sequencing. This technology enables the identification of copy number changes, chromosomal alterations such as translocations and

Recent advances in molecular oncology and discoveries in genetic alterations have yielded new treatment strategies that target specific molecules and pathways in the cancer cell and thereby shed light on personalized therapy. In the past, treatment decisions were based on pathologic results. Now, diagnostic or therapeutic decisions are often also based on genetics/genomic alterations. Currently, the genomic view effectively guides cancer treatment decisions and predicts therapeutic response. Early clinical success was achieved with all-*trans* retinoic acid therapy in patients with acute promyelocytic leukemia (characterized by chromosomal translocations involving retinoic acid receptor α, the target of all-*trans* retinoic acid) (Huang, Ye et al. 1988; Castaigne, Chomienne et al. 1990), Herceptin (trastuzumab, a monoclonal antibody) and in patients with breast cancer in which *ERBB2* is amplified and/or overexpressed (Baselga, Tripathy et al. 1999; Slamon, Leyland-Jones et al. 2001; Vogel, Cobleigh et al. 2002). Also, imatinib mesylate and, subsequently, nilotinib (a selective ABL tyrosine kinase inhibitor [TKI]) have proved effective in patients with the *BCR-ABL* fusion gene, including most individuals (95%) with chronic myeloid leukemia (CML), which constitutively activates the ABL tyrosine kinase (Mauro, O'Dwyer et al. 2002). These successes motivated the discovery of new targets and selective inhibitors for those targets. Currently, targeted therapeutics are used to target receptor tyrosine kinases (*EGFR*, *ERBB2*, *FGFR1*, *FGFR2*, *FGFR3*, *PDGFRA*, *PDGFRB*, *ALK*, *c-MET*, *IGF1R*, *c-KIT*, *FLT3*, and *RET*), non-receptor tyrosine kinases (*ABL*, *JAK2*, and *SRC*), serine-threonine-lipid kinases (*BRAF*, *Aura A and B kinases*, *mTOR*, and *PIK3*), and DNA damage and repair genes (*BRCA1* and *BRCA2*), however not all therapeutics are selective inhibitors. Here, we focus on activating mutations that are targeted by selective inhibitors to inhibit only mutated genes;

*EGFR, ALK, c-KIT, BCR-ABL, JAK2, BRAF, IDH1, IDH2*, *FLT3* and *PIK3CA* (Table 1).

may alter gene expression or may alter protein stability or function.


Activating Mutations and Targeted Therapy in Cancer 279

generation of mutagenic reactive oxygen species and to inhibit DNA repair mechanisms

The discovery of this oncogenic fusion protein led to the development of imatinib mesylate. Imatinib, an ABL kinase inhibitor, was the first therapeutically successful treatment for CML and gained U.S. Food and Drug Administration approval in 2001. However, a substantial proportion of patients with CML developed resistance to imatinib because of mutation in *BCR-ABL* fusion gene (>90 mutations that affect >55 amino acid residues in *BCR-ABL*) (Table 1) (Branford 2007). Interestingly, *BCR-ABL* mutations were found in 57% of patients with acquired resistance to imatinib compared with 30% of patients with primary resistance (Soverini, Colarossi et al. 2006). The point mutation(s) in the *BCR-ABL* kinase domain result in the resistance to imatinib by reducing the flexibility of the kinase domain and its binding to imatinib, and inhibiting the activity of the kinase (Burgess, Skaggs et al. 2005; O'Hare,

T315I is the most common imatinib-resistant mutation in *BCR-ABL*; among the other highly imatinib-resistant mutations are L248V, Y253F/H, E255K/V, H396P/R, and F486S (Houchhaus, La Rosee et al. 2011). These discoveries were followed by the development of second-generation TKIs to inhibit BCR-ABL: dasatinib, and nilotinib. The response rate to these second-generation BCR-ABL inhibitors in patients harboring imatinib-resistant mutations is variable, depending on the mutation: L248V (40%), G250E (33%), E255K (38%), and E255V (36%), but response rates are low in those harboring F317L (7%) or Q252H (17%) (Muller, Cortes et al. 2009). The following imatinib-resistant mutations are sensitive to nilotinib: M351T, G250E, M244V, H396R, F317L, E355G, E459K, F486S, L248V, D276G, E279K, and V299L. The following are sensitive to dasatinib: M351T, G250E, F359V, M244V, Y253H, H396R, E355G, E459K, F486S, L248V, D276G, E279K, Y253F, F359C, and F359I. The following mutations are resistant to dasatinib: V299L, T315A, and F317I/L. The following are resistant to nilotinib: Y253F/H, E255K/V, and F359C/V (Hochhaus, La Rosee et al. 2011). All three these inhibitors inhibit the catalytic activity of BCR-ABL by binding to the ATP-

One of the discoveries of mutations affecting cancer prognosis is *BRAF* mutations. *BRAF* has been discovered to be the most commonly mutated oncogene in melanoma (50–60%) (Davies, Bignell et al. 2002), papillary thyroid carcinoma (36–53%) (Yeang , McCormick et al. 2008), colon carcinoma (57%), serous ovarian carcinoma (~30%) (Yeang , McCormick et al. 2008), and hairy cell leukemia (100%) (Tiacci, Trifonov et al. 2011). To date, >60 distinct mutations in the *BRAF* gene have been identified (Table 1) (Garnett and Marais 2004; *Catalog of Somatic Mutations in Cancer*: www.sanger.ac.uk/genetics/CGP/cosmic/). The most prevalent mutation is a missense mutation in *BRAF*, which results in a substitution of glutamic acid to valine at codon 600 (BRAFV600E) and occurs in 90% of all *BRAF* mutations (Garnett and Marais 2004). *BRAF* encodes BRAF, a member of the RAF family of cytoplasmic serine/threonine protein kinases. BRAF phosphorylates MEK protein and

(Koptyra, Falinski et al. 2006; Fernandes, Reddy et al. 2009).

Walters et al. 2005).

binding pocket of the ABL kinase domain.

**3.2. Activating mutations at** *BRAF*

1Acute lymphocytic leukemia; 2 acute myleloid leukemia; 3 Barret's adenocarcinoma; 4 breast carcinoma; 5 colon carcinoma; 6endometrial carcinoma; 7ependymoma; 8essential thrombocyte; 9glioma; 10leukemia; 11hepatocellular carcinoma; 12lung cancer; 13melanoma; 14multiple myeloma; 15neuroblastoma; 16non-hodgkins lymphoma; 17ovarian carcinoma; 18pancreas cancer; 19polycythemia vera; 20primary myelofibrosis; 21prostate cancer; 22sarcoma; 23stomach cancer; 24thyroid cancer; 25colorectal cancer; 26myelodysplastic syndromes; 27paraganglioma; 28H&N (head and neck) cancer; 29esophageal cancer, 30lymphoma.

**Table 1.** Mutations have been reported at *EGFR, BRAF, KRAS, PIK3C, c-KIT, ABL, IDH1, IDH2* and *JAK2* in variety of cancers (Garnett and Marais 2004; Lee, Vivanco et al. 2006; Loeffler-Ragg, Witsch-Baumgartner et al. 2006; Thomas, Baker et al. 2007; Balss, Meyer et al. 2008; The Cancer Genome Atlas Network 2008; Bleeker, Lamba et al. 2009; Hayes, Douglas et al. 2009; MacConaill, Campbell et al. 2009; Yan, Parsons et al. 2009; de Muga, Hernandez et al. 2010; Gravendeel, Kloosterhof et al. 2010; Green and Beer 2010; Reitman and Yan 2010; Yen, Bittinger et al. 2010; Chapman, Lawrence et al. 2011; Konopka, Janiec-Jankowska et al. 2011; Metzger, Chambeau et al. 2011; Murugan, Dong et al. 2011; Passamonti, Elena et al. 2011; Peraldo-Neia, Migliardi et al. 2011; Stransky, Egloff et al. 2011; Tanaka, Terai et al. 2011; The Cancer Genome Atlas Network 2011; Teng, Tan et al. 2011; Montagut, Dalmases et al. 2012; Weisberg, Sattler et al. 2010; *Catalog of Somatic Mutations in Cancer*: www.sanger.ac.uk/genetics/CGP/cosmic/)

#### **3.1. Activating mutations at** *BCR-ABL*

In a normal cell, ABL protein is located in the nucleus, but in cancer cells the BCR-ABL fusion protein is found in the cytoplasm and is constitutively active (Goldman and Melo 2008). Studies have shown that *BCR-ABL* is oncogenic in hematopoietic cells, promoting leukemic cell proliferation and inhibiting apoptosis (Lugo, Pendergast et al. 1990; Stoklosa, Poplawski et al. 2008). Notably, *BCR-ABL* activity has also been found to stimulate the generation of mutagenic reactive oxygen species and to inhibit DNA repair mechanisms (Koptyra, Falinski et al. 2006; Fernandes, Reddy et al. 2009).

The discovery of this oncogenic fusion protein led to the development of imatinib mesylate. Imatinib, an ABL kinase inhibitor, was the first therapeutically successful treatment for CML and gained U.S. Food and Drug Administration approval in 2001. However, a substantial proportion of patients with CML developed resistance to imatinib because of mutation in *BCR-ABL* fusion gene (>90 mutations that affect >55 amino acid residues in *BCR-ABL*) (Table 1) (Branford 2007). Interestingly, *BCR-ABL* mutations were found in 57% of patients with acquired resistance to imatinib compared with 30% of patients with primary resistance (Soverini, Colarossi et al. 2006). The point mutation(s) in the *BCR-ABL* kinase domain result in the resistance to imatinib by reducing the flexibility of the kinase domain and its binding to imatinib, and inhibiting the activity of the kinase (Burgess, Skaggs et al. 2005; O'Hare, Walters et al. 2005).

T315I is the most common imatinib-resistant mutation in *BCR-ABL*; among the other highly imatinib-resistant mutations are L248V, Y253F/H, E255K/V, H396P/R, and F486S (Houchhaus, La Rosee et al. 2011). These discoveries were followed by the development of second-generation TKIs to inhibit BCR-ABL: dasatinib, and nilotinib. The response rate to these second-generation BCR-ABL inhibitors in patients harboring imatinib-resistant mutations is variable, depending on the mutation: L248V (40%), G250E (33%), E255K (38%), and E255V (36%), but response rates are low in those harboring F317L (7%) or Q252H (17%) (Muller, Cortes et al. 2009). The following imatinib-resistant mutations are sensitive to nilotinib: M351T, G250E, M244V, H396R, F317L, E355G, E459K, F486S, L248V, D276G, E279K, and V299L. The following are sensitive to dasatinib: M351T, G250E, F359V, M244V, Y253H, H396R, E355G, E459K, F486S, L248V, D276G, E279K, Y253F, F359C, and F359I. The following mutations are resistant to dasatinib: V299L, T315A, and F317I/L. The following are resistant to nilotinib: Y253F/H, E255K/V, and F359C/V (Hochhaus, La Rosee et al. 2011). All three these inhibitors inhibit the catalytic activity of BCR-ABL by binding to the ATPbinding pocket of the ABL kinase domain.

#### **3.2. Activating mutations at** *BRAF*

278 Mutations in Human Genetic Disease

1Acute lymphocytic leukemia; 2

cancer; 29esophageal cancer, 30lymphoma.

**3.1. Activating mutations at** *BCR-ABL*

I789I21 H870N21 V834A21 T725M17 L858R17 R832C17 A868D17 T852M17 T725A17 L703P17 S720F17 N700S17 R836S17 G721S17 L703P17 K708G17 P772-H773insV12 R108K9 L62R9 V651M9 R222C9 T263P9 A289T9 A289V9 A597P9 G598V9 C620Y9 S703F9

A839V21 V559G22 G863D12, 21 V559D13, 22 V851I12, 21 V540L2 I821T21 M541L2

acute myleloid leukemia; 3

carcinoma; 6endometrial carcinoma; 7ependymoma; 8essential thrombocyte; 9glioma; 10leukemia; 11hepatocellular carcinoma; 12lung cancer; 13melanoma; 14multiple myeloma; 15neuroblastoma; 16non-hodgkins lymphoma; 17ovarian carcinoma; 18pancreas cancer; 19polycythemia vera; 20primary myelofibrosis; 21prostate cancer; 22sarcoma; 23stomach cancer; 24thyroid cancer; 25colorectal cancer; 26myelodysplastic syndromes; 27paraganglioma; 28H&N (head and neck)

**Table 1.** Mutations have been reported at *EGFR, BRAF, KRAS, PIK3C, c-KIT, ABL, IDH1, IDH2* and *JAK2*

In a normal cell, ABL protein is located in the nucleus, but in cancer cells the BCR-ABL fusion protein is found in the cytoplasm and is constitutively active (Goldman and Melo 2008). Studies have shown that *BCR-ABL* is oncogenic in hematopoietic cells, promoting leukemic cell proliferation and inhibiting apoptosis (Lugo, Pendergast et al. 1990; Stoklosa, Poplawski et al. 2008). Notably, *BCR-ABL* activity has also been found to stimulate the

in variety of cancers (Garnett and Marais 2004; Lee, Vivanco et al. 2006; Loeffler-Ragg, Witsch-Baumgartner et al. 2006; Thomas, Baker et al. 2007; Balss, Meyer et al. 2008; The Cancer Genome Atlas Network 2008; Bleeker, Lamba et al. 2009; Hayes, Douglas et al. 2009; MacConaill, Campbell et al. 2009; Yan, Parsons et al. 2009; de Muga, Hernandez et al. 2010; Gravendeel, Kloosterhof et al. 2010; Green and Beer 2010; Reitman and Yan 2010; Yen, Bittinger et al. 2010; Chapman, Lawrence et al. 2011; Konopka, Janiec-Jankowska et al. 2011; Metzger, Chambeau et al. 2011; Murugan, Dong et al. 2011; Passamonti, Elena et al. 2011; Peraldo-Neia, Migliardi et al. 2011; Stransky, Egloff et al. 2011; Tanaka, Terai et al. 2011; The Cancer Genome Atlas Network 2011; Teng, Tan et al. 2011; Montagut, Dalmases et al. 2012; Weisberg, Sattler et al. 2010; *Catalog of Somatic Mutations in Cancer*: www.sanger.ac.uk/genetics/CGP/cosmic/)

Barret's adenocarcinoma; 4

breast carcinoma; 5

colon

One of the discoveries of mutations affecting cancer prognosis is *BRAF* mutations. *BRAF* has been discovered to be the most commonly mutated oncogene in melanoma (50–60%) (Davies, Bignell et al. 2002), papillary thyroid carcinoma (36–53%) (Yeang , McCormick et al. 2008), colon carcinoma (57%), serous ovarian carcinoma (~30%) (Yeang , McCormick et al. 2008), and hairy cell leukemia (100%) (Tiacci, Trifonov et al. 2011). To date, >60 distinct mutations in the *BRAF* gene have been identified (Table 1) (Garnett and Marais 2004; *Catalog of Somatic Mutations in Cancer*: www.sanger.ac.uk/genetics/CGP/cosmic/). The most prevalent mutation is a missense mutation in *BRAF*, which results in a substitution of glutamic acid to valine at codon 600 (BRAFV600E) and occurs in 90% of all *BRAF* mutations (Garnett and Marais 2004). *BRAF* encodes BRAF, a member of the RAF family of cytoplasmic serine/threonine protein kinases. BRAF phosphorylates MEK protein and activates ERK signaling, downstream of RAS, which regulates multiple key cellular processes that are required for cell proliferation, differentiation, apoptosis, and survival. The *RAF* family (*A-RAF, B-RAF, C-RAF*) members are components of a signal transduction pathway downstream of the membrane-bound small G-protein RAS, which is activated by growth factors, hormones, and cytokines (Robinson and Cobb 1997).

Activating Mutations and Targeted Therapy in Cancer 281

2006), ovarian, and hepatocellular cancers and medulloblastoma (Broderick, Di et al. 2004), among others (Kang, Bader et al. 2005; Lee, Soung et al. 2005). *PIK3CA* encodes the p110α catalytic subunit of phosphatidylinositol 3-kinase (PI3K), a lipid kinase that drives AKT signaling to govern cell growth and survival. PI3Ks are heterodimers, composed of catalytic (p110α; PI3Kα) and regulatory (p85) subunits. Catalytic units include the ABD, RBD, C2, helical, and kinase domains, whereas the regulatory unit comprises the SH3, GAP, nSH2, iSH2, and cSH2 domains. Mutations mostly cluster between the kinase domain and other domains within the catalytic subunit (Huang, Mandelker et al. 2007). The family of receptor tyrosine kinase, together with the MAP kinase and PI3K cascades, forms part of the obsolete growth factor signaling pathway governing tumor cell growth and survival (Samuels, Diaz et al. 2005). Due to complexity and diverse activation of PI3K signaling, such as activating mutations or amplification of *PIK3CA,* or upstream of RTK, loss of *PTEN* or activating mutations of *RAS* in human cancers (Courtney, Corcoran et al. 2010), developing the effective therapeutic agents against PIK3CA might be more challenging (Zhao and Vogt 2008). Hereby, either single agents or combination with other therapeutic agents against to

This finding was followed by the identification of activating point mutations and small insertions/deletions in *EGFR*, an oncogene encoding a receptor tyrosine kinase, which is present more frequently in East Asian individuals with non–small-cell lung cancer (NSCLC) (25%) than in Caucasian people (10–15%) and occurs most frequently in lung adenocarcinomas (Lynch, Bell et al. 2004; Paez, Janne et al. 2004; Pao, Miller et al. 2004). Activating mutations were initially identified in 3 kinase domain exons (18, 19, and 21), encoding G719S and G719C in exon 18 and L861Q in exon 21; the most common mutations are small in-frame deletions in exon 19 and the leucine-to-arginine substitution mutation L858R. L858R mutation causes constitutive activation of the tyrosine kinase of EGFR. Oncogenic mutation of *EGFR* activates downstream signaling pathways of EGFR, which are implicated in tumor cell growth, proliferation, and survival. This discovery led to the development of the selective EGFR TKIs erlotinib and gefitinib. Inhibition of EGFR by EGFR inhibitors blocks the activity of tyrosine kinase, and hence the activation of the downstream cellular pathways. Individuals with lung adenocarcinoma harboring the G719S and L858R mutations are sensitive to gefitinib or erlotinib. Although patients harboring these mutations have a high response rate to the EGFR inhibitors gefitinib and erlotinib, the duration of the response is not long, and patients relapse after about a year of treatment

One of the mechanisms by which resistance to erlotinib or gefitinib develops in 50% of relapsed patients is acquisition of a resistant mutation in exon 20 (T790M) in *EGFR* (Kobayashi, Boggon et al. 2005; Pao, Miller et al. 2005) or activating mutation in *KRAS* (Pao, Wang et al. 2005). A second mutation in *EGFR* (T790M) is also found rarely in the germline to be associated with an inherited susceptibility to lung cancer (Bell, Gore et al. 2005; Vikis, Sato et al. 2007). This mutation has been shown to decrease the affinity of EGFR to gefitinib

PIK3CA are under development (Courtney, Corcoran et al. 2010).

**3.4. Activating mutations at** *EGFR*

(Pao and Chmielecki 2010).

MEK inhibitors suppress ERK signaling in all normal and tumor cells. In contrast, the RAF inhibitor vemurafenib inhibits the ERK pathway and cell proliferation only in tumor cells with mutant *BRAF*. Targeted therapy and selective inhibitors for certain altered genes are crucial to enable targeting of tumor cells but not normal cells.

Mutated *BRAF* activates and deregulates the kinase activity of BRAF. The recently developed BRAF inhibitor vemurafenib (PLX4032) inhibits RAF activation selectively only in cells carrying the *BRAF* V600E mutation. Clinically, vemurafenib has an 80% response rate in metastatic melanoma patients harboring the *BRAF* V600E mutation, but 18% of patients treated with vemurafenib develop at least one squamous-cell carcinoma of the skin or keratoacanthoma as an adverse event (Chapman, Hauschild et al. 2010). The remaining 20% of patients who harbor the *BRAF* V600E mutation, and also patients who do not harbor the *BRAF* V600E mutation, are resistant to vemurafenib. Other mechanisms that cause vemurafenib resistance are mutations in *NRAS* and *c-KIT* alterations. *c-KIT* alterations (mutations and/or amplifications) are found more frequently (28-39%) in melanomas from acral, mucosal, and chronically sun-damaged sites (Curtin, Busam et al. 2006), whereas uveal melanomas uniquely harbor activating mutations in the a-subunit of a G proteins of the Gq family, GNAQ and GNA11 (Van Raamsdonk, Bezrookove et al. 2009; Van Raamsdonk, Griewank et al. 2010). *NRAS* mutations are observed in 15–30% of cutaneous melanomas and are mutually exclusive of *BRAF* mutations; the most common change occurs at G12 or Q61 (Brose, Volpe et al. 2002). Currently, no selective inhibitor for those mutations exists. In contrast, *BRAF* mutations are also found in colon cancer (8%) (Hutchins, Southward et al. 2011), papillary thyroid cancer (44%) and anaplastic thyroid cancer (24%) (Xing, Westra et al. 2005), but limited study has reported to date. However, vemurafenib has limited therapeutic effects in *BRAF* (V600E) mutant colon cancers because inhibition of *BRAF* (V600E) causes a rapid feedback activation of EGFR, which induces continued proliferation in *BRAF* (V600E) inhibited cells. Therefore, blocking the *EGFR* by gefitinib, erlotinib or cetuximab has strong synergistic with inhibition of *BRAF* (V600E) by vemurafenib in colon tumor cell *in vivo* and *in vitro* (Prahallad, Sun et al. 2012). The question remains to answer whether the same BRAF selective inhibitor can be effective in other tumor types due to lack of evidence.

#### **3.3. Activating mutations at** *PIK3CA*

Shortly after *BRAF* mutations were found and selective inhibitors of the mutant *BRAF* were developed, activating point mutations were found in *PIK3CA* (Samuels, Wang et al. 2004) in a variety of cancers, including breast (20–30%) (Bachman, Argani et al. 2004; Campbell, Russell et al. 2004), colorectal (Parsons, Wang et al. 2005), endometrial (Samuels and Ericson 2006), ovarian, and hepatocellular cancers and medulloblastoma (Broderick, Di et al. 2004), among others (Kang, Bader et al. 2005; Lee, Soung et al. 2005). *PIK3CA* encodes the p110α catalytic subunit of phosphatidylinositol 3-kinase (PI3K), a lipid kinase that drives AKT signaling to govern cell growth and survival. PI3Ks are heterodimers, composed of catalytic (p110α; PI3Kα) and regulatory (p85) subunits. Catalytic units include the ABD, RBD, C2, helical, and kinase domains, whereas the regulatory unit comprises the SH3, GAP, nSH2, iSH2, and cSH2 domains. Mutations mostly cluster between the kinase domain and other domains within the catalytic subunit (Huang, Mandelker et al. 2007). The family of receptor tyrosine kinase, together with the MAP kinase and PI3K cascades, forms part of the obsolete growth factor signaling pathway governing tumor cell growth and survival (Samuels, Diaz et al. 2005). Due to complexity and diverse activation of PI3K signaling, such as activating mutations or amplification of *PIK3CA,* or upstream of RTK, loss of *PTEN* or activating mutations of *RAS* in human cancers (Courtney, Corcoran et al. 2010), developing the effective therapeutic agents against PIK3CA might be more challenging (Zhao and Vogt 2008). Hereby, either single agents or combination with other therapeutic agents against to PIK3CA are under development (Courtney, Corcoran et al. 2010).

#### **3.4. Activating mutations at** *EGFR*

280 Mutations in Human Genetic Disease

tumor types due to lack of evidence.

**3.3. Activating mutations at** *PIK3CA*

activates ERK signaling, downstream of RAS, which regulates multiple key cellular processes that are required for cell proliferation, differentiation, apoptosis, and survival. The *RAF* family (*A-RAF, B-RAF, C-RAF*) members are components of a signal transduction pathway downstream of the membrane-bound small G-protein RAS, which is activated by

MEK inhibitors suppress ERK signaling in all normal and tumor cells. In contrast, the RAF inhibitor vemurafenib inhibits the ERK pathway and cell proliferation only in tumor cells with mutant *BRAF*. Targeted therapy and selective inhibitors for certain altered genes are

Mutated *BRAF* activates and deregulates the kinase activity of BRAF. The recently developed BRAF inhibitor vemurafenib (PLX4032) inhibits RAF activation selectively only in cells carrying the *BRAF* V600E mutation. Clinically, vemurafenib has an 80% response rate in metastatic melanoma patients harboring the *BRAF* V600E mutation, but 18% of patients treated with vemurafenib develop at least one squamous-cell carcinoma of the skin or keratoacanthoma as an adverse event (Chapman, Hauschild et al. 2010). The remaining 20% of patients who harbor the *BRAF* V600E mutation, and also patients who do not harbor the *BRAF* V600E mutation, are resistant to vemurafenib. Other mechanisms that cause vemurafenib resistance are mutations in *NRAS* and *c-KIT* alterations. *c-KIT* alterations (mutations and/or amplifications) are found more frequently (28-39%) in melanomas from acral, mucosal, and chronically sun-damaged sites (Curtin, Busam et al. 2006), whereas uveal melanomas uniquely harbor activating mutations in the a-subunit of a G proteins of the Gq family, GNAQ and GNA11 (Van Raamsdonk, Bezrookove et al. 2009; Van Raamsdonk, Griewank et al. 2010). *NRAS* mutations are observed in 15–30% of cutaneous melanomas and are mutually exclusive of *BRAF* mutations; the most common change occurs at G12 or Q61 (Brose, Volpe et al. 2002). Currently, no selective inhibitor for those mutations exists. In contrast, *BRAF* mutations are also found in colon cancer (8%) (Hutchins, Southward et al. 2011), papillary thyroid cancer (44%) and anaplastic thyroid cancer (24%) (Xing, Westra et al. 2005), but limited study has reported to date. However, vemurafenib has limited therapeutic effects in *BRAF* (V600E) mutant colon cancers because inhibition of *BRAF* (V600E) causes a rapid feedback activation of EGFR, which induces continued proliferation in *BRAF* (V600E) inhibited cells. Therefore, blocking the *EGFR* by gefitinib, erlotinib or cetuximab has strong synergistic with inhibition of *BRAF* (V600E) by vemurafenib in colon tumor cell *in vivo* and *in vitro* (Prahallad, Sun et al. 2012). The question remains to answer whether the same BRAF selective inhibitor can be effective in other

Shortly after *BRAF* mutations were found and selective inhibitors of the mutant *BRAF* were developed, activating point mutations were found in *PIK3CA* (Samuels, Wang et al. 2004) in a variety of cancers, including breast (20–30%) (Bachman, Argani et al. 2004; Campbell, Russell et al. 2004), colorectal (Parsons, Wang et al. 2005), endometrial (Samuels and Ericson

growth factors, hormones, and cytokines (Robinson and Cobb 1997).

crucial to enable targeting of tumor cells but not normal cells.

This finding was followed by the identification of activating point mutations and small insertions/deletions in *EGFR*, an oncogene encoding a receptor tyrosine kinase, which is present more frequently in East Asian individuals with non–small-cell lung cancer (NSCLC) (25%) than in Caucasian people (10–15%) and occurs most frequently in lung adenocarcinomas (Lynch, Bell et al. 2004; Paez, Janne et al. 2004; Pao, Miller et al. 2004). Activating mutations were initially identified in 3 kinase domain exons (18, 19, and 21), encoding G719S and G719C in exon 18 and L861Q in exon 21; the most common mutations are small in-frame deletions in exon 19 and the leucine-to-arginine substitution mutation L858R. L858R mutation causes constitutive activation of the tyrosine kinase of EGFR. Oncogenic mutation of *EGFR* activates downstream signaling pathways of EGFR, which are implicated in tumor cell growth, proliferation, and survival. This discovery led to the development of the selective EGFR TKIs erlotinib and gefitinib. Inhibition of EGFR by EGFR inhibitors blocks the activity of tyrosine kinase, and hence the activation of the downstream cellular pathways. Individuals with lung adenocarcinoma harboring the G719S and L858R mutations are sensitive to gefitinib or erlotinib. Although patients harboring these mutations have a high response rate to the EGFR inhibitors gefitinib and erlotinib, the duration of the response is not long, and patients relapse after about a year of treatment (Pao and Chmielecki 2010).

One of the mechanisms by which resistance to erlotinib or gefitinib develops in 50% of relapsed patients is acquisition of a resistant mutation in exon 20 (T790M) in *EGFR* (Kobayashi, Boggon et al. 2005; Pao, Miller et al. 2005) or activating mutation in *KRAS* (Pao, Wang et al. 2005). A second mutation in *EGFR* (T790M) is also found rarely in the germline to be associated with an inherited susceptibility to lung cancer (Bell, Gore et al. 2005; Vikis, Sato et al. 2007). This mutation has been shown to decrease the affinity of EGFR to gefitinib in the L858R mutant by increasing the affinity of EGFR to ATP (Yun, Mengwasser et al. 2008). This resistant mutant led to the development of promising new agents as secondgeneration EGFR inhibitors (Li, Shimamura et al. 2007; Li, Ambrogio et al. 2008; Zhou, Ercan et al. 2009). Another mechanism by which resistance to erlotinib or gefitinib develops is amplification (20%) or mutation (Y1230H) in *MET*, an oncogene encoding receptor tyrosine kinase (Bean, Brennan et al. 2007; Engelman, Zejnullahu et al. 2007). Overexpression of HGF, a specific ligand of MET, is another mechanism by which resistance to EGFR inhibitors develops (Yano, Wang et al. 2008).

Activating Mutations and Targeted Therapy in Cancer 283

contain seven distinct domains: JAK homology (JH) domains 1 to 7 (JH1–7). The tyrosine kinase domain (JH1) is located at C-terminus of the protein and is responsible for the kinase activity. The pseudokinase domain (JH2) has no kinase activity, but deletion of the JH2 domain leads to increased kinase activity. JH3 and JH4 are similar to the SH2 domain, and their roles are still unclear (Wilks, Harpur et al. 1991; Lindauer, Loerting et al. 2001; Giordanetto and Kroemer 2002; Saharinen and Silvennoinen 2002). JH5, JH6, and JH7 are located at the amino-terminus of the protein and play a role in binding the JAK molecule to the cytokine receptor and in maintaining receptor expression at the cell surface (Huang, Constantinescu et al. 2001). JAK2 is a nonreceptor tyrosine kinase that mediates signals

An activating mutation of *JAK2*, a valine-to-phenylalanine substitution at position 617 (V617F) (Scott, Tong et al. 2007), leads to constitutive activation of STAT5. The JAK inhibitors INCB01824, TG101348, and lestaurtinib (CEP701), which inhibit JAK1 and JAK2, results in a

Other kinase activating mutations have been found in the oncogene c-*KIT* in gastrointestinal stromal tumors (GIST), acral or mucosal melanoma, endometrial carcinoma, germ cell tumors, myeloproliferative diseases, and leukemias, which is the mutations cause constitutive activation of *c-KIT* (Malaise, Steinbach et al. 2009). c-KIT is a transmembrane cytokine receptor tyrosine kinase that is expressed on the surface of hematopoietic stem cells. Most GIST patients who harbor *c-KIT* mutations have a response to imatinib mesilate (80%). This raises the question of whether imatinib or nilotinib (TKIs) may elicit clinical responses in *KIT*-mutant melanoma or endometrial carcinoma or in other cancers that harbor *KIT* mutations. Acquired resistance to imatinib commonly occurs via secondary gene mutations in the *c-KIT* kinase domain in GIST. For example, the V560G mutation in *KIT* is sensitive to imatinib, although the

*IDH1* encodes a nicotinamide adenine dinucleotide phosphate (NADP)+-dependent enzyme that converts isocitrate to 2-ketoglutarate in the cytoplasm. Somatic mutations were found to be present in *IDH1* and *IDH2* in 88% of individuals with secondary glioblastomas, 68% of those with grade II glioma (lower grade diffuse astrocytomas), 78% of those with grade III anaplastic astrocytomas, and 69% of those with grade III anaplastic oligodendrogliomas (Dang, Jin et al. 2010; Dang, White et al. 2010) as well as 31% of patients with myeloproliferative neoplasm (Green and Beer 2010) and 10% of those with acute myeloid leukemia (AML) (Dang, Jin et al. 2010; Yen, Bittinger et al. 2010). Mutations in *IDH* were first reported to be activating mutations, but subsequent studies of mutations at arginine R132 (in *IDH1*) and at R140 or R172 (in *IDH2*) in the enzyme showed a gain of new function and the ability to convert alpha-ketoglutarate to 2-hydroxyglutarate (Dang, White et al. 2009). Mutations that have been reported in *IDH1* and *IDH2* are summarized in Table 1. Mutations in these metabolic enzymes uncover novel avenues for the development of anticancer

marked reduction (>50%) in massive splenomegaly (Verstovsek, Kantarjian et al. 2010).

D816V mutation is resistant to imatinib (Mahadevan, Cooke et al. 2007).

between cytokine receptors and downstream targets.

**3.6. Activating mutations at** *c-KIT*

**3.7. Mutations at** *IDH1 and IDH2*

Gefitinib and erlotinib are first-generation, reversible EGFR inhibitors. Currently being developed are second-generation irreversible EGFR inhibitors, which inhibit EGFR kinase activity even when the T790M mutation is present. Neratinib (HKI-272) (Li, Shimamura et al. 2007; Wong, Fracasso et al. 2009; Sequist, Besse et al. 2010) and afatinib (BIBW 2992) (Eskens, Mom et al. 2008; Li, Ambrogio et al. 2008; Yap, Vidal et al. 2010) are dual inhibitors against EGFR and HER2, and PF-00299804 is a multi-inhibitor against EGFR, ERBB2, and ERBB4 (Engelman, Zejnullahu et al. 2007). For *MET* gene amplification, the MET inhibitor PHA-665752 has been developed (Engelman, Zejnullahu et al. 2007). Recently, new EGFR inhibitors (WZ4002, WZ3146, and WZ8040) have been reported that suppress the growth of *EGFR* T790M-containing cell lines by inhibiting phosphorylation (Zhou, Ercan et al. 2009). Erlotinib has a statistically significantly higher response rate than chemotherapy (83% vs 36%) (Friedrich 2011). In fact, some activating mutations, like those of *KRAS*, may not be drug targets but may rather govern the resistance to selective inhibitors of EGFR (Allegra, Jessup et al. 2009). Activating mutations of *EGFR* are also present in glioma, breast, endometrial and colorectal carcinomas. *KRAS* mutations at G12 and G13 are associated with resistance to erlotinib or gefitinib in *EGFR* mutated lung adenocarcinoma parients (Pao, Wang et al. 2005) and metastatic colorectal carcinoma (Allegra, Jessup et al. 2009).

Shortly after the discovery of *EGFR* mutations, somatic activating mutations of *ERBB2* were found in 2–4% of patients with lung adenocarcinoma. *ERBB2* is a receptor tyrosine kinase, one of the members of ERBB family, and the only one that does bind to any known ligand but activates downstream signaling pathways by homo- or hetero-dimerization with other ERBB family members. Small in-frame insertion mutations span exon 20 of the kinase domain of *ERBB2*, and these are analogous to the mutations in the paralogous exon 20 in the *EGFR* gene that confer resistance to erlotinib or gefitinib. ERBB2 is a receptor tyrosine kinase that heterodimerizes or homodimerizes with EGFR and other members of the ERBB family, ERBB3 and ERBB4, to activate downstream signaling pathways (Hynes and Lane, 2005)..

#### **3.5. Activating mutations at** *JAK2*

The discovery of the somatic gain-of-function mutation (V617F) in Janus kinase 2 (*JAK2*) in >90% of individuals with polycythemia vera, 50% of individuals with primary myelofibrosis, and 60% of those with essential thrombocytopenia (Levine, Wadleigh et al. 2005), all of which are Philadelphia chromosome -negative myeloproliferative neoplasms, generated interest in developing JAK2 inhibitors. The JAK kinases (JAK1, JAK2, JAK3, and JAK4) were first identified in 1989 (Wilks 1989). Structurally, all members of the JAK family contain seven distinct domains: JAK homology (JH) domains 1 to 7 (JH1–7). The tyrosine kinase domain (JH1) is located at C-terminus of the protein and is responsible for the kinase activity. The pseudokinase domain (JH2) has no kinase activity, but deletion of the JH2 domain leads to increased kinase activity. JH3 and JH4 are similar to the SH2 domain, and their roles are still unclear (Wilks, Harpur et al. 1991; Lindauer, Loerting et al. 2001; Giordanetto and Kroemer 2002; Saharinen and Silvennoinen 2002). JH5, JH6, and JH7 are located at the amino-terminus of the protein and play a role in binding the JAK molecule to the cytokine receptor and in maintaining receptor expression at the cell surface (Huang, Constantinescu et al. 2001). JAK2 is a nonreceptor tyrosine kinase that mediates signals between cytokine receptors and downstream targets.

An activating mutation of *JAK2*, a valine-to-phenylalanine substitution at position 617 (V617F) (Scott, Tong et al. 2007), leads to constitutive activation of STAT5. The JAK inhibitors INCB01824, TG101348, and lestaurtinib (CEP701), which inhibit JAK1 and JAK2, results in a marked reduction (>50%) in massive splenomegaly (Verstovsek, Kantarjian et al. 2010).

#### **3.6. Activating mutations at** *c-KIT*

282 Mutations in Human Genetic Disease

develops (Yano, Wang et al. 2008).

**3.5. Activating mutations at** *JAK2*

in the L858R mutant by increasing the affinity of EGFR to ATP (Yun, Mengwasser et al. 2008). This resistant mutant led to the development of promising new agents as secondgeneration EGFR inhibitors (Li, Shimamura et al. 2007; Li, Ambrogio et al. 2008; Zhou, Ercan et al. 2009). Another mechanism by which resistance to erlotinib or gefitinib develops is amplification (20%) or mutation (Y1230H) in *MET*, an oncogene encoding receptor tyrosine kinase (Bean, Brennan et al. 2007; Engelman, Zejnullahu et al. 2007). Overexpression of HGF, a specific ligand of MET, is another mechanism by which resistance to EGFR inhibitors

Gefitinib and erlotinib are first-generation, reversible EGFR inhibitors. Currently being developed are second-generation irreversible EGFR inhibitors, which inhibit EGFR kinase activity even when the T790M mutation is present. Neratinib (HKI-272) (Li, Shimamura et al. 2007; Wong, Fracasso et al. 2009; Sequist, Besse et al. 2010) and afatinib (BIBW 2992) (Eskens, Mom et al. 2008; Li, Ambrogio et al. 2008; Yap, Vidal et al. 2010) are dual inhibitors against EGFR and HER2, and PF-00299804 is a multi-inhibitor against EGFR, ERBB2, and ERBB4 (Engelman, Zejnullahu et al. 2007). For *MET* gene amplification, the MET inhibitor PHA-665752 has been developed (Engelman, Zejnullahu et al. 2007). Recently, new EGFR inhibitors (WZ4002, WZ3146, and WZ8040) have been reported that suppress the growth of *EGFR* T790M-containing cell lines by inhibiting phosphorylation (Zhou, Ercan et al. 2009). Erlotinib has a statistically significantly higher response rate than chemotherapy (83% vs 36%) (Friedrich 2011). In fact, some activating mutations, like those of *KRAS*, may not be drug targets but may rather govern the resistance to selective inhibitors of EGFR (Allegra, Jessup et al. 2009). Activating mutations of *EGFR* are also present in glioma, breast, endometrial and colorectal carcinomas. *KRAS* mutations at G12 and G13 are associated with resistance to erlotinib or gefitinib in *EGFR* mutated lung adenocarcinoma parients (Pao,

Wang et al. 2005) and metastatic colorectal carcinoma (Allegra, Jessup et al. 2009).

Shortly after the discovery of *EGFR* mutations, somatic activating mutations of *ERBB2* were found in 2–4% of patients with lung adenocarcinoma. *ERBB2* is a receptor tyrosine kinase, one of the members of ERBB family, and the only one that does bind to any known ligand but activates downstream signaling pathways by homo- or hetero-dimerization with other ERBB family members. Small in-frame insertion mutations span exon 20 of the kinase domain of *ERBB2*, and these are analogous to the mutations in the paralogous exon 20 in the *EGFR* gene that confer resistance to erlotinib or gefitinib. ERBB2 is a receptor tyrosine kinase that heterodimerizes or homodimerizes with EGFR and other members of the ERBB family, ERBB3 and ERBB4, to activate downstream signaling pathways (Hynes and Lane, 2005)..

The discovery of the somatic gain-of-function mutation (V617F) in Janus kinase 2 (*JAK2*) in >90% of individuals with polycythemia vera, 50% of individuals with primary myelofibrosis, and 60% of those with essential thrombocytopenia (Levine, Wadleigh et al. 2005), all of which are Philadelphia chromosome -negative myeloproliferative neoplasms, generated interest in developing JAK2 inhibitors. The JAK kinases (JAK1, JAK2, JAK3, and JAK4) were first identified in 1989 (Wilks 1989). Structurally, all members of the JAK family Other kinase activating mutations have been found in the oncogene c-*KIT* in gastrointestinal stromal tumors (GIST), acral or mucosal melanoma, endometrial carcinoma, germ cell tumors, myeloproliferative diseases, and leukemias, which is the mutations cause constitutive activation of *c-KIT* (Malaise, Steinbach et al. 2009). c-KIT is a transmembrane cytokine receptor tyrosine kinase that is expressed on the surface of hematopoietic stem cells. Most GIST patients who harbor *c-KIT* mutations have a response to imatinib mesilate (80%). This raises the question of whether imatinib or nilotinib (TKIs) may elicit clinical responses in *KIT*-mutant melanoma or endometrial carcinoma or in other cancers that harbor *KIT* mutations. Acquired resistance to imatinib commonly occurs via secondary gene mutations in the *c-KIT* kinase domain in GIST. For example, the V560G mutation in *KIT* is sensitive to imatinib, although the D816V mutation is resistant to imatinib (Mahadevan, Cooke et al. 2007).

#### **3.7. Mutations at** *IDH1 and IDH2*

*IDH1* encodes a nicotinamide adenine dinucleotide phosphate (NADP)+-dependent enzyme that converts isocitrate to 2-ketoglutarate in the cytoplasm. Somatic mutations were found to be present in *IDH1* and *IDH2* in 88% of individuals with secondary glioblastomas, 68% of those with grade II glioma (lower grade diffuse astrocytomas), 78% of those with grade III anaplastic astrocytomas, and 69% of those with grade III anaplastic oligodendrogliomas (Dang, Jin et al. 2010; Dang, White et al. 2010) as well as 31% of patients with myeloproliferative neoplasm (Green and Beer 2010) and 10% of those with acute myeloid leukemia (AML) (Dang, Jin et al. 2010; Yen, Bittinger et al. 2010). Mutations in *IDH* were first reported to be activating mutations, but subsequent studies of mutations at arginine R132 (in *IDH1*) and at R140 or R172 (in *IDH2*) in the enzyme showed a gain of new function and the ability to convert alpha-ketoglutarate to 2-hydroxyglutarate (Dang, White et al. 2009). Mutations that have been reported in *IDH1* and *IDH2* are summarized in Table 1. Mutations in these metabolic enzymes uncover novel avenues for the development of anticancer

therapeutics, but specific inhibitors are needed for the mutated forms R132, R140, or R172. It is not clear what the role of this mutation is in cancer and whether it is crucial for tumorigenesis, although the 2-hydroxyglutarate metabolite is a biomarker that can be measured in whole blood and used to select targeted therapy (Yen, Bittinger et al. 2010).

Activating Mutations and Targeted Therapy in Cancer 285

(SU11248)(O'Farrell, Foran et al. 2003), sorafenib (BAY43-9006), and tandutinib (MLN518), followed by the second-generation FLT3 inhibitors KW2449 (Pratz, Cortes et al. 2009) and

Drugs targeting some of these mutations are now either undergoing clinical testing or have protocols in the approval process. The discovery of base mutations through systematic DNA sequencing has provided decisive genetic evidence that these same pathways play crucial roles in tumorigenesis and maintenance and has also opened up new avenues for the deployment of targeted therapeutics. We are just starting to understand the genetic mechanisms that lead to the development of cancer and play a role in treatment. Hence, we are still at the beginning of the road map to targeted therapy. We still need to discover all activating mutations or other chromosomal rearrangements, inactivating mutations, and epigenetic alterations in the genome that drive cells to tumorigenesis for each type and subtype of cancer, and we need to identify resistant and sensitive mutations to find the correct targets for the development of new selective therapeutic agents, and use

Abu-Duhier, F. M., A. C. Goodeve, et al. (2001). Genomic structure of human FLT3:

Alitalo, K., M. Schwab, et al. (1983). Homogeneously staining chromosomal regions contain amplified copies of an abundantly expressed cellular oncogene (c-myc) in malignant neuroendocrine cells from a human colon carcinoma. *Proc Natl Acad Sci U S A* 80(6):

Allegra, C. J., J. M. Jessup, et al. (2009). American Society of Clinical Oncology provisional clinical opinion: testing for KRAS gene mutations in patients with metastatic colorectal carcinoma to predict response to anti-epidermal growth factor receptor monoclonal

Bachman, K. E., P. Argani, et al. (2004). The PIK3CA gene is mutated with high frequency in

Balss, J., J. Meyer, et al. (2008). Analysis of the IDH1 codon 132 mutation in brain tumors.

AC220 (Zarrinkar, Gunawardane et al. 2009).

combination of selective therapeutic agents.

and Christopher I. Amos

antibody therapy. *J Clin Oncol* 27(12): 2091-6.

*Acta Neuropathol* 116(6): 597-602.

human breast cancers. *Cancer Biol Ther* 3(8): 772-5.

*The University of Texas MD Anderson Cancer Center, Houston, Texas, USA* 

implications for mutational analysis. *Br J Haematol* 113(4): 1076-7.

**4. Future directions** 

**Author details** 

*Department of Genetics,* 

Musaffe Tuna\*

**5. References** 

1707-11.

Corresponding Author

 \*

#### **3.8. Fusion genes**

Another recent breakthrough was the discovery of translocations or other chromosomal rearrangements between ETS transcription factors (*ERG*, *ETV1*, and *ETV4*) in >40% of prostate cancers (Tomlins, Rhodes et al. 2005; Tomlins, Laxman et al. 2007; Berger, Lawrence et al. 2011) and the fusion of anaplastic lymphoma kinase (*ALK*) with other genes in NSCLC (Soda, Choi et al. 2007; Choi, Soda et al. 2010). Echinoderm microtubule-associated protein-like 4 (*EML4*) is fused to *ALK*, which leads to a fusion-type tyrosine kinase between the N-terminus of *EML4* and the C-terminus of the *ALK* that is a chimeric oncoprotein and is found in 3–5% of NSCLC tumors (Soda, Choi et al. 2007; Choi, Soda et al. 2010). The inversion on chromosome 2p [inv(2)(p21p23)] leads to formation of the *ELK4-ALK* fusion oncogene. The chromosomal inversion occurs in different locations, and multiple *EML4-ALK* variants have been reported; all involve the intracellular tyrosine kinase domain of *ALK* (exon 20) but different truncation of *EML4* (exon 2, 6, 13, 14, 15, 17, 18, or 20), TFG, and *KIF5B*; the most common inversion is in exon 13 of *EML4* (Hernandez, Pinyol et al. 1999; Choi, Takeuchi et al. 2008; Takeuchi, Choi et al. 2009). The amino-terminal coiled-coil domain within EML4 is necessary and sufficient for the transforming activity of *EML4-ALK* (Soda, Choi et al. 2007). This fusion tyrosine kinase may activate downstream signaling pathways of ALK, such as RAS/RAF. This recent discovery of the genetic rearrangement between ALK and the aforementioned genes has led to the development of another targeted agent, crizotinib (PF-02341066), for the treatment of NSCLC. Crizotinib, a TKI that was initially designed as an inhibitor of MET, is currently used to inhibit both tyrosine kinases, MET and ALK in NSCLC. *ALK* rearrangement has been found mostly in younger and more likely to be never or light smoker lung adenocarcinomas and is more frequent in the Asian population than in the American or European population (Sasaki, Rodig et al. 2010). Patients who developed resistance to BRAF inhibitors were found to be harboring the C1156Y (46.6%) and L1196M (15.1%) mutations in the *ALK* gene (Choi, Soda et al. 2010) and also the F1174L mutation (Sasaki, Okuda et al. 2010).

#### **3.9. Activating mutations at** *FLT3*

*FLT3* encodes a receptor tyrosine kinase that is involved in stem cell development and differentiation, stem and/or progenitor cell survival, and the development of B-progenitor cells, dendritic cells, and natural killer cells in the bone marrow (Small, Levenstein et al. 1994). Two common mutations have been found in AML: internal tandem duplication (ITD) in-frame mutations of 3–400 base pairs in the juxtamembrane region, and point mutations in the tyrosine kinase domain (TKD) D835 (7%). Mutations in the ITD and TKD lead to constitutive activation of tyrosine kinase (Abu-Duhier, Goodeve et al. 2001), and this finding led to the design of the first-generation FLT3 inhibitors lestaurtinib (CEP701) (Smith, Levis et al. 2004), midostaurin (PKC412A) (Stone, DeAngelo et al. 2005), sunitinib (SU11248)(O'Farrell, Foran et al. 2003), sorafenib (BAY43-9006), and tandutinib (MLN518), followed by the second-generation FLT3 inhibitors KW2449 (Pratz, Cortes et al. 2009) and AC220 (Zarrinkar, Gunawardane et al. 2009).

#### **4. Future directions**

284 Mutations in Human Genetic Disease

**3.8. Fusion genes** 

therapeutics, but specific inhibitors are needed for the mutated forms R132, R140, or R172. It is not clear what the role of this mutation is in cancer and whether it is crucial for tumorigenesis, although the 2-hydroxyglutarate metabolite is a biomarker that can be measured in whole blood and used to select targeted therapy (Yen, Bittinger et al. 2010).

Another recent breakthrough was the discovery of translocations or other chromosomal rearrangements between ETS transcription factors (*ERG*, *ETV1*, and *ETV4*) in >40% of prostate cancers (Tomlins, Rhodes et al. 2005; Tomlins, Laxman et al. 2007; Berger, Lawrence et al. 2011) and the fusion of anaplastic lymphoma kinase (*ALK*) with other genes in NSCLC (Soda, Choi et al. 2007; Choi, Soda et al. 2010). Echinoderm microtubule-associated protein-like 4 (*EML4*) is fused to *ALK*, which leads to a fusion-type tyrosine kinase between the N-terminus of *EML4* and the C-terminus of the *ALK* that is a chimeric oncoprotein and is found in 3–5% of NSCLC tumors (Soda, Choi et al. 2007; Choi, Soda et al. 2010). The inversion on chromosome 2p [inv(2)(p21p23)] leads to formation of the *ELK4-ALK* fusion oncogene. The chromosomal inversion occurs in different locations, and multiple *EML4-ALK* variants have been reported; all involve the intracellular tyrosine kinase domain of *ALK* (exon 20) but different truncation of *EML4* (exon 2, 6, 13, 14, 15, 17, 18, or 20), TFG, and *KIF5B*; the most common inversion is in exon 13 of *EML4* (Hernandez, Pinyol et al. 1999; Choi, Takeuchi et al. 2008; Takeuchi, Choi et al. 2009). The amino-terminal coiled-coil domain within EML4 is necessary and sufficient for the transforming activity of *EML4-ALK* (Soda, Choi et al. 2007). This fusion tyrosine kinase may activate downstream signaling pathways of ALK, such as RAS/RAF. This recent discovery of the genetic rearrangement between ALK and the aforementioned genes has led to the development of another targeted agent, crizotinib (PF-02341066), for the treatment of NSCLC. Crizotinib, a TKI that was initially designed as an inhibitor of MET, is currently used to inhibit both tyrosine kinases, MET and ALK in NSCLC. *ALK* rearrangement has been found mostly in younger and more likely to be never or light smoker lung adenocarcinomas and is more frequent in the Asian population than in the American or European population (Sasaki, Rodig et al. 2010). Patients who developed resistance to BRAF inhibitors were found to be harboring the C1156Y (46.6%) and L1196M (15.1%) mutations in the *ALK* gene (Choi, Soda et

al. 2010) and also the F1174L mutation (Sasaki, Okuda et al. 2010).

*FLT3* encodes a receptor tyrosine kinase that is involved in stem cell development and differentiation, stem and/or progenitor cell survival, and the development of B-progenitor cells, dendritic cells, and natural killer cells in the bone marrow (Small, Levenstein et al. 1994). Two common mutations have been found in AML: internal tandem duplication (ITD) in-frame mutations of 3–400 base pairs in the juxtamembrane region, and point mutations in the tyrosine kinase domain (TKD) D835 (7%). Mutations in the ITD and TKD lead to constitutive activation of tyrosine kinase (Abu-Duhier, Goodeve et al. 2001), and this finding led to the design of the first-generation FLT3 inhibitors lestaurtinib (CEP701) (Smith, Levis et al. 2004), midostaurin (PKC412A) (Stone, DeAngelo et al. 2005), sunitinib

**3.9. Activating mutations at** *FLT3*

Drugs targeting some of these mutations are now either undergoing clinical testing or have protocols in the approval process. The discovery of base mutations through systematic DNA sequencing has provided decisive genetic evidence that these same pathways play crucial roles in tumorigenesis and maintenance and has also opened up new avenues for the deployment of targeted therapeutics. We are just starting to understand the genetic mechanisms that lead to the development of cancer and play a role in treatment. Hence, we are still at the beginning of the road map to targeted therapy. We still need to discover all activating mutations or other chromosomal rearrangements, inactivating mutations, and epigenetic alterations in the genome that drive cells to tumorigenesis for each type and subtype of cancer, and we need to identify resistant and sensitive mutations to find the correct targets for the development of new selective therapeutic agents, and use combination of selective therapeutic agents.

## **Author details**

Musaffe Tuna\* and Christopher I. Amos *Department of Genetics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA* 

#### **5. References**


<sup>\*</sup> Corresponding Author

Baselga, J., D. Tripathy, et al. (1999). Phase II study of weekly intravenous trastuzumab (Herceptin) in patients with HER2/neu-overexpressing metastatic breast cancer. *Semin Oncol* 26(4 Suppl 12): 78-83.

Activating Mutations and Targeted Therapy in Cancer 287

Dang, L., S. Jin, et al. (2010). IDH mutations in glioma and acute myeloid leukemia. *Trends* 

Dang, L., D. W. White, et al. (2009). Cancer-associated IDH1 mutations produce 2-

Davies, H., G. R. Bignell, et al. (2002). Mutations of the BRAF gene in human cancer. *Nature*

de Muga, S., S. Hernandez, et al. (2010). Molecular alterations of EGFR and PTEN in prostate cancer: association with high-grade and advanced-stage carcinomas. *Mod Pathol* 23(5):

Engelman, J. A., K. Zejnullahu, et al. (2007). PF00299804, an irreversible pan-ERBB inhibitor, is effective in lung cancer models with EGFR and ERBB2 mutations that are resistant to

Engelman, J. A., K. Zejnullahu, et al. (2007). MET amplification leads to gefitinib resistance

Eskens, F. A., C. H. Mom, et al. (2008). A phase I dose escalation study of BIBW 2992, an irreversible dual inhibitor of epidermal growth factor receptor 1 (EGFR) and 2 (HER2) tyrosine kinase in a 2-week on, 2-week off schedule in patients with advanced solid

Esteller, M. (2007). Cancer epigenomics: DNA methylomes and histone-modification maps.

Fernandes, M. S., M. M. Reddy, et al. (2009). BCR-ABL promotes the frequency of mutagenic

Friedrich, M. J. (2011). NSCLC drug targets acquire new visibility. *J Natl Cancer Inst* 103(5):

Garnett, M. J. and R. Marais (2004). Guilty as charged: B-RAF is a human oncogene. *Cancer* 

Giordanetto, F. and R. T. Kroemer (2002). Prediction of the structure of human Janus kinase 2 (JAK2) comprising JAK homology domains 1 through 7. *Protein Eng* 15(9): 727-37. Goldman, J. M. and J. V. Melo (2008). BCR-ABL in chronic myelogenous leukemia--how

Gravendeel, L. A., N. K. Kloosterhof, et al. (2010). Segregation of non-p.R132H mutations in

Green, A. and P. Beer (2010). Somatic mutations of IDH1 and IDH2 in the leukemic transformation of myeloproliferative neoplasms. *N Engl J Med* 362(4): 369-70. Hanahan, D. and R. A. Weinberg (2011). Hallmarks of cancer: the next generation. *Cell* 

Hayes, M. P., W. Douglas, et al. (2009). Molecular alterations of EGFR and PIK3CA in

Hernandez, L., M. Pinyol, et al. (1999). TRK-fused gene (TFG) is a new partner of ALK in anaplastic large cell lymphoma producing two structurally different TFG-ALK

Hochhaus, A., P. La Rosee, et al. (2011). Impact of BCR-ABL mutations on patients with

IDH1 in distinct molecular subtypes of glioma. *Hum Mutat* 31(3): E1186-99.

in lung cancer by activating ERBB3 signaling. *Science* 316(5827): 1039-43.

single-strand annealing DNA repair. *Blood* 114(9): 1813-9.

*Mol Med* 16(9): 387-97.

417(6892): 949-54.

703-12.

366-7.

*Cell* 6(4): 313-9.

144(5): 646-74.

hydroxyglutarate. *Nature* 465(7300): 966.

gefitinib. Cancer Res 67(24): 11924-32.

tumours. *Br J Cancer* 98(1): 80-5.

does it work? *Acta Haematol* 119(4): 212-7.

translocations. *Blood* 94(9): 3265-8.

uterine serous carcinoma. *Gynecol Oncol* 113(3): 370-3.

chronic myeloid leukemia. *Cell Cycle* 10(2): 250-60.

*Nat Rev Genet* 8(4): 286-98.


Dang, L., S. Jin, et al. (2010). IDH mutations in glioma and acute myeloid leukemia. *Trends Mol Med* 16(9): 387-97.

286 Mutations in Human Genetic Disease

*Oncol* 26(4 Suppl 12): 78-83.

erlotinib. *Proc Natl Acad Sci U S A* 104(52): 20932-7.

*Hematology Am Soc Hematol Educ Program*: 376-83.

prostate cancer. *Nature* 470(7333): 214-20.

melanoma. *Cancer Res* 62(23): 6997-7000.

*Proc Natl Acad Sci U S A* 102(9): 3395-400.

breast cancer. *Cancer Res* 64(21): 7678-81.

multiple myeloma. *Nature* 471(7339): 467-72.

to ALK inhibitors. *N Engl J Med* 363(18): 1734-9.

cancer. *J Clin Oncol* 28(6): 1075-83.

melanoma. *J Clin Oncol* 24(26): 4340-6.

Baselga, J., D. Tripathy, et al. (1999). Phase II study of weekly intravenous trastuzumab (Herceptin) in patients with HER2/neu-overexpressing metastatic breast cancer. *Semin* 

Bean, J., C. Brennan, et al. (2007). MET amplification occurs with or without T790M mutations in EGFR mutant lung tumors with acquired resistance to gefitinib or

Bell, D. W., I. Gore, et al. (2005). Inherited susceptibility to lung cancer may be associated

Berger, M. F., M. S. Lawrence, et al. (2011). The genomic complexity of primary human

Bleeker, F. E., S. Lamba, et al. (2009). IDH1 mutations at residue p.R132 (IDH1(R132)) occur frequently in high-grade gliomas but not in other solid tumors. *Hum Mutat* 30(1): 7-11. Bos, J. L., D. Toksoz, et al. (1985). Amino-acid substitutions at codon 13 of the N-ras

Branford, S. (2007). Chronic myeloid leukemia: molecular monitoring in clinical practice.

Broderick, D. K., C. Di, et al. (2004). Mutations of PIK3CA in anaplastic oligodendrogliomas,

Brose, M. S., P. Volpe, et al. (2002). BRAF and RAS mutations in human lung cancer and

Burgess, M. R., B. J. Skaggs, et al. (2005). Comparative analysis of two clinically active BCR-ABL kinase inhibitors reveals the role of conformation-specific binding in resistance.

Campbell, I. G., S. E. Russell, et al. (2004). Mutation of the PIK3CA gene in ovarian and

Capon, D. J., P. H. Seeburg, et al. (1983). Activation of Ki-ras2 gene in human colon and lung

Castaigne, S., C. Chomienne, et al. (1990). All-trans retinoic acid as a differentiation therapy

Chapman, M. A., M. S. Lawrence, et al. (2011). Initial genome sequencing and analysis of

Chapman, P. B., A. Hauschild, et al. (2010). Improved survival with vemurafenib in

Choi, Y. L., M. Soda, et al. (2010). EML4-ALK mutations in lung cancer that confer resistance

Choi, Y. L., K. Takeuchi, et al. (2008). Identification of novel isoforms of the EML4-ALK

Collins, S. and M. Groudine (1982). Amplification of endogenous myc-related DNA

Courtney, K. D., R. B. Corcoran, et al. (2010). The PI3K pathway as drug target in human

Curtin, J. A., K. Busam, et al. (2006). Somatic activation of KIT in distinct subtypes of

carcinomas by two different point mutations. *Nature* 304(5926): 507-13.

for acute promyelocytic leukemia. I. Clinical results. *Blood* 76(9): 1704-9.

melanoma with BRAF V600E mutation. *N Engl J Med* 364(26): 2507-16.

transforming gene in non-small cell lung cancer. *Cancer Res* 68(13): 4971-6.

sequences in a human myeloid leukaemia cell line. *Nature* 298(5875): 679-81.

high-grade astrocytomas, and medulloblastomas. *Cancer Res* 64(15): 5048-50.

with the T790M drug resistance mutation in EGFR. *Nat Genet* 37(12): 1315-6.

oncogene in human acute myeloid leukaemia. *Nature* 315(6022): 726-30.


Huang, C. H., D. Mandelker, et al. (2007). The structure of a human p110alpha/p85alpha complex elucidates the effects of oncogenic PI3Kalpha mutations. *Science* 318(5857): 1744-8.

Activating Mutations and Targeted Therapy in Cancer 289

MacConaill, L. E., C. D. Campbell, et al. (2009). Profiling critical cancer gene mutations in

Mahadevan, D., L. Cooke, et al. (2007). A novel tyrosine kinase switch is a mechanism of imatinib resistance in gastrointestinal stromal tumors. *Oncogene* 26(27): 3909-19. Malaise, M., D. Steinbach, et al. (2009). Clinical implications of c-Kit mutations in acute

Mauro, M. J., M. O'Dwyer, et al. (2002). STI571: a paradigm of new agents for cancer

Metzger, B., L. Chambeau, et al. (2011). The human epidermal growth factor receptor (EGFR) gene in European patients with advanced colorectal cancer harbors infrequent

Montagut, C., A. Dalmases, et al. (2012). Identification of a mutation in the extracellular domain of the Epidermal Growth Factor Receptor conferring cetuximab resistance in

Muller, M. C., J. E. Cortes, et al. (2009). Dasatinib treatment of chronic-phase chronic myeloid leukemia: analysis of responses according to preexisting BCR-ABL mutations.

Murugan, A. K., J. Dong, et al. (2011). Uncommon GNAQ, MMP8, AKT3, EGFR, and PIK3R1

O'Farrell, A. M., J. M. Foran, et al. (2003). An innovative phase I clinical study demonstrates inhibition of FLT3 phosphorylation by SU11248 in acute myeloid leukemia patients.

O'Hare, T., D. K. Walters, et al. (2005). In vitro activity of Bcr-Abl inhibitors AMN107 and BMS-354825 against clinically relevant imatinib-resistant Abl kinase domain mutants.

Paez, J. G., P. A. Janne, et al. (2004). EGFR mutations in lung cancer: correlation with clinical

Pao, W. and J. Chmielecki (2010). Rational, biologically based treatment of EGFR-mutant

Pao, W., V. Miller, et al. (2004). EGF receptor gene mutations are common in lung cancers from "never smokers" and are associated with sensitivity of tumors to gefitinib and

Pao, W., V. A. Miller, et al. (2005). Acquired resistance of lung adenocarcinomas to gefitinib or erlotinib is associated with a second mutation in the EGFR kinase domain. *PLoS Med*

Pao, W., T. Y. Wang, et al. (2005). KRAS mutations and primary resistance of lung

Parsons, D. W., T. L. Wang, et al. (2005). Colorectal cancer: mutations in a signalling

Passamonti, F., C. Elena, et al. (2011). Molecular and clinical features of the myeloproliferative

Peraldo-Neia, C., G. Migliardi, et al. (2011). Epidermal Growth Factor Receptor (EGFR) mutation analysis, gene expression profiling and EGFR protein expression in primary

neoplasm associated with JAK2 exon 12 mutations. *Blood* 117(10): 2813-6.

clinical tumor samples. *PLoS One* 4(11): e7887.

therapeutics. *J Clin Oncol* 20(1): 325-34.

colorectal cancer. *Nat Med* 18(2): 221-3.

*Blood* 114(24): 4944-53.

*Clin Cancer Res* 9(15): 5465-76.

pathway. *Nature* 436(7052): 792.

prostate cancer. *BMC Cancer* 11: 31.

*Cancer Res* 65(11): 4500-5.

2(3): e73.

myelogenous leukemia. *Curr Hematol Malig Rep* 4(2): 77-82.

mutations in its tyrosine kinase domain. *BMC Med Genet* 12: 144.

mutations in thyroid cancers. *Endocr Pathol* 22(2): 97-102.

response to gefitinib therapy. *Science* 304(5676): 1497-500.

non-small-cell lung cancer. *Nat Rev Cancer* 10(11): 760-74.

adenocarcinomas to gefitinib or erlotinib. *PLoS Med* 2(1): e17.

erlotinib. *Proc Natl Acad Sci U S A* 101(36): 13306-11.


MacConaill, L. E., C. D. Campbell, et al. (2009). Profiling critical cancer gene mutations in clinical tumor samples. *PLoS One* 4(11): e7887.

288 Mutations in Human Genetic Disease

*Mol Cell* 8(6): 1327-38.

promyelocytic leukemia. *Blood* 72(2): 567-72.

inhibitors. *Nat Rev Cancer* 341: 5(5):341-54.

myelofibrosis. *Cancer Cell* 7(4): 387-97.

combination therapy. *Cancer Cell* 12(1): 81-93.

autoregulation. *Protein Eng* 14(1): 27-37.

*J Cancer* 42(1): 109-11.

*Med* 350(21): 2129-39.

lung cancer to gefitinib. *N Engl J Med* 352(8): 786-92.

cancer. *J Clin Oncol* 29(10): 1261-70.

Huang, C. H., D. Mandelker, et al. (2007). The structure of a human p110alpha/p85alpha complex elucidates the effects of oncogenic PI3Kalpha mutations. *Science* 318(5857): 1744-8. Huang, L. J., S. N. Constantinescu, et al. (2001). The N-terminal domain of Janus kinase 2 is required for Golgi processing and cell surface expression of erythropoietin receptor.

Huang, M. E., Y. C. Ye, et al. (1988). Use of all-trans retinoic acid in the treatment of acute

Hutchins, G., K. Southward, et al. (2011). Value of mismatch repair, KRAS, and BRAF mutations in predicting recurrence and benefits from chemotherapy in colorectal

Hynes, N. E., H. A. Lane (2005). ERBB receptors and cancer: The complexity of targeted

Kang, S., A. G. Bader, et al. (2005). Phosphatidylinositol 3-kinase mutations identified in

Kobayashi, S., T. J. Boggon, et al. (2005). EGFR mutation and resistance of non-small-cell

Konopka, B., A. Janiec-Jankowska, et al. (2011). PIK3CA mutations and amplification in endometrioid endometrial carcinomas: relation to other genetic defects and

Koptyra, M., R. Falinski, et al. (2006). BCR/ABL kinase induces self-mutagenesis via reactive

Lee, J. C., I. Vivanco, et al. (2006). Epidermal growth factor receptor activation in glioblastoma through novel missense mutations in the extracellular domain. *PLoS Med* 3(12): e485. Lee, J. W., Y. H. Soung, et al. (2005). PIK3CA gene is frequently mutated in breast

Levine, R. L., M. Wadleigh, et al. (2005). Activating mutation in the tyrosine kinase JAK2 in polycythemia vera, essential thrombocythemia, and myeloid metaplasia with

Li, D., L. Ambrogio, et al. (2008). BIBW2992, an irreversible EGFR/HER2 inhibitor highly

Li, D., T. Shimamura, et al. (2007). Bronchial and peripheral murine lung carcinomas induced by T790M-L858R mutant EGFR respond to HKI-272 and rapamycin

Lindauer, K., T. Loerting, et al. (2001). Prediction of the structure of human Janus kinase 2 (JAK2) comprising the two carboxy-terminal domains reveals a mechanism for

Loeffler-Ragg, J., M. Witsch-Baumgartner, et al. (2006). Low incidence of mutations in EGFR kinase domain in Caucasian patients with head and neck squamous cell carcinoma. *Eur* 

Lugo, T. G., A. M. Pendergast, et al. (1990). Tyrosine kinase activity and transformation

Lynch, T. J., D. W. Bell, et al. (2004). Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib. *N Engl J* 

human cancer are oncogenic. *Proc Natl Acad Sci U S A* 102(3): 802-7.

clinicopathologic status of the tumors. *Hum Pathol* 42(11): 1710-9.

oxygen species to encode imatinib resistance. *Blood* 108(1): 319-27.

carcinomas and hepatocellular carcinomas. *Oncogene* 24(8): 1477-80.

effective in preclinical lung cancer models. *Oncogene* 27(34): 4702-11.

potency of bcr-abl oncogene products. *Science* 247(4946): 1079-82.


Prahallad, A., C. Sun, et al. (2012). Unresponsiveness of colon cancer to BRAF(V600E) inhibition through feedback activation of EGFR. *Nature* 483(7387): 100-3.

Activating Mutations and Targeted Therapy in Cancer 291

Soverini, S., S. Colarossi, et al. (2006). Contribution of ABL kinase domain mutations to imatinib resistance in different subsets of Philadelphia-positive patients: by the GIMEMA Working Party on Chronic Myeloid Leukemia. *Clin Cancer Res* 12(24): 7374-9. Stoklosa, T., T. Poplawski, et al. (2008). BCR/ABL inhibits mismatch repair to protect from

Stone, R. M., D. J. DeAngelo, et al. (2005). Patients with acute myeloid leukemia and an activating mutation in FLT3 respond to a small-molecule FLT3 tyrosine kinase inhibitor,

Stransky, N., A. M. Egloff, et al. (2011). The mutational landscape of head and neck

Tabin, C. J., S. M. Bradley, et al. (1982). Mechanism of activation of a human oncogene.

Takeuchi, K., Y. L. Choi, et al. (2009). KIF5B-ALK, a novel fusion oncokinase identified by an immunohistochemistry-based diagnostic system for ALK-positive lung cancer. *Clin* 

Tanaka, Y., Y. Terai, et al. (2011). Prognostic effect of epidermal growth factor receptor gene mutations and the aberrant phosphorylation of Akt and ERK in ovarian cancer. *Cancer* 

Taub, R., I. Kirsch, et al. (1982). Translocation of the c-myc gene into the immunoglobulin heavy chain locus in human Burkitt lymphoma and murine plasmacytoma cells. *Proc* 

The Cancer Genome Atlas Network (2008). Comprehensive genomic characterization defines human glioblastoma genes and core pathways. *Nature* 455(7216): 1061-8. The Cancer Genome Atlas Network (2011). Integrated genomic analyses of ovarian

Teng, Y. H., W. J. Tan, et al. (2011). Mutations in the epidermal growth factor receptor (EGFR) gene in triple negative breast cancer: possible implications for targeted therapy.

Thomas, R. K., A. C. Baker, et al. (2007). High-throughput oncogene mutation profiling in

Tiacci, E., V. Trifonov, et al. (2011). BRAF mutations in hairy-cell leukemia. *N Engl J Med*

Tomlins, S. A., B. Laxman, et al. (2007). Distinct classes of chromosomal rearrangements create oncogenic ETS gene fusions in prostate cancer. *Nature* 448(7153): 595-9. Tomlins, S. A., D. R. Rhodes, et al. (2005). Recurrent fusion of TMPRSS2 and ETS

Van Raamsdonk, C. D., V. Bezrookove, et al. (2009). Frequent somatic mutations of GNAQ

Van Raamsdonk, C. D., K. G. Griewank, et al. (2010). Mutations in GNA11 in uveal

Vennstrom, B., D. Sheiness, et al. (1982). Isolation and characterization of c-myc, a cellular homolog of the oncogene (v-myc) of avian myelocytomatosis virus strain 29. *J Virol*

transcription factor genes in prostate cancer. *Science* 310(5748): 644-8.

in uveal melanoma and blue naevi. *Nature* 457(7229): 599-602.

apoptosis and induce point mutations. *Cancer Res* 68(8): 2576-80.

squamous cell carcinoma. *Science* 333(6046): 1157-60.

PKC412. *Blood* 105(1): 54-60.

*Nature* 300(5888): 143-9.

*Cancer Res* 15(9): 3143-9.

*Natl Acad Sci U S A* 79(24): 7837-41.

carcinoma. *Nature* 474(7353): 609-15.

human cancer. *Nat Genet* 39(3): 347-51.

melanoma. *N Engl J Med* 363(23): 2191-9.

*Breast Cancer Res* 13(2): R35.

364(24): 2305-15.

42(3): 773-9.

*Biol Ther* 11(1): 50-7.


Soverini, S., S. Colarossi, et al. (2006). Contribution of ABL kinase domain mutations to imatinib resistance in different subsets of Philadelphia-positive patients: by the GIMEMA Working Party on Chronic Myeloid Leukemia. *Clin Cancer Res* 12(24): 7374-9.

290 Mutations in Human Genetic Disease

300(5888): 149-52.

18(1): 77-82.

70(24): 10038-43.

*Med* 344(11): 783-92.

*Opin Cell Biol* 9(2): 180-6.

of human cancer cells. *Cancer Cell* 7(6): 561-73.

human cancers. *Science* 304(5670): 554.

lung cancer. *Eur J Cancer* 46(10): 1773-1780.

cancer. *J Clin Oncol* 28(18): 3076-83.

leukemia. *Blood* 103(10): 3669-76.

idiopathic erythrocytosis. *N Engl J Med* 356(5): 459-68.

carcinoma cell line Calu-1. *Nature* 304(5926): 497-500.

in non-small-cell lung cancer. *Nature* 448(7153): 561-6.

Prahallad, A., C. Sun, et al. (2012). Unresponsiveness of colon cancer to BRAF(V600E)

Pratz, K. W., J. Cortes, et al. (2009). A pharmacodynamic study of the FLT3 inhibitor KW-2449 yields insight into the basis for clinical response. *Blood* 113(17): 3938-46.

Reddy, E. P., R. K. Reynolds, et al. (1982). A point mutation is responsible for the acquisition of transforming properties by the T24 human bladder carcinoma oncogene. *Nature* 

Reitman, Z. J. and H. Yan (2010). Isocitrate dehydrogenase 1 and 2 mutations in cancer: alterations at a crossroads of cellular metabolism. *J Natl Cancer Inst* 102(13): 932-41. Robinson, M. J. and M. H. Cobb (1997). Mitogen-activated protein kinase pathways. *Curr* 

Saharinen, P. and O. Silvennoinen (2002). The pseudokinase domain is required for suppression of basal activity of Jak2 and Jak3 tyrosine kinases and for cytokine-

Samuels, Y., L. A. Diaz, Jr., et al. (2005). Mutant PIK3CA promotes cell growth and invasion

Samuels, Y. and K. Ericson (2006). Oncogenic PI3K and its role in cancer. *Curr Opin Oncol*

Samuels, Y., Z. Wang, et al. (2004). High frequency of mutations of the PIK3CA gene in

Sasaki, T., K. Okuda, et al. (2010). The neuroblastoma-associated F1174L ALK mutation causes resistance to an ALK kinase inhibitor in ALK-translocated cancers. *Cancer Res*

Sasaki, T., S. J. Rodig, et al., (2010). The biology and treatment of EML4-ALK non-small cell

Scott, L. M., W. Tong, et al. (2007). JAK2 exon 12 mutations in polycythemia vera and

Sequist, L. V., B. Besse, et al. (2010). Neratinib, an irreversible pan-ErbB receptor tyrosine kinase inhibitor: results of a phase II trial in patients with advanced non-small-cell lung

Shimizu, K., D. Birnbaum, et al. (1983). Structure of the Ki-ras gene of the human lung

Slamon, D. J., B. Leyland-Jones, et al. (2001). Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2. *N Engl J* 

Small, D., M. Levenstein, et al. (1994). STK-1, the human homolog of Flk-2/Flt-3, is selectively expressed in CD34+ human bone marrow cells and is involved in the proliferation of early progenitor/stem cells. *Proc Natl Acad Sci U S A* 91(2): 459-63. Smith, B. D., M. Levis, et al. (2004). Single-agent CEP-701, a novel FLT3 inhibitor, shows biologic and clinical activity in patients with relapsed or refractory acute myeloid

Soda, M., Y. L. Choi, et al. (2007). Identification of the transforming EML4-ALK fusion gene

inducible activation of signal transduction. *J Biol Chem* 277(49): 47954-63.

inhibition through feedback activation of EGFR. *Nature* 483(7387): 100-3.


*Oncol* 20(3): 719-26.

*Oncogene* 29(37): 5120-34.

kinase. *Mol Cell Biol* 11(4): 2057-65.

*Clin Cancer Res* 15(7): 2552-8.

*Cancer Res* 68(22): 9479-87.

in cancer. *FASEB J* 22(8):2605-22.

*Natl Acad Sci U S A* 105(7): 2652-7.

EGFR T790M. *Nature* 462(7276): 1070-4.

therapeutic opportunities. *Oncogene* 29(49): 6409-17.

360(8): 765-73.

72.

5.

2984-92.

Verstovsek, S., H. Kantarjian, et al. (2010). Safety and efficacy of INCB018424, a JAK1 and

Vikis, H., M. Sato, et al. (2007). EGFR-T790M is a rare lung cancer susceptibility allele with

Vogel, C. L., M. A. Cobleigh, et al. (2002). Efficacy and safety of trastuzumab as a single agent in first-line treatment of HER2-overexpressing metastatic breast cancer. *J Clin* 

Weisberg, E., M. Sattler, et al. (2010). Drug resistance in mutant FLT3-positive AML.

Wellcome Trust Sanger Institute Catalog of Somatic Mutations in Cancer [online],

Wilks, A. F. (1989). Two putative protein-tyrosine kinases identified by application of the

Wilks, A. F., A. G. Harpur, et al. (1991). Two novel protein-tyrosine kinases, each with a second phosphotransferase-related catalytic domain, define a new class of protein

Wong, K. K., P. M. Fracasso, et al. (2009). A phase I study with neratinib (HKI-272), an irreversible pan ErbB receptor tyrosine kinase inhibitor, in patients with solid tumors.

Xing, M., W. H. Westra, et al. (2005). BRAF mutation predicts a poorer clinical prognosis for

Yan, H., D. W. Parsons, et al. (2009). IDH1 and IDH2 mutations in gliomas. *N Engl J Med*

Yano, S., W. Wang, et al. (2008). Hepatocyte growth factor induces gefitinib resistance of lung adenocarcinoma with epidermal growth factor receptor-activating mutations.

Yap, T. A., L. Vidal, et al. (2010). Phase I trial of the irreversible EGFR and HER2 kinase inhibitor BIBW 2992 in patients with advanced solid tumors. *J Clin Oncol* 28(25): 3965-

Yeang, C. H., F McCormick, et al. (2008). Combinatorial patterns of somatic gene mutations

Yen, K. E., M. A. Bittinger, et al. (2010). Cancer-associated IDH mutations: biomarker and

Yun, C. H., K. E. Mengwasser, et al. (2008). The T790M mutation in EGFR kinase causes drug resistance by increasing the affinity for ATP. *Proc Natl Acad Sci U S A* 105(6): 2070-

Zarrinkar, P. P., R. N. Gunawardane, et al. (2009). AC220 is a uniquely potent and selective inhibitor of FLT3 for the treatment of acute myeloid leukemia (AML). *Blood* 114(14):

Zhao, L. and P. K. Vogt (2008). Helical domain and kinase domain mutations in p110alpha of phosphatidylinositol 3-kinase induce gain of function by different mechanisms. *Proc* 

Zhou, W., D. Ercan, et al. (2009). Novel mutant-selective EGFR kinase inhibitors against

http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=gene&ln=BRAF]

polymerase chain reaction. *Proc Natl Acad Sci U S A* 86(5): 1603-7.

papillary thyroid cancer. *J Clin Endocrinol Metab* 90(12): 6373-9.

JAK2 inhibitor, in myelofibrosis. *N Engl J Med* 363(12): 1117-27.

enhanced kinase activity. *Cancer Res* 67(10): 4665-70.

## *Edited by David N. Cooper and Jian-Min Chen*

Different types of mutation can vary in size, from structural variants to single basepair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher order features of the genomic architecture. The genomes of higher organisms are now known to contain "pervasive architectural flaws" in that certain DNA sequences are inherently mutation prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. In this volume, a number of different authors from diverse backgrounds describe how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment.

Mutations in Human Genetic Disease

Mutations in

Human Genetic Disease

*Edited by David N. Cooper and Jian-Min Chen*

Photo by Svisio / iStock