**4. Sources of material for studying ALS**

In ALS, the vulnerable cells are the motor neurones, located in the motor cortex, brainstem and spinal cord. Since motor neurones cannot be sampled during life, model systems, such as neuronal-like cells in culture and animals carrying mutant transgenes, have been used to study the neurodegenerative process. In addition to sampling post-mortem material from patients, peripheral tissue from living patients (and neurologically normal controls) has been used as a source material for applying gene expression profiling to ALS.

### **4.1 Cellular models of ALS**

Neuronal cell lines, both human (e.g. SH-SY5Y) and rodent (e.g. PC12), have been used as a model to investigate mechanisms of neurodegeneration. However, one of the most widely used cellular models for examining the molecular pathophysiology underpinning the neurodegenerative disease process in ALS is the mouse spinal cord/neuroblastoma cell line NSC-34 (Cashman et al 1992). Immortalized NSC-34 cells recapitulate many of the characteristics of motor neurones whilst maintaining their ability to proliferate in culture; thus providing a continual resource of motor neurone-like cells (Cashman et al 1992; Durham et al 1993). In particular, they have proved a robust model of mutant copper zinc superoxide dismutase-1 (*SOD1*) associated familial ALS (FALS), as they can be transfected with vectors carrying normal or mutant forms of the human *SOD1* gene (Durham et al 1997; Menzies et al 2002a; Menzies et al 2002b). Cellular models of TDP-43-related ALS, caused by mutation of the TAR DNA binding protein gene (*TARDBP*) are being generated (Duan et al 2010; Igaz et al 2009), though sustained over-expression of the wild-type and mutant proteins is proving problematic due to the toxicity of both the wild-type and mutant overexpressed proteins and the tight auto-regulation of TDP-43 (Budini & Buratti 2011).

Gene expression profiling of cell lines transfected with ALS associated genes provides a genetically homogenous cell population uncontaminated by non-neuronal astrocytes and other types of glial cells, which are present in the central nervous system (CNS). Furthermore, environmental conditions can easily be manipulated and tightly controlled *in vitro* so as to reduce the impact of external confounding factors on gene expression. The limitations of this type of model system include the fact that NSC-34 cell lines (or other neuronal cell lines) are continually dividing cells, rather than post-mitotic cells and they are unable to mirror the effects of cellular interactions that occur between the different cell populations *in situ* (Kirby et al 2011).

Primary neuronal and astrocytic cells can be isolated from embryonic mice and short-term cultures generated for microarray analysis. These cells more closely mirror those present in the CNS, though the primary neuronal cultures, as they are post-mitotic cells, have a limited lifespan of 7-10 days. In contrast, cultured primary astrocytes are able to proliferate in culture. Whilst co-cultures or separated co-cultures allow a degree of interaction between the two cell types, these types of mixed cultures have not yet been used for microarray analysis in ALS.

### **4.2 Animal models of ALS**

42 Amyotrophic Lateral Sclerosis

The advent of next generation sequencing, and specifically the sequencing of all RNA molecules in a quantitative manner, has recently become an alternative, though expensive method for measuring levels of gene expression. However, this has the potential to add further knowledge and value to the application of gene expression profiling to disease.

Quantification of the transcriptome has been a useful mechanism for both discovering and defining mechanisms of pathogenesis in ALS (Cox et al 2010; Ferraiuolo et al 2007; Kirby et al 2005; Kirby et al 2011). In particular lists of differentially expressed genes can be usefully converted to functional 'themes' by an enrichment analysis (Hosack et al 2003). Various categorisations exist, including the gene ontology (GO) and Kyoto encyclopaedia of genes and genomes (KEGG), which classify genes according to molecular function, biological

A frequent application of gene expression profiling has been the development of putative biomarkers via a supervised classification approach (Booij et al 2011; Nagasaka et al 2005; Scherzer et al 2007). The large number of targets quantified simultaneously by gene expression profiling is essential for biomarker discovery, as it allows an unbiased survey of the most informative RNA transcripts. A reliable biomarker(s) for defining pathogenesis and prognosis in ALS has yet to be established, though gene expression profiling is one of

In ALS, the vulnerable cells are the motor neurones, located in the motor cortex, brainstem and spinal cord. Since motor neurones cannot be sampled during life, model systems, such as neuronal-like cells in culture and animals carrying mutant transgenes, have been used to study the neurodegenerative process. In addition to sampling post-mortem material from patients, peripheral tissue from living patients (and neurologically normal controls) has

Neuronal cell lines, both human (e.g. SH-SY5Y) and rodent (e.g. PC12), have been used as a model to investigate mechanisms of neurodegeneration. However, one of the most widely used cellular models for examining the molecular pathophysiology underpinning the neurodegenerative disease process in ALS is the mouse spinal cord/neuroblastoma cell line NSC-34 (Cashman et al 1992). Immortalized NSC-34 cells recapitulate many of the characteristics of motor neurones whilst maintaining their ability to proliferate in culture; thus providing a continual resource of motor neurone-like cells (Cashman et al 1992; Durham et al 1993). In particular, they have proved a robust model of mutant copper zinc superoxide dismutase-1 (*SOD1*) associated familial ALS (FALS), as they can be transfected with vectors carrying normal or mutant forms of the human *SOD1* gene (Durham et al 1997; Menzies et al 2002a; Menzies et al 2002b). Cellular models of TDP-43-related ALS, caused by mutation of the TAR DNA binding protein gene (*TARDBP*) are being generated (Duan et al 2010; Igaz et al 2009), though sustained over-expression of the wild-type and mutant proteins is proving problematic due to the toxicity of both the wild-type and mutant over-

**3. Uses of gene expression profiling** 

**4. Sources of material for studying ALS** 

**4.1 Cellular models of ALS** 

process, cellular component or a known biological pathway.

the methodologies currently being used to establish an ALS biomarker(s).

been used as a source material for applying gene expression profiling to ALS.

expressed proteins and the tight auto-regulation of TDP-43 (Budini & Buratti 2011).

Transgenic mice expressing mutant forms of ALS-related genes provide a source of RNA for microarray analysis. For investigating the mechanisms of *SOD1*-related neurodegeneration, mice over-expressing the human p.G93A or mouse p.G86R mutant forms of the SOD1 protein (SOD1G93A or SOD1G86R) have been used as they develop an age dependent neuromuscular condition; the motor function symptoms and histopathological features have been extensively characterised and resemble those observed in both *SOD1*-related ALS and classical ALS patients (Gurney et al 1994; Ripps et al 1995). In contrast, over-expression of wild-type human SOD1 (SOD1WT) does not produce an overt motor phenotype, supporting a toxic gain of function by the mutant SOD1 protein as the mechanism by which the mutant proteins cause cell death. In contrast, mouse models over-expressing either wild-type or mutant TDP-43 show a neurodegenerative phenotype (Igaz et al 2011; Stallings et al 2010).

One of the major advantages of using animal models for microarray analysis is the ability to examine animals at different ages in order to investigate the progression of disease, an approach that is unattainable in human post-mortem tissue. Valuable insights regarding onset of disease can be established in pre-symptomatic and early symptomatic disease stages since these represent time points at which the identification of key novel targets for therapeutic intervention could be best placed to rescue vulnerable neuronal cell populations before the development of irreversible neuronal injury (Ferraiuolo et al 2007). In addition, sampling of specific cell types from the CNS allow gene expression changes to be identified which include the effects of interactions with neighbouring cells.

Backcrossing of the SOD1WT and SOD1G93A mice with C57Bl6 mice has led to the formation of SOD1WT and SOD1G93A mice on a homogeneous background (Ferraiuolo et al 2007). The use of these mice for microarray analysis and the use of non-transgenic littermates as controls have proven effective in reducing inter-individual genetic variability to ensure the generation of consistent and reliable gene expression data.

### **4.3 Human post-mortem material**

Human post-mortem brain and spinal cord specimens derived from clinically and pathologically confirmed cases of ALS can be used in comparisons with age, gender and

Insights Arising from Gene Expression Profiling in Amyotrophic Lateral Sclerosis 45

erythrocytic mRNA transcripts, masks the expression of less abundant genes of potential biological significance as a result of their high signal intensity on the microarray (Wright et al 2008). Thus, strategies have been developed and evaluated to remove these transcripts present in the RNA samples from whole blood (Liu et al 2006). Alternatively, fractionation of peripheral blood mononuclear cells (PBMCs) can also be used though the additional processing steps involved in isolating the PBMCs can introduce spurious artifactual alterations in gene expression that are not attributable to the disease (Whitney et al 2003). Fibroblasts are not known to have any direct involvement in ALS, though they provide a model with the genetic background of the individual and have the added value of being a source for the generation of induced pluripotent stem cells (iPS) or for the direct manipulation into motor neuronal cells (Dimos et al 2008; Son et al 2011). In addition, fibroblasts have been shown to reflect changes in patients with neurodegenerative disease (Aguirre et al 1998; Hoepken et al 2008; Mortiboys et al 2008) and their gene expression profiles distinguish pre-symptomatic individuals from those of controls (Nagasaka et al 2005). In contrast to fibroblasts, skeletal muscle is severely affected by the disease. Whilst muscle biopsies are the most invasive sample to collect, they are a useful source for gene expression profiling, as they provide a window into the mechanisms involved in the neuromuscular degeneration that occur in ALS during life (Dadon-Nachum et al 2011;

Gene expression profiling of the widely used NSC34 cellular model of *SOD1*-related FALS identified a marked degree of transcriptional repression in the presence of the SOD1G93A mutation (Kirby et al 2005). These repressed genes included a group of antioxidant response (ARE) genes or "programmed cell life" genes that are regulated by the Nrf2 transcription factor (Figure 1). Reduced expression of *Nrf2* and selected downstream targets was seen at both the RNA and protein levels in the cellular model and *NRF2* dysregulation was also demonstrated in isolated motor neurones from *SOD1*-related ALS cases. Subsequent work by Sarlette and colleagues has shown that *NRF2* transcription and translation is also decreased in SALS cases (Sarlette et al 2008). Most recently, it has been demonstrated that activation of *Nrf2* in NSC34 *SOD1*-related ALS cell models, primary motor neurone and astrocyte co-cultures and in the SOD1G93A mouse model all improve neuronal survival

Primary astrocyte cultures prepared from mutant SOD1 rodent models have been shown to increase motor neuronal death in co-cultures (Nagai et al 2007; Vargas et al 2006). To investigate this effect, gene expression profiling was performed on RNA isolated from astrocyte cultures generated from SOD1G93A rats (and litter mate controls). Perhaps surprisingly, there were limited differences in the transcriptional profiles of transgenic and non-transgenic astrocytes (Vargas et al 2008b). However, of the two genes most dysregulated, regulator of differentiation (*Rod1*) and decorin (*Dcn*), both showed consistent changes in asymptomatic and early symptomatic rats, implicating components of RNA processing and

Primary neuronal cultures originating from SOD1G93A mice and subjected to oxidative stress (H2O2) or excitotoxicity (NMDA), demonstrated a greater level of cell death, compared to non-transgenic cultures (Boutahar et al 2011). Microarray analysis detected cytoskeletal remodelling and vesicular transport related genes as increased in response to oxidative

the extracellular matrix as contributors early in the disease process (Figure 1).

Dupuis & Loeffler 2009).

**5. Results from use of cellular models of ALS** 

(Neymotin et al 2011; Vargas et al 2008a).

ethnically matched neurologically normal controls and are a pivotal source of RNA for gene expression profiling in ALS. Brain and spinal cord represent the tissues that are most susceptible to the underlying neurodegenerative disease process. However, there are a number of limitations that should be taken into account when using this tissue. First and foremost, the transcriptional changes that are detected are reflecting the terminal stage of disease, with the majority of vulnerable neuronal cells having already undergone cell death (Sharp et al 2006). Therefore, it can be difficult to distinguish whether the changes detected are due to the survival response in the remaining cells, initiation of a cell death pathway (if the cells are beginning to die), or whether the changes are present in response to a pathogenic trigger (Lederer et al 2007). Secondly, samples, and particularly control material, are difficult to obtain and often in short supply. This restricts the sample sizes that can be used, which has a negative impact upon the statistical power of such studies. Considerable heterogeneity exists between individuals in disease and control groups and gene expression changes may also be influenced by post-mortem interval, variability in brain pH, degree of neuroinflammation and sample type including white versus grey matter or cortical motor neurones versus spinal motor neurones. Furthermore, *ex vivo* RNA degradation, particularly during sample preparation, should also be taken into consideration (Maes et al 2007).

RNA extraction protocols are faster and more straightforward for whole tissue homogenates. However, the inclusion of a heterogeneous mixture of cell populations may in effect blur the distinctive profile emanating from the neuronal cells that are most affected in ALS (Kirby et al 2011). In comparing the gene expression profiles generated from diseased tissue to that of healthy controls there is also the added issue of a shift in the proportion of different cell populations within the sample as neurodegeneration is characterised by the loss of motor neurones and the active proliferation of non-neuronal microglia and macrophages (Dangond et al 2004). Laser capture microdissection (LCM) may be a more expensive and laborious technique but is beneficial in studying a neuronal enriched cell population (Jiang et al 2005). It also is noteworthy that in comparison to primary cultures these cells have had the benefit of being embedded within their natural environment where physiologically relevant cross talk has taken place with neighbouring cells and tissues (Ferraiuolo et al 2007).

### **4.4 Human peripheral tissue samples**

Peripheral tissue such as whole venous blood, cultured skin fibroblasts and muscle biopsy material offer an attractive and readily accessible resource for microarray analysis in ALS. Samples can be collected longitudinally, particularly as the collection of blood is relatively non-invasive and sampling techniques can be standardized across research centres (Highley et al 2011; Saris et al 2009; Shtilbans et al 2011; Tsuang et al 2005).

Blood is classified as a fluid connective tissue composed of plasma (55%), erythrocytes (43%), leukocytes (0.5%) and platelets (1.5%) which continuously permeates and interacts with every other tissue and organ of the mammalian body. It is in a permanent state of renewal and is known to play a pivotal role in physiological homeostasis, cellular immunity and inflammation (Mohr & Liew 2007). Since 80% of the genes routinely expressed within the CNS have also been detectable in circulating blood cells it is anticipated that there are quantifiable changes in the levels of these gene transcripts which have the potential to act as a sentinel of disease (Liew et al 2006). Evidence suggests that the shear abundance of endogenous alpha and beta globin messenger RNA (mRNA), which constitutes up to 70% of

ethnically matched neurologically normal controls and are a pivotal source of RNA for gene expression profiling in ALS. Brain and spinal cord represent the tissues that are most susceptible to the underlying neurodegenerative disease process. However, there are a number of limitations that should be taken into account when using this tissue. First and foremost, the transcriptional changes that are detected are reflecting the terminal stage of disease, with the majority of vulnerable neuronal cells having already undergone cell death (Sharp et al 2006). Therefore, it can be difficult to distinguish whether the changes detected are due to the survival response in the remaining cells, initiation of a cell death pathway (if the cells are beginning to die), or whether the changes are present in response to a pathogenic trigger (Lederer et al 2007). Secondly, samples, and particularly control material, are difficult to obtain and often in short supply. This restricts the sample sizes that can be used, which has a negative impact upon the statistical power of such studies. Considerable heterogeneity exists between individuals in disease and control groups and gene expression changes may also be influenced by post-mortem interval, variability in brain pH, degree of neuroinflammation and sample type including white versus grey matter or cortical motor neurones versus spinal motor neurones. Furthermore, *ex vivo* RNA degradation, particularly

during sample preparation, should also be taken into consideration (Maes et al 2007).

(Ferraiuolo et al 2007).

**4.4 Human peripheral tissue samples** 

et al 2011; Saris et al 2009; Shtilbans et al 2011; Tsuang et al 2005).

RNA extraction protocols are faster and more straightforward for whole tissue homogenates. However, the inclusion of a heterogeneous mixture of cell populations may in effect blur the distinctive profile emanating from the neuronal cells that are most affected in ALS (Kirby et al 2011). In comparing the gene expression profiles generated from diseased tissue to that of healthy controls there is also the added issue of a shift in the proportion of different cell populations within the sample as neurodegeneration is characterised by the loss of motor neurones and the active proliferation of non-neuronal microglia and macrophages (Dangond et al 2004). Laser capture microdissection (LCM) may be a more expensive and laborious technique but is beneficial in studying a neuronal enriched cell population (Jiang et al 2005). It also is noteworthy that in comparison to primary cultures these cells have had the benefit of being embedded within their natural environment where physiologically relevant cross talk has taken place with neighbouring cells and tissues

Peripheral tissue such as whole venous blood, cultured skin fibroblasts and muscle biopsy material offer an attractive and readily accessible resource for microarray analysis in ALS. Samples can be collected longitudinally, particularly as the collection of blood is relatively non-invasive and sampling techniques can be standardized across research centres (Highley

Blood is classified as a fluid connective tissue composed of plasma (55%), erythrocytes (43%), leukocytes (0.5%) and platelets (1.5%) which continuously permeates and interacts with every other tissue and organ of the mammalian body. It is in a permanent state of renewal and is known to play a pivotal role in physiological homeostasis, cellular immunity and inflammation (Mohr & Liew 2007). Since 80% of the genes routinely expressed within the CNS have also been detectable in circulating blood cells it is anticipated that there are quantifiable changes in the levels of these gene transcripts which have the potential to act as a sentinel of disease (Liew et al 2006). Evidence suggests that the shear abundance of endogenous alpha and beta globin messenger RNA (mRNA), which constitutes up to 70% of erythrocytic mRNA transcripts, masks the expression of less abundant genes of potential biological significance as a result of their high signal intensity on the microarray (Wright et al 2008). Thus, strategies have been developed and evaluated to remove these transcripts present in the RNA samples from whole blood (Liu et al 2006). Alternatively, fractionation of peripheral blood mononuclear cells (PBMCs) can also be used though the additional processing steps involved in isolating the PBMCs can introduce spurious artifactual alterations in gene expression that are not attributable to the disease (Whitney et al 2003). Fibroblasts are not known to have any direct involvement in ALS, though they provide a model with the genetic background of the individual and have the added value of being a source for the generation of induced pluripotent stem cells (iPS) or for the direct manipulation into motor neuronal cells (Dimos et al 2008; Son et al 2011). In addition, fibroblasts have been shown to reflect changes in patients with neurodegenerative disease (Aguirre et al 1998; Hoepken et al 2008; Mortiboys et al 2008) and their gene expression profiles distinguish pre-symptomatic individuals from those of controls (Nagasaka et al 2005). In contrast to fibroblasts, skeletal muscle is severely affected by the disease. Whilst muscle biopsies are the most invasive sample to collect, they are a useful source for gene expression profiling, as they provide a window into the mechanisms involved in the neuromuscular degeneration that occur in ALS during life (Dadon-Nachum et al 2011; Dupuis & Loeffler 2009).

### **5. Results from use of cellular models of ALS**

Gene expression profiling of the widely used NSC34 cellular model of *SOD1*-related FALS identified a marked degree of transcriptional repression in the presence of the SOD1G93A mutation (Kirby et al 2005). These repressed genes included a group of antioxidant response (ARE) genes or "programmed cell life" genes that are regulated by the Nrf2 transcription factor (Figure 1). Reduced expression of *Nrf2* and selected downstream targets was seen at both the RNA and protein levels in the cellular model and *NRF2* dysregulation was also demonstrated in isolated motor neurones from *SOD1*-related ALS cases. Subsequent work by Sarlette and colleagues has shown that *NRF2* transcription and translation is also decreased in SALS cases (Sarlette et al 2008). Most recently, it has been demonstrated that activation of *Nrf2* in NSC34 *SOD1*-related ALS cell models, primary motor neurone and astrocyte co-cultures and in the SOD1G93A mouse model all improve neuronal survival (Neymotin et al 2011; Vargas et al 2008a).

Primary astrocyte cultures prepared from mutant SOD1 rodent models have been shown to increase motor neuronal death in co-cultures (Nagai et al 2007; Vargas et al 2006). To investigate this effect, gene expression profiling was performed on RNA isolated from astrocyte cultures generated from SOD1G93A rats (and litter mate controls). Perhaps surprisingly, there were limited differences in the transcriptional profiles of transgenic and non-transgenic astrocytes (Vargas et al 2008b). However, of the two genes most dysregulated, regulator of differentiation (*Rod1*) and decorin (*Dcn*), both showed consistent changes in asymptomatic and early symptomatic rats, implicating components of RNA processing and the extracellular matrix as contributors early in the disease process (Figure 1).

Primary neuronal cultures originating from SOD1G93A mice and subjected to oxidative stress (H2O2) or excitotoxicity (NMDA), demonstrated a greater level of cell death, compared to non-transgenic cultures (Boutahar et al 2011). Microarray analysis detected cytoskeletal remodelling and vesicular transport related genes as increased in response to oxidative

Insights Arising from Gene Expression Profiling in Amyotrophic Lateral Sclerosis 47

express human TDP-43 without the nuclear localization signal (hTDP43-delNLS) developed signs of motor spasticity, neurone loss in forebrain regions and corticospinal tract degeneration (Igaz et al 2011). Microarray analysis of hTDP43-delNLS expression in the cortex of mutant mice, following 2 weeks induction of the mutant protein, detected dramatic changes in gene expression, with the most enriched pathway being chromatin assembly (Figure 2). Interestingly, after only 2 weeks of hTDP43-delNLS induction, markers of

Fig. 2. Summary of prominent pathways arising from GEP of Animal Models. Important changes in the transcriptome have been highlighted by green labels; yellow stars indicate up-regulation, red stars indicate down-regulation. Blue squares outline functional consequences of changes in the transcriptome. Further details are discussed in the text.

Although these studies have greatly contributed to present knowledge on the transcriptional changes occurring in ALS, the analysis of a mixed cell population within the CNS has several disadvantages. This kind of approach does not identify which cell population is responsible for the transcriptional changes observed and only detects those transcripts most highly differentially expressed, with subtle but potential pivotal gene expression changes masked as

well as changes in genes differentially expressed in one cell type, but not in others.

different cell types to the degenerative process occurring in ALS.

**6.2 Gene expression profiling of laser capture microdissection isolated cell types**  In order to overcome the limitations of using mixed cell population samples, dissection of single cells from complex tissues using LCM has been applied to identify the contribution of

inflammation and neuronal loss were unchanged.

stress, whilst genes involved in the ubiquitin-proteasome system and cytokines were increased following excitotoxicity (Figure 1). Several of these pathways have already been implicated as playing a pathogenic role in ALS and add further support to the idea that the proposed disease mechanisms are mutually compatible.

Fig. 1. Summary of prominent pathways arising from GEP of Cellular Models. Important changes in the transcriptome have been highlighted by green labels; yellow stars indicate up-regulation, red stars indicate down-regulation. Blue squares outline functional consequences of changes in the transcriptome. Further details are discussed in the text.
