**2. Genetic screening technologies for the evaluation of ASD cases**

of both false-positive and false-negative variants [11]; this is one of the many considerations

Genetic Evaluation of Individuals with Autism Spectrum Disorders

http://dx.doi.org/10.5772/53900

193

**Exome Report Total # of genes # of unique genes # of overlapping genes**

O'Roak 2011 21 21 0

Sanders 2012 170 166 4

O'Roak 2012 240 227 13

Neale 2012 173 168 5

Chahrour 2012 53 53 0

Iossifov 2012 363 338 25

**Table 1.** Six recent scientific articles describing rare genetic variants identified in ASD cases by next generation sequencing (NGS) has led to a dramatic increase in the number of potential ASD candidate genes and further illustrates the genetic heterogeneity of ASD. Overlapping genes are genes in which a rare variant was identified in

more than one exome report and are used as a measure of genetic heterogeneity.

**Targeted gene panels**

**Whole exome sequencing**

**Whole genome sequencing** Commercially available

Available only in research settings

Available only in research settings

Total 1020 973 47

**Availability Cost Advantages Disadvantages**





techniques

panel

characterized ASD susceptibility genes (syndromic and non-

majority of diseasecovering variants

genome (both coding and non-coding regions)

syndromic)

~\$1000 -Estimated to detect the

~\$4000-\$5000 -Greatest coverage of the

~\$5000 -Highest resolution of NGS approaches

that must be taken into account in deciding which NGS technique to utilize.

Autism spectrum disorders (ASD) are among the most highly heritable neurodevelopmental disorders, and extensive research has been focused on identifying the underlying genetic ba‐ sis of these disorders. It has become apparent that ASD is a genetically heterogeneous disor‐ der, with hundreds of genes and chromosomal rearrangements identified that confer varying degrees of risk for disease. Initially, susceptibility genes and genomic loci were identified by costly, low-throughput techniques, such as automated Sanger sequencing and conventional cytogenetic techniques. The need for lower-cost, higher-throughput genetic screening technologies capable of identifying genome-wide variation in individuals with ge‐ netically complex diseases, such as ASD, has driven improvements in pre-existing techni‐ ques and the development of new technologies. The genetic screening technologies presently available to clinical geneticists and researchers are capable of providing lowercost, high-throughput genetic data that have significantly expanded our knowledge of ge‐ netic variation, both in the general population and in ASD individuals in particular. The first major high-throughput studies aimed at identifying CNVs and SNVs in ASD cohorts were published in 2010 and 2011, respectively [1, 2]. Here we describe in greater detail two of these genetic screening technologies that have become widely used in the genetic evaluation of ASD cases: next generation sequencing (NGS) and chromosomal microarray (CMA).

#### **2.1. Next generation sequencing**

Next-generation sequencing (NGS) is a term used to describe a collection of high-through‐ put sequencing technologies that have enabled clinicans to screen larger amounts of genetic material at lower cost than traditional sequencing technologies, such as automated Sanger sequencing [3]. NGS is typically used to identify single nucleotide variants (SNVs), as well as small insertions or deletions in candidate genes. However, NGS can also be used to iden‐ tify copy number variants (CNVs), as was recently demonstrated in a report detailing whole exome sequencing in a cohort of ASD cases [4], as well as balanced chromosomal rearrang‐ ments, which are typically not detected by genome-wide microarrays [5, 6]. Since 2011, six research articles have been published that have identified rare variants in both existing and novel ASD susceptibility genes using NGS techniques [2, 4, 7-10], a fact that illustrates how extensively these techniques have been adopted by the ASD research community. As a re‐ sult of these studies, the potential number of potential ASD-linked genes have increased dramatically (Table 1). Furthermore, as demonstrated by the minimal overlap of candidate genes across these studies, the results of these studies further illustrate the genetic heteroge‐ neity of ASD.

NGS techniques are typically divided into three categories, each with its own advantages and disadvantages (summarized in Table 2). These techniques vary in terms of genetic cov‐ erage (the size of the sequenced target, which can range in size from one or a few genes to the entire genome) and genetic resolution (sensitivity in detection of variants per sequencing target). In general, the smaller the genetic coverage, the higher the genetic resolution. It should be noted that, as the size of the target for sequencing increases, so does the number


of both false-positive and false-negative variants [11]; this is one of the many considerations that must be taken into account in deciding which NGS technique to utilize.

**2. Genetic screening technologies for the evaluation of ASD cases**

**2.1. Next generation sequencing**

192 Recent Advances in Autism Spectrum Disorders - Volume I

neity of ASD.

Autism spectrum disorders (ASD) are among the most highly heritable neurodevelopmental disorders, and extensive research has been focused on identifying the underlying genetic ba‐ sis of these disorders. It has become apparent that ASD is a genetically heterogeneous disor‐ der, with hundreds of genes and chromosomal rearrangements identified that confer varying degrees of risk for disease. Initially, susceptibility genes and genomic loci were identified by costly, low-throughput techniques, such as automated Sanger sequencing and conventional cytogenetic techniques. The need for lower-cost, higher-throughput genetic screening technologies capable of identifying genome-wide variation in individuals with ge‐ netically complex diseases, such as ASD, has driven improvements in pre-existing techni‐ ques and the development of new technologies. The genetic screening technologies presently available to clinical geneticists and researchers are capable of providing lowercost, high-throughput genetic data that have significantly expanded our knowledge of ge‐ netic variation, both in the general population and in ASD individuals in particular. The first major high-throughput studies aimed at identifying CNVs and SNVs in ASD cohorts were published in 2010 and 2011, respectively [1, 2]. Here we describe in greater detail two of these genetic screening technologies that have become widely used in the genetic evaluation of ASD cases: next generation sequencing (NGS) and chromosomal microarray (CMA).

Next-generation sequencing (NGS) is a term used to describe a collection of high-through‐ put sequencing technologies that have enabled clinicans to screen larger amounts of genetic material at lower cost than traditional sequencing technologies, such as automated Sanger sequencing [3]. NGS is typically used to identify single nucleotide variants (SNVs), as well as small insertions or deletions in candidate genes. However, NGS can also be used to iden‐ tify copy number variants (CNVs), as was recently demonstrated in a report detailing whole exome sequencing in a cohort of ASD cases [4], as well as balanced chromosomal rearrang‐ ments, which are typically not detected by genome-wide microarrays [5, 6]. Since 2011, six research articles have been published that have identified rare variants in both existing and novel ASD susceptibility genes using NGS techniques [2, 4, 7-10], a fact that illustrates how extensively these techniques have been adopted by the ASD research community. As a re‐ sult of these studies, the potential number of potential ASD-linked genes have increased dramatically (Table 1). Furthermore, as demonstrated by the minimal overlap of candidate genes across these studies, the results of these studies further illustrate the genetic heteroge‐

NGS techniques are typically divided into three categories, each with its own advantages and disadvantages (summarized in Table 2). These techniques vary in terms of genetic cov‐ erage (the size of the sequenced target, which can range in size from one or a few genes to the entire genome) and genetic resolution (sensitivity in detection of variants per sequencing target). In general, the smaller the genetic coverage, the higher the genetic resolution. It should be noted that, as the size of the target for sequencing increases, so does the number **Table 1.** Six recent scientific articles describing rare genetic variants identified in ASD cases by next generation sequencing (NGS) has led to a dramatic increase in the number of potential ASD candidate genes and further illustrates the genetic heterogeneity of ASD. Overlapping genes are genes in which a rare variant was identified in more than one exome report and are used as a measure of genetic heterogeneity.



although the differences in cost betwen these two techniques have fallen from 10- to 20-fold

Genetic Evaluation of Individuals with Autism Spectrum Disorders

http://dx.doi.org/10.5772/53900

195

Microscopically-visible chromosomal rearrangments have long been implicated in the onset and pathogenesis of neurodevelopmental disorders, includng ASD. Indeed, many of the most strongly ASD-linked chromosomal deletions and duplications, collectively referred to as copy number variants (CNVs), were discovered through the use of conventional cytogenetic techniques such as G-banded karyotyping, fluorescent in situ hybridization (FISH), and microsatellite analysis. For example, duplications of chromosome 15q11-q13 were first implicated in ASD in the mid-1990s by these methods [15-17]. Likewise, these methods identified chromosomal rearrangments on the long arm of chromosome 22 in ASD cases [18, 19]. However, conventional cytogenetic techniques are impractical in the identification of copy number variation throughout the human genome in large case cohorts. While G-banded karyotyping is capable of detecting large chromosomal deletions and duplications (~1 Mb and larger), it lacks the sensitivity to detect smaller CNVs. Alternatively, the use of techniques such as FISH is generally limited to screen a particular chromosomal region, so while they are useful for examining copy number variation in a genomic loci of interest in larger case populations, they are impractical for the purposes of identifying deletions and duplications throughout the

In the last decade, technological and computational advances have allowed clinical geneticists and researchers to detect submicroscopic chromosomal deletions and duplications throughout the human genome in large case cohorts that would not be detected by traditional cytogenetic techniques. Chromosomal microarray (CMA) is a term frequently used to include all types of array-based whole genome copy number analyses, with the two most widely used being arraycomparative genomic hybridization (aCGH) and single nucleotide polymorphism (SNP) arrays. CMA has been demonstrated to provide a higher diagnostic yield than G-banded karyotyping (15-20% compared to ~3%) due to its ability to detect submicroscopic deletions and duplications, and it has been proposed that CMA should replace conventional cytogenetic techniques as a first-tier diagnostic tool for individuals with congential abnormalities and developmental disorders, including ASD [20]. High-throughput genome-wide aCGH and SNP arrays are now regularly used in the detection of CNVs in large ASD cohorts [1, 21-24].

aCGH and SNP arrays employ similar methodologies in the detection of CNVs (Figure 1). The first step involves labeling the DNA of the ASD patient with a fluorophore, thereby creating a test sample. The test sample is then mixed with an equal amount of DNA from a normal reference sample that has been labeled with a different fluorophore. This mixed DNA sample is added to a glass slide containing thousands of oligonucleotide probes corresponding to different chromosomal regions that cover the human genome; in the case of SNP arrays, the oligonucleotide probes are specific for common polymorphisms found in the general popu‐ lation. The sensitivity of CMA has been greatly increased in recent years by the development of arrays employing a larger number of smaller oligonulceotide probes; in doing so, clinical geneticists and researchers are able to detect even smaller copy number changes than before

[13] to 4- to 5-fold [14].

genome.

**2.2. Chromosomal microarray**

**Table 2.** A summary of the benefits and drawbacks of the three types of next generation sequencing (NGS) techniques in the genetic evaluation of ASD cases. Cost estimates of whole-exome and whole-genome sequencing in [14].

#### *2.1.1. Targeted gene panels*

Targeted gene panels generally test for 50-100 genes that have been demonstrated to be strong‐ ly associated with a particular disease. Such gene panels are already extensively used to screen individuals for a wide range of cancers and inherited diseases for which causative genes have been identified. A number of commercially-available ASD gene panels have recently been de‐ signed to target both genes strongly associated with non-syndromic ASD as well as syndromic genes (genes that cause syndromes in which a subset of affected individuals also develop ASD, such as *FMR1*, *MECP2*, and *CACNA1C*, which cause Fragile X, Rett, and Timothy syndromes, respectively). For example, the Greenwood Genetic Center offers a 62-gene syndromic gene panel that covers the coding region and flanking intronic boundaries of ASD-linked 62 genes for \$5500 (http://www.ggc.org/images/pdfs/syndromicautism62-genengspanel.pdf). While targeted gene panels offer the smallest coverage of the human genome of the three NGS ap‐ proaches, they offer the highest resolution. One of the major drawbacks to the use of targeted gene panels for a genetically heterogeneous disorder such as ASD is the inability to detect mu‐ tations in genes outside of those included in the gene panel.

#### *2.1.2. Whole exome sequencing*

Whole exome sequencing, which is also known as targeted exome capture, is designed to spe‐ cifically identify variants in protein-coding regions of the human genome. Although these pro‐ tein-coding regions, called exons, constitute a very small percentage of the human genome, it is estimated that they contain up to 85% of disease-causing mutations [12, 13]. However, whole exome sequencing will fail to detect any potentially pathogenic variants in non-coding regions of the human genome. This NGS method also provides lower resolution than targeted gene panels. Nonetheless, whole exome sequencing is increasingly being used to identify potential‐ ly pathogenic rare single gene variants in individuals with ASD [2, 4, 7-10].

#### *2.1.3. Whole genome sequencing*

In contrast to whole exome sequencing, which only covers protein-coding regions of the human genome, whole genome sequencing provide coverage of the entire genome, allowing for the sequencing of both coding and non-coding genomic regions. As such, single nucleotide changes and small insertions/deletions within non-coding regions of the genome can be detected by this method. While whole genome sequencing covers the largest amount of the human genome of all NGS techniques, it offers the lowest resolution of the three NGS technologies. Whole genome sequencing is also more costly than whole exome sequencing, although the differences in cost betwen these two techniques have fallen from 10- to 20-fold [13] to 4- to 5-fold [14].

#### **2.2. Chromosomal microarray**

**Availability Cost Advantages Disadvantages**

**Table 2.** A summary of the benefits and drawbacks of the three types of next generation sequencing (NGS) techniques in

Targeted gene panels generally test for 50-100 genes that have been demonstrated to be strong‐ ly associated with a particular disease. Such gene panels are already extensively used to screen individuals for a wide range of cancers and inherited diseases for which causative genes have been identified. A number of commercially-available ASD gene panels have recently been de‐ signed to target both genes strongly associated with non-syndromic ASD as well as syndromic genes (genes that cause syndromes in which a subset of affected individuals also develop ASD, such as *FMR1*, *MECP2*, and *CACNA1C*, which cause Fragile X, Rett, and Timothy syndromes, respectively). For example, the Greenwood Genetic Center offers a 62-gene syndromic gene panel that covers the coding region and flanking intronic boundaries of ASD-linked 62 genes for \$5500 (http://www.ggc.org/images/pdfs/syndromicautism62-genengspanel.pdf). While targeted gene panels offer the smallest coverage of the human genome of the three NGS ap‐ proaches, they offer the highest resolution. One of the major drawbacks to the use of targeted gene panels for a genetically heterogeneous disorder such as ASD is the inability to detect mu‐

Whole exome sequencing, which is also known as targeted exome capture, is designed to spe‐ cifically identify variants in protein-coding regions of the human genome. Although these pro‐ tein-coding regions, called exons, constitute a very small percentage of the human genome, it is estimated that they contain up to 85% of disease-causing mutations [12, 13]. However, whole exome sequencing will fail to detect any potentially pathogenic variants in non-coding regions of the human genome. This NGS method also provides lower resolution than targeted gene panels. Nonetheless, whole exome sequencing is increasingly being used to identify potential‐

In contrast to whole exome sequencing, which only covers protein-coding regions of the human genome, whole genome sequencing provide coverage of the entire genome, allowing for the sequencing of both coding and non-coding genomic regions. As such, single nucleotide changes and small insertions/deletions within non-coding regions of the genome can be detected by this method. While whole genome sequencing covers the largest amount of the human genome of all NGS techniques, it offers the lowest resolution of the three NGS technologies. Whole genome sequencing is also more costly than whole exome sequencing,

the genetic evaluation of ASD cases. Cost estimates of whole-exome and whole-genome sequencing in [14].

tations in genes outside of those included in the gene panel.

ly pathogenic rare single gene variants in individuals with ASD [2, 4, 7-10].

*2.1.1. Targeted gene panels*

194 Recent Advances in Autism Spectrum Disorders - Volume I

*2.1.2. Whole exome sequencing*

*2.1.3. Whole genome sequencing*


> Microscopically-visible chromosomal rearrangments have long been implicated in the onset and pathogenesis of neurodevelopmental disorders, includng ASD. Indeed, many of the most strongly ASD-linked chromosomal deletions and duplications, collectively referred to as copy number variants (CNVs), were discovered through the use of conventional cytogenetic techniques such as G-banded karyotyping, fluorescent in situ hybridization (FISH), and microsatellite analysis. For example, duplications of chromosome 15q11-q13 were first implicated in ASD in the mid-1990s by these methods [15-17]. Likewise, these methods identified chromosomal rearrangments on the long arm of chromosome 22 in ASD cases [18, 19]. However, conventional cytogenetic techniques are impractical in the identification of copy number variation throughout the human genome in large case cohorts. While G-banded karyotyping is capable of detecting large chromosomal deletions and duplications (~1 Mb and larger), it lacks the sensitivity to detect smaller CNVs. Alternatively, the use of techniques such as FISH is generally limited to screen a particular chromosomal region, so while they are useful for examining copy number variation in a genomic loci of interest in larger case populations, they are impractical for the purposes of identifying deletions and duplications throughout the genome.

> In the last decade, technological and computational advances have allowed clinical geneticists and researchers to detect submicroscopic chromosomal deletions and duplications throughout the human genome in large case cohorts that would not be detected by traditional cytogenetic techniques. Chromosomal microarray (CMA) is a term frequently used to include all types of array-based whole genome copy number analyses, with the two most widely used being arraycomparative genomic hybridization (aCGH) and single nucleotide polymorphism (SNP) arrays. CMA has been demonstrated to provide a higher diagnostic yield than G-banded karyotyping (15-20% compared to ~3%) due to its ability to detect submicroscopic deletions and duplications, and it has been proposed that CMA should replace conventional cytogenetic techniques as a first-tier diagnostic tool for individuals with congential abnormalities and developmental disorders, including ASD [20]. High-throughput genome-wide aCGH and SNP arrays are now regularly used in the detection of CNVs in large ASD cohorts [1, 21-24].

> aCGH and SNP arrays employ similar methodologies in the detection of CNVs (Figure 1). The first step involves labeling the DNA of the ASD patient with a fluorophore, thereby creating a test sample. The test sample is then mixed with an equal amount of DNA from a normal reference sample that has been labeled with a different fluorophore. This mixed DNA sample is added to a glass slide containing thousands of oligonucleotide probes corresponding to different chromosomal regions that cover the human genome; in the case of SNP arrays, the oligonucleotide probes are specific for common polymorphisms found in the general popu‐ lation. The sensitivity of CMA has been greatly increased in recent years by the development of arrays employing a larger number of smaller oligonulceotide probes; in doing so, clinical geneticists and researchers are able to detect even smaller copy number changes than before

without compromising genomic coverage. The test and reference DNA samples hybridize with the probes on the slide, and the fluorescence intensities of the test and reference DNA can then be measured. Following analysis with software that is typically specific for the platform being used, one or more algorithms are used to call the CNV. The ratio between the two fluorescence intensities is used to identify copy number changes. For example, if the test-to-reference ratio is 1 (yellow in the example below), then there is no change in copy number at the chromosomal region corresponding to a given probe, If the test-to-reference fluorescence ratio is > 1 for a particular probe (green in the example below), then the ASD patient carries a duplication in the chromosomal region corresponding to that probe. If the test-to-reference ratio is < 1 (red in the example below), then the patient carries a deletion at that site of the genome.

the Human Gene Module of the autism genetic database AutDB [27] has increased from 284 genes in September 2011 to 369 genes in June 2012. A large number of newer susceptibility genes have been annotated from reports employing whole exome sequencing of ASD cases [2, 4, 7-10], illustrating the increasing usage of NGS techniques in the study of genetic variation in ASD. In addition to the identification of novel ASD susceptibility genes, NGS techniques have identified novel rare variants in previously identified ASD susceptibility genes. The number of ASD-associated CNV loci has also increased significantly, with the CNV module of AutDB expanding from 1034 CNV loci in September 2011 to 1173 loci in June 2012 (Figure 2). In this section we describe the genetic categories into which ASD susceptibility genes have been classified, as well as describe recent studies that have yielded invaluable insight on the

Genetic Evaluation of Individuals with Autism Spectrum Disorders

http://dx.doi.org/10.5772/53900

197

**Figure 2.** The number of genes and CNV loci associated with ASD in the genetic database AutDB has increased over

The earliest ASD susceptibility genes were rare single gene variants in genes associated with syndromes such as Fragile X syndrome and Rett syndrome. The discovery of single gene mutations/disruptions in two neuroligin genes, NLGN3 and NLGN4, in ASD siblings [28] initiated the search for additional ASD susceptibility genes in non-syndromic ASD cases. The continued identification of rare genetic variants associated with both syndromic and nonsyndromic ASD, as well as of risk-conferring polymorphisms enriched in ASD populations

functional profiles of ASD-associated genes and CNV loci.

the last four quarterly release dates.

**3.1. Genetic categories of ASD susceptibility genes**

**Figure 1.** Chromosomal microarray (CMA) analysis involves hybridization of differently labeled test and reference DNA samples with oligonucleotide probes, followed by computerized analysis and identification of copy number variants (CNVs) based on changes in fluorescence intensity ratio.

Despite the recommended use of CMA as a first-tier genetic evaluation tool in place of conven‐ tional cytogenetic techniques, it should be noted that aCGH is unable to detect balanced chro‐ mosomal rearrangments and other chromosomal abnormalities that have traditionally been detected by karyotype analysis [25]. In addition to their traditional utilization in the detection of risk-conferring common polymorphisms, SNP arrays have the added advantage of being able to detect copy number neutral genetic variation such as uniparental disomy and long con‐ tiguous streteches of homozygosity (LCSH) that cannot be detected by aCGH [25, 26].
