**1. Introduction**

126 Aquaculture

Seed, R. (1969). The ecology of *Mytilus edulis* L. (Lamelibrabchiata) on exposed rocky shores. 1. Breeding and settlement. *Oecologia*, Vol.3, No.3-4, (September 1969), pp. 277–316 Slatkin, M. & Hudson, R.R. (1991). Pairwise comparisons of mitochondrial DNA sequences

Soule, M.E. (1980). Thresholds for survival: maintaining fitness and evolutionary potential,

Stepien, C.A.; Hubers, A.N. & Skidmore, J.L. (1999). Diagnostic genetic markers and

Tajima, F. (1989). Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. *Genetics*, Vol.123, No.3, (November 1989), pp. 585–595 Tamura, K.; Dudley, J.; Nei, M. & Kumar, S. (2007). MEGA4: Molecular Evolutionary

Therriault, T.W.; Docker, M.F.; Orlova, M.I.; Heath, D.D. & MacIsaac, H.J. (2004). Molecular

Thompson, J.D.; Gibson, T.J.; Plewniak, F.; Jeanmougin, F. & Higgins, D.G. (1997). The

Vaught, K.C. (1989). *A classification of the living mollusca*. American Malacologists, ISBN

Wilding, C.S.; Beaumont, A.R. & Latchford, J.W. (1999). Are *Pecten maximus* and *Pecten* 

Yu, X.Y.; Mao, Y.; Wang, M.F.; Zhou, L. & Gui, J.F. (2004). Genetic heterogeneity analysis

Yuan, T.; He, M. & Huang, L. (2009). Intraspecific genetic variation in mitochondrial 16S

*Chlamys nobilis* Reeve. *Aquaculture*, Vol.289, No.1-2, (April 2009), pp. 19-25 Zenetos, A. (1996). *Fauna Graeciae. VII. The marine Bivalvia (Mollusca) of Greece*. N.C.M.R.,

Zenetos, A.; Vardala-Theodorou, E. & Alexandrakis, C. (2005). Update of the marine

*Association of the United Kingdom*, Vol.85, No.4, (June 2005), pp. 993-998

*Kingdom*, Vol. 79, No.5, (September 2000), pp. 949-952

*of Shellfish Research*, Vol.23, No.1, (April 2004), pp. 165-171

*Phylogenetics and Evolution*, Vol.13, No.1, (October 1999), pp. 31-49

Vol.24, No.8, (August 2007), pp. 1596-1599

9780915826223, Melbourne, FL

ISBN 960-85952-0-7, Athens

1991), pp. 555–562

Massachusetts

4876-4882

in stable and exponential growth populations. *Genetics*, Vol.129, No.2, (October

In: *Conservation biology: an evolutionary-ecological perspective*, Soule, M.E. & Wilcox, B.A., (Eds.), 151-170, Sinauer Association, ISBN 978-0878938001, Sunderland,

evolutionary relationships among invasive Dreissenoid and Corbiculoid Bivalves in North America: phylogenetic signal from mitochondrial 16S rDNA. *Molecular* 

Genetics Analysis (MEGA) software version 4.0. *Molecular Biology and Evolution*,

resolution of the family Dreissenidae (Mollusca: Bivalvia) with emphasis on Ponto-Caspian species, including first report of Mytilopsis leucophaeata in the Black Sea basin. *Molecular Phylogenetics and Evolution*, Vol.30, No.3, (March 2004), pp. 479-489

CLUSTAL X windows interface: Flexible strategies for multiple alignments aided by quality analysis tool. *Nucleic Acids Research*, Vol.25, No.24, (December 2007), pp.

*jacobaeus* different species? *Journal of the Marine Biological Association of the United* 

and RAPD marker detection among four forms of *Atrina pectinata* Linnaeus. *Journal* 

rRNA and COI genes in domestic and wild populations of Huaguizhikong scallop

Bivalvia Mollusca checklist in Greek waters. *Journal of the Marine Biological* 

#### **1.1 What is genomics?**

The central dogma of molecular biology states that DNA is transcribed to RNA, which is translated into proteins (Crick, 1970), the molecules that facilitate all biological functions. A genome comprises all of an organism's DNA, or hereditary information. The field of genomics is the study of whole genomes. Genomics has been defined as "a branch of biotechnology concerned with applying the techniques of genetics and molecular biology to the genetic mapping and DNA sequencing of sets of genes or the complete genomes of selected organisms" (Mirriam-Webster Dictionary). Specifically, whereas genetics is the study of a single gene, or a few genes in isolation, genomics examines all of the genes, as well as the non-coding elements (i.e., regions that do not encode proteins or RNA components of the cell), within the DNA of a genome. Although the term genomics was first used in 1987, the field is relatively new and is growing rapidly as new technologies for exploring genomes emerge. The applications of genomics are vast, spanning realms such as medicine, industry and ecology, with implications for global issues including cancer diagnosis, prevention and treatment, alternative energy sources, agriculture, conservation and sustainable development. Furthermore, given that the fundamental basis of genomics⎯ DNA⎯is common across all living organisms, genomics stands to bridge the gaps between these fields of study, with synergistic results as multi-disciplinary approaches to answering biological questions are developed.

#### **1.2 Genomics in aquaculture: Phenotypic vs. genotypic selection**

In traditional selective breeding practices, individuals showing desirable phenotypic, or visible characteristics are bred to produce new strains of plants or animals that exhibit features such as faster growth, greater size, increased overall robustness, or improved aesthetic appeal. Farmers, through thousands of years of following the mantra "breed the best to the best and hope for the best" (quotation attributed to American Thoroughbred and Standardbred breeder, John E. Madden), have drastically changed natural populations, as exemplified by the now hundreds of breeds of the domestic dog, or differing cattle strains for dairy or beef production. However, these phenotype-based breeding tactics come with inherent drawbacks. Specifically, they are draining of resources and time, as they often rely on trial and error, they require vast numbers of individuals exhibiting extensive phenotypic variation, and often numerous generations of repetitive selective breeding are required to see an effect, which is especially difficult for species with long generation times and for which extensive parental investment is needed. Furthermore, often animals need to be sacrificed to detect or measure morphological characteristics such as flesh color and tissue quality, and these, as well as disease challenged (i.e., to determine immune function or disease resistance) animals cannot be bred. Finally, most characteristics of interest to aquaculture are complex traits – i.e., those that are governed by numerous genes and complex interactions between multiple pathways, or those for which morphological variation is based on environmental cues, and it is very difficult to select for multiple complex traits together based on phenotypic information alone. Thus, the amount of time and the costs associated with traditional, phenotypic approaches to selective breeding are large.

In contrast to phenotype-based selection, which directly selects for a given heritable trait, in genotype-based selection, or marker-assisted selection (MAS), traits are indirectly selected for based on variability at the DNA level. More specifically, the genomes of breeding populations are screened for multiple markers, or DNA tags, that are associated with genes of interest that work together to produce a desirable phenotype for a complex trait. This genetic screening can be accomplished using high-throughput genomics systems that incorporate tools such as polymerase chain reaction (PCR) or DNA sequence-based screening.

MAS has been used extensively for the genetic improvement of cultivated plant cultivars, such as wheat (Gupta et al., 2008), corn (Tuberosa et al., 2002) and soy (Kim et al., 2010), as well as cultured stocks (broodstocks) in agriculture species such as swine (Liu et al., 2007) and cattle (Veerkamp & Beerda 2007). As the availability of genomics resources has increased for aquatic species, more research is being done to improve broodstocks for many key aquaculture species worldwide. Specifically, genomics, and the use of genomic tools, enables one to examine the differences and similarities among organisms at the genotypic (DNA) level, as opposed to more traditional broad-scaled phenotype-based (appearancebased) approaches. This very fine-scaled perspective, looking at differences in genes (alleles) as well as variability at non-coding loci, or positions within the genome, means that complex traits that are driven by more than one gene or pathway can be broken down into their components. That is, differences at the individual, family, population, species and even the organism level can be assessed in very fine detail. Such insight into the genetic factors that drive complex traits can facilitate the development of effective and efficient breeding methods, which have far-reaching implications for the aquaculture industry.

#### **2. Atlantic salmon: A model aquaculture species for MAS**

The holy grail of any genomics program for a species is a whole genome sequence that is well assembled and annotated. The advantages of this are many, and are discussed in further detail in subsequent sections, but are mainly centered around two things: 1) the

Standardbred breeder, John E. Madden), have drastically changed natural populations, as exemplified by the now hundreds of breeds of the domestic dog, or differing cattle strains for dairy or beef production. However, these phenotype-based breeding tactics come with inherent drawbacks. Specifically, they are draining of resources and time, as they often rely on trial and error, they require vast numbers of individuals exhibiting extensive phenotypic variation, and often numerous generations of repetitive selective breeding are required to see an effect, which is especially difficult for species with long generation times and for which extensive parental investment is needed. Furthermore, often animals need to be sacrificed to detect or measure morphological characteristics such as flesh color and tissue quality, and these, as well as disease challenged (i.e., to determine immune function or disease resistance) animals cannot be bred. Finally, most characteristics of interest to aquaculture are complex traits – i.e., those that are governed by numerous genes and complex interactions between multiple pathways, or those for which morphological variation is based on environmental cues, and it is very difficult to select for multiple complex traits together based on phenotypic information alone. Thus, the amount of time and the costs associated with traditional, phenotypic approaches to selective breeding are

In contrast to phenotype-based selection, which directly selects for a given heritable trait, in genotype-based selection, or marker-assisted selection (MAS), traits are indirectly selected for based on variability at the DNA level. More specifically, the genomes of breeding populations are screened for multiple markers, or DNA tags, that are associated with genes of interest that work together to produce a desirable phenotype for a complex trait. This genetic screening can be accomplished using high-throughput genomics systems that incorporate tools such as polymerase chain reaction (PCR) or DNA sequence-based

MAS has been used extensively for the genetic improvement of cultivated plant cultivars, such as wheat (Gupta et al., 2008), corn (Tuberosa et al., 2002) and soy (Kim et al., 2010), as well as cultured stocks (broodstocks) in agriculture species such as swine (Liu et al., 2007) and cattle (Veerkamp & Beerda 2007). As the availability of genomics resources has increased for aquatic species, more research is being done to improve broodstocks for many key aquaculture species worldwide. Specifically, genomics, and the use of genomic tools, enables one to examine the differences and similarities among organisms at the genotypic (DNA) level, as opposed to more traditional broad-scaled phenotype-based (appearancebased) approaches. This very fine-scaled perspective, looking at differences in genes (alleles) as well as variability at non-coding loci, or positions within the genome, means that complex traits that are driven by more than one gene or pathway can be broken down into their components. That is, differences at the individual, family, population, species and even the organism level can be assessed in very fine detail. Such insight into the genetic factors that drive complex traits can facilitate the development of effective and efficient breeding

The holy grail of any genomics program for a species is a whole genome sequence that is well assembled and annotated. The advantages of this are many, and are discussed in further detail in subsequent sections, but are mainly centered around two things: 1) the

methods, which have far-reaching implications for the aquaculture industry.

**2. Atlantic salmon: A model aquaculture species for MAS** 

large.

screening.

wealth of data that is produced, including the full gene repertoire with additional information such as gene location and copy number, and 2) the ability of the sequenced genome to act as a reference genome, both for the sequenced species itself (i.e., such that the genomes of additional individuals can be easily re-sequenced using the original as a reference for assembly), as well as to provide information for other, closely related species. Currently, however, even with the great advances in sequencing technology that have come to light in recent years, obtaining a whole genome sequence remains an extremely difficult, costly and time-consuming undertaking. This is particularly true for fish species simply due to the evolutionary age of fish and the more than 20,000 extant species (Nelson, 2006), factors that make the fish genomes diverse and complex and complicate sequencing. Indeed, only five fish genomes have been reported to date (medaka, *Oryzias latipes*; tiger pufferfish, *Takifugu rubripes;* green spotted pufferfish, *Tetraodon nigriviridis*; zebrafish, *Danio rerio* and stickleback, *Gasterosteus aculeatus*), although more are underway. Their sequences, as well as those for the sequenced genomes of other organisms are publically available within the Ensembl database (www.ensembl.org). These fish species were chosen for their abilities to act as model sequences for genetics research, rather than for their utility for aquaculture. Specifically, the medaka and zebrafish genomes were sequenced to provide model organisms for studying developmental biology, while the stickleback genome serves as a model for studying adaptive evolution, and the two pufferfish represent the smallest known vertebrate genomes. Figure 1 illustrates the phylogenetic relationships among these species as well as some key aquaculture species, and shows that the full spectrum of teleosts is not represented by the genome sequences that are currently available.

When a whole genome sequence is not available, there are numerous genomics resources and tools that can be developed which, particularly when used in combination, can provide extensive insight into a genome and can be used for applications such as MAS, both for the species of interest, and for other closely-related species. In the following sections, we will use the example of Atlantic salmon (*Salmo salar*) and its genomics program to describe these tools, their development and utility for aquaculture. Atlantic salmon is a particularly good model fish species for genomics because it is a major aquaculture species, with approximately 1.5 million tonnes produced worldwide in 2009 (Food and Agriculture Organization [FAO] Fishery Statistic, 2009), and like other salmonids, is of substantial environmental, economic and social importance. In addition, there are extensive genomics resources for Atlantic salmon, which were developed using standard methods and approaches that are applicable to other genomes. Furthermore, a strong argument has been made for obtaining the full genome sequence for Atlantic salmon, a project that is currently in progress (Davidson et al., 2010). Specifically, aside from the merits of the species itself, there are no salmonid species yet sequenced, and thus Atlantic salmon can serve as a reference salmonid genome, providing extensive opportunities for cross-referencing, or comparative synteny analyses with other salmonids, particularly the Pacific salmon (*Oncorhynchus* sp.), rainbow trout (*Oncorhynchus mykiss*) and Arctic charr (*Salvelinus alpinus*).

Atlantic salmon also provides an example of some of the challenges that face fish genomics in general. The common ancestor of salmonids underwent a whole genome duplication event between 20 and 120 million years ago (Allendorf & Thorgaard, 1984; Ohno, 1970). Thus, whereas there are usually two copies of each gene within a genome, Atlantic salmon have four, and the duplicate copies are evolving into genes with new functions or noncoding DNA. The genome duplication also increased the size and the repetitiveness of the genome. These characteristics, combined with the lack of a closely related guide sequence, mean that sequencing and assembling the Atlantic salmon genome are extremely challenging.

Fig. 1. Schematic representation of the phylogenetic relationships among fish species. Species in bold have publically available full genome sequences, while those that are underlined are currently being sequenced. Note that the full spectrum of teleosts is not represented by the genome sequences that are currently available. Species are listed by their common names with orders in parentheses. Gar is used as an outgroup.
