**5. Molecular markers**

There are a number of molecular markers in Neotropical fish, such as allozyme markers, restriction fragment length polymorphisms in regions of DNA (RFLP), randomly amplified polymorphic DNA (RAPD), randomly amplified polymorphic DNA (AFLP), microsatellites markers, high genome coverage markers [single nucleotide polymorphisms (SNPs)] and maternal inheritance markers (mtDNA).

#### **5.1. Allozymes**

The identification of cryptic species is an important genetic application for the ecology and conservation of Neotropical freshwater fish. This taxonomic challenge has been overcome due to the advent and availability of rapid DNA sequencing for detecting and differentiating morphologically similar species [22]. The destruction and disturbance of river basins, especially those caused by human interference, have led to the threat of complete extinction of several fish species [96]. However, many species exposed to these threats are still undescribed, and efforts to catalogue and identify these fish are increasingly important. Most species have been described by morphological and typological characteristics [97]. However, speciation is not always accompanied by differences in morphology, and due to the difficulty of identification,

the actual number of existing fish species is greater than previously described [22].

lack of knowledge of their diversity makes taxonomic identification a great challenge.

should be routinely incorporated into taxonomic research.

hybrid genome of both the father and mother can be identified [106].

**5. Molecular markers**

258 Biological Resources of Water

Genetic methods facilitate the identification of cryptic species and species with few identifiable phenotypic characteristics. The presumed neutrality of some molecular markers, in conjunction with phylogenetic methods, provides a new perspective on species identification, especially in hierarchical relatedness and relative rates of evolution. The increased frequency with which cryptic species can be discovered with DNA sequence data, and often subsequently confirmed with morphological and/or ecological data, suggests that molecular data

Another major problem for the natural populations of Neotropical fish (that can be reduced or controlled using genetic resources) is accidental or deliberate release of non-native fish species [103]. Hybridisation is the mating of genetically differentiated individuals and may involve individuals within a species or between species [104]. Conventional approaches to detect interspecific hybridisation include morphometric and molecular analyses. In recent years, DNA polymorphisms have been used for investigating fish hybridisation [105]. Nuclear genetic markers, in particular, allow hybrid species identification because contributions to the

There are a number of molecular markers in Neotropical fish, such as allozyme markers, restriction fragment length polymorphisms in regions of DNA (RFLP), randomly amplified polymorphic DNA (RAPD), randomly amplified polymorphic DNA (AFLP), microsatellites

DNA sequencing has introduced a new method of species discovery known as DNA barcodes [98]. DNA barcodes are short and standardised sequences from a part of the mitochondrial genome that can be used to distinguish different species. This differentiation can easily be determined when genetic variation between species exceeds that within species [99]. The barcode sequence from each unknown specimen is then compared with a library of reference barcode sequences derived from individuals of known identity. Research has been carried out to evaluate the effectiveness of this technique in identifying cryptic species in insects [100], birds [101] and plants [102]. The diversity Neotropical freshwater ichthyofauna is the richest in the world and make up around 25% of the total freshwater fish fauna on Earth [5]. However, the Allozymes were considered the first molecular marker, discovered in the 1960s in enzymes. When DNA sequences of two or more alleles in the same locus are divergent, and the corresponding RNA encodes different amino acids, multiple variants of the same protein are created. However, not every mutation in a DNA sequence results in changes to the amino acid sequences, and this is one of the disadvantages of using an allozyme as a molecular marker [107]. Other disadvantages include heterozygote deficiencies due to null alleles and the amount and quality of tissue samples required [108]. The limitations and disadvantages of these markers led to the development of DNA-based genetic markers.

In the 1980s, the first DNA-based molecular markers were developed. They can be classified into dominant and codominant markers. It is not possible to identify heterozygotes in dominant markers, whereas in codominant markers, this differentiation can be determined, and it is possible to estimate allele frequencies. Molecular markers can also be classified into those with known function (type I markers) or with anonymous regions (type II markers) [108].

#### **5.2. Restriction fragment length polymorphisms (RFLP)**

RFLP markers were the first markers discovered that were based on DNA sequences [109]. They are considered codominant markers and are type I or type II. They are based on bacterial enzymes that recognise specific DNA sequences. The DNA is then cut into fragments where these sequences are found. The digestion of DNA by restriction enzymes results in fragments that vary between individuals, populations and species. The fragments can be analysed using the polymerase chain reaction (PCR), and the PCR products are digested by restriction enzymes. RFLP markers have low potential in determining genetic variation when compared to new, recently discovered molecular markers, mainly due to the low level of polymorphism. In addition, sequence information of the specimen is required, making it difficult to determine markers in species without molecular information. However, one advantage of these markers is that they are codominant [108].

#### **5.3. Randomly amplified polymorphic DNA (RAPD)**

RAPD techniques use PCR amplification of random anonymous segments of genomic DNA with identical pairs of primers at 8–10 bp in length. Unlike RFLP markers, RAPD does not require any knowledge of DNA sequences of the organism. Therefore, nearly all RAPD markers are dominant, and it is not possible to distinguish whether a DNA segment is amplified from a heterozygous or homozygous locus [110]. The primers used are short and anneal at low temperatures, amplifying multiple products from different loci. Due to the fact that most of the nuclear genome is non-coding, most amplified loci are neutral. Genetic variation is assessed by considering each band as a bi-allelic locus, with the presence or absence of the amplified product generated by PCR. One disadvantage of this technique is the intensity variation that can occur between bands. They can make it difficult to determine whether bands represent different loci or alternative alleles of a locus. The markers also have a low reproducibility due to low annealing temperature in PCR amplification, and thus have limited application in fisheries science. Despite the disadvantages, the detection of polymorphisms is considered high [108, 111].

Many studies of mtDNA have focused on the major non-coding region, often called the control region, because of its rapid rate of evolution. The control region includes transcriptional promoters in both strands and the D-loop region. In these non-coding D-loop regions, the evolution rates are higher than the rest of the molecule. These changes lead to the formation of multiple alleles, called haplotypes that can be phylogenetically ordered within the same population and confirm intrapopulation phylogenetic relationships in

Genetic Applications in the Conservation of Neotropical Freshwater Fish

http://dx.doi.org/10.5772/intechopen.73207

261

Microsatellites or simple sequence repeats (SSRs) have been a popular marker in genetic fish research due to their abundance in the genome in all regions of the chromosome. There can be a small number to a few hundred copies of tandem repeat sequences of mono-, di-, tri- and tetranucleotide motifs. They are codominant and mostly type II markers, with abundancy in all species of fish with an estimated occurrence of one in every 10 kb in coding genes, intronic

These markers are useful in evaluating structure and genetic diversity between different populations due to high polymorphisms that give a high power in analyses of population genetics [114]. The polymorphisms are identified by size differences, resulting in varying numbers of repeat units in alleles of a single locus [108]. Mutation rates have been detected as high as 10−2

There have been many studies of wild fish stocks using microsatellites that allowed the analysis of historical population structures, colonisation histories and connectivity between populations [125]. These population characteristics are generally controlled by environmental effects [126–128] or by anthropogenic intervention [129–131] that can induce the structuring of

However, the use of microsatellite markers has some drawbacks. They require a large investment of time and laboratorial effort due to the genotyping step [108]. Moreover, they require a species-specific marker, where there is a high potential for null alleles and imperfect repeats due to polymerase slippage during replication, and genotyping errors that impact population studies by providing unreliable genetic information for conservation biology, molecular ecol-

SNPs are type I or type II polymorphisms caused by point mutations that generate different alleles for a given nucleotide belonging to a specific locus. These molecular markers are unique nucleotide substitutions of a sequence at a single site and have been well characterised since the beginning of DNA sequencing [108]. SNPs are the main focus in molecular marker development as they constitute the most abundant polymorphism in any organism's genome, with a frequency estimated at approximately 1 SNP per 200–500 bp [133]. This marker is adaptable to the automation of genotyping and reveals hidden polymorphisms that are not detected by other markers and methods [108]. Moreover, they can be efficiently identified in

fish populations with a reduction in gene flow exchange and genetic variability.

population studies [94].

regions and regulatory sequences [122, 123].

ogy and population genetic research [132].

**5.7. Single nucleotide polymorphisms (SNPs)**

any organism without the need for genomic information.

**5.6. Microsatellites**

per generation [124].

#### **5.4. Randomly amplified polymorphic DNA (AFLP)**

AFLP is a combination of the RFLP and RAPD techniques, using PCR to randomly amplify anonymous fragments of nuclear DNA (type II marker). The technique involves digestion of DNA using a restriction enzyme, as in RFLP analysis, producing a high number of dominant fragments that, depending on their concentration, are not detected by electrophoresis. The DNA is digested with different types of endonucleases, generating fragments of different sizes. The following steps are similar to the principles of RAPD, where small, known DNA sequences (adapters) are coupled to the ends of the fragments and are annealed with specific primers during PCR [112]. A unique feature of this technique is the addition of known sequence adapters to DNA fragments generated by complete genomic DNA digestion. This allows subsequent PCR amplification of the many fragments generated that are then separated by denaturing polyacrylamide gel electrophoresis [108]. The AFLP technique has some advantages, such as detection of greater numbers of loci generating a higher number of polymorphisms, broad coverage of the genome with high reproducibility (due to high PCR annealing temperatures) and low cost [113]. Like RAPDs, they are considered dominant markers and although there are packages for codominant scoring of AFLP bands, their applicability in population studies is difficult. The major disadvantage of the technique is the need for automated gene sequencers for electrophoretic analysis of fluorescent labels, although traditional electrophoretic methods can also be employed using radioactive labels or silver staining techniques [108].

#### **5.5. mtDNA markers**

Mitochondrial DNA (mtDNA) markers were the first widely used DNA markers and are one of the most popular markers for molecular diversity studies in fish [114]. This part of the genome consists of a small, circular, abundant and easy to amplify DNA molecule as there are multiple copies in the cell. Moreover, the mitochondrial gene content is strongly conserved across species, with little duplication, no intronic regions and very short intergenic regions [115]. Studies of vertebrate species have shown a mutation rate that exceeds, by multiple times, nuclear DNA mutation rates that may be due to a lack of repair mechanisms during replication [116]. The complete mtDNA sequences have been sequenced to facilitate analyses of molecular markers in many economically important Neotropical fish species, such as *S. brasiliensis* [117], *P. mesopotamicus* [118] and *L. elongatus* [119].

The DNA of cytoplasmic organelles has a non-Mendelian inheritance, and the mtDNA must be considered a single locus in genetic investigations [94]. Inheritance occurs via the mitochondria of the oocyte from which an animal develops [120]. This maternal transmission gives information on maternal lineages of fish stocks and provides a more sensitive tool for detecting population subdivision, making it an efficient marker when compared to typical nuclear markers such as microsatellites and SNPs [121].

Many studies of mtDNA have focused on the major non-coding region, often called the control region, because of its rapid rate of evolution. The control region includes transcriptional promoters in both strands and the D-loop region. In these non-coding D-loop regions, the evolution rates are higher than the rest of the molecule. These changes lead to the formation of multiple alleles, called haplotypes that can be phylogenetically ordered within the same population and confirm intrapopulation phylogenetic relationships in population studies [94].

#### **5.6. Microsatellites**

variation that can occur between bands. They can make it difficult to determine whether bands represent different loci or alternative alleles of a locus. The markers also have a low reproducibility due to low annealing temperature in PCR amplification, and thus have limited application in fisheries science. Despite the disadvantages, the detection of polymor-

AFLP is a combination of the RFLP and RAPD techniques, using PCR to randomly amplify anonymous fragments of nuclear DNA (type II marker). The technique involves digestion of DNA using a restriction enzyme, as in RFLP analysis, producing a high number of dominant fragments that, depending on their concentration, are not detected by electrophoresis. The DNA is digested with different types of endonucleases, generating fragments of different sizes. The following steps are similar to the principles of RAPD, where small, known DNA sequences (adapters) are coupled to the ends of the fragments and are annealed with specific primers during PCR [112]. A unique feature of this technique is the addition of known sequence adapters to DNA fragments generated by complete genomic DNA digestion. This allows subsequent PCR amplification of the many fragments generated that are then separated by denaturing polyacrylamide gel electrophoresis [108]. The AFLP technique has some advantages, such as detection of greater numbers of loci generating a higher number of polymorphisms, broad coverage of the genome with high reproducibility (due to high PCR annealing temperatures) and low cost [113]. Like RAPDs, they are considered dominant markers and although there are packages for codominant scoring of AFLP bands, their applicability in population studies is difficult. The major disadvantage of the technique is the need for automated gene sequencers for electrophoretic analysis of fluorescent labels, although traditional electrophoretic methods

can also be employed using radioactive labels or silver staining techniques [108].

*brasiliensis* [117], *P. mesopotamicus* [118] and *L. elongatus* [119].

nuclear markers such as microsatellites and SNPs [121].

Mitochondrial DNA (mtDNA) markers were the first widely used DNA markers and are one of the most popular markers for molecular diversity studies in fish [114]. This part of the genome consists of a small, circular, abundant and easy to amplify DNA molecule as there are multiple copies in the cell. Moreover, the mitochondrial gene content is strongly conserved across species, with little duplication, no intronic regions and very short intergenic regions [115]. Studies of vertebrate species have shown a mutation rate that exceeds, by multiple times, nuclear DNA mutation rates that may be due to a lack of repair mechanisms during replication [116]. The complete mtDNA sequences have been sequenced to facilitate analyses of molecular markers in many economically important Neotropical fish species, such as *S.* 

The DNA of cytoplasmic organelles has a non-Mendelian inheritance, and the mtDNA must be considered a single locus in genetic investigations [94]. Inheritance occurs via the mitochondria of the oocyte from which an animal develops [120]. This maternal transmission gives information on maternal lineages of fish stocks and provides a more sensitive tool for detecting population subdivision, making it an efficient marker when compared to typical

phisms is considered high [108, 111].

260 Biological Resources of Water

**5.5. mtDNA markers**

**5.4. Randomly amplified polymorphic DNA (AFLP)**

Microsatellites or simple sequence repeats (SSRs) have been a popular marker in genetic fish research due to their abundance in the genome in all regions of the chromosome. There can be a small number to a few hundred copies of tandem repeat sequences of mono-, di-, tri- and tetranucleotide motifs. They are codominant and mostly type II markers, with abundancy in all species of fish with an estimated occurrence of one in every 10 kb in coding genes, intronic regions and regulatory sequences [122, 123].

These markers are useful in evaluating structure and genetic diversity between different populations due to high polymorphisms that give a high power in analyses of population genetics [114]. The polymorphisms are identified by size differences, resulting in varying numbers of repeat units in alleles of a single locus [108]. Mutation rates have been detected as high as 10−2 per generation [124].

There have been many studies of wild fish stocks using microsatellites that allowed the analysis of historical population structures, colonisation histories and connectivity between populations [125]. These population characteristics are generally controlled by environmental effects [126–128] or by anthropogenic intervention [129–131] that can induce the structuring of fish populations with a reduction in gene flow exchange and genetic variability.

However, the use of microsatellite markers has some drawbacks. They require a large investment of time and laboratorial effort due to the genotyping step [108]. Moreover, they require a species-specific marker, where there is a high potential for null alleles and imperfect repeats due to polymerase slippage during replication, and genotyping errors that impact population studies by providing unreliable genetic information for conservation biology, molecular ecology and population genetic research [132].

#### **5.7. Single nucleotide polymorphisms (SNPs)**

SNPs are type I or type II polymorphisms caused by point mutations that generate different alleles for a given nucleotide belonging to a specific locus. These molecular markers are unique nucleotide substitutions of a sequence at a single site and have been well characterised since the beginning of DNA sequencing [108]. SNPs are the main focus in molecular marker development as they constitute the most abundant polymorphism in any organism's genome, with a frequency estimated at approximately 1 SNP per 200–500 bp [133]. This marker is adaptable to the automation of genotyping and reveals hidden polymorphisms that are not detected by other markers and methods [108]. Moreover, they can be efficiently identified in any organism without the need for genomic information.

Theoretically, the SNP of a particular locus can contain up to four alleles (A, T, C and G). In practice, however, most SNPs are usually limited to two alleles (often two C/T pyrimidines or two A/G purines) with codominant inheritance [108]. The level of polymorphism is not as high as in microsatellite markers (multi-alleles), but this disadvantage is counterbalanced by its abundance in the genome [133]. Therefore, to be considered an SNP, it is necessary for the least frequent allele to have a frequency of 1% or higher [134].

alternative techniques for genome reduction to acquire sets of redundant *contigs*: transcrip-

Genetic Applications in the Conservation of Neotropical Freshwater Fish

http://dx.doi.org/10.5772/intechopen.73207

263

Transcriptome sequencing of genomes is one of the most common analytical approaches. Complementary DNA (cDNA) is produced from the mRNA of a specific tissue or life stage. Thus, whole mRNA sequences (cDNA library) from a specific tissue or set of tissues can be aligned to a reference genome (or reference transcripts) or assembled *de novo* [147]. This approach allows data to be obtained for a single nucleotide variation profile, as well as transcriptome characteristics and gene expression levels, in a cost-effective way [148]. Additionally, transcriptome sequencing allows gene-associated SNP studies, depending on

RAD-seq is an important method of genome reduction in non-model fish for identifying and genotyping SNPs, and unlike RNA-seq, uses genomic DNA as a template. The technique uses the principles of RFLP by reducing the complexity of the genome by subsampling at sites defined by restriction enzymes [150]. This technique consists of digesting the genomic DNA with restriction enzymes, followed by mechanical fragmentation to reduce the size of the fragments making them suitable for sequencing. The digested fragments are then attached to adapters with single barcodes for each individual so they can be multiplexed in a pool of samples. Thus, the regions adjacent to the restriction sites of multiple individuals are sequenced simultaneously in a single run [151]. There are numerous variations of the RAD-seq technique with single restriction enzyme cut sites (original RAD, 2bRAD) or with two restriction enzyme cut sites (GBS, CRoPS, RRL, ddRAD) that promise to increase the number of loci

The identification of SNPs using the RAD-seq method has the advantage of avoiding unequal gene expression problems that may impair the discovery of SNPs using transcriptome sequencing [144]. Another advantage of the RAD-seq technique is the possibility of identifying DNA barcodes for individual samples or pools of samples during the preparation of DNA libraries, thus reducing costs [153]. However, alongside transcriptome analysis, the ability to identify true SNPs is hampered by the occurrence of errors caused by high-throughput sequencing. To mitigate this problem, a sufficient sequence read depth is necessary for both

RNA-seq and RAD-seq techniques have allowed the detection of many microsatellite markers [155, 156] and SNP markers [154, 157, 158] in model and non-model fish species around the world. Although they have been increasingly used in the aquaculture industry for Neotropical fish, microsatellites have been identified and characterised for research in the field of biology and conservation [159–162]. In previous studies, microsatellites loci in closely related species have been identified. These include species belonging to the Anostomidae [163], Characidae [164, 165], Cichlidae [166], Pimelodidae [167], Prochilodontidae [168–170] and Serrasalmidae

With respect to SNP identification, few studies have been carried out in relation to the conservation of Neotropical freshwater species. Researchers have focused on valorous species such as the tambaqui (*C. macropomum*) and the pacu (*P. mesopotamicus*) of the Serrasalmidae family

tome sequencing (RNA-seq) and restriction site-associated DNA (RAD-seq) [144].

the exact genomic location and functional role that are inserted [149].

assayed at low cost and effort in ecological and evolutionary studies [152].

techniques [154].

[171] families.

These characteristics demonstrate that this marker is ideal for several biological studies because they allow complex genomic analyses with high yield and coverage. This marker has been revolutionary in fish population research. The SNP markers have already been used in comparative studies of evolutionary genomics, population genomics, identification of interspecific hybrids, identification of sex-related sequences, genomic selection, mapping of genes by linkage maps and detection of alleles associated with economically important characteristics in aquaculture [135–139].

For the routine use of SNPs, genotyping platforms for analysing a large number of markers and samples, in a fast and economical manner, are fundamental. For low-throughput SNP genotyping, candidate loci can be tested using different methodologies. In summary, each platform uses a specific detection chemistry, which generates differences in the cost of genotyping, price of equipment, number of markers, expertise for use, sample volume analysis and automation [140].

One of the greatest barriers to the routine use of SNPs is the characterisation and discovery of these markers. Historically, numerous approaches to SNP discovery have been described, primarily from the comparison of specific locus sequences. Direct sequencing (Sanger) of candidate genes was considered the simplest, though expensive, strategy for SNP discovery. On a larger scale, the comparison of sequences of cloned fragments, particularly expressed sequence tag (EST) designs using different types of tissues, is the best alternative [134]. However, in addition to the high costs, a considerable amount of laboratory work, time and expertise is required for this type of analysis.

#### **5.8. Next generation sequencing (NGS) in molecular marker discovery**

Next generation sequencing (NGS) allowed researchers to generate a large amount of sequencing data at relatively low cost as compared with other methods such as Sanger sequencing. To identify a greater number of gene-associated markers, a greater yield of sequence readings is required. Next generation sequencers are particularly adapted to produce high precision sequence coverage [141, 142]. Furthermore, NGS provides an enormous number of reads, which allows entire genomes to be sequenced at a fraction of the cost for Sanger sequencing [143] and is inclusive of non-model organisms [144]. Therefore, NGS technologies have become useful for *de novo* sequencing (sequencing without a reference genome) of eukaryotic genomes [145]. When using NGS technologies, the absence of a reference genome is one of the greatest barriers to discovery of molecular markers in non-model organisms. In these cases, from a sequencing project, individual reads can be assembled into consensus sequences called *contigs* that may serve as a pseudo-reference genome [146]. There are two alternative techniques for genome reduction to acquire sets of redundant *contigs*: transcriptome sequencing (RNA-seq) and restriction site-associated DNA (RAD-seq) [144].

Theoretically, the SNP of a particular locus can contain up to four alleles (A, T, C and G). In practice, however, most SNPs are usually limited to two alleles (often two C/T pyrimidines or two A/G purines) with codominant inheritance [108]. The level of polymorphism is not as high as in microsatellite markers (multi-alleles), but this disadvantage is counterbalanced by its abundance in the genome [133]. Therefore, to be considered an SNP, it is necessary for the

These characteristics demonstrate that this marker is ideal for several biological studies because they allow complex genomic analyses with high yield and coverage. This marker has been revolutionary in fish population research. The SNP markers have already been used in comparative studies of evolutionary genomics, population genomics, identification of interspecific hybrids, identification of sex-related sequences, genomic selection, mapping of genes by linkage maps and detection of alleles associated with economically important characteris-

For the routine use of SNPs, genotyping platforms for analysing a large number of markers and samples, in a fast and economical manner, are fundamental. For low-throughput SNP genotyping, candidate loci can be tested using different methodologies. In summary, each platform uses a specific detection chemistry, which generates differences in the cost of genotyping, price of equipment, number of markers, expertise for use, sample volume analysis

One of the greatest barriers to the routine use of SNPs is the characterisation and discovery of these markers. Historically, numerous approaches to SNP discovery have been described, primarily from the comparison of specific locus sequences. Direct sequencing (Sanger) of candidate genes was considered the simplest, though expensive, strategy for SNP discovery. On a larger scale, the comparison of sequences of cloned fragments, particularly expressed sequence tag (EST) designs using different types of tissues, is the best alternative [134]. However, in addition to the high costs, a considerable amount of laboratory work, time and

Next generation sequencing (NGS) allowed researchers to generate a large amount of sequencing data at relatively low cost as compared with other methods such as Sanger sequencing. To identify a greater number of gene-associated markers, a greater yield of sequence readings is required. Next generation sequencers are particularly adapted to produce high precision sequence coverage [141, 142]. Furthermore, NGS provides an enormous number of reads, which allows entire genomes to be sequenced at a fraction of the cost for Sanger sequencing [143] and is inclusive of non-model organisms [144]. Therefore, NGS technologies have become useful for *de novo* sequencing (sequencing without a reference genome) of eukaryotic genomes [145]. When using NGS technologies, the absence of a reference genome is one of the greatest barriers to discovery of molecular markers in non-model organisms. In these cases, from a sequencing project, individual reads can be assembled into consensus sequences called *contigs* that may serve as a pseudo-reference genome [146]. There are two

least frequent allele to have a frequency of 1% or higher [134].

tics in aquaculture [135–139].

262 Biological Resources of Water

and automation [140].

expertise is required for this type of analysis.

**5.8. Next generation sequencing (NGS) in molecular marker discovery**

Transcriptome sequencing of genomes is one of the most common analytical approaches. Complementary DNA (cDNA) is produced from the mRNA of a specific tissue or life stage. Thus, whole mRNA sequences (cDNA library) from a specific tissue or set of tissues can be aligned to a reference genome (or reference transcripts) or assembled *de novo* [147]. This approach allows data to be obtained for a single nucleotide variation profile, as well as transcriptome characteristics and gene expression levels, in a cost-effective way [148]. Additionally, transcriptome sequencing allows gene-associated SNP studies, depending on the exact genomic location and functional role that are inserted [149].

RAD-seq is an important method of genome reduction in non-model fish for identifying and genotyping SNPs, and unlike RNA-seq, uses genomic DNA as a template. The technique uses the principles of RFLP by reducing the complexity of the genome by subsampling at sites defined by restriction enzymes [150]. This technique consists of digesting the genomic DNA with restriction enzymes, followed by mechanical fragmentation to reduce the size of the fragments making them suitable for sequencing. The digested fragments are then attached to adapters with single barcodes for each individual so they can be multiplexed in a pool of samples. Thus, the regions adjacent to the restriction sites of multiple individuals are sequenced simultaneously in a single run [151]. There are numerous variations of the RAD-seq technique with single restriction enzyme cut sites (original RAD, 2bRAD) or with two restriction enzyme cut sites (GBS, CRoPS, RRL, ddRAD) that promise to increase the number of loci assayed at low cost and effort in ecological and evolutionary studies [152].

The identification of SNPs using the RAD-seq method has the advantage of avoiding unequal gene expression problems that may impair the discovery of SNPs using transcriptome sequencing [144]. Another advantage of the RAD-seq technique is the possibility of identifying DNA barcodes for individual samples or pools of samples during the preparation of DNA libraries, thus reducing costs [153]. However, alongside transcriptome analysis, the ability to identify true SNPs is hampered by the occurrence of errors caused by high-throughput sequencing. To mitigate this problem, a sufficient sequence read depth is necessary for both techniques [154].

RNA-seq and RAD-seq techniques have allowed the detection of many microsatellite markers [155, 156] and SNP markers [154, 157, 158] in model and non-model fish species around the world. Although they have been increasingly used in the aquaculture industry for Neotropical fish, microsatellites have been identified and characterised for research in the field of biology and conservation [159–162]. In previous studies, microsatellites loci in closely related species have been identified. These include species belonging to the Anostomidae [163], Characidae [164, 165], Cichlidae [166], Pimelodidae [167], Prochilodontidae [168–170] and Serrasalmidae [171] families.

With respect to SNP identification, few studies have been carried out in relation to the conservation of Neotropical freshwater species. Researchers have focused on valorous species such as the tambaqui (*C. macropomum*) and the pacu (*P. mesopotamicus*) of the Serrasalmidae family that are considered one of the most captured species in the Neotropical region due to their high commercial value and potential in aquaculture [172, 173]. Other studies on SNPs (identified by the Pool-seq technique) refer to the evolutionary adaptation of species such as *Poecilia mexicana* in waters with high hydrogen sulphide concentrations (H<sup>2</sup> S) in Mexico [174] and studies regarding the identification of SNPs in the sex chromosomes of *Characidium gomesi* by the RAD-seq technique with the aim of differentiating males and females [175].

most of the wild populations tend to have high levels of genetic diversity [182]. This is largely due to formation of these groups by migratory fish, representing panmictic populations, since

Genetic Applications in the Conservation of Neotropical Freshwater Fish

http://dx.doi.org/10.5772/intechopen.73207

265

Several factors that may interfere in the fragmentation of populations, or their migratory potential, may cause a population bottleneck and decrease the genetic variability. Bottlenecks reduce population size by making individuals subject to genetic drift and inbreeding, thereby

Several studies carried out in the Paraná River Basin have already demonstrated a decline and genetic homogenisation among fish populations in this basin [185–188]. These studies indicate that the fragmentation of the basin due to the large hydroelectric dams installed in the Paraná River Basin, mainly in the Upper Paraná region, is one of the major factors affect-

Brazil is the third largest producer of hydroelectric power, accounting for up to 10% of total world production. The conversion of free-flowing tropical rivers into the regulated systems associated with hydroelectric dams is one of the major concerns for the conservation of freshwater Neotropical fish. In addition to the impact on water velocity and temperature, hydroelectric dams block the natural river flow that affects freshwater fish populations due to habitat fragmentation, with increased risks of population isolation and consequent destruction of gene flow. This has already been reported using microsatellite markers for *Prochilodus argenteus* in the São Francisco River [168] and *Brycon insignis* in the Brazilian southeast [189].

In order to mitigate the damage caused by hydroelectric dams, programmes to reintroduce affected species are a potential solution. However, lack of knowledge about the genetics of local species can have the opposite effect. Analyses of restocking programmes for *P. argenteus* indicate differences between stock populations and wild populations, and this differentiation

In addition to hydroelectric dams and inappropriate programmes for genetic restocking, the inadequate management of cultivated populations may also interfere with the genetic variability of species. Fish escaping from aquaculture facilities may influence the level of genetic diversity in natural populations living in the vicinity of fish farms. The introduction of cultivated individuals to wild populations may result in a mixture of populations with different genetic characteristics that reduce the average genetic diversity (Wahlund effect), as has

As mentioned previously, Neotropical ichthyofauna is subject to many environmental factors that may affect their rate of retention in the environment of origin, including the destruction of their habitat and consequent fragmentation of populations. The effects on the spatial distribution of fish populations may result in genetic processes that affect gene frequency, including dispersive processes, gene oscillation and founder effects. These genetic processes intensify systematic migration, mutation and selection. Due to the high levels of polymorphisms and abundance throughout the genome, molecular markers are useful for genetic

represents a risk and interrupts the diversity of local genes [190].

already been observed in many fish populations [191, 192].

high gene flow and the size of the population reduce the effects of genetic drift [183].

reducing the species evolutionary potential [184].

ing these populations.

**6.3. Genetic structure**
