**Phylogenetics, Reticulation and Evolution**

## Milton H. Gallardo

[31] Coyne JA, Orr HA. Speciation. Sunderland: Sinauer Associates; 2004. p. 545

2010;**64**:295-315. DOI: 10.1111/j.1558-5646.2009.00877.x

tree.2011.04.005

46 Phylogenetics

Evolution. 2002;**56**:2083-2089

10.1111/j.1558-5646.2009.00844.x

Publishing; 2004. p. 208

1992;**61**:1-10

[32] Sobel JM, Chen GF, Watt LR, Schemske DW. The biology of speciation. Evolution.

[33] Servedio MR, Van Doorn GS, Kopp M, Frame AM, Nosil P. Magic traits in speciation: 'Magic' but not rare? Trends in Ecology and Evolution. 2011;**26**:389-397. DOI: 10.1016/j.

[34] Price TD, Bouvier MM. The evolution of F1 postzygotic incompatibilities in birds.

[35] Gourbière S, Mallet J. Are species real? The shape of the species boundary with exponential failure, reinforcement, and the "missing snowball". Evolution. 2010;**64**:1-24. DOI:

[36] Frankham R, Bradshaw CJA, Brook BW. Genetics in conservation management: Revised recommendations for the 50/500 rules, Red List criteria and population viability analy-

[37] Bininda-Emonds O, Vazquez D, Manne L. The calculus of biodiversity: Integrating phy-

[38] Gaston KJ, Spicer JI. Biodiversity: An Introduction. 2nd edition. Malden: Wiley-Blackwell

[40] Faith DP. Conservation evaluation and phylogenetic diversity. Biological Conservation.

[41] Pardi F, Goldman N. Species choice for comparative genomics: Being greedy works.

[42] Marko PB, Hart MW. The complex analytical landscape of gene flow inference. Trends

in Ecology and Evolution. 2011;**26**:448-456. DOI: 10.1016/j.tree.2011.05.007

ses. Biological Conservation. 2014;**170**:56-63. DOI:10.1016/j.biocon.2013.12.036

logeny and conservation. Trends in Ecology & Evolution. 2000;**15**(3):92-94

[39] May RM. Taxonomy as destiny. Nature. 1990;**347**:129-130. DOI: 10.1038/347129a0

PLoS Genetics. 2005;**1**:e71. DOI: 10.1371/journal.pgen.0010071

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/intechopen.68564

#### **Abstract**

Incongruence between phylogenetic trees constructed from different gene sequences has bothered practitioners for decades. Paraphyletic or polyphyletic clustering has been traditionally treated as noise that distorts its genealogical bases. Nevertheless, recent genomic data have provided a first indication that horizontal gene transfer (HGT) in microbes and interspecific hybridization (or polyploidization) in eukaryotes challenge the doctrine of common descent. Due to promiscuous recombination, the initial stages of life would have not had a genealogical history but a common physical one whose graphic representation is known as evolutionary reticulation. Reticulate evolution in plants has long been recognized, and recent genomic evidence from animals also indicates its widespread occurrence. Taking into consideration that mounting evidence for hybridization and polyploidy in eukaryotic taxa accumulates, it is essential to have methods to infer reticulate evolutionary histories. Considering the different forms of transpecific genetic transference and introgression across the tree of life, the origin of a given species may not coincide with the origin of its genes. Accordingly, molecular mutation rates might be erroneous if based on strict genealogical thinking. Given abundant new data, it is time to move forward because a major shift in our understanding of species, speciation and phylogenetics is taking place.

**Keywords:** convergence, gene trees, phylogenetic incongruence, polyphyly, reticulation

## **1. Introduction**

Since Darwin´s seminal work, it has been claimed that organic diversity could be represented by a unique branching pattern of inclusive hierarchies depicting genealogical relationships among organisms [1]. This tree of life, based on shared homologies, was considered to reflect nature´s genuine attributes, exclusively represented by descent with modification.

Nevertheless, there is neither *a priori* independent evidence nor rigorous tests to ensure such a nested organization of nature´s biodiversity due to common descent [2]. In fact, the initial stages of life, including the origin of the last cellular ancestor, were dominated by lateral gene transfer, advanced almost 20 years ago [3]. This breakthrough has challenged the doctrine of common descent by indicating that the ancestral state would not have been an individual but a community of entities with a common physical history, but not with a genealogical one. Apparently, the three domains of life emerged independently through a sorting process from a pool of entities involved in promiscuous recombination. These processes of gene recombination in prokaryotes, leading to reticulate evolution are mimicked by repeated intercrossing (hybridization) between metazoan populations or lineages. Consequently, their evolutionary histories cannot be adequately represented as bifurcating phylogenetic trees. As a result from these deviations, a network of relationships difficult to deal with is produced, regardless of the numerous methods for the reconstruction proposed recently [4].

Traditional phylogenetic analysis applied to animal and plant phyla has stumbled with gross, irreconcilable discrepancies since its onset. Molecular phylogenomics has corrected some of these paradoxes, but what gets clarified on one end gets muddled in another. A paradigmatic example of this is the recent synthesis of animal phylogeny and taxonomy of [5], plagued with conflicts near the base of *Eukaryota* and *Metazoa*. Likewise, the phylogenomic approach to animal evolution by Telford et al. [6] resolved the most derived branches but is contentious with regard to the placement of *Eumetazoa*, *Bilateria*, *Protostomia*, *Deuterostomia* and *Lophotrochozo*a. Likewise, the phylogenetic origin of major plant taxa is unclear. For example, the placement of the Celastrales-Oxalidales-Malpighiales clade within Rosidae remains one of the most confounding phylogenetic questions in angiosperms, with previous analyses placing it with either Fabidae or Malvidae [7].

Theoretically, species correspond to independent, reproductively isolated populations although Darwin recognized interspecific hybridization as a merging process involving two ancestors. The graphical representation of this phenomenon, otherwise being diverging, is known as reticulate evolution or network evolution, describes the origination of a lineage through the partial merging of two ancestor lineages. Hybridization has played an important role in genome diversification and in adapting organisms to their environment. Nevertheless, methods for reconstructing their reticulate relationships are still in their infancy and have limited applicability. Reticulate evolution in plants has long been recognized, but recent genomic evidence from animals indicates that this phenomenon is much more common than anticipated. Taking into consideration that mounting evidence of hybridization in eukaryotic taxa accumulates, it is essential to have methods to infer reticulate evolutionary histories. Given abundant new data, it is time to move forward because a major shift in our understanding of species, speciation and phylogenetics is taking place.

Many groups of closely related species including insects, vertebrates, microbes and plants have reticulate phylogenies. In microbes, lateral gene transfer is the dominant process that distorts strictly genealogical, tree-like phylogenies. In multicellular eukaryotes, hybridization and introgression among related species are of prime importance. Introgression and reticulation can thereby affect all parts of the tree of life, not just the crown species. Accordingly, conceptual issues regarding adaptive evolution, speciation, phylogenetics and comparative genomics must be modified to fit these recent findings. Reticulation is produced by phenomena like lateral gene transfer, introgressive hybridization and polyploidization. In fact, certain alleles of gene trees may appear more closely related to alleles from a different species than to other conspecific alleles, thus giving rise to instances of paraphyly or polyphyly. The occurrence of such anomalous clustering in the evolutionary history of species poses serious challenges to practitioners of phylogenetic analysis as they result in genomic regions with locally incongruent genealogies relative to the speciation pattern. Thus, phylogenetic analyses should account for the reticulate component of evolution, especially now that whole genome sequencing provides unprecedented phylogenetic information across the web of life [8]. Here, we present genetic and genomic evidence indicating the evolutionary importance of reticulation in multicellular eukaryotes and summarize relevant reticulate issues and its bearings on phylogenetic practice.

## **2. Horizontal gene transfer (HGT)**

Nevertheless, there is neither *a priori* independent evidence nor rigorous tests to ensure such a nested organization of nature´s biodiversity due to common descent [2]. In fact, the initial stages of life, including the origin of the last cellular ancestor, were dominated by lateral gene transfer, advanced almost 20 years ago [3]. This breakthrough has challenged the doctrine of common descent by indicating that the ancestral state would not have been an individual but a community of entities with a common physical history, but not with a genealogical one. Apparently, the three domains of life emerged independently through a sorting process from a pool of entities involved in promiscuous recombination. These processes of gene recombination in prokaryotes, leading to reticulate evolution are mimicked by repeated intercrossing (hybridization) between metazoan populations or lineages. Consequently, their evolutionary histories cannot be adequately represented as bifurcating phylogenetic trees. As a result from these deviations, a network of relationships difficult to deal with is produced, regardless of

Traditional phylogenetic analysis applied to animal and plant phyla has stumbled with gross, irreconcilable discrepancies since its onset. Molecular phylogenomics has corrected some of these paradoxes, but what gets clarified on one end gets muddled in another. A paradigmatic example of this is the recent synthesis of animal phylogeny and taxonomy of [5], plagued with conflicts near the base of *Eukaryota* and *Metazoa*. Likewise, the phylogenomic approach to animal evolution by Telford et al. [6] resolved the most derived branches but is contentious with regard to the placement of *Eumetazoa*, *Bilateria*, *Protostomia*, *Deuterostomia* and *Lophotrochozo*a. Likewise, the phylogenetic origin of major plant taxa is unclear. For example, the placement of the Celastrales-Oxalidales-Malpighiales clade within Rosidae remains one of the most confounding phylogenetic questions in angiosperms, with previous analyses placing it with

Theoretically, species correspond to independent, reproductively isolated populations although Darwin recognized interspecific hybridization as a merging process involving two ancestors. The graphical representation of this phenomenon, otherwise being diverging, is known as reticulate evolution or network evolution, describes the origination of a lineage through the partial merging of two ancestor lineages. Hybridization has played an important role in genome diversification and in adapting organisms to their environment. Nevertheless, methods for reconstructing their reticulate relationships are still in their infancy and have limited applicability. Reticulate evolution in plants has long been recognized, but recent genomic evidence from animals indicates that this phenomenon is much more common than anticipated. Taking into consideration that mounting evidence of hybridization in eukaryotic taxa accumulates, it is essential to have methods to infer reticulate evolutionary histories. Given abundant new data, it is time to move forward because a major shift in our understanding of species, speciation and

Many groups of closely related species including insects, vertebrates, microbes and plants have reticulate phylogenies. In microbes, lateral gene transfer is the dominant process that distorts strictly genealogical, tree-like phylogenies. In multicellular eukaryotes, hybridization and introgression among related species are of prime importance. Introgression and reticulation can thereby affect all parts of the tree of life, not just the crown species. Accordingly, conceptual

the numerous methods for the reconstruction proposed recently [4].

either Fabidae or Malvidae [7].

48 Phylogenetics

phylogenetics is taking place.

HGT phenomenon of genetic transference mainly among prokaryotes can occur via bacterial transformation, conjugation or transduction. It excludes mitosis and meiosis and does not require immediate ancestry. Bacterial genomes have revealed a complex evolutionary history, which cannot be represented by a single strictly bifurcating tree for most genes. Comparative analysis of sequenced genomes indicates that lineage-specific gene loss has been common in evolution, thus complicating the notion of a species tree, of a last universal common ancestor and the delimitation of its taxonomic units by being asexual.

HGT in eukaryotes has been reported in phagotrophic protists and limited largely to the ancient acquisition of bacterial genes. Nevertheless, standard mitochondrial genes, encoding ribosomal and respiratory proteins, are subject to evolutionarily frequent horizontal transfer between distantly related flowering plants. These transfers have created a variety of genomic outcomes, including gene duplication, recapture of gene lost through transfer to the nucleus and chimeric, half-monocot, half-dicot genes [9].

As a result, from intergenomic comparisons, HGT appears as a dominant process to generate innovations and complex adaptations like the acquisition of shade-dwelling habits in ferns. Molecular evidence indicates that the chimeric photoreceptor, neochrome, was acquired from hornworts, thereby optimizing phototropic responses [10]. HGT not only involve individual genes but also whole chromosomes and even nuclear genomes by asexual means. In the fungi genus *Fusarium*, HGT was responsible for the acquisition of chromosomes that largely increased the organism pathogenicity [11].

The horizontal transfer of a complete genome, giving rise to a new *Nicotiana* species, was achieved by grafting somatic tissues of two transgenic, 48-chromosome *Nicotiana tabacum* × *Nicotiana glauca*. The resulting octoploid species, *Nicotiana tabauca* (2n = 96), has double genome size, and its fertile F1 depicts intermediate phenotypic traits between both parental species [12]. In *Amborella trichopoda* (the sister group of angiosperms), whole mitochondrial transfer and subsequent fusion with the recipient genome have been reported. The plant´s huge mitochondrial (mt) genome size (3.9 Mb) comes from six different genomic sources and from the mtDNA of three types of green algae, a fungus and other angiosperms. These findings emphasize the role of transpecific genomic compatibilities, fusions and syngamy, to form more complex wholes [13].

Overall results of reticulate evolution via genome-wide quantification reveal that ecological specialization somehow restricts intra- and interspecific recombination [14]. Nevertheless, the genomic architecture and content of transposable elements are also central to HGT and to recombination potential. In addition, genomic regions differ in levels of potential HGT and reticulated evolution from single genes to whole genomes. It is also noticed that genetic distances, genomic rearrangements and genome synteny all show evidence of HGT and network-like evolution both at whole and core genome scales. Moreover, proteomic core genes have experienced reticulated evolution of complex traits and played a transcendent causal role in the radiation and adaptation of life on earth.

## **3. Interspecific hybridization**

One potential cause of gene tree/species tree discordance and concomitant polyphyly is the occasional mating (hybridization) between otherwise distinct species. The resulting transfer of parental alleles to hybrid offspring (introgression) introduces variation at rates much higher than mutation.

Thus, significant levels of genomic replacement may accrue over long periods, even at low hybridization rates. This has been recently demonstrated in extant *Anopheles* mosquitoes [15] and in some *Heliconius* butterfly species and to detect past hybridization events using ancient DNA [16]. These instances force us to accept an *ad hoc* species definition applicable only to terminal taxa, rather than to the original bifurcating ancestors. Indeed, the branches of the tree change the species identity. Thereby the accumulation of introgressed regions flips the effect of gene majority to another topology [4].

Hybridization is increasingly being recognized as a widespread process between ecologically and behaviorally divergent animal species. Determining phylogenetic relationships in the presence of hybridization remains a major challenge for evolutionary biologists. If hybridization has occurred among the species of a given taxon, cladistic analysis fails to account for the process involved since the relationships are not genealogical but reticulate. Since hybridization results in incongruent intersecting data that obscure the underlying hierarchy, the results are always plagued with convergences and parallelisms of no biological relevancy [17].

Recombination is a form of reticulation that mimics the problems derived from hybridization, except that occurs at the gene level. Recombination can be diagnosed by looking at the compatibility of the phylogenetic partition supported by the polymorphic sites along the sequence. One strategy consists of looking at changes in the most parsimonious topology along sequences, while others use a maximum chi-square test or use the maximum-likelihood approach to detect specific incongruent evolutionary patterns. Unfortunately, no general method to place a putative hybrid in the appropriate clade exists.

Introgression (also known as introgressive hybridization or interspecific gene flow) occurs when alleles from one species penetrate the gene pool of another through interspecific mating and the subsequent backcrossing of hybrids into parental populations. When hybridization is symmetrical, the resulting hybrid species might be polyphyletic, as might be both parental species. Having in mind that hybrid speciation is often associated with whole genome duplication (polyploidy), knowledge of such traits may strengthen the suspicion of polyphyly derived from hybrid speciation [4]. However, in several cases of putative hybrid speciation, alternative explanations have been difficult to rule out. Considering that mitochondrial alleles are more easily introgressed than nuclear ones, their heterospecific plasmidial origin will be more frequently detected. Consequently, mitochondrial gene trees could be particularly susceptible to the effects of introgression and be especially misleading in cases where introgressed haplotype lineages become fixed, leaving no hint that they are of heterospecific origin.

The discovery of cytoplasmic introgression and the disparity between rDNA and cpDNA phylogenies of several plant groups is reflective of past hybridization and subsequent introgression. If an analysis includes hybrids, no matter where the hybrids are placed, a cladistic method produces only divergently branching phylogenetic patterns and thus can never retrieve the correct phylogeny, and we end up with confusing and conflicting results.

## **4. Polyploidy**

with the recipient genome have been reported. The plant´s huge mitochondrial (mt) genome size (3.9 Mb) comes from six different genomic sources and from the mtDNA of three types of green algae, a fungus and other angiosperms. These findings emphasize the role of transpecific

Overall results of reticulate evolution via genome-wide quantification reveal that ecological specialization somehow restricts intra- and interspecific recombination [14]. Nevertheless, the genomic architecture and content of transposable elements are also central to HGT and to recombination potential. In addition, genomic regions differ in levels of potential HGT and reticulated evolution from single genes to whole genomes. It is also noticed that genetic distances, genomic rearrangements and genome synteny all show evidence of HGT and network-like evolution both at whole and core genome scales. Moreover, proteomic core genes have experienced reticulated evolution of complex traits and played a transcendent causal

One potential cause of gene tree/species tree discordance and concomitant polyphyly is the occasional mating (hybridization) between otherwise distinct species. The resulting transfer of parental alleles to hybrid offspring (introgression) introduces variation at rates much

Thus, significant levels of genomic replacement may accrue over long periods, even at low hybridization rates. This has been recently demonstrated in extant *Anopheles* mosquitoes [15] and in some *Heliconius* butterfly species and to detect past hybridization events using ancient DNA [16]. These instances force us to accept an *ad hoc* species definition applicable only to terminal taxa, rather than to the original bifurcating ancestors. Indeed, the branches of the tree change the species identity. Thereby the accumulation of introgressed regions flips the effect

Hybridization is increasingly being recognized as a widespread process between ecologically and behaviorally divergent animal species. Determining phylogenetic relationships in the presence of hybridization remains a major challenge for evolutionary biologists. If hybridization has occurred among the species of a given taxon, cladistic analysis fails to account for the process involved since the relationships are not genealogical but reticulate. Since hybridization results in incongruent intersecting data that obscure the underlying hierarchy, the results are always plagued with convergences and parallelisms of no biological relevancy [17].

Recombination is a form of reticulation that mimics the problems derived from hybridization, except that occurs at the gene level. Recombination can be diagnosed by looking at the compatibility of the phylogenetic partition supported by the polymorphic sites along the sequence. One strategy consists of looking at changes in the most parsimonious topology along sequences, while others use a maximum chi-square test or use the maximum-likelihood approach to detect specific incongruent evolutionary patterns. Unfortunately, no general

method to place a putative hybrid in the appropriate clade exists.

genomic compatibilities, fusions and syngamy, to form more complex wholes [13].

role in the radiation and adaptation of life on earth.

**3. Interspecific hybridization**

of gene majority to another topology [4].

higher than mutation.

50 Phylogenetics

Polyploidy is a form of interspecific hybridization followed by whole genome duplication (WGD). As the most drastic modification that a cell can experience, it involves rapid and profound nonrandom changes in chromatin composition, segregation patterns and copy number variation of dispersed repetitive DNA [18, 19]. Polyploidy is also instrumental to introgress alien DNA into breeding lines enabling the introduction of novel characters as demonstrated by FISH, GISH and genetic mapping [20]. Its evolutionary role has motivated intense studies because duplicated gene pathways provide new opportunities for increased body-plan complexity, organismal differentiation and adaptation by recruitment of new genes to new roles [21, 22]. Polyploidy has played a significant role in the hybrid speciation and adaptive radiation of flowering plants but has been considered irrelevant to mammalian speciation due to severe disruptions in the sex-determination system and dosage compensation mechanism [23, 24]. Recent comparative genomic data has further demonstrated the evolutionary transcendence of polyploidy by reporting three rounds of WGD (3R hypothesis) in vertebrate evolution [25] and five rounds in flowering plants [26].

The convergence of distinct lineages upon interspecific hybridization (allopolyploid) and subsequent endoreduplication that increases ploidy level is a driving force in the origin of most flowering plants species. Likewise, the grass tribe *Triticeae* (Hordeeae) is characterized by its evolutionary complexity as indicated by numerous events of auto- and allopolyploidization. Introgression involving diploid and polyploid ancestors is the major factor concurring to their complex history [27]. Moreover, several analyses of multi-gene data sets demonstrated the conflict between the chloroplast and both nuclear and mitochondrial data sets. Nevertheless, synthetic polyploids are able to stabilize their genome in few generations after their onset. In order to explain conflicting pattern distribution in a phylogeny, it is claimed that several strategies have been advanced [7].

Following WGD, duplicated genes show two types of homologies stem from the fact that genes are duplicated: paralogy and orthology. Paralogy stands for genes that are related following a duplication event, whereas orthology is the result of speciation. Consequently, the gene tree based on multigene families in polyploid species would be problematic if confounding these two forms of homology. Due to this limitation, mitochondrial single-copy genes rather than nuclear genes are a more reliable source of allele orthology. A gene tree that includes paralogous alleles may depict polyphyletic species because its topology reflects gene duplication as well as speciation. The cause of this polyphyly may be misinterpreted if the orthology of alleles is assumed. Because mitochondrial loci are single-copy genes rather than members of multigene families, it was long considered safe to assume allele orthology by mitochondrial primers. This is a serious phylogenetic challenge considering that most angiosperms are polyploid. If the 3R and 5R hypotheses are scientifically valid, their implication makes the search for common ancestry irrelevant to science. To celebrate the 150 years of Darwin´s Origin of Species, the prestigious journal, *Heredity,* published an issue on speciation whose editorial introduction says: "many questions concerning the causes of speciation remain open and speciation continues to be one of the most actively studied topics in modern evolutionary biology" [28]. The end result is that we neither do have a comprehensive understanding of speciation nor about the reality of the species. And the origin of species by natural selection continues being debated. One wonders whether the scientific community is not pursuing in the wrong direction by studying patterns instead of the process itself [1, 3]. In this line, Lynn Margulis claimed that "…neodarwinism will ultimately be viewed as only a minor twentiethcentury religious sect within the sprawling religious persuasion of Anglo-Saxon biology" [29].

In short, gene duplication following polyploidy can give rise to multigenic families that correspond to groups of locally distributed, tandemly oriented redundant genes that can subsequently be involved in non-allelic homologous recombination. Duplicated genes can undergo three different outcomes. First, both copies can persist, keeping their sequence identity while maintaining a high level of gene expression. A second possibility, known as subfunctionalization, occurs when one gene copy is silenced (by physical elimination or methylation). Subfunctionalized copies may form pseudogenes, nonfunctional genetic sequences that conserve their similarity to one or more paralogs that confound phylogenetic analyses. The third outcome of a duplicated gene is neofunctionalization, a phenomenon that involves functional diversification to a new role or allelic specialization of a previous function. Clearly, these processes of gene evolution consisting of both gene births and deaths after duplication interfere with the general assumptions of phylogenetic analysis and blur the end results.

## **5. Incomplete lineage sorting**

Incomplete lineage sorting occurs when polymorphisms persist between speciation events, so that the true genealogical relationship of a gene or genome region differs from the species branching pattern. Incomplete lineage sorting and introgression are two main causes of discordance between gene trees and species trees of eukaryotic coding sequences. For instance, around 15% of human genes are more closely related to homologs in gorillas than to the chimpanzee sister lineage. This anomaly is probably derived from their reduced ancestral effective population size (*Ne* ) and short speciation time span between humans and simians. Recent findings of shared polymorphisms between them include the MHC and ABO blood group loci. In the species complex of *Anopheles gambiae*, a very large chromosomal inversion encompassing 8.5% of its genome size is maintained by a balanced selection-driven populational regime [15]. Unlike lateral transfer and introgression, incomplete lineage sorting does not result in phylogenetic reticulation at species level. Nevertheless, it confounds molecular phylogenetic analysis by making to appear closer that real two different clades. A phenomenon derived from chance events is taken as if genealogical.

In order to explain conflicting pattern distribution in a phylogeny, it is claimed that several

Following WGD, duplicated genes show two types of homologies stem from the fact that genes are duplicated: paralogy and orthology. Paralogy stands for genes that are related following a duplication event, whereas orthology is the result of speciation. Consequently, the gene tree based on multigene families in polyploid species would be problematic if confounding these two forms of homology. Due to this limitation, mitochondrial single-copy genes rather than nuclear genes are a more reliable source of allele orthology. A gene tree that includes paralogous alleles may depict polyphyletic species because its topology reflects gene duplication as well as speciation. The cause of this polyphyly may be misinterpreted if the orthology of alleles is assumed. Because mitochondrial loci are single-copy genes rather than members of multigene families, it was long considered safe to assume allele orthology by mitochondrial primers. This is a serious phylogenetic challenge considering that most angiosperms are polyploid. If the 3R and 5R hypotheses are scientifically valid, their implication makes the search for common ancestry irrelevant to science. To celebrate the 150 years of Darwin´s Origin of Species, the prestigious journal, *Heredity,* published an issue on speciation whose editorial introduction says: "many questions concerning the causes of speciation remain open and speciation continues to be one of the most actively studied topics in modern evolutionary biology" [28]. The end result is that we neither do have a comprehensive understanding of speciation nor about the reality of the species. And the origin of species by natural selection continues being debated. One wonders whether the scientific community is not pursuing in the wrong direction by studying patterns instead of the process itself [1, 3]. In this line, Lynn Margulis claimed that "…neodarwinism will ultimately be viewed as only a minor twentiethcentury religious sect within the sprawling religious persuasion of Anglo-Saxon biology" [29]. In short, gene duplication following polyploidy can give rise to multigenic families that correspond to groups of locally distributed, tandemly oriented redundant genes that can subsequently be involved in non-allelic homologous recombination. Duplicated genes can undergo three different outcomes. First, both copies can persist, keeping their sequence identity while maintaining a high level of gene expression. A second possibility, known as subfunctionalization, occurs when one gene copy is silenced (by physical elimination or methylation). Subfunctionalized copies may form pseudogenes, nonfunctional genetic sequences that conserve their similarity to one or more paralogs that confound phylogenetic analyses. The third outcome of a duplicated gene is neofunctionalization, a phenomenon that involves functional diversification to a new role or allelic specialization of a previous function. Clearly, these processes of gene evolution consisting of both gene births and deaths after duplication interfere with the general assump-

strategies have been advanced [7].

52 Phylogenetics

tions of phylogenetic analysis and blur the end results.

Incomplete lineage sorting occurs when polymorphisms persist between speciation events, so that the true genealogical relationship of a gene or genome region differs from the species

**5. Incomplete lineage sorting**

Several analytical methods assume that reticulation events are the sole cause of all incongruence among the gene trees and seek phylogenetic networks to explain all incongruences. Nevertheless, these methods overestimate the degree of reticulation if other causes of incongruence are at play. Indeed, recent studies in the human genome [30, 31] in *Mus* [32] and butterflies [33] have shown that detecting hybridization in practice is complicated by incomplete lineage sorting.

Some authors claim that significant steps have been conducted to put phylogenetic networks on par with phylogenetic trees as a model of capturing evolutionary relationships. Nevertheless, progress with phylogenetic network inference notwithstanding methods of inferring reticulate evolutionary histories while accounting for ILS is poorly understood. Its inapplicability stems mainly from two major issues: the lack of a phylogenetic network inference method and the lack of a method to assess the degree of confidence associated to an inference traveling into a phylogenetic network space. Likewise, methods for assessing the complexity of a network and the use the bootstrap method for measuring branch support of inferred networks have been developed [33].

## **6. Identifying complex patterns of genetic diversity through networks**

Branching diagrams dominate the phylogenetic thinking. Nevertheless, the genetic patterns of bacterial genome evolution give rise to complex patterns than cannot be accommodated by a tree [34]. The complexity and profound relationships among the three domains of life defy traditional methods. For example, the construction of a web of genetic similarity comprising proteomic data from 14 eukaryotes, 104 prokaryotes, 2389 virus and 1044 plasmids clearly showed the chimeric origin of eukaryotes. These fusion events between *Archaebacteria* and *Eubacteria* would not have been detected by conventional phylogenetic algorithms and trees. But not only that, it also indicated that eukaryote genes connecting a specified domain of prokaryotes tend to connect to other entities of the same domain [35]. Genes derived from *Archaea* or *Bacteria* tend to carry out different functions and act in distinct cell compartments. This complex interwoven on the web suggests an early integration of their respective genetic repertoires. Thus, web analysis stresses the study of deep evolutionary events.

Reticulate patterns can also stem from an inadequate analysis or data processing, wrong specification of the model used, wrong use of data or sequence alignments. Even though network analysis allows a drastic reduction of data misinterpretation, most important is to be aware that genomic hybridization is a more probable explanation to capture the differences among genetic trees [36].

#### **7. Conclusions**

Interspecific gene exchanges are much common than previously appreciated. This not only includes hybridizing sister species undergoing genomic introgression but whole groups that exchange adaptive and nonadaptive genomic regions, as exemplified in *Anopheles*, *Heliconius* and hominids. Considering that hybridization between sister species may or may not affect the species tree, the sole estimates of introgression rates derived from species tree topologies can underestimate the overall level of gene flow. Thus, the origin of traits and the genes behind them can have very different histories from that of the species tree.

The only literature survey dealing with the frequency, causes and consequences of specieslevel paraphyly and polyphyly indicates that their incidence is taxonomically widespread [37]. Interestingly, almost 25% of the scientific literature surveyed does not offer an explanation to polyphyletic gene trees. Polyphyly was observed in 15% of species across the cnidarians, mollusks, insects, crustaceans, arachnids and echinoderms, whereas half of the citations dealing with these deviations claim for a faulty taxonomy. Both introgressive hybridization and incomplete lineage sorting were also invoked in one third of the 2319 species analysed. Inadequate phylogenetic information is invoked in few papers [37]. Consequently, species-level monophyly cannot be assumed as an *a priori* axiom. For the origination of above species-level polyphyly, traditional phylogenetics uses a Lamarckian explanation and thinking: the environment triggers evolutionary innovations, while organisms passively adapt to the new environmental demands. Natural selection is conceived as the source and driving force that shape life as we see it. Distance and objectivity of phylogenetic thinking from a particular (i.e. Darwinian) evolutionary view is advised. The search for evolutionary relationships does not require alignment to a particular world view to discover the pattern that connects [38]. Otherwise, any data set that does not fit the model is labelled as convergence or parallelism, descriptive concepts with no informational, explanatory value. The morphophysiological discrepancies observed among animal or vegetal phyla [5–7] are incontrovertible evidence that traditional phylogenetic thinking cannot explain the origin of body plans. Distorted presumptions about nature and inadequate or faulty methodologies conspire to maintain the present phylogenetic incongruencies. Having in mind that HGT occurs all across the tree of life, the time for the origin of a given species will not coincide with the origin of its genes. They could have evolved in other genetic backgrounds and horizontally transferred across reproductive barriers. Accordingly, molecular mutation rates might be erroneous if based on genealogical thinking. One explanation for polyphyly might not be derived from a faulty taxonomy but from unforeseen non-Mendelian mechanisms.

## **Acknowledgements**

Reticulate patterns can also stem from an inadequate analysis or data processing, wrong specification of the model used, wrong use of data or sequence alignments. Even though network analysis allows a drastic reduction of data misinterpretation, most important is to be aware that genomic hybridization is a more probable explanation to capture the differences among

Interspecific gene exchanges are much common than previously appreciated. This not only includes hybridizing sister species undergoing genomic introgression but whole groups that exchange adaptive and nonadaptive genomic regions, as exemplified in *Anopheles*, *Heliconius* and hominids. Considering that hybridization between sister species may or may not affect the species tree, the sole estimates of introgression rates derived from species tree topologies can underestimate the overall level of gene flow. Thus, the origin of traits and the genes

The only literature survey dealing with the frequency, causes and consequences of specieslevel paraphyly and polyphyly indicates that their incidence is taxonomically widespread [37]. Interestingly, almost 25% of the scientific literature surveyed does not offer an explanation to polyphyletic gene trees. Polyphyly was observed in 15% of species across the cnidarians, mollusks, insects, crustaceans, arachnids and echinoderms, whereas half of the citations dealing with these deviations claim for a faulty taxonomy. Both introgressive hybridization and incomplete lineage sorting were also invoked in one third of the 2319 species analysed. Inadequate phylogenetic information is invoked in few papers [37]. Consequently, species-level monophyly cannot be assumed as an *a priori* axiom. For the origination of above species-level polyphyly, traditional phylogenetics uses a Lamarckian explanation and thinking: the environment triggers evolutionary innovations, while organisms passively adapt to the new environmental demands. Natural selection is conceived as the source and driving force that shape life as we see it. Distance and objectivity of phylogenetic thinking from a particular (i.e. Darwinian) evolutionary view is advised. The search for evolutionary relationships does not require alignment to a particular world view to discover the pattern that connects [38]. Otherwise, any data set that does not fit the model is labelled as convergence or parallelism, descriptive concepts with no informational, explanatory value. The morphophysiological discrepancies observed among animal or vegetal phyla [5–7] are incontrovertible evidence that traditional phylogenetic thinking cannot explain the origin of body plans. Distorted presumptions about nature and inadequate or faulty methodologies conspire to maintain the present phylogenetic incongruencies. Having in mind that HGT occurs all across the tree of life, the time for the origin of a given species will not coincide with the origin of its genes. They could have evolved in other genetic backgrounds and horizontally transferred across reproductive barriers. Accordingly, molecular mutation rates might be erroneous if based on genealogical thinking. One explanation for polyphyly might not be derived from a faulty taxonomy but from

behind them can have very different histories from that of the species tree.

unforeseen non-Mendelian mechanisms.

genetic trees [36].

54 Phylogenetics

**7. Conclusions**

The assistance of E. Suárez-Villota is highly appreciated. This contribution was supported with private funds.

## **Author details**

Milton H. Gallardo

Address all correspondence to: mgallard@uach.cl

Faculty of Sciences, Institute of Marine and Limnological Sciences, Austral University of Chile, Valdivia, Chile

## **References**


[21] Otto SP. The evolutionary consequences of polyploidy. Cell. 2007;**131**:452-462. DOI: 10.1016/ j.cell.2007.10.022

[9] Bergthorsson U, Adams KL, Thomason B, Palmer JD. Widespread horizontal transfer of mitochondrial genes in flowering plants. Nature. 2003;**424**:197-201. DOI: 10.1038/

[10] Li FW, Villarreal JC, Kelly S, Rothfels CJ, Melkonian M, Frangedakis E, Ruhsam M, Sigel EM, Der JP, Pittermann J, et al. Horizontal transfer of an adaptive chimeric photoreceptor from bryophytes to ferns. Proceedings of the National Academy of Sciences of the

[11] Mehrabi R, Bahkali AH, Abd-Elsalam KA, Moslem M, Ben M'barek S, Gohari AM, Jashni MK, Stergiopoulos I, Kema GH, de Wit PJ. Horizontal gene and chromosome transfer in plant pathogenic fungi affecting host range. FEMS Microbiology Reviews. 2011;**35**:542-

[12] Fuentes I, Stegemann S, Golczyk H, Karcher D, Bock R. Horizontal genome transfer as an asexual path to the formation of new species. Nature. 2014;**511**:232-235. DOI: 10.1038/

[13] Rice DW, Alverson AJ, Richardson AO, Young GJ, Sanchez-Puerta MV, Munzinger J, Barry K, Boore JL, Zhang Y, dePamphilis CW, et al. Horizontal transfer of entire genomes via mitochondrial fusion in the angiosperm *Amborella*. Science. 2013;**342**:1468-1473. DOI:

[14] Hernández-López A, Chabrol O, Royer-Carenzi M, Merhej V, Pontarotti P, Raoult D. To tree or not to tree? Genome-wide quantification of recombination and reticulate evolution during the diversification of strict intracellular bacteria. Genome Biology and

[15] Wen D, Yu Y, Hahn MW, Nakhleh L. Reticulate evolutionary history and extensive introgression in mosquito species revealed by phylogenetic network analysis. Molecular

[16] Schaefer NK, Shapiro B, Green RE. Detecting hybridization using ancient DNA.

[17] Funk VA. Phylogenetic patterns and hybridization. Annals of the Missouri Botanical

[18] Chen ZJ. Genetic and epigenetic mechanisms for gene expression and phenotypic variation in plant polyploids. Annual Review of Plant Biology. 2007;**58**:377-406. DOI: 10.1146/

[19] Lim KY, Soltis DE, Soltis PS, Tate J, Matyasek R, Srubarova H, Kovarik A, Pires JC, Xiong Z, Leitch AR. Rapid chromosome evolution in recently formed polyploids in *Tragopogon*

[20] Chester M, Leitch AR, Soltis PS, Soltis DE. Review of the application of modern cytogenetic methods (FISH/GISH) to the study of reticulation (polyploidy/hybridisation).

(Asteraceae). PLoS One. 2008;**3**:e3353. DOI: 10.1371/journal.pone.0003353

United States of America. 2014;**111**:6672-6677. DOI: 10.1073/pnas.1319929111

554. DOI: 10.1111/j.1574-6976.2010.00263.x

Evolution. 2013;**5**:2305-2317. DOI: 10.1093/gbe/evt178

Ecology. 2016;**25**:2361-2372. DOI: 10.1111/mec.13544

Garden. 1985;**66**(3):521-527. DOI: 10.2307/2399220

Genes. 2010;**1**:166-192. DOI: 10.3390/genes1020166

annurev.arplant.58.032806.103835

Molecular Ecology. 2016;**25**:2398-2412. DOI: 10.1111/mec.13556

nature01743

56 Phylogenetics

nature13291

10.1126/science.1246275


**Cytogenetics and Genomics Approaches for Phylogenetics**

[34] Jachiet PA, Colson P, Lopez P, Bapteste E. Extensive gene remodeling in the viral world: New evidence for nongradual evolution in the mobilome network. Genome Biology and

[35] Alvarez-Ponce D, Lopez P, Bapteste E, McInerney JO. Gene similarity networks provide tools for understanding eukaryote origins and evolution. Proceedings of the National Academy of Sciences of the United States of America. 2013;**110**:E1594-E1603. DOI:

[36] Bapteste E, van Iersel L, Janke A, Kelchner S, Kelk S, McInerney JO, Morrison DA, Nakhleh L, Steel M, Stougie L, Whitfield J. Networks: Expanding evolutionary thinking.

[37] Funk DJ, Omland KE. Species-level paraphyly and polyphyly: Frequency, causes, and consequences, with insights from animal mitochondrial DNA. Annual Review of Ecology, Evolution, and Systematics. 2003;**34**:397-423. DOI: 10.1146/annurev.ecolsys.34.011802.132421

[38] Bateson G. Mind and Nature. A Necessary Unity. New York: Bantam Books; 1979. p. 259

Trends in Genetics. 2013;**29**:439-441. DOI: 10.1016/j.tig.2013.05.007

Evolution. 2014;**6**:2195-2205. DOI: 10.1093/gbe/evu168

10.1073/pnas.1211371110

58 Phylogenetics

## **Applying Cytogenetics in Phylogenetic Studies**

Ming Chen, Wen‐Hsiang Lin, Dong‐Jay Lee, Shun‐Ping Chang, Tze‐Ho Chen and Gwo‐Chin Ma

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/intechopen.68566

#### **Abstract**

Cytogenetics, with its fundamental role in the field of genetic investigation, continues to be an indispensable tool for studying phylogenetics, given that currently molecular evo‐ lutionary analyses are more commonly utilized. Chromosomal evolution indicated that genomic evolution occurs at the level of chromosomal segments, namely, the genomic blocks in the size of Mb‐level. The recombination of homologous blocks, through the mechanisms of insertion, translocation, inversion, and breakage, has been proven to be a major mechanism of speciation and subspecies differentiation. Meanwhile, molecular cytogenetics (fluorescence in situ hybridization‐based methodologies) had been already widely applied in studying plant genetics since polyploidy is common in plant evolu‐ tion and speciation. It is now recognized that comparative cytogenetic studies can be used to explore the plausible phylogenetic relationships of the extant mammalian species by reconstructing the ancestral karyotypes of certain lineages. Therefore, cytogenetics remains a feasible tool in the study of comparative genomics, even in this next generation sequencing (NGS) prevalent era.

**Keywords:** cytogenetics, comparative cytogenetics, fluorescence *in situ* hybridization, genomic *in situ* hybridization, zoo‐CGH

## **1. Introduction: chromosomal evolution of mammals**

According to fossil records, the radiation evolution of mammals diverged after the K‐T bound‐ ary (approximately 65 Mya, between the Cretaceous and Tertiary periods, at which most of the dinosaurs were extinct). There are three hypotheses that try to explain such findings: (1) Explosive hypothesis: It is supported by most paleobiologists and states that the genesis and diversification of many phyletic groups ("Orders") diverged after the Cretaceous‐Tertiary (K‐T)

© 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

boundary; (2) Long Fuse hypothesis: It supports the view that Order diversification occurred after the K‐T boundary but that genesis occurred in the Cretaceous period, i.e., before the K‐T boundary; and (3) Short Fuse hypothesis: It considers the genesis and diversification of Orders to have diverged before the K‐T boundary (**Figure 1**) [1].

**Figure 1.** Three hypotheses of mammalian interordinal divergences, modified from Ref. [1].

Molecular data indicate that mammalian diversification began in the Cretaceous period, which supports the (2) Long Fuse and (3) Short Fuse hypotheses. However, these data have limitations, including the availability of a single temporal calibration point and the variable evolution rate of different phyletic groups. Due to the lack of representativeness of the sam‐ ples, this inadequate taxon sampling restricts the use on some, but not all, placental mam‐ mals, and it makes the negative correlation between evolution rate and body size difficult to explain. William Murphy and Stephen O'Brien's team made a successful attempt at answer‐ ing these questions with zoo‐fluorescence *in situ* hybridization (zoo‐FISH). Currently, the Long Fuse hypothesis seems to be a better match with the evolution of most phyletic groups, but not the orders *Rodentia* and *Eulipotyphla*, which better suit the Short Fuse hypothesis [1].

boundary; (2) Long Fuse hypothesis: It supports the view that Order diversification occurred after the K‐T boundary but that genesis occurred in the Cretaceous period, i.e., before the K‐T boundary; and (3) Short Fuse hypothesis: It considers the genesis and diversification of Orders

to have diverged before the K‐T boundary (**Figure 1**) [1].

62 Phylogenetics

**Figure 1.** Three hypotheses of mammalian interordinal divergences, modified from Ref. [1].

**Figure 2** presents the phylogenetic tree of placental mammals derived from 16,379 nucleotide sequences (including 19 nuclear genes and 3 mitochondrial genes published by the study team), where opossum is considered an outgroup using the maximal likelihood method, and placental mammals are considered to appear at 105 Mya. When the K‐T boundary is labeled with red dashes, we find that "Order" genesis and diversification are events that occur before the boundary.

By comparing the chromosomal break point of multiple species, including the chromosomal rearrangement of loci discovered via comparative genomics and some genetic sequences from

**Figure 2.** Phylogenetic tree of placental mammals derived from 16,379 nucleotide sequences, modified from Ref. [1].

fully sequenced species, we can clearly find that (1) Approximately 20% of chromosomal break points are repeatedly involved in the evolutionary process of mammals. (2) These repeatedly involved break points are primarily located at the centromere and telomere. (3) The number of genes within and near the break point blocks that are involved in chromosomal evolution is higher than the mean of the overall genome. (4) The unique break points unique in Primates are located at repeated segment regions and the ends are surrounded by reversed sequences. **Figure 3** refers to the rate of chromosomal breaking using the chromosomal break points involved in the evolution of mammals.

The result shows that the chromosomal rearrangement rate before the K‐T boundary is 0.11–0.43/My, and this rate is doubled to quadrupled for Primates and increased fivefold for Rodentia [2, 3].

**Figure 3.** Rate of chromosomal breaking using chromosomal break points involved in the evolution of mammals, modified from Ref. [3].

## **2. How to apply molecular genomics in the study of evolution and parental relationships**

#### **2.1. Zoo‐FISH**

Comparative mapping: It is a method for comparing the location of homologous genes of different species to explore the evolution of genomes; zoo‐FISH is an extension of such tech‐ nology. This method assesses the overall chromosomal similarity among all mammalian orders and becomes a powerful tool to study genomic evolution. The possible mechanism and factors related to mammalian genomic evolution can be understood with Metatheria and Eutheria studies.

fully sequenced species, we can clearly find that (1) Approximately 20% of chromosomal break points are repeatedly involved in the evolutionary process of mammals. (2) These repeatedly involved break points are primarily located at the centromere and telomere. (3) The number of genes within and near the break point blocks that are involved in chromosomal evolution is higher than the mean of the overall genome. (4) The unique break points unique in Primates are located at repeated segment regions and the ends are surrounded by reversed sequences. **Figure 3** refers to the rate of chromosomal breaking using the chromosomal break points

The result shows that the chromosomal rearrangement rate before the K‐T boundary is 0.11–0.43/My, and this rate is doubled to quadrupled for Primates and increased fivefold for

**2. How to apply molecular genomics in the study of evolution and** 

Comparative mapping: It is a method for comparing the location of homologous genes of different species to explore the evolution of genomes; zoo‐FISH is an extension of such tech‐ nology. This method assesses the overall chromosomal similarity among all mammalian

**Figure 3.** Rate of chromosomal breaking using chromosomal break points involved in the evolution of mammals,

involved in the evolution of mammals.

Rodentia [2, 3].

64 Phylogenetics

**parental relationships**

**2.1. Zoo‐FISH**

modified from Ref. [3].

When conducting zoo‐FISH, partial or whole chromosomes are obtained through the sort‐ ing of fluorescence‐labeled cells or microscopic extraction. DNA extracted from this specific chromosomal block is subject to degenerated oligonucleotide primed‐PCR (DOP‐PCR), then labeled with fluorescence to produce probes, and hybridized with the chromosome of the spe‐ cies of interest. Due to the resolution of zoo‐FISH, which is approximately 10 Mbp (megabase pairs), this method may underestimate the real rearrangement events on the chromosome. However, zoo‐FISH has revealed some interesting facts: many chromosome blocks of dif‐ ferent species are rather conservative, and the similar chromosome blocks from a common ancestor are called synteny blocks. For example, one somatic chromosome of the gray‐headed flying fox (*Pteropus poliocephalus*) possesses synteny blocks that are also found in *Homo sapiens* (HSA) chromosome 3 and HSA 21. These HSA3+21 synteny blocks form the primary synteny blocks of placental mammals, i.e., it is a characteristic that was present in a common ancestor and all researched Eutheria members [4].

One of the most important applications of zoo‐FISH is to study the speed of chromosomal rearrangement when studying genomic evolution [5]. Using the phylogenetic tree that is based on fossil evidence, we can understand the rate of movement and rearrangement of syn‐ teny blocks in the chromosomes of two species. When there are difficulties in bi‐directional zoo‐FISH, monodirectional zoo‐FISH can provide with key information or a new understand‐ ing. By comparing the chromosomal synteny blocks of indicator mammals and Aves, the occurrence rate of chromosomal rearrangement was found to be fixed at approximately 1–10/ Mya [6]. The chromosomal rearrangement rate is shown in **Figure 4**, and the rate may differ with lineage genesis and at different evolutionary stages.

Three important stages of chromosomal rearrangement are found (**Figure 4**): The first stage (1–3 Mya) < 0.2/My, the second stage increased to 1.1/My, and in the third stage, the rear‐ rangement rate greatly varied in nonrodents. For example, humans, *Carnivora* and *Soricidae* are of low rearrangement (< 0.1/My), swine, cattle, equine and dolphin are moderate (0.1–0.3/My), and large apes are relatively fast (1.5–2.3/My). The chromosomal evolution in Rodentia is the fastest, and the possible explanations include (1) population size (a larger population provides more genetic modification); (2) different genetic composition (more than 50% of the mammalian genome is repeated sequence, whereas it is only 15% repeated sequence in birds), and (3) different generation times (a short generation time indicates more mitotic events). From chromosomal evolutionary evidence, scientists believe that the evolution of mammalian genomes was inconsistent. The evolution was faster for Rodentia, bears, canines, cattle and few big apes, whereas it was relatively slow for cats, ferrets, badgers, dolphins and humans. In addition, it is worth noting that zoo‐FISH, like other FISH‐based methods, cannot identify intrachromosomal rearrangements (such as inversion). It was believed that the incidence of interchromosomal rearrangement events is higher than intrachromosomal events, but a sequential comparison revealed that it is the opposite for feline and cattle. In a zoo‐FISH using human DNA as probe, some recombination events were lineage‐specific. For

**Figure 4.** Three phases of chromosomal rearrangement rate. The numbers in the circles are the time (Mya) of divergence of common ancestors, and the numbers in the brackets indicate the rates of chromosomal rearrangement per Myr. (ps= prosimians; nw= new world monkeys; ow = old world monkeys; la = lesser apes; ga = great apes.) Modified from Ref. [6].

example, "15 + 19" (suggesting synteny blocks similar to HSA15 and 19) is Cetartiodactyla‐ and Perissodactyla‐specific, "3 + 19" is Carnivora‐specific, and "14 + 15" is widely seen in Aves and placental mammals other than Rodentia (**Figure 5**).

The other application of zoo‐FISH is to reconstruct primitive karyotyping. **Figure 6** shows the estimates of ancestral placental mammal (2*n* = 50), primate (2*n* = 50), and Carnivora (2*n* = 42) karyotypes as well as each chromosome and its relationship with human syntenic‐associated chromosomes.

It is worth noting that the study shows that the chromosomal karyotype of primitive placental mammals is 2*n* = 50, while Svartman et al. [7] also found that the karyotype of Hoffmann's two‐toed sloth (*Choloepus hoffmanni*), a Xenarthra member, possesses a karyotype close to the primitive one. This result suggests that the most primitive placental mammals may be Xenarthra, not Afrotheria. Both groups originated in the southern hemisphere, and this result does not violate Murphy's hypothesis on the origin of mammals. That is, when the part of supercontinent Gondwana in southern hemisphere had not yet separated and formed Africa and South America, placental mammals diverged and Xenarthra and Afrotheria appeared; later, the ancestors of Laurasiatheria and Euarchontoglires diverged and migrated to the northern hemisphere.

The karyotype of Hoffmann's two‐toed sloth: The blocks that are syntenic to HSA are labeled on the left of each chromosome. For example, Chromosome 1 is syntenic to HSA1, but it is not syntenic to other HSA chromosomes, while Chromosome 6 contains synteny blocks that are similar to those found in HSA3 and HSA21 [7]. These karyotypes are presented in **Figure 7**.


**Figure 5.** Human synteny block associations observed in other placental mammals by zoo‐FISH, and positive results were indicated by the solid circles, modified from Ref. [4].

**Figure 6.** Assumed ancestral karyotypes. Numbers at the left side of the ideogram indicate the regions homologous to human karyotype segments, modified from Ref. [4].

#### **2.2. How is chromosomal recombination fixed in evolution?**

example, "15 + 19" (suggesting synteny blocks similar to HSA15 and 19) is Cetartiodactyla‐ and Perissodactyla‐specific, "3 + 19" is Carnivora‐specific, and "14 + 15" is widely seen in Aves

**Figure 4.** Three phases of chromosomal rearrangement rate. The numbers in the circles are the time (Mya) of divergence of common ancestors, and the numbers in the brackets indicate the rates of chromosomal rearrangement per Myr. (ps= prosimians; nw= new world monkeys; ow = old world monkeys; la = lesser apes; ga = great apes.) Modified from Ref. [6].

The other application of zoo‐FISH is to reconstruct primitive karyotyping. **Figure 6** shows the estimates of ancestral placental mammal (2*n* = 50), primate (2*n* = 50), and Carnivora (2*n* = 42) karyotypes as well as each chromosome and its relationship with human syntenic‐associated

It is worth noting that the study shows that the chromosomal karyotype of primitive placental mammals is 2*n* = 50, while Svartman et al. [7] also found that the karyotype of Hoffmann's two‐toed sloth (*Choloepus hoffmanni*), a Xenarthra member, possesses a karyotype close to the primitive one. This result suggests that the most primitive placental mammals may be Xenarthra, not Afrotheria. Both groups originated in the southern hemisphere, and this result does not violate Murphy's hypothesis on the origin of mammals. That is, when the part of supercontinent Gondwana in southern hemisphere had not yet separated and formed Africa and South America, placental mammals diverged and Xenarthra and Afrotheria appeared; later, the ancestors of Laurasiatheria and Euarchontoglires diverged and migrated to the

The karyotype of Hoffmann's two‐toed sloth: The blocks that are syntenic to HSA are labeled on the left of each chromosome. For example, Chromosome 1 is syntenic to HSA1, but it is not syntenic to other HSA chromosomes, while Chromosome 6 contains synteny blocks that are similar to those found in HSA3 and HSA21 [7]. These karyotypes are presented in **Figure 7**.

and placental mammals other than Rodentia (**Figure 5**).

chromosomes.

66 Phylogenetics

northern hemisphere.

Theoretically, chromosomal rearrangement may lead to meiotic errors and reduced fertility. It is fundamentally a harmful genetic variation, and most rearrangements are difficult to pass on in a population. However, (1) genetic drift, (2) Muller's ratchet mechanism or (3) hitch‐ hiker make it possible to keep some chromosomal recombination (beneficial mutations may be eliminated due to the selection of other loci, whereas harmful mutations may be preserved due to the selection of other beneficial loci).

**Figure 7.** The karyotype of Hoffmann's two‐toed sloth is arranged from left to right in the order of chromosomal number, the number in the column refers to the number of HSA it is syntenic to, and the diagram below karyotype of ancestral placental mammals describes the synteny blocks in sloth chromosome. We can find that both are quite similar but one is subject to two fissions and one fusion, modified from Ref. [7].

#### **2.3. The importance of studying "the weird mammals"**

The genome of most mammals contains approximately 3 billion nucleotides (3 × 109 bp), but the number of chromosome varies greatly. For placental mammals, Indian muntjac possesses as few as 2*n* = 6, while South America rodents possess 2*n* = 92; and for opossum, swamp wallaby possesses as few as 2*n* = 10, while rufous rat kangaroo possesses 2*n* = 32. Long, con‐ servative synteny blocks are found in placental mammals. For example, mice and humans share 116 synteny blocks, and it is estimated that approximately 94 rearrangement events have occurred.

Infraclasses Eutheria (placental mammals) and Metatheria (opossum) diverged at approx‐ imately 130 Mya, and their subclasses, Theria and Prototheria (i.e., monotreme), diverged at approximately 170 Mya. Fossil studies show that the radiation evolution of placental mammals (20 orders, including more than 4600 species) occurred in the Cretaceous period (approximately 60–80 Mya). By comparing the differences in the genomes of various animal populations, especially those that play specific roles in evolutionary history (Jennifer Graves, an Australian scholar, called them "the weird mammals"), such as monotreme, opossum and fast‐evolving rodents, we can learn more about the evolution progress of mammals.

## **3. The innovative application of zoological comparative genomic hybridization (CGH) in phylogenetics**

Placental mammals include four major lineages: (1) Afrotheria, which includes the orders Sirenia, Hyracoidea, Proboscidea, Tubulidentata, Macroscelidea and Afrosoricida; (2) Laurasiatheria, which includes orders Eulipotyphla, Carnivora, Pholidota, Perissodactyla, Cetartiodactyla and Chiroptera; (3) Euarchontoglires, which includes Rodentia, Lagomorpha, Primates, Scandentia and Dermoptera; and (4) Xenarthra [8]. Currently, there are disputes and uncer‐ tainties in the phylogenetic relationships and the true origins of each order in these four lin‐ eages. We attempt to define the phylogenetic relationship of the orders Pholidota, Carnivora and Xenarthra using genomic *in situ* hybridization, which was used to determine such rela‐ tionships for plants. In fact, there is a similar technology called "DNA‐DNA hybridization," developed by Sibley and Ahlquist [9]. The basic premise of DNA‐DNA hybridization is that a single strand is obtained from the DNA double helix of each species, and when the single strands are hybridized, the binding of the strands from two different species will be much stronger and their associated melting temperatures will be higher when they have a closer relationship. Radioisotope labeling is used to verify the binding as reformation of the double helix or the combination of single strands from two compared species. This technology was applied in the determination of the phylogenetic relationship between Primates and Aves. This technology revealed that in hominoids, humans are closer to chimpanzees than to goril‐ las or orangutans (**Figure 8**).

In this "DNA‐DNA hybridization," the DNA of two species was cut into small chunks of 600– 800 bp before mixing. Unfortunately, this technology was unable to prevent errors that were caused by the existence of paralogous sequences instead of orthologous sequences. The result was used for trending, similar to zoo‐GISH, but it was not designed for accuracy. On the other hand, analyses that are focused on one or more genes that are present in the evolutionary his‐ tory of only a few loci, lack a bridge to connect them. We are looking for a tool that is capable of not only whole genome and individual gene trending, but also larger block trending for

**2.3. The importance of studying "the weird mammals"**

subject to two fissions and one fusion, modified from Ref. [7].

**hybridization (CGH) in phylogenetics**

have occurred.

68 Phylogenetics

The genome of most mammals contains approximately 3 billion nucleotides (3 × 109

the number of chromosome varies greatly. For placental mammals, Indian muntjac possesses as few as 2*n* = 6, while South America rodents possess 2*n* = 92; and for opossum, swamp wallaby possesses as few as 2*n* = 10, while rufous rat kangaroo possesses 2*n* = 32. Long, con‐ servative synteny blocks are found in placental mammals. For example, mice and humans share 116 synteny blocks, and it is estimated that approximately 94 rearrangement events

**Figure 7.** The karyotype of Hoffmann's two‐toed sloth is arranged from left to right in the order of chromosomal number, the number in the column refers to the number of HSA it is syntenic to, and the diagram below karyotype of ancestral placental mammals describes the synteny blocks in sloth chromosome. We can find that both are quite similar but one is

Infraclasses Eutheria (placental mammals) and Metatheria (opossum) diverged at approx‐ imately 130 Mya, and their subclasses, Theria and Prototheria (i.e., monotreme), diverged at approximately 170 Mya. Fossil studies show that the radiation evolution of placental mammals (20 orders, including more than 4600 species) occurred in the Cretaceous period (approximately 60–80 Mya). By comparing the differences in the genomes of various animal populations, especially those that play specific roles in evolutionary history (Jennifer Graves, an Australian scholar, called them "the weird mammals"), such as monotreme, opossum and

fast‐evolving rodents, we can learn more about the evolution progress of mammals.

**3. The innovative application of zoological comparative genomic** 

Placental mammals include four major lineages: (1) Afrotheria, which includes the orders Sirenia, Hyracoidea, Proboscidea, Tubulidentata, Macroscelidea and Afrosoricida; (2) Laurasiatheria,

bp), but

**Figure 8.** Phylogenetic relationship between Primates determined by DNA‐DNA hybridization, modified from Ref. [9].

genomes, and even positioning. Therefore, the author chose to apply a mature technology from the study of human neoplasms called "metaphase comparative genomic hybridization (CGH)" to the study of phylogenetic history.

#### **4. The history and prior applications of CGH**

In 1992, Dan Pinkel's lab at UC San Francisco published an innovative technology named CGH [10]. In this method, tumor and normal cellular DNA probes were labeled with red and green fluorescence, respectively. They were then hybridized with normal cells in meta‐ phase and competed with each other in incorporating with normal chromosomes. Yellow is observed when red and green fluorescence are mixed in equal amounts. A block with more tumor cell genome than the normal reference, i.e., with duplication, turns green, whereas deletion causes it to turn red. This innovative genome‐wide technology not only allows posi‐ tioning, but shows increase or decrease, making it a powerful tool in searching for tumor sup‐ pressing genes (which make the amount of tumor cells lower than those of normal reference) or oncogenes (which make the amount of tumor cells higher than those of normal reference), with a resolution of 5–10 Mbp. However, this technology is difficult to operate and requires specific photographic tools and image processing software to calculate the ratio of red and green fluorescence. Recently, gene chips have replaced this technology. Gene chips, formally known as array CGH (the original CGH was renamed as metaphase CGH), have designated probes that are fixed onto a chip [11]. The array CGH probes are derived from the known sequences of target organisms. Array CGH does not involve chromosomal preparation or microscope interpretation. Conversely, metaphase CGH is genome‐wide and has chromo‐ some‐level resolution, and it is a useful tool when the full genome sequence is unknown. This technology can be applied in more than tumor research; it is also valuable for studying human genetic diseases that are related to repeated or deleted blocks, especially those that are caused by copy number variation [12]. The captured images and the last interpretation are presented in **Figure 9**, where (A) fluorescein (FITC) is used to provide green light; (B) rhodamine for red light; and (C) merged CGH results from one normal sample.

The fluorescence of the green‐red ratio was analyzed with software.

We also applied this technology to report a rare case of missing human 13q31 without clini‐ cal symptoms [13]. In **Figure 10**, we can see that the human 13q31 block presents more red fluorescence in the block indicated by a straight red line (considered an increase when the green‐red ratio is more than 1.2 and a decrease when the ratio is less than 0.8). The label *n* =18 indicates that the number of Chromosome 13 samples is 18. Therefore, 13q31 is possibly a large polymorphic block in the human genome and this discovery is important in clinical genetic consultations.

Based on the experience of metaphase CGH in human medicine, the author considered the feasibility of applying this technology in interspecies exploration to characterize the

genomes, and even positioning. Therefore, the author chose to apply a mature technology from the study of human neoplasms called "metaphase comparative genomic hybridization

In 1992, Dan Pinkel's lab at UC San Francisco published an innovative technology named CGH [10]. In this method, tumor and normal cellular DNA probes were labeled with red and green fluorescence, respectively. They were then hybridized with normal cells in meta‐ phase and competed with each other in incorporating with normal chromosomes. Yellow is observed when red and green fluorescence are mixed in equal amounts. A block with more tumor cell genome than the normal reference, i.e., with duplication, turns green, whereas deletion causes it to turn red. This innovative genome‐wide technology not only allows posi‐ tioning, but shows increase or decrease, making it a powerful tool in searching for tumor sup‐ pressing genes (which make the amount of tumor cells lower than those of normal reference) or oncogenes (which make the amount of tumor cells higher than those of normal reference), with a resolution of 5–10 Mbp. However, this technology is difficult to operate and requires specific photographic tools and image processing software to calculate the ratio of red and green fluorescence. Recently, gene chips have replaced this technology. Gene chips, formally known as array CGH (the original CGH was renamed as metaphase CGH), have designated probes that are fixed onto a chip [11]. The array CGH probes are derived from the known sequences of target organisms. Array CGH does not involve chromosomal preparation or microscope interpretation. Conversely, metaphase CGH is genome‐wide and has chromo‐ some‐level resolution, and it is a useful tool when the full genome sequence is unknown. This technology can be applied in more than tumor research; it is also valuable for studying human genetic diseases that are related to repeated or deleted blocks, especially those that are caused by copy number variation [12]. The captured images and the last interpretation are presented in **Figure 9**, where (A) fluorescein (FITC) is used to provide green light; (B) rhodamine for red

(CGH)" to the study of phylogenetic history.

70 Phylogenetics

**4. The history and prior applications of CGH**

light; and (C) merged CGH results from one normal sample.

genetic consultations.

The fluorescence of the green‐red ratio was analyzed with software.

We also applied this technology to report a rare case of missing human 13q31 without clini‐ cal symptoms [13]. In **Figure 10**, we can see that the human 13q31 block presents more red fluorescence in the block indicated by a straight red line (considered an increase when the green‐red ratio is more than 1.2 and a decrease when the ratio is less than 0.8). The label *n* =18 indicates that the number of Chromosome 13 samples is 18. Therefore, 13q31 is possibly a large polymorphic block in the human genome and this discovery is important in clinical

Based on the experience of metaphase CGH in human medicine, the author considered the feasibility of applying this technology in interspecies exploration to characterize the

**Figure 9.** The result of metaphase CGH. (A) The signal of FITC‐labeled probes. (B) The signal of rhodamine‐labeled probes. (C) The merged CGH image of FITC and rhodamine.

**Figure 10.** Metaphase CGH profiles of the 13q31 deletion case. (A) An interstitial deletion at band 13q31 was found (denoted as a red vertical bar beside chromosome 13). (B) An amplified ideogram of chromosome 13 with the deleted region marked by a red vertical bar on the right.

evolutionary relationships among extant eutherian mammalian taxonomic groups (orders/ supraordinal clades). That is, to determine the sequence/genomic similarity of unknown‐ sequence species A and B with respect to species C, the DNA of species A and B would be labeled with molecules emitting different fluorescence dyes. The ratio of the labeled fluores‐ cence intensities in each chromosome of species C should then reflect regions of sequence similarity to species A versus B. This is a brand‐new application and the author named it "zoo‐CGH" (**Figure 11**).

**Figure 11.** Schematic diagram of zoo‐CGH. After calibration for genome size, equal amounts of genomic DNA from Species A (SpA) and Species B (SpB), labeled with a green and red fluorophore, respectively, were competitively hybridized to metaphase spreads of Species C (SpC).

## **5. Applying CGH in exploring the relationship between** *Pholidota***,**  *Carnivora***, and** *Xenarthra*

Myrmecophagy is a feeding behavior characterized by mainly or exclusively eating ants, termites, or both. This feeding specialization occurs in few eutherian mammals. Myrmecophagous spe‐ cies of Eutheria are in the orders Pholidota (e.g., pangolins, *Manis* spp, Manidae), Tubulidentata (e.g., aardvark, *Orycteropus afer*, Orycteropodidae) and Carnivora (e.g., aardwolf, *Proteles cristata*, Hyaenidae), and superorder Xenarthra (e.g., anteaters, Myrmecophaga spp, Myrmecophagidae; armadillos, Dasypus spp, Dasypodidae) [14, 15]. These species share similar adaptations for this feeding specialization, including short teeth and jaws, a long sticky tongue, powerful forelimbs with strong claws, a rounded skull, and a low metabolic rate. In these species, the taxonomic status of Pholidota is a controversial issue. Morphological cladistics propose a close relation‐ ship between Pholidota and Xenarthra, whereas molecular evidence from mitochondrial and nuclear genes indicate that Pholidota is the sister taxa of Carnivora. However, it was recently noted that Pholidota lacks one of the lineage‐specific karyotypic signatures of Carnivora. Zoo‐ CGH provided a genome‐wide perspective on the relationship among Pholidota, Xenarthra, and Carnivora, even though the sequences of these animals are not fully determined. In the following example, DNA of the domestic dog (*Canis lupus familiaris*; Carnivora) and the two‐ toed sloth (*Choloepus didactylus*; Xenarthra) are labeled with different fluorophores and then hybridized with the metaphase chromosome spreads of Taiwanese pangolin (*Manis pentadactyla pentadactyla*; Pholidota).

#### **5.1. Method and procedures**

#### *5.1.1. Determine nuclear genome size*

The genome size of the two‐toed sloth and domestic dog were determined to ensure that approximately equal numbers of nuclei (i.e., copy number of whole genomes in each species) are used in zoo‐CGH analyses. The genome sizes were obtained after flow cytometry analysis of propidium iodide (IP)‐stained nuclei from the target organisms.

#### *5.1.2. Extract DNA from the two‐toed sloth and domestic dog*

Genomic DNA was isolated from leukocytes with a commercial kit (Gentra Puregene DNA Purification Kit, Qiagen, Hilden, German), used in accordance with the manufacturer's instructions.

#### *5.1.3. Prepare the mitotic metaphase slides of Taiwanese pangolin*

Fibroblast cell lines were established from lung tissues derived from Taiwanese pangolin, and metaphase cells were harvested following a 2‐hour incubation with colcemid (at a concentra‐ tion of 0.1 μg/ml).

#### *5.1.4. Produce two‐toed sloth and domestic dog DNA probes*

The two‐toed sloth and domestic dog DNA were labeled with biotin and digoxigenin (DIG) by nick translation, respectively.

#### *5.1.5. Prepare pangolin C0 t‐1 DNA*

**5. Applying CGH in exploring the relationship between** *Pholidota***,** 

Myrmecophagy is a feeding behavior characterized by mainly or exclusively eating ants, termites, or both. This feeding specialization occurs in few eutherian mammals. Myrmecophagous spe‐ cies of Eutheria are in the orders Pholidota (e.g., pangolins, *Manis* spp, Manidae), Tubulidentata (e.g., aardvark, *Orycteropus afer*, Orycteropodidae) and Carnivora (e.g., aardwolf, *Proteles cristata*,

**Figure 11.** Schematic diagram of zoo‐CGH. After calibration for genome size, equal amounts of genomic DNA from Species A (SpA) and Species B (SpB), labeled with a green and red fluorophore, respectively, were competitively

*Carnivora***, and** *Xenarthra*

72 Phylogenetics

hybridized to metaphase spreads of Species C (SpC).

*C*0 *t*‐1 DNA obtained its name from its isolation using a method called *C*<sup>0</sup> *t* analysis (*C*<sup>0</sup> denotes "DNA concentration," whereas t denotes "time"). Repetitive nucleotide sequences, which constitute most of the *C*<sup>0</sup> *t*‐1 DNA, are abundantly distributed in most mammalian genomes. Blocking the repetitive sequences by *C*<sup>0</sup> *t*‐1 DNA can suppress nonspecific hybridization in FISH and CGH assays, and hence is a common step in such analyses. The genomic DNA of Taiwanese pangolin was sonicated to break the DNA into approximately 500‐bp fragments, and the fragmented DNA was purified by ethanol precipitation. The purified DNA was dis‐ solved to 500 ng/ml in TB buffer, denatured at 95°C for 10 minutes, and then chilled in ice for 10 minutes. A 1/10 volume of 12× SSC was then added to the fragmented DNA, which was reannealed at 60°C for 10 minutes. Then, S1‐nuclease was used to digest the nonannealed DNA at 42°C for 1 hour. Thereafter, DNA was precipitated with ethanol and resuspended in TE buffer. Lastly, the acquired *C*<sup>0</sup> *t*‐1 DNA was quantified by spectrometry.

#### *5.1.6. Perform zoo‐CGH*

Male Taiwanese pangolin chromosome spreads were prepared on a slide and denatured at 73°C for 5 minutes in 70% formamide and 2 × SSC, pH 7.0, followed by dehydration in a graded ethanol series. Next, equal genome copy numbers of biotin‐labeled two‐toed sloth DNA and DIG‐labeled domestic dog DNA were coprecipitated with a 50‐fold excess of Taiwanese pangolin *C*<sup>0</sup> *t*‐1 DNA, then redissolved in 10 μl of hybridization buffer (50% for‐ mamide, 10% dextran sulfate, and 2 × SSC), acting as the hybridization probe. Before hybrid‐ ization, the probe was denatured at 80°C for 7 minutes, and then incubated at 37°C for 1 hour for preannealing of the repetitive DNA. The denatured probe was applied to the slide with the denatured and dehydrated metaphase spreads, covered with a cover slip, sealed, and incu‐ bated at 37°C for 72 hours. After hybridization, the slide was washed three times with 50% formamide and 2% SSC at 40°C for 5 minutes, and then washed twice with 2% SSC at 40°C for 5 minutes. The slide was kept undisturbed with 0.1% Tween 20 in 4 × SSC for 5 minutes, and the hybridization signal was detected with fluorescein‐conjugated avidin (green fluorescence; for biotin‐labeled probe) and rhodamine‐conjugated ant‐DIG antibody (red fluorescence; for DIG‐labeled probe). Pangolin chromosomes were counterstained with 4′, 6‐diamidino‐2‐phe‐ nylindole (DAPI) and fluorescence signals were visualized under a Leica DMLB microscope equipped with a cooled CCD camera. The profile of the fluorescein versus rhodamine fluo‐ rescence intensity ratio (F/R ratio) was estimated with CGHView image analysis software.

#### *5.1.7. Analyze image*

By comparing the fluorescence ratio on the longitudinal axis of pangolin metaphase chromo‐ some, we estimated differences in the inter‐species gene copy number and DNA sequence similarity. The means of the F/R ratios obtained from the heterologous hybridization, which represents DNA from different species labeled with different fluorophores that are com‐ petitively bound to probes obtained from a third species, were calculated for each pangolin autosome. Pangolin chromosomal segments with F/R ratios of < 0.8 (red fluorescence is more intense) and > 1.2 (green fluorescence is more intense) were considered to have significantly different hybridization strengths. When the F/R ratios were between 0.8 and 1.2 (showing yellow fluorescence), the DNA sequence difference or copy number of each pair was roughly equivalent. Means of the ratios were also calculated using a dye‐swap design.

#### **5.2. Result**

constitute most of the *C*<sup>0</sup>

74 Phylogenetics

Blocking the repetitive sequences by *C*<sup>0</sup>

TE buffer. Lastly, the acquired *C*<sup>0</sup>

*5.1.6. Perform zoo‐CGH*

Taiwanese pangolin *C*<sup>0</sup>

*5.1.7. Analyze image*

*t*‐1 DNA, are abundantly distributed in most mammalian genomes.

*t*‐1 DNA was quantified by spectrometry.

*t*‐1 DNA, then redissolved in 10 μl of hybridization buffer (50% for‐

FISH and CGH assays, and hence is a common step in such analyses. The genomic DNA of Taiwanese pangolin was sonicated to break the DNA into approximately 500‐bp fragments, and the fragmented DNA was purified by ethanol precipitation. The purified DNA was dis‐ solved to 500 ng/ml in TB buffer, denatured at 95°C for 10 minutes, and then chilled in ice for 10 minutes. A 1/10 volume of 12× SSC was then added to the fragmented DNA, which was reannealed at 60°C for 10 minutes. Then, S1‐nuclease was used to digest the nonannealed DNA at 42°C for 1 hour. Thereafter, DNA was precipitated with ethanol and resuspended in

Male Taiwanese pangolin chromosome spreads were prepared on a slide and denatured at 73°C for 5 minutes in 70% formamide and 2 × SSC, pH 7.0, followed by dehydration in a graded ethanol series. Next, equal genome copy numbers of biotin‐labeled two‐toed sloth DNA and DIG‐labeled domestic dog DNA were coprecipitated with a 50‐fold excess of

mamide, 10% dextran sulfate, and 2 × SSC), acting as the hybridization probe. Before hybrid‐ ization, the probe was denatured at 80°C for 7 minutes, and then incubated at 37°C for 1 hour for preannealing of the repetitive DNA. The denatured probe was applied to the slide with the denatured and dehydrated metaphase spreads, covered with a cover slip, sealed, and incu‐ bated at 37°C for 72 hours. After hybridization, the slide was washed three times with 50% formamide and 2% SSC at 40°C for 5 minutes, and then washed twice with 2% SSC at 40°C for 5 minutes. The slide was kept undisturbed with 0.1% Tween 20 in 4 × SSC for 5 minutes, and the hybridization signal was detected with fluorescein‐conjugated avidin (green fluorescence; for biotin‐labeled probe) and rhodamine‐conjugated ant‐DIG antibody (red fluorescence; for DIG‐labeled probe). Pangolin chromosomes were counterstained with 4′, 6‐diamidino‐2‐phe‐ nylindole (DAPI) and fluorescence signals were visualized under a Leica DMLB microscope equipped with a cooled CCD camera. The profile of the fluorescein versus rhodamine fluo‐ rescence intensity ratio (F/R ratio) was estimated with CGHView image analysis software.

By comparing the fluorescence ratio on the longitudinal axis of pangolin metaphase chromo‐ some, we estimated differences in the inter‐species gene copy number and DNA sequence similarity. The means of the F/R ratios obtained from the heterologous hybridization, which represents DNA from different species labeled with different fluorophores that are com‐ petitively bound to probes obtained from a third species, were calculated for each pangolin autosome. Pangolin chromosomal segments with F/R ratios of < 0.8 (red fluorescence is more intense) and > 1.2 (green fluorescence is more intense) were considered to have significantly different hybridization strengths. When the F/R ratios were between 0.8 and 1.2 (showing yellow fluorescence), the DNA sequence difference or copy number of each pair was roughly

equivalent. Means of the ratios were also calculated using a dye‐swap design.

*t*‐1 DNA can suppress nonspecific hybridization in

In **Figure 12**, we can see red, green or yellow blocks on different parts of the chromosome. The overall homology between the pangolin and dog genomes was higher than that between the pangolin and sloth genomes. Analysis of pangolin chromosomes 14 and 15, which were the larg‐ est and most easily identifiable, showed that red fluorescence is dominant in euchromatin, i.e., more similar to the domestic dog (**Figure 12E**). When dye swapping was conducted, i.e., green fluorescence for the domestic dog and red fluorescence for the two‐toed sloth, consistent results were obtained (**Figure 12F**).

**Figure 12** shows zoo‐CGH for the domestic dog, two‐toed sloth, and Taiwanese pangolin. In panel (A) genomic DNA from dog (labeled with DIG conjugated to the red fluorophore, rho‐ damine) and sloth (labeled with biotin conjugated to the green fluorophore, fluorescein) were mixed in equal quantities and competitively hybridized to metaphase spreads from the pan‐ golin lymphocytes. In panel (B) individual chromosome analysis of the fluorescent ratio in (A) was presented where blue lines denote the ratio of F/R signal at each position of the pan‐ golin chromosomes. Numbers in brackets represent the number of chromosomes analyzed. When the vertical bar between each chromosome and its ideogram appears red or green, the F/R ratio was <0.8 or >1.2, respectively. Overall, all chromosomes (except Y) appeared red. Panels (C) and (D) represent dye swap of (A) and (B), respectively. All chromosomes (except Y) appeared green.

From the results above, we found that all somatic chromosomes of *Manis pentadactyla* are more similar to the domestic dog (Carnivora) than the two‐toed sloth (Xenarthra), providing evidence that *Pholidota* is more related to Carnivora than Xenarthra. For the Y chromosomes, which show the opposite results, we must eliminate the possibility of deletion of domes‐ tic dog's Y chromosome. We further analyzed the karyotype of this individual, but did not find such deletion. Therefore, it is possible that Y chromosome of *Manis pentadactyla* has a different evolutionary history than the somatic chromosomes [16]. The differences in the Y chromosome results can also be attributed to the size difference between the Y chromosomes of domestic dog and two‐toed sloth. The large genomic blocks of somatic chromosomes lack structural rearrangements during evolution, making "richness" prevail in signal expression instead of "similarity," which is more desired. We performed molecular evolution analysis with the *Sry* gene, which is located on Y chromosome, and the results were combined with those from zoo‐CGH; that is, two markers of different evolutionary history were used to answer the question. There is no doubt in the answer: in terms of extant mammal taxonomy, Pholidota has a closer relationship with Carnivora than Xenarthra. The new methods we developed can be used as a powerful tool for clarifying the phylogenetic relationships of orders under the Mammalia class, and they help answer some long‐disputed taxonomical questions. For example, to which greater taxonomical category should Chiroptera belong: Laurasiatheria or Euarchontoglires? Zoo‐CGH not only reveals the similarity trend of the whole genome but also individual gene blocks, making it the CGH technology with the highest resolution before the complete sequencing of each species; when it is combined with cross‐species whole chromosome painting FISH (zoo‐FISH), a new era of comparative genomics begins [17].

**Figure 12.** Cross‐species CGH for the domestic dog (*Canis lupus*), two‐toed sloth (*Chloepus didactylus*), and Taiwanese pangolin (*Manis pentadactyla pentadactyla*). (A) The competitively hybridization results of dog (rhodamine) and sloth (FITC) to metaphase spreads from pangolin lymphocytes. (B) Individual chromosome analysis of the fluorescent ratio in (A). (C) and (D) dye swap of (A) and (B), respectively. (E) and (F) Enlarge pangolin chromosomes 14 and 15 of (B) and (D), respectively.

#### **6. Discussion**

In early times, comparative genomics study between closely related species can only be done by comparing the karyotypes of the species and the techniques used are primitive, including Giemsa stain only, the G‐banding techniques, and thus only the diploid number (2N), the func‐ tional number (FN, indicating the numbers of the chromosomal arms), as well as the classifica‐ tion of the chromosomes into metacentric, submetacentric, acrocentric, and telocentric according to the arm ratios can be provided. In addition, the special stains, such as the C‐banding and Ag‐nucleolus organizer region (NOR) staining, can be used to elucidate the constitutive het‐ erochromatin (by C‐banding), and the sites of secondary constriction and the active‐transcrib‐ ing ribosomal DNA genes (by Ag‐NOR staining), can help to find the more trivial differences between species which may carry evolutionary significance [18, 19]. However, the advent of flu‐ orescence *in situ* hybridization (FISH) technology greatly expanded the role of cytogenetics in studying the karyotypic evolution, not only in mammals but also in plants [5, 7, 20]. The authors therefore propose here a complete cytogenetic study in the light of karyotypic evolution that should include the conventional karyotyping, the special stains, as well as the fluorescence *in situ* hybridization (FISH)‐based technologies such as genomic *in situ* hybridization (which is specific to plants), the chromosomal painting to study the movement and shuffling of the large genomic blocks in the Mb‐level (in mammals), the telomere (TTAGGG)*n* FISH to demarcate the chromosomal ends or to demonstrate the insertional translocation between species (in all verte‐ brates), mapping the locations of the gene of special interest with the FISH probe made by the gene segment cloned (in both animals and plants), and the innovative zoo‐CGH we described in the previous section (in mammals), as our previous studies recently demonstrated [17–19].

## **7. Conclusion**

Despite molecular evolution being made nowadays, by studying the homologous DNA sequences and using different evolutionary analytical models to reconstruct the phylogeny, which is the mainstream of comparative genomics [1–4], especially when sequencing the whole genome of each species has become more feasible through the powerful next generation sequencing (NGS) technology [21], cytogenetics remains an indispensible tool in studying the karyotypic evolution, which is one of the major mechanisms and thus is equally important as the molecular evolution to the processes involved in the speciation and subspecies differentia‐ tion. Conventional karyotyping, special stains to delineate the locations of heterochromatin, sites of active‐transcribing ribosomal DNA genes, as well as molecular cytogenetics (namely, the fluorescence *in situ* hybridization (FISH)‐based methodologies) can still provide insightful clues to solve the deficiencies that molecular evolution‐based analyses cannot easily answer because in addition to point mutations and small insertion/deletions (indels), the movement of large genomic segments in the size of Mb‐level, which is very difficult to analyze if by molecular methods, is also important in the evolution of the genetic complements of species deriving from a common ancestor in a specific lineage. The authors therefore propose a more balanced approach to study phylogenetics that is mandatory when considering using cyto‐ genetics or molecular analyses as the major research tool. Evolutionary genetics will not be complete if the valuable insights obtained through cytogenetics are ignored or omitted in this NGS‐predominant molecular era.

## **Author details**

**6. Discussion**

(D), respectively.

76 Phylogenetics

In early times, comparative genomics study between closely related species can only be done by comparing the karyotypes of the species and the techniques used are primitive, including Giemsa stain only, the G‐banding techniques, and thus only the diploid number (2N), the func‐ tional number (FN, indicating the numbers of the chromosomal arms), as well as the classifica‐ tion of the chromosomes into metacentric, submetacentric, acrocentric, and telocentric according to the arm ratios can be provided. In addition, the special stains, such as the C‐banding and Ag‐nucleolus organizer region (NOR) staining, can be used to elucidate the constitutive het‐ erochromatin (by C‐banding), and the sites of secondary constriction and the active‐transcrib‐ ing ribosomal DNA genes (by Ag‐NOR staining), can help to find the more trivial differences between species which may carry evolutionary significance [18, 19]. However, the advent of flu‐ orescence *in situ* hybridization (FISH) technology greatly expanded the role of cytogenetics in studying the karyotypic evolution, not only in mammals but also in plants [5, 7, 20]. The authors therefore propose here a complete cytogenetic study in the light of karyotypic evolution that should include the conventional karyotyping, the special stains, as well as the fluorescence *in situ* hybridization (FISH)‐based technologies such as genomic *in situ* hybridization (which is specific to plants), the chromosomal painting to study the movement and shuffling of the large genomic blocks in the Mb‐level (in mammals), the telomere (TTAGGG)*n* FISH to demarcate the

**Figure 12.** Cross‐species CGH for the domestic dog (*Canis lupus*), two‐toed sloth (*Chloepus didactylus*), and Taiwanese pangolin (*Manis pentadactyla pentadactyla*). (A) The competitively hybridization results of dog (rhodamine) and sloth (FITC) to metaphase spreads from pangolin lymphocytes. (B) Individual chromosome analysis of the fluorescent ratio in (A). (C) and (D) dye swap of (A) and (B), respectively. (E) and (F) Enlarge pangolin chromosomes 14 and 15 of (B) and

> Ming Chen1,2,3,4\*, Wen‐Hsiang Lin1,2, Dong‐Jay Lee1,2, Shun‐Ping Chang1,2, Tze‐Ho Chen4 and Gwo‐Chin Ma1,2

\*Address all correspondence to: mchen\_cch@yahoo.com

1 Department of Genomic Medicine, and Core Lab for System Biology, Changhua Christian Hospital, Changhua, Taiwan

2 Department of Genomic Science and Technology, Changhua Christian Hospital, Changhua, Taiwan

3 Department of Obstetrics and Gynecology, College of Medicine, National Taiwan University, Taipei, Taiwan

4 Department of Life Science, Tunghai University, Taichung, Taiwan

## **References**


[13] Ke YY, Lee DJ, Ma GC, Lee MH, Wang BT, Chen M. Interstitial deletion 13q31 associ‐ ated with normal phenotype: cytogenetic study of a family with concomitant segrega‐ tion of reciprocal translocation and interstitial deletion. Journal of the Formosan Medical Association. 2007;**106**(7):582‐588 DOI: 10.1016/S0929‐6646(07)60010‐2

**References**

78 Phylogenetics

10.1038/35054550

BF02101980

10.1038/ng2092

[1] Springer MS, Murphy WJ, Eizirik E, O'Brien SJ. Placental mammal diversification and the Cretaceous‐Tertiary boundary. Proceedings of the National Academy of Sciences of

[3] Murphy WJ, Larkin DM, Wind AE, Bourque G, Tesler G, Auvil L, Beever JE, Chowdhary Bp, Galibert F, Gatzke L, Hitte C, Meyers SN, Milan D, Ostrnder EA, Pape G, Parker HG, Raudsepp T, Rogatcheva MB, Schook LB, Skow LC, Welge M, Womack JE, O'Brien SJ, Pevzner PA, Lewin HA. Dynamics of mammalian chromosome evolution inferred from multispecies comparative maps. Science. 2005;**309**:613‐617. DOI: 10.1126/science.1111387

[4] Murphy WJ, Stanyon R, O'Brien SJ. Evolution of mammalian genome organization inferred from comparative gene mapping. Genome Biology. 2001b;**2**(6):reviews0005.1‐

[5] Ferguson‐Smith MA, Yang F, RensW, O'Brien PC. The impact of chromosome sorting and painting on the comparative analysis of primate genomes. Cytogenetic and Genome

[6] Burt DW, Bruley C, Dunn IC, Jones CT, Ramage A, Law AS, Morrice DR, Paton IR, Smith J, Windsor D, Sazanov A, Fries R, Waddington D. The dynamics of chromosome evolu‐

[7] Svartman M, Stone G, Stanyon R. The ancestral eutherian karyotype is present in

[8] Amrin‐Madsen H, Koepfli KP, Wayne RK, Springer MS. A new phylogenetic marker, Apolipoprotein B, provides compelling evidence for eutherian relationships. Molecular Phylogenetics and Evolution. 2003;**28**(2):225‐240. DOI: 10.1016/S1055‐7903(03)00118‐0 [9] Sibley CG, Ahlquist JE. The phylogeny of the hominoid primates, as indicated by DNA‐ DNA hybridization. Journal of Molecular Evolution. 1984;**20**(1):2‐15. DOI: 10.1007/

[10] Kallioniemi A, Kallioniemi OP, Sudar D, Rutovitz D, Gray JW, Waldman F, Pinkel D. Comparative genomic hybridization for molecular cytogenetic analysis of solid tumors.

[11] Kallioniemi A. CGH microarrays and cancer. Current Opinion in Biotechnology.

[12] Lee C, Iafrate AJ, Brothman AR. Copy number variations and clinical cytogenetic diagnosis of constitutional disorders. Nature Genetics. 2007;**39**(7 Suppl):S48‐S54. DOI:

Science. 1992;**258**(5083):818‐821. DOI: 10.1126/science.1359641

2008;**19**(1):36‐40. DOI: 10.1016/j.copbio.2007.11.004

tion in birds and mammals. Nature. 1999;**402**:411‐413. DOI: 10.1038/46555

Xenarthra. PLoS Genetics. 2006;**2**(7):e109. DOI: 10.1371/journal.pgen.0020109

reviews0005.8. DOI: 10.1186/gb‐2001‐2‐6‐reviews0005

Research. 2005;**108**(1‐3):112‐121. DOI: 10.1159/000080809

the United States of America. 2003;**100**:1056‐1061. DOI: 10.1073/pnas.0334222100 [2] Murphy WJ, Eizirik E, Johnson WE, Zhang YP, Ryder OA, O'Brien SJ. Molecular phy‐ logenetics and the origins of the placental mammals. Nature. 2001;**409**:614‐618. DOI:

