**2.2 Smaller scale duplication**

152 Gene Duplication

form of duplicated copies ranges from 10 to 50% and often correlates with the time elapsed

WGD is widespread in plants (Vision et al., 2000; Adams & Wendel, 2005). Estimates of the incidence of polyploidy in angiosperms vary from 30 to 80%, and about 3% of speciation events are explained by genome duplications (Otto & Whitton, 2000). Many, if not all, species of plants may thus have at least one polyploid ancestor. Most eudicots are assumed to have an ancient hexaploid ancestor, with subsequent tetraploidization in some taxa

Duplication of the entire genome in the yeast *Saccharomyces cerevisiae* led to an initial increase in the number of genes from 5000 to 10 000, but the subsequent loss of paralogs has led to the preservation in modern *Saccharomyces* of about 5500 protein-coding genes, of which 1102 form 551 paralogous pairs (Byrne & Wolfe, 2005). A special term, ohnologs,

Detection of natural polyploidy is a difficult task, especially for ancient events. Recent duplications can be detected by comparing closely related species, one of which underwent diploidization and therefore contains twice as many chromosomes as species that did not undergo WGD. For example, a comparison of the genomes of *Ashbya gossypii* and *S. cerevisiae* revealed that both species evolved from a single ancestor that had seven or eight chromosomes (Dietrich et al., 2004). Changes in chromosome number due to mutations (in particular translocations) led to the ancestors of *A. gossypii* and *S. cerevisiae*. WGD in *S. cerevisiae* has provided this species with new opportunities for functional divergence absent in *A. gossypii*. A similar comparative analysis was also carried out for *S. cerevisiae* and its

The older the duplication, the harder the analysis, because a period of diploidization often follows polyploidization, which "transforms" the polyploid genome to the diploid state. Diploidization is achieved by an intensive loss of genes, rearrangements of the genome and the divergence of duplicated genes. Recent analyses have also shown that the duplication of individual genes in evolution has occurred much more frequently than was previously thought (Lynch & Conery, 2000; Lynch et al., 2001). Diploidization has been studied in many genomes including those of plants (Chapman et al., 2006; Jaillon et al., 2007; Tuskan et al., 2006), bony fishes (Brunet et al., 2006), yeasts (Piskur, 2001; Kellis et al., 2004; Scannell et al., 2006; Scannell et al., 2007), *Paramecium* (Aury et al., 2006) and vertebrata (Blomme

Plants have repeatedly undergone polyploidization during evolution, presumably aided by their ability to propagate vegetatively and by the existence of specific regulatory mechanisms in plant cells. In particular, model polyploids have been characterized by a rapid loss of some genes and the specific inactivation of others by methylation (Kashkush et al., 2002; Comai et al., 2000; Lee & Chen, 2001). Epigenetic silencing may protect the duplicated copies from pseudogenization, thus facilitating the acquisition of new functions

Vertebrate genomes contain many families of genes that are not found in invertebrates, and many gene duplications apparently occurred early in the evolution of the chordates (Taylor & Raes, 2004). Ohno suggested that the complex genome of vertebrates arose as a result of two rounds (2R) of WGD (Ohno, 1970). This view was once supported by the belief that the human genome contained about 100 000 genes, which was four times more than the estimated number of genes in the genomes of invertebrates. Sequencing of the human genome has since reduced the estimate of the number of genes to 20 000-25 000 but has not

dedicated to S. Ohno, was proposed for paralogs resulting from WGD (Wolfe, 2000).

closest non-WGD relative, *Kluyveromyces waltii* (Kellis et al., 2004).

since duplication (Scannell et al., 2006).

(Jaillon et al., 2007).

et al., 2006).

(Rodin & Riggs, 2003).

Ohno (1970) argued that duplication of the genome rather than its individual parts is more important for evolution, because partial duplications can lead to regulatory imbalances. Nevertheless, partial and complete duplications of genes also play very important roles in evolution. WGDs have occurred several times during the evolutionary history of organisms, while SSDs arise continuously through multiple mechanisms. Several mechanisms have been suggested for the improvement in function of existing proteins and for the creation of new functions. One such mechanism is the internal (partial) duplication of genes, which is important for increasing the functional complexity of genes in evolution (Li, 1997). Such duplications are believed to have played a key role in the emergence of complex genes. Many proteins of modern organisms contain internal repeats of amino acids, and these repeats often correspond to functional or structural domains of proteins. These data suggest that the genes encoding these proteins were formed by internal duplications (Lavorgna et al., 2001). Internal duplication provides the possibility of improving protein function by increasing the number of active sites. Internal duplications can also lead to the acquisition of new functions by the modification of duplicated regions or the reorganization of modules. Numerous data on the role of intragenic duplications in the early stages of evolution of proteins were obtained by comparative analyses of sequenced genomes (Marcotte et al., 1999; Lavorgna et al., 2001; Conant & Wagner 2005; Chen et al. 2007). Duplicated regions can accumulate mutations that contribute to the divergence of the repeated fragments, which can then become fixed. Often, only traces of duplications in the form of imperfect repeats can be detected in contemporary amino acid sequences (Li, 1997). Eukaryotic proteins have more repeats than do prokaryotic proteins (Marcotte et al., 1999; Chen et al., 2007).
