**5. Acknowledgments**

We received financial support from Ministerio de Educación, Gobierno de España (FPU to M.T-R, BIO2009-08160), Generalitat de Catalunya (FI to S.L.), Fundació Javier Lamas Miralles (Ajut Predoctoral Javier Lamas Miralles to N.R-T) and Institució Catalana de Recerca i Estudis Avançats (M.M.A).

## **6. References**


in orphan proteins will evolve slower than the conserved protein regions, rather the contrary would seem more logical. In a previous study we showed that the nonsynonymous to synonymous nucleotide substitution rates of primate-specific genes, measured for human and macaque orthologues, were, on average, twice as high as those of mammalian-specific genes and five times higher than those of deeply conserved eukaryotic proteins (Toll-Riera et al., 2009a). The differences in amino acid substitution rates between orphan and parental genes described here reinforce the idea that the evolution of a new

We have examined the evolutionary dynamics of a group of novel primate-specific genes (orphan genes) that have arisen by gene duplication. These genes typically form new structures in which only part of the protein sequence is shared with the parental copy, presumably because of partial gene duplication, and the rest of the protein sequence is unique. The orphan proteins accumulate a much larger number of amino acid substitutions per site than the parental proteins, denoting rapid functional diversification. The parental gene copies appear to act as "donors" of sequence but do not experience any obvious sequence evolution alterations, thus they probably preserve their ancestral functions. Future research in this area, using computational as well as experimental studies, should help clarify how frequent is partial gene duplication with respect to complete gene duplication, the differences in gene copy survival in both cases, and how partial and complete gene

We received financial support from Ministerio de Educación, Gobierno de España (FPU to M.T-R, BIO2009-08160), Generalitat de Catalunya (FI to S.L.), Fundació Javier Lamas Miralles (Ajut Predoctoral Javier Lamas Miralles to N.R-T) and Institució Catalana de

Alba, M.M. & Castresana, J. 2005. Inverse relationship between evolutionary rate and age of

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D.J.

Arguello, J.R., Chen, Y., Yang, S., Wang, W. & Long, M. 2006. Origination of an X-linked

Bailey, J.A., Liu, G. & Eichler, E.E. 2003. An Alu transposition model for the origin and expansion of human segmental duplications. *Am J Hum Genet* 73(4): 823-834. Cai, J., Zhao, R., Jiang, H. & Wang, W. 2008. De novo origination of a new protein-coding

Capra, J.A., Pollard, K.S. & Singh, M. 2010. Novel genes exhibit distinct patterns of function

gene in Saccharomyces cerevisiae. *Genetics* 179(1): 487-496.

acquisition and network integration. *Genome Biol* 11(12): R127.

1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search

testes chimeric gene by illegitimate recombination in Drosophila. *PLoS Genet* 2(5):

gene is strongly associated with very rapid sequence change.

duplication contribute to the generation of evolutionary novelties.

mammalian genes. *Mol Biol Evol* 22(3): 598-606.

programs. *Nucleic Acids Res* 25(17): 3389-3402.

**4. Concluding remarks and future research** 

**5. Acknowledgments** 

**6. References** 

e77.

Recerca i Estudis Avançats (M.M.A).


Partial Gene Duplication and the Formation of Novel Genes 109

Nembaware, V., Crum, K., Kelso, J. & Seoighe, C. 2002. Impact of the presence of paralogs

Notredame, C., Higgins, D.G. & Heringa, J. 2000. T-Coffee: A novel method for fast and

Ochs, R.L., Stein, T.W., Jr., Chan, E.K., Ruutu, M. & Tan, E.M. 1996. cDNA cloning and characterization of a novel nucleolar protein. *Mol Biol Cell* 7(7): 1015-1024.

Parra, G., Reymond, A., Dabbouseh, N., Dermitzakis, E.T., Castelo, R., Thomson, T.M.,

Patthy, L. 1999. Genome evolution and the evolution of exon-shuffling--a review. *Gene*

Porter, D., Weremowicz, S., Chin, K., Seth, P., Keshaviah, A., Lahti-Domenici, J., Bae, Y.K.,

Rosenberg, H.F. & Dyer, K.D. 1995. Eosinophil cationic protein and eosinophil-derived

Salichs, E., Ledda, A., Mularoni, L., Alba, M.M. & de la Luna, S. 2009. Genome-wide analysis

Scannell, D.R. & Wolfe, K.H. 2008. A burst of protein sequence evolution and a prolonged

Schittek, B., Hipfel, R., Sauer, B., Bauer, J., Kalbacher, H., Stevanovic, S., Schirle, M.,

Siepel, A. 2009. Darwinian alchemy: Human genes from noncoding DNA. *Genome Res*

Siew, N. & Fischer, D. 2003. Analysis of singleton ORFans in fully sequenced microbial

Toll-Riera, M., Bosch, N., Bellora, N., Castelo, R., Armengol, L., Estivill, X. & Alba, M.M.

Toll-Riera, M., Castelo, R., Bellora, N. & Alba, M.M. 2009b. Evolution of primate orphan

Ueki, N., Kondo, M., Seki, N., Yano, K., Oda, T., Masuho, Y. & Muramatsu, M. 1998. NOLP:

nuclear speckles compartment. *PLoS Genet* 5(3): e1000397.

accurate multiple sequence alignment. *J Mol Biol* 302(1): 205-217.

protein complexity in the human genome. *Genome Res* 16(1): 37-44.

Ohno, S. 1970. Evolution by gene duplication. *New York: Springer-Verlag*.

1370-1376.

238(1): 103-114.

137-147.

484(1): 22-28.

19(10): 1693-1695.

*Evol* 26(3): 603-612.

97-102.

genomes. *Proteins* 53(2): 241-251.

proteins. *Biochem Soc Trans* 37(Pt 4): 778-782.

*Acad Sci U S A* 100(19): 10931-10936.

*Biol Chem* 270(37): 21539-21544.

on sequence divergence in a set of mouse-human orthologs. *Genome Res* 12(9):

Antonarakis, S.E. & Guigo, R. 2006. Tandem chimerism as a means to increase

Monitto, C.L., Merlos-Suarez, A., Chan, J., Hulette, C.M., Richardson, A., Morton, C.C., Marks, J., Duyao, M., Hruban, R., Gabrielson, E., Gelman, R. & Polyak, K. 2003. A neural survival factor is a candidate oncogene in breast cancer. *Proc Natl* 

neurotoxin. Evolution of novel function in a primate ribonuclease gene family. *J* 

of histidine repeats reveals their role in the localization of human proteins to the

period of asymmetric evolution follow gene duplication in yeast. *Genome Res* 18(1):

Schroeder, K., Blin, N., Meier, F., Rassner, G. & Garbe, C. 2001. Dermcidin: a novel human antibiotic peptide secreted by sweat glands. *Nat Immunol* 2(12): 1133-1137. Shu-Nu, C., Lin, C.H. & Lin, A. 2000. An acidic amino acid cluster regulates the nucleolar

localization and ribosome assembly of human ribosomal protein L22. *FEBS Lett*

2009a. Origin of primate orphan genes: a comparative genomics approach. *Mol Biol* 

identification of a novel human nucleolar protein and determination of sequence requirements for its nucleolar localization. *Biochem Biophys Res Commun* 252(1):


Johnson, M.E., Viggiano, L., Bailey, J.A., Abdul-Rauf, M., Goodwin, G., Rocchi, M. & Eichler,

Katju, V. & Lynch, M. 2003. The structure and early evolution of recently arisen gene duplicates in the Caenorhabditis elegans genome. *Genetics* 165(4): 1793-1803. Katju, V. & Lynch, M. 2006. On the formation of novel genes by duplication in the

Khalturin, K., Hemmrich, G., Fraune, S., Augustin, R. & Bosch, T.C. 2009. More than just

Knowles, D.G. & McLysaght, A. 2009. Recent de novo origin of human protein-coding

Kondrashov, F.A. & Koonin, E.V. 2004. A common framework for understanding the origin

Kondrashov, F.A., Rogozin, I.B., Wolf, Y.I. & Koonin, E.V. 2002. Selection in the evolution of

Kuo, C.H. & Kissinger, J.C. 2008. Consistent and contrasting properties of lineage-specific

Levine, M.T., Jones, C.D., Kern, A.D., Lindfors, H.A. & Begun, D.J. 2006. Novel genes

Lynch, M. & Conery, J.S. 2000. The evolutionary fate and consequences of duplicate genes.

Ma, P., Wang, N., McKown, R.L., Raab, R.W. & Laurie, G.W. 2008. Focus on molecules:

Makova, K.D. & Li, W.H. 2003. Divergence in the spatial pattern of gene expression between

Marques, A.C., Dupanloup, I., Vinckenbosch, N., Reymond, A. & Kaessmann, H. 2005.

Martinez-Garay, I., Jablonka, S., Sutajova, M., Steuernagel, P., Gal, A. & Kutsche, K. 2002. A

Muller, H.J. 1935. The origination of chromatin deficiencies as minute deletions subject to

Nathans, J., Thomas, D. & Hogness, D.S. 1986. Molecular genetics of human color vision: the genes encoding blue, green, and red pigments. *Science* 232(4747): 193-202. Nekrutenko, A. & Li, W.H. 2001. Transposable elements are found in a large number of

Emergence of young human genes after a burst of retroposition in primates. *PLoS* 

new gene family (FAM9) of low-copy repeats in Xp22.3 expressed exclusively in testis: implications for recombinations in this region. *Genomics* 80(3): 259-267. Mouse Genome Sequencing Consortium. 2002. Initial sequencing and comparative analysis

Caenorhabditis elegans genome. *Mol Biol Evol* 23(5): 1056-1067.

gene duplications. *Genome Biol* 3(2): RESEARCH0008.

the young and old. *Nat Rev Genet* 4(11): 865-875.

human duplicate genes. *Genome Res* 13(7): 1638-1645.

of the mouse genome. *Nature* 420(6915): 520-562.

human protein-coding genes. *Trends Genet* 17(11): 619-621.

insertion elsewhere. *Genetica* 17: 237-252.

*Science* 290(5494): 1151-1155.

*Biol* 3(11): e357.

lacritin. *Exp Eye Res* 86(3): 457-458.

African apes. *Nature* 413(6855): 514-519.

genes. *Genome Res* 19(10): 1752-1759.

25(9): 404-413.

20(7): 287-290.

8: 108.

E.E. 2001. Positive selection of a gene family during the emergence of humans and

orphans: are taxonomically-restricted genes important in evolution? *Trends Genet*

of genetic dominance and evolutionary fates of gene duplications. *Trends Genet*

genes in the apicomplexan parasites Plasmodium and Theileria. *BMC Evol Biol* 

derived from noncoding DNA in Drosophila melanogaster are frequently X-linked and exhibit testis-biased expression. *Proc Natl Acad Sci U S A* 103(26): 9935-9939. Long, M., Betran, E., Thornton, K. & Wang, W. 2003. The origin of new genes: glimpses from


**Part 2** 

**A Look at Some Gene Families** 


**Part 2** 

**A Look at Some Gene Families** 

110 Gene Duplication

Van de Peer, Y., Taylor, J.S., Braasch, I. & Meyer, A. 2001. The ghost of selection past: rates of

Wang, J., Wang, N., Xie, J., Walton, S.C., McKown, R.L., Raab, R.W., Ma, P., Beck, S.L.,

Wootton, J.C. & Federhen, S. 1996. Analysis of compositionally biased regions in sequence

Yang, S., Arguello, J.R., Li, X., Ding, Y., Zhou, Q., Chen, Y., Zhang, Y., Zhao, R., Brunet, F.,

as a mechanism for new gene origination in Drosophila. *PLoS Genet* 4(1): e3. Zendman, A.J., Van Kraats, A.A., Weidle, U.H., Ruiter, D.J. & Van Muijen, G.N. 2002. The

Zhang, J., Zhang, Y.P. & Rosenberg, H.F. 2002. Adaptive evolution of a duplicated pancreatic ribonuclease gene in a leaf-eating monkey. *Nat Genet* 30(4): 411-415. Zhou, Q. & Wang, W. 2008. On the origin and evolution of new genes--a genomic and

Zhou, Q., Zhang, G., Zhang, Y., Xu, S., Zhao, R., Zhan, Z., Li, X., Ding, Y., Yang, S. & Wang, W. 2008. On the origin of new genes in Drosophila. *Genome Res* 18(9): 1446-1455.

experimental perspective. *J Genet Genomics* 35(11): 639-648.

5): 436-446.

3708-3713.

*Cell Biol* 174(5): 689-700.

databases. *Methods Enzymol* 266: 554-571.

evolution and functional divergence of anciently duplicated genes. *J Mol Evol* 53(4-

Coffman, G.L., Hussaini, I.M. & Laurie, G.W. 2006. Restricted epithelial proliferation by lacritin via PKCalpha-dependent NFAT and mTOR pathways. *J* 

Peng, L., Long, M. & Wang, W. 2008. Repetitive element-mediated recombination

XAGE family of cancer/testis-associated genes: alignment and expression profile in normal tissues, melanoma lesions and Ewing's sarcoma. *Int J Cancer* 99(3): 361-369. Zhang, J., Rosenberg, H.F. & Nei, M. 1998. Positive Darwinian selection after gene

duplication in primate ribonuclease genes. *Proc Natl Acad Sci U S A* 95(7):

**7** 

*USA* 

**Immunoglobulin Polygeny:** 

**An Evolutionary Perspective** 

J. E. Butler, Xiu-Zhu Sun and Nancy Wertz

*Carver College of Medicine, University of Iowa, Iowa City,* 

*Department of Microbiology & Interdisciplinary Immunology Program* 

The immune system of vertebrates is characterized by genes of the Ig-superfamily (IGSF) that encode the immunoglobulin (Ig) genes, genes that encode the T cell receptor (TCR), a portion of the structure of the genes encoding the major histocompatibility molecules (MHC), Ig cell surface and transport receptors, some families of cytokines and chemokines as well as numerous other proteins important to the immune system. IGSF genes also encode proteins in sponges, coelenterates and flatworms (Blumbach et al., 1998; Miller & Steele, 2000; Ogawa et al., 1998). While not a topic for this chapter, we acknowledge that the IGSF genes are not the only family of genes used to generate an antibody repertoire in vertebrates. The VLR-based receptors of jawless fishes that belong to the LRR family of

Figure 1 illustrates the signature features of proteins encoded by the IGSF genes. Highly diagnostic is the so-called "-barrel" or "Ig fold". Anti-parallel -pleated sheets form the staves of the barrel that are joined at each end by flexible polypeptide chains. These flexible polypeptides on the face of a heavy chain variable region domain (VH; Fig. 1A) contain three combinatorial determining regions (CDRs). The variable light chain domain (VL; not shown) also contributes three CDRs. CDRs from both VH and VL domains coalesce to form the antibody binding site (Fig. 1B). A striking feature of IGSF genes that encode the variable region domain of Igs is the degree of polygeny such that duplicated VH genes alone can occupy > three megabases (Matsuda et al., 1990). There are three such variable region loci in mammals: VH, Vand V. The former encodes the variable heavy chain domain (Fig. 1A) while V and V encode the light chains variable region domains. All three loci are independent (non-linked) although a few orphan human VH genes can be found in other linkage groups (Matsuda et al., 1990). Popular textbooks suggest that this polygeny explains why antibodies can recognize >1010 different antigens. It is argued that if each specific antibody required a completely separate gene, more DNA would be needed than exists in the mammalian genome. To reduce the need for so many different germline encoded antibody binding sites, a system of somatic gene segment recombinations and later, somatic

The complete antibody molecule (and other proteins encoded by IGSF genes) is often composed of a tandem series of -barrel domains as illustrated in Fig. 1B. Each domain in such multi-domain molecules differs slightly in structure and correspondingly, in function.

receptors, have had a parallel evolution (Herrin & Cooper, 2010).

hypermutation (SHM) or somatic gene conversion (SCG), evolved.

**1. Introduction** 
