**1. Introduction**

76 Gene Duplication

Saez, L. J., Gianola, K. M., McNally, E. M., Feghali, R., Eddy, R., Shows, T. B. & Leinwand, L.

Shoja, V. & Zhang, L. (2006). A roadmap of tandemly arrayed genes in the genomes of

Sun, Y. M., Da Costa, N. & Chang, K. C. (2003). Cluster characterisation and temporal

Tokyo Cabinet: a modern implementation of DBM, (2011). Available from

human, mouse, and rat, *Mol Biol Evol*, Vol.23, No.11, pp. 2134-2141

tra's spawn at master - GitHub, (2011). Available from http://github.com/tra/spawn Tweedie, S., Ashburner, M., Falls, K., Leyland, P., McQuilton, P., Marygold, S., Millburn, G.,

W3C SVG Working Group, (2011). Available from http://www.w3.org/Graphics/SVG/ Weydert, A., Daubas, P., Lazaridis, I., Barton, P., Garner, I., Leader, D. P., Bonhomme, F.,

Zhang, J. (2003). Evolution by gene duplication: an update, *Trends Ecol Evol*, Vol.18, pp.

Zhang, J. & Nei, M. (1996). Evolution of Antennapedia-class homeobox genes, *Genetics*,

Zhou, Q., Zhang, G., Zhang, Y., Xu, S., Zhao, R., Zhan, Z., Li, X., Ding, Y., Yang, S. & Wang,

W. (2008). On the origin of new genes in Drosophila, *Genome Res*, Vol.18, No.9, pp.

genome, *Nucleic Acids Res*, Vol.15, No.13, pp. 5443-5459 script.aculo.us - web 2.0 javascript, (2011). Available from http://script.aculo.us

Vol.24, No.8, pp. 561-570

Vol.142, No.1, pp. 295-303

pp. D555-559

7183-7187

292-298

1446-1455

http://fallabs.com/tokyocabinet/

A. (1987). Human cardiac myosin heavy chain genes and their linkage in the

expression of porcine sarcomeric myosin heavy chain genes, *J Muscle Res Cell Motil*,

Osumi-Sutherland, D., Schroeder, A., Seal, R. et al. (2009). FlyBase: enhancing Drosophila Gene Ontology annotations, *Nucleic Acids Res*, Vol.37, Database issue,

Catalan, J., Simon, D., Guenet, J. L. et al. (1985). Genes for skeletal muscle myosin heavy chains are clustered and are not located on the same mouse chromosome as a cardiac myosin heavy chain gene, *Proc Natl Acad Sci U S A*, Vol.82, No.21, pp. Thousands of different multi-domain proteins, each the product of one separate specific gene, which exhibit one or multiple transmembrane (TM) domains, are expressed in Arabidopsis as well as in Homo sapiens (Sallman-Almen et al.,2009). These molecules may carry one or multiple TM domains very close to the amino or carboxyl terminus or even dispersed throughout the molecule Thus one may conclude that the TM domain existed prior to the branching of plants and metazoa. In both organisms, a very small percent of the TM domain containing proteins additionally possess multiple leucine rich repeats (LRR) domains. These domains seem to transmit ligand perception. So one should presume that such domains also existed prior to the branching of these two forms of life. (A caveat, however, is in order: Many of the combinations of the various domains which one finds in plants are not present in the same combination in animals and vice versa.). As demonstrated in this paper, subsequently, during evolution, on several occasions, the genes for these multi-domain proteins duplicated and during this process they were often altered slightly to allow generation of proteins that could provide new specific functioning (i.e.they underwent neofunctionalization).

Unlike the "adaptive immune" system which exists in animals but not in plants, an "innate immune" system is present in all multicellular organisms (animals and plants ). This latter system operates by way of receptors: the Toll-like receptors (TLRs) (first identified in Drosophila ) which bind lipopolysaccharides (endotoxins). In Drosophila, these receptors not only activate innate immunity; they also act in dorsal-ventral specifications. When one compares these molecules as they are encoded e.g. .in Arabidopsis (Dangl & Jones, 2001) with those present in Homo Sapiens, one finds that in animals – but not in plants – these receptor proteins possess a cysteine rich domain just prior to entering the membrane and also a signaling domain (which is not a protein kinase but which acts as a docking site) on the backside of the TM domain. TLRs target "pathogen associated molecular patterns" (PAMPs) by way of the LRRs domains.

There are, however, besides of TLRs, many other multi-domain proteins that contain both TM and LRR domains, encoded in Arabidopsis and also in Homo sapiens. Many exist in both of these two species but some of them are present only in one or in the other. Below we list more than a hundred of these various proteins (each expressed from an individual gene) present in Arabidopsis. The TM domain is an about 22 AA residue domain and the LRR is an about 20-29 residue domain (which contains about 6 Leu ). Both domains are present in proteins that

The LRR and TM Containing Multi-Domain Proteins in Arabidopsis 79

NP\_178125:AT1G80080; TMM (Too Many Mouths); Protein Binding/Receptor. (Note: This protein promotes cell fate progression in stomatal development of stems (Bhave et al.,2009)).

P\_176717;AT1G65380; Clavata 2; Protein Binding /Receptor Signaling Protein. (This protein forms a distinct CLE binding receptor complex regulating stem cell specification (Guo et

NP\_188941; AT3G23010; Receptor Like Protein 36 (Disease Resistance Protein)

NP\_187188; AT3G05370; Receptor Like protein 31; Protein Binding

NP\_177296; AT1G71400; Receptor Like Protein 12; Protein Binding

NP\_567412;AT4G13920; Receptor Like Protein 50; Protein Binding

NP\_176115;AT1G58190; Receptor Like Protein 9, Protein Binding

al.,2010)).

participate in protein-protein interactions but which have different functions and cellular locations. In each case, the protein is presented below, in an order guided by a Clustal X arrangement (http:www.clustal.org/) and is labeled by its NCBI (http://www.ncbi.nlmn.nih. gov/protein) protein identification number, followed by its chromosomal locus tag, in diagram form as given by SMART (http://smart.embl-heidelberg.de).
