**3. Duplication and diversification of Ig genes**

### **3.1 VH genes display evidence of duplication and genomic gene conversion**

Genomic gene conversion was originally described in yeast (Meselson & Radding, 1975; Szostak et al., 1983) and is a form of non-homologous recombination in which the end result

Immunoglobulin Polygeny: An Evolutionary Perspective 119

GQ923685 3-2 LVESGGGLVQPGGSLRLSCAASGFTFS DYSMN WVRQAPGKGLEWVA YTSYGSGNPI YYADSVKGRFTISRDNAKNTLYLQMSSLRAEDTAVYYCAR GQ923622 3-2 LVESGGGLVQPGGSLRLSCAASGFTFS EYGMN WVRQAPGKGPEWVS YISSPSGSNI YYAASVKGRFTISRENAKNSLYLQMSSLRAEDTAVYYCAR GQ923683 3-2 LVESGGGLVQPGGSLRLSCAASGFTFS EYGMN WVRQAPGKGLEWVS YISGDSSSDI YYAASVKGRFTISRDNAKNMVYLQMNSLRAEDTALCYCVR GQ923616 3-2 LVESGGGLVQPGGSLRLSCAASGFTFS EYGMN WVRQAPGKGLEWVS YISSPSGSNI YYAASVKGRFTISRENAKNSLYLQMSSLRAEDTAVYYCAR GQ923644 LVESGGGLVQPGGSLRLSCAASGFTFS NYDMH WIRQAPGKELEWVA HIWTDGSQK YYAESVKGRFTISRDNTKNMAYLQMNSLRVKDTALYYCAR GQ923675 3-2 LVESGGGLVQPGGSLRLSCAASGFTFS NYGMN WVRQAPGKGLEWIA YTSSGDGNPI YYADSVKGRFTISRDNAKNTLYLQMSSLRAEDTAVYYCAR GQ923628 LVESGGGLVQPGGSLRLSCAASGFTFS NYYMN WVRQAPGKGLEWVA SISDGSSYI YYGEAVKGRFTISRDNTKNMLYLQMNSLRAEDSAVYYCAR GQ923647 3-2 LVESGGGLVQPGGSLRLSCAASGFTFS NYYMN WVRQAPGKGLEWVA YISSDGSSYI NYANAVKGRFTISRDNAKNMVYLQMSSLRAEDTAMYYCAR GQ923618 3-3 LVESGGGLVQPGGSLRLSCAASGFTFS SNSMN WVRQAPGKGLEWVA LISSGGGST YYAASVKGRFTISRDNAKNSLYLQMNSLRAEDMALYYCAR GQ923650 LVESGGGLVQPGGSLRLSCAASGFTFS SNWMS WVRQAPGKGLEWVG IISTDGGTT NYADSVKGRFTISRDNAKNTLYLQMNSLTAKDTAVYYCAK GQ923681 3-1 LVESGGGLVQPGGSLRLSCAASGFTFS SSWMV GVRQAPGKGLEWVS LINPDGSIT NYANSVKGRFTISSDNAKNMLYLQMNSLRAEETAMYYCAR GQ923662 3-3 LVESGGGLVQPGGSLRLSCAASGFTFS SYDMN WVRQAPGKGLEWVA LISTDGGST YYANSVKGRFTISRDNAKNSLYLQMNSLRAEDTAVYFCAR GQ923669 3-3 LVESGGGLVQPGGSLRLSCAASGFTFS SYDMN WVRQAPGKGLEWVS LISPSGGST YYADSVKGRFTISRDNAKNMVYLQMSSLKAEDKAVYFCAR GQ923625 3-3 LVESGGGLVQPGGSLRLSCAASGFTFS SYDMN WVRQAPGKGLEWVS LISPSGGST YYADSVKGRFTISRDNAKNMVSLQMSSLRAEDTAVYYCAR GQ923612 3-3 LVESGGGLVQPGGSLRLSCAASGFTFS SYDMN WVRQAPGKGLEWVA YISSASNTI YYANSVKGRFTISIDNAKNTLYLQMSSLRAEDTAVYYCAR GQ923621 3-3 LVESGGGLVQPGGSLRLSCAASGFTFS SYDMS WVRQAPGKGLEWVS AISNGGGST YYAASVKGRFTISRDNAKNSLYLQMNSLRAEDTAVYYCAR GQ923679 LVESGGGLVQPGGSLRLSCAASGFTFS SYGMH WVRQAPGKGLEWVA YQYISSDGRNYI NYAASVKGRFTISRDNAKNTAYLQMNSLKAEDTAVYYCAR GQ923617 LVESGGGLVQPGGSLRLSCAASGFTFS SYGMH WVRQAPGKGLEWVA YQYISSDGRNYI NYAASVKGRFTISRDNAKNMLYLQMSSLRAEDMAVYYCAR GQ923658 3-2 LVESGGGLVQPGGSLRLSCAASGFTFS SYGMH WIRQAPGKGLEWVS RIGSDGRSYI HYADSVKSRFTISRDNAKNMLYLQMSSLRSEDTALYYCAR GQ923672 3-4 LVESGGGLVQPGGSLRLSCAASGFTFS SYGMN WVHQVLGKGLECVS GVSSIGGTT YYADSVKSRFTVSRDNTTSMLYLQMNSLRTEDMAVYYCAR GQ923613 3-2 LVESGGGLVQPGGSLRLSCAASGFTFS SYGMY WVGQVPGKGLEWVS LISSDGSSTI YYANSVKGRFTISRDNAKNTLYLQMNSLRAEDTAVYYCAR GQ923657 3-3 LVESGGGLVQPGGSLRLSCAASGFTFS SYHMN WVRQAPGKGLEWVA FISNGGSTI YYAASVKGRFTISRDNAKNMLYLQMNSLRAEDTALYYCAR GQ923638 LVESGGGLVQPGGSLRLSCAASGFTFS SYQMH WVRQAPGKGLEWVE LISSSGGTI YYADSVKGRFTISRDNAKNTLFLQMSSLRADDTAMYYCAR GQ923629 3-3 LVESGGGLVQPGGSLRLSCAASGFTFS SYSMD WVRQAPGKGLEWVA YISSASSTI YYANSVKGRFTISRDNAKNTLYLQMSSLRAEDTAMYYCAR GQ923642 3-3 LVESGGGLVQPGGSLRLSCAASGFTFS SYSMN WVRQAPGKGLEWVA VISSSGGTI YYADSVKGRFTISRDNAKNTLYLQMNSLRAEDTAVYYCAR GQ923684 3-1 LVESGGGLVQPGGSLRLSCAASGFTFS SYWMD WVRQAPGKGLEWLC RMNPDGSTT HYANSVKGRFTISRDNAKNMLYLQMNSLRAEETAMYYCAR GQ923619 3-3 LVESGGGLVQPGGSLRLSCAASGFTFS SYWMH WVRQAPGKGLEWVS RISSSGSTI SYAASVKGRFTISRDNTKNTLYLQMNSLRAEDTAVYYCAR GQ923632 3-2 LVESGGGLVQPGGSLRLSCAASGFTFS SYWMH WVRQAPGKGLEWVS LISSDGSSTI YYANSVKGRFTISRDNAKNTLYLQMSSLRAEDTAVYYCAR GQ923668 3-2 LVESGGGLVQPGGSLRLSCAASGFTFS SYWMH WVRQAPGKGLEWVS LISSDGSSYI YYAASVKGRFTISRDNAKNTLYLQMSSLKAEDTAVYYCAR GQ923640 LVESGGGLVQPGGSLRLSCAASGFTFS SYWMN WVRQAPGKRLEWVS AISSSGGST YYADSVKGRFTISRDNAKNTLFLQMSSLRVEDTAVYYCAK GQ923627 LVESGGGLVQPGGSLRLSCAASGFTFS SYWMS WVRQAPGKGLEWVA HINSGGST YYADSVKGRFTISRHNAKNSLYLQMSSLRAEDTVGYYCVR GQ923663 3-1 LVESGGGLVQPGGSLRLSCAASGFTFS SYWMY WVRQAPGKGLEWLC RMNPDGSTT NYANSVKGRFTISRDNAKNTLYLQMNSLSTQDTAMYYCST GQ923623 3-2 LVESGGGLVQPGGSLRLSCAASGFTFS SYYMN WVRQAPGEGLEWVA SISSGGGSYI YYAASVKGRFTISRDNTKNTLYLQMNSLRAEDTAVYYCAR GQ923680 LVESGGGLVLPGGSLRLSCAASGFTFS DYWVH WVCQAPWKGLEWVS DFRGDGGTT YYADSVKGRSTLSRDNVKNSLYLQMSSLRAKDTAIYYCAR GQ923635 LVESGGGLVLPGGSLRLSCAASGFTFS GYWIS WARQAPGKKLEWVS DISGDSSIT YYAASVKGRFTISRDNAKNTLYLQMNSLRAEDTALYYCAR GQ923636 LVESGGGLVLPGGSLRLSCAASGFTFS NYDMH WIRQAPGKGLEWVA HIWTDGSQK YYAESVKGRFTISRDNAKNSLYLQMNSLKAEDSALYYCAR GQ923673 3-1 LVESGGGLVLPGGSLRLSCAASGFTFS SNWMH WVRQAPGKGLEWLC RMNPDGSTT NYANSVKGRFTISRDSAKNMLYLQMNSLRAEETAMYYCTT GQ923655 3-1 LVESGGGLVLPGGSLRLSCAASGFTFS SYWMH WVRQAPGKGPEWLC QINSDGNTI YYANSVKGRFTISRDNAKNMLHLQMNSLRAEESALYYCAR GQ923615 3-1 LVESGGGLVLPGGSLRLSCAASGFTFS SYWMH WVRQAPGKGLEWLC QINSDGNTI YYANSVKGRFTISRDNAKNMLHLQMNSLRAEESALYYCAR GQ923676 3-3 LVESGGGLVQPGGSLRISCAASGFTFS SYWMS WVRQAPGKGPEWVS HISDGGGST YYANSVKGRFTNSRDNAKNSLSLQMNSLKPEDTALYYCAR GQ923631 LVESGGGLVQPGGSLRLSCAASGFIFS SYGMS WVRQAPGNGLEWVS GVSSIGGTTG YYADSVKGRFTVSRDNGKNMLFLQMNSLRAKDTAVYYCAR GQ923611 3-2 LVESGGGLVQPGGSLRLSCAASGFSFS SYWMG WVRQAPGKGLEWVA LISTGGGGNT YYATSVKGRFTISRDNAKNSLYLQMSSLRAEDTAVYYCAR GQ923667 3-3 LVESGGGLVQPGGSLRLSCAASGFSFS IYGMN WVRQAPGKGLEWVS GISTGGGST YYAASVKGRFTISRDNAKNSLYLQMNSLKAEDTAVYYCAR GQ923654 3-2 LVESGGGLVQPGGSLRLSCAATGFTFS SYWMH WVRQAPGKGLEWVS LISSDGSSYI YYAASVKGRFTISRDNAKNTLYLQMSSLRAEDTAVYYCAR GQ923645 3-3 LVESGGGLVQPGGSLRLSCSGSGFTFS SYSMD WVRQAPGKGLEWVA YISSASNTI YYANSVKGRFTISIDNAKNTLYLQMSSLRAEDTAVYYCAR GQ923660 3-3 LVESGGGLVQPGGSLRLSCSGSGFTFS SYSMD WVRQTPGKGLEWVA YISSASNTI YYANSVKGRFTISIDNAKNTLYLQMSSLRAEDTAVYYCAR GQ923677 3-6 LVESGGGVVPPGGSLSLSCKASGFTFT NYSMD WVSQAPGKGLQWVT RVSKPKGMTQ WYAPAVQGRFTIFRDNPMSTASLEITKLTSEDMAMYYCAR GQ923648 3-6 LVESGGGVVPPGGSLSLSCKASGFTFT NYSMD WVSQAPGKGLQWVT RVSKPKGMTQ WYAPAVQGRFTIFRDNPMSTVSLEITKLTSEDMAMYYCTR GQ923671 3-6 LVESGGGVVPPGGSLSLSCKASGFTFT NYSMD WVRQAPGKGLQWVA RVSQPKGTTQ WYAPAVRGRFTISRDNPTSTVSLKMTKLTSEDTAVYYCTR GQ923626 LVESGGDLVQPGGSLRLSCAASGFTFS SYWMH WVRQAPGKGLEWVS LVNPAGSST YYANSVKGRFTISRDNAKNTLYLQMSSLRAEDTGVYYCAR GQ923653 LVESGGDLVQPGGSLRLSCAASGFTFS SYWMH WVRQAPGEGLEWVS LINPAGSST YYANSVKGRFTISRDNAKNMLYLQMNSLRAEDTALYYCSR GQ923674 3-1 LVESGGGFVQPGGSLRLSCAASGFTFS SSSMD WVRQAPGKGLEWLC RINPDGSTT YYANSVKGKFTTSRDNAKNLLYLQMNSLRAQDMAVYYCVT GQ923652 LVESGGGLGKPEGSLRLSCAASGFASS SYYMN WIRQTPGKGLEWMA VISYNGNNT YYADSVKGRFTISRDNAKNMVYLQMSSLRAEDTALYYCVR GQ923661 3-5 LVESGGGLGKPEGSLRLSCAASGYASS SYYMN WVRQTPGKGLEWIC AITANGDST YYADSVKGRFTISRDNAKNTIYLQMSSLKSEDTAVYYCST GQ923624 3-5 LVESGGGLGKPEGTLRLSCAASGYASS SYYMN WVRQTPGKGLEWIC AITGNSDST YYADSVKGRFTISRDNAKNTIYLQMNSLRAEDMTVYYCAT GQ923633 3-5 LVESGGGLGKPEGTLRLSCAASGYASS SYYMN WVRQTPGKGLEWIC AITGNSDST YYADSVKGRFTISRDNAKNTIYLQMNSLRAEDTGVYYCAK GQ923637 3-3 LVESGGGLVKPGGSLRLSCAASGFTFS SYSMD WVCQAPGKGLEWVA YISSASSTI YYANSVKGRFTISIDNAKNTLYLQMSSLRAEDTAVYYCAR GQ923614 3-2 LVESGGGLVKPGGSLRLSCAASGFTFS EYGMN WVRQAPGKGLEWVS YISGDSSSDI YYAASVKGRFTISRDNAKNMVYLQMNSLRAEDTAVYYCAR GQ923651 LVESGGGLVPHGGSLRLSCAASGFTFN DYEMN WVRQAPGKGLEWVS RITSTGGST HYAASVKGRFTISRDNAKNTLFLQMNSLRAEDQAMYYCTA GQ923659 3-1 LVESGGGLVPPGGSLRLSCAASGFTFS SYWNT WVRQAPGKGLEWLG EINPDGSTT NYANAVKGRFTISRDNAKNTLYLQMNSLSAQDMAVYYCVR GQ923646 LVESGGGLVPPGGSLRLSCAASGFTFS SYWMG WVRQAPGKGLEWVS FVTDYGGSI YYADSVKGRFSISRDNAKNTLYLQMTSLRATDTAVYYCAR GQ923630 LVESGGGLVPPGGSLRLSCATSGFTFS GYWIS WARQAPGKKLEWVS DINGDSSTT YYAASVKGRFTTSRDNAKNMLFLQMSSLRAEDTAVYYCAR GQ923682 LVESGGGLVPPGGSLRLSCATSGFTFS GYWIS WARQAPGKKLEWVS DINGDSSTT YYAASVKGRFTISRDNAKNTLYLQMNSLRAEDTALYYCAR GQ923649 3-2 LVESGGGLVQPGGSLRLSCAASGFSFW SYPMN WVRQAPGKGLEWVA LISSGRDGNT YYATSVKGRFTISRDNAKNSLYLQMSSLRSDDTALYYCAR GQ923634 LVESGGGLVQPGGSLRLSCAASGFTFD DYYMH WVRQAPGKGLEWVT SISEGGSYI YYANAVKGRFTISRDNAKNSLYLQMNSLRAEDTAVYYCAR GQ923641 3-2 LVESGGGLVQPGGSLRLSCAASGFTFG SYWMH WVRQAPGKGLEWVS RIGSDGSSYI YYADSVKGHFTISRDNAKNMLYLQMSSLRSEDTAVYYCAR GQ923665 3-3 LVESGGGLVQPGGSLRLSCAASGSTFS SYSMD WVRQAPGKGLEWVA YISSASSTI YYANSVKGRFTISRDNAKNSLYLQMSSLRAEDTAVYYCAR GQ923670 3-3 LVESGGGLVQPGGSLRLSCTGSGFSFS SYSMD WVRQAPGKGLEWVA YISSASNTI YYANSVKGRFTISIDNAKNTLYLQMSSLRAEDTAVYYCAR GQ923639 3-4 LVESGGGLVQPGGSLRLSFAASGFTFS SYGMN WAHQFPGKGLEWVS GVSSIGGTT YYADSVKGRFTVSRDNAKNTLYLQMNSLRTEDMAVYYCAR GQ923656 3-4 LVESGGGLVQPGGSLRLSFAASGFTFS SYGMN WAHQFPGKGLEWVS GVSSIGGTT YYADSVKGRFTVSRDNAKNTLYLQMNSLRTEDMAVYYCAR GQ923620 LVESGGGLVQPWGSLRLSCKGSGFTFS DYYMN WIRQTPGKGLEWVG LIRNKANGHTT EYAASVKGRVAISRDDSKSTADLQMSSLIAEDTAMYNYAR GQ923664 LVESGGGLVQTGGSLRLSCAASGFTFS SYWMN WVRQAPGKGLEWVS TITHTDGST YYPDSVKGRFTISRDNTKNMLYLQMNSLRSEDTAVYYCAR GQ923643 3-6 LVESGGGMVPPGGSLSLSCKASGFTFT NYSMY WVRQAPGKGLQWVA GVSKPTGKNQ WYAPAVQGRFMISRDNPTSTVSLEITKLTSEDTALYYCSR GQ923678 LVESGGGLVQPGGSLRLSCAASGYTFS SYWMG WVCQAPCKGLEWVS LITGNGGST YYANTAKVRFTISRDNTKNMLYLQMNSLTAEDTAVYYCAR GQ923666 3-5 LVESGGSLVQPEGSLRLTCAASGFTSN SYYMN WFHQTPGKGLEWIC AITANGDST YYSDSVKGSFTISRDTAKNPIYLQMSSLRSKDTAVYYCAR

**Genebank # group FR1 CDR1 FR2 CDR2 FR3**

Fig. 5. The deduced amino acid sequences for the framework (FR) and combinatorial determining regions (CDR) for 75 members of the VH3 family from the little brown bat (*M.* 

*lucifugus*). Shared sequences are color-coded as in Fig. 4. From Bratsch et al., 2011.

is that a segment of one gene is "translocated" to another gene. When this process is combined with gene duplication, an array of modified duplicons results. Figure 4 shows the VH gene sequences for swine and Figure 5 the VH3 genes of the little brown bat (*Myotis lucifugus*). We have used color-coding to show that in swine, there are only four different FR1 sequences among the 24 known VH genes, only two FR2 sequences but a larger number of different FR3 sequences, although five are often shared (Fig. 4). Also shown is that CDR regions are often shared. Assuming these genes are the result of a combination of duplication and genomic gene conversion, VHT could be derived from VHE with CDR2 and FR3 translocated from VHF. A similar pattern of shared gene segments is seen among the VH3 germline gene repertoire of the little brown bat (Fig. 5). As shown, many of these share common FR1 sequences, a smaller number share CDR1, FR2 and CDR2 while the greatest diversity is seen in FR3 (Bratsch et al., 2011). This pattern of similarity among duplicated Ig genes is also seen in human and mouse, suggesting that after duplication and genomic gene conversion, the 3' segment was subjected to a higher rate of germline mutation and selection. We believe these examples support the hypothesis that the polygeny of VH took place by a combination of gene duplication and genomic gene conversion. It is welldocumented that within a sublocus, intralocus duplication of segments containing several genes also occurs. This is illustrated for the C-region sublocus for humans and rabbits (Fig. 2B). The same phenomenon occurs in the VH sublocus of mice (Retter et al., 2007; Johnston et al., 2006) humans (Matsuda et al., 1990) and in swine (Eguchi-Ogawa et al., 2010). For example, the genomic segment in swine that contains VHA, VHB and VHE has been duplicated to yield VHA\*, VHB\* and VHF.


Fig. 4. Deduced amino acid sequences for the framework (FR) and combinatorial determining regions (CDR) of germline porcine VH genes. Regions of FR and CDR regions that are shared among genes are color-coded. Those sequences that are not colored indicate segments with sequences that differ by one or a few changes that are not shared by other sequences.

is that a segment of one gene is "translocated" to another gene. When this process is combined with gene duplication, an array of modified duplicons results. Figure 4 shows the VH gene sequences for swine and Figure 5 the VH3 genes of the little brown bat (*Myotis lucifugus*). We have used color-coding to show that in swine, there are only four different FR1 sequences among the 24 known VH genes, only two FR2 sequences but a larger number of different FR3 sequences, although five are often shared (Fig. 4). Also shown is that CDR regions are often shared. Assuming these genes are the result of a combination of duplication and genomic gene conversion, VHT could be derived from VHE with CDR2 and FR3 translocated from VHF. A similar pattern of shared gene segments is seen among the VH3 germline gene repertoire of the little brown bat (Fig. 5). As shown, many of these share common FR1 sequences, a smaller number share CDR1, FR2 and CDR2 while the greatest diversity is seen in FR3 (Bratsch et al., 2011). This pattern of similarity among duplicated Ig genes is also seen in human and mouse, suggesting that after duplication and genomic gene conversion, the 3' segment was subjected to a higher rate of germline mutation and selection. We believe these examples support the hypothesis that the polygeny of VH took place by a combination of gene duplication and genomic gene conversion. It is welldocumented that within a sublocus, intralocus duplication of segments containing several genes also occurs. This is illustrated for the C-region sublocus for humans and rabbits (Fig. 2B). The same phenomenon occurs in the VH sublocus of mice (Retter et al., 2007; Johnston et al., 2006) humans (Matsuda et al., 1990) and in swine (Eguchi-Ogawa et al., 2010). For example, the genomic segment in swine that contains VHA, VHB and VHE has been

duplicated to yield VHA\*, VHB\* and VHF.

EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EEKLVESGGGLVQPGGSLRLSCVGSGFDFS EEKLVESGGGLVQPGGSLRLSCVGSGFDFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EEKLVESGGGLVQPGGSLRLSCVGSGFDFS EEKLVESGGGLVQPGGSLRLSCVGSGYTFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EEKLVESGGGLVQPGGSLRLSCVGSGITFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EVKLVESGGGLVQPGGSLRLSCVGSVFDFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EEKLVESGGGLVQPGGSLRLSCVGSGYTFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EVKLVESGGGLVQPGGSLRLSCVGSGYTFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EEKLVESGGGLVQPGGSLRLSCVGSGITFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EEKLVESGGGLVQPGGSLRLSCVGSGYTFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS EEKLVESGGGLVQPGGSLRLSCVGSGFTFS

STYIN STYIN DNAFS DYAFS SYEIS SYEIS SYAVS SYGVG SYGMS SYPIG SYAVE SSPIG SYAVS SYSMS SYPIG SYEIS SYPIG SYNMI SYAVS SYEIS STYIN SYGIG SYSMS SYEIS WVRQAPGKGLEWLA WVRQAPGKGLEWLA WVRQAPGKGLEWVA WVRQAPGKGLEWVA WVRQAPGKGLEWLA WVRQAPGKGLEWVA WVRQAPGKGLEWLA WVRQAPGKGLESLA WVRQAPGKGLEWLA WVRQAPGKGLEWLA WVRQAPGKGLEWLA WVRQAPGKGLEWLA WVRQAPGKGLEWLA WVRQAPGKGLEWLA WVRQAPGKGLEWLA WVRQAPGKGLEWLA WVRQAPGKGLEWLA WVRQAPGKGLEWLA WVRQAPGKGLESLA WVRQAPGKGLEWLA WVRQAPGKGLEWVA WVRQAPGKGLEWLA WVRQAPGKGLEWLA WVRQAPGKGLEWLA

EEKLVESGGGLVQPGGSLRLSCVGSGFTFSSYEISWVRQAPGKGLEWLAAIYSGGSAISTSGG

Fig. 4. Deduced amino acid sequences for the framework (FR) and combinatorial determining regions (CDR) of germline porcine VH genes. Regions of FR and CDR regions that are shared among genes are color-coded. Those sequences that are not colored indicate segments with sequences that differ by one or a few changes that are not shared by other sequences.

FR2

AISTSGG AISTSGG AIASSDYDG AIASSDYDG GIYSSGS DICSGG GIDSGSYSG SIGSGSYIG GIDSGSYSG SIGRGRYRG SIGSGSYIG SIGSGSYSG AIYSGGS GIYSSGS AISTSGS AISTSGA CIYSSGS YITSSGG SIGSGSYIG AIGCGSYSG AIASSDYDG GIYSGG CIYSSGS AISTSGG STYYADSVKGRFTISRDNSQNTAYLQMNSLRTEDTARYYCAT STYYADSVKGRFTISRDDSQNTAYLQMNSLRTEDTARYYCAT STYYADSVKGRFTISRDNSQNTVYLQMNSLRTEDTARYYCAI STYYADSVKGRFTISRDNSQNTVYLQMNSLRTEDTARYYCAI STYYADSVKGRFTISRDNSQNTAYLQMNSLRTEDTARYYCAR STYYADSVKGRFTISRDNSQNTAYLQMNSLRTEDTARYYCAR STYYADSVKGRFTISRDNSQNTAYLQMNSLRTEDTARYYCAR STDYADSVKGRFTISSDNSQNTAYLQMNSLRTEDTARYYCAR STYYADSVKGRFTISRDNSQNTAYLQMNSLRTEDTARYYCAR STYYADYVKGRFTISRDNSQNTAYLQMNSLRTEDTARYYCAR STDYADSVKGRFTISSDDSQNTVYLQMNSLRTEDTARYYCAR STYYADSVNGRFTISRDNSQNTAYLQMNSLRTEDTARYYCAR STYYADSVKGRFTISKDNSQNTAYLQMNSLRTEDTARYYCAT STYYADSVKGRFTISSDNSQNTAYLQMNSLRTEDTARYYCAR STYYADSVKGRFTISRDNSQNTAYLQMNSLRTEDTARYYCAT STVYADSVKDRFIYSRDNSQNTAYLQMNRNTAYLQMTYYCAR STYYADSVKGRFTISKDNSQNTAYLQMNSLRTEDTARYYCAR STYYADSVKGRFTISSDNSQNTAYLQMTRNTAYLQMTYYCAR STDYADSVKGRFTISSDDSQNTVYLQMNRNTAYLQMTYYCAR STYYADSVKGRFTISSDNSQNTAYLQMTRNTAYLQMTYYCAR STYYADSVKGRFTISSDNSQNTAYLQMTRNTAYLQMTYYCAR STYYADSVKGRFTISKDNSQNTAYLQMNSLRTEDTARYYCAR STYYADSVKGRFTISRDNSQNTAYLQMNSLRTEDTARYYCAR STYYADSVKGRFTISRDNSQNTAYLQMNSLRTEDTARYYCAR

FR3

CDR2

CDR1

FR1

Genebank VH # gene AF064686 VHA AB513624 VHA\* AF064687 VHB AB513624 VHB\* AF064688 VHC VHD AF064689 VHE AF064690 VHF DQ886395 VHG DQ886392 VHH AY911501 VHJ AF064692 VHK AY911500 VHL AF321841 VHN AF321842 VHO AF321844 VHQ AF321845 VHR AF321846 VHS AF321847 VHT AF321848 VHU AF321849 VHV AY911502 VHX AY911504 VHZZ DQ886393 VHY


Fig. 5. The deduced amino acid sequences for the framework (FR) and combinatorial determining regions (CDR) for 75 members of the VH3 family from the little brown bat (*M. lucifugus*). Shared sequences are color-coded as in Fig. 4. From Bratsch et al., 2011.

pattern.

Immunoglobulin Polygeny: An Evolutionary Perspective 121

diverged early from IgG3 (Butler et al., 2009a). We hypothesize that the duplication/diversification of C subclass genes in mice and humans followed the same

B lymphocytes, named because they form in the **B**one marrow or the chicken **B**ursa of Fabricius, are the cells that synthesize and secrete antibodies. This developmental process occurs in what are called "primary lymphoid tissues". These include the bone marrow, the chicken bursa, fetal liver, yolk sac and according to some, certain hindgut lymphoid tissues of artiodactyls. Among lower vertebrates, other tissues like the "head kidney"(pronephros), epigonal organ and Leydig organ are involved in this process (Solem & Stenvik, 2006;

Somatic recombination is illustrated in Figure 7A. This process is mediated by Recombinase Activation Genes (RAGs) as well as a variety of DNA repair and ligation enzymes. In the heavy chain locus this process first involves recombination of one J region gene segment and one D region gene segment. This event also produces a circular DNA product containing the intervening DNA sequence that is excised and is known as a signal joint circle (Fig. 7B). Single joint circles are diagnostic evidence that B cell lymphogenesis has recently occurred since this nuclear product is rapidly degraded. The rearrangement process then proceeds to the rearrangement of the DJ unit with some V gene segment and generation of another signal joint circle (Fig. 7A). The selection of the J, D, and V gene segments is poorly understood and will be discussed in Section 5. A similar series of events occurs among

**4. Mammalian antibody repertoires result from somatic events** 

segments in the light chain loci except that there are no D segments involved.

Fig. 7. The somatic rearrangement process among the gene segments of the variable heavy chain locus of mammals. A. Sequential rearrangement of D to J, then DJ to V and finally splicing of the primary transcript for VDJ to the exons encoding the C-region of IgM. B. Generation of a signal joint circle during the excision of intervening DNA during

Rumfelt et al., 2002; Dooley & Flajnik, 2006) .

recombination of D and J.

**4.1 Somatic gene segment recombination characterizes B cell lymphogenesis** 

### **3.2 Duplication and deletion of Ig genes in the C-region sublocus**

Figure 2B illustrates that in humans, a major segment of the C- region sublocus has been duplicated resulting in two IgEs, two IgAs and two pair of IgG genes (Lefranc et al., 1982; Flanagan & Rabbitts, 1982). A C pseudogenes separates them. In mammals, the target of duplications in the C-region has been those genes encoding IgG (or IgA in rabbits; Fig. 2B; Table 2). The duplication process in the CH region suggests it also occurred together with genomic gene conversions to produce an array of modified Cgenes (Fig. 6). In the example provided the IgG1 and IgG2 alleles share a common CH1 domain that is also found in the IgG4 alleles. The allelic variants of IgG1 and IgG4 differ only in their hinge exons. However, IgG1a and IgG4a have the same hinge as do IgG1b and IgG4b. The difference between IgG1 and IgG4 is in the CH3 domain which is not shared with any other C subclass gene. Another example is IgG5a and IgG6 which share a common CH1 and hinge exons. IgG5a also has the same CH2 exon as IgG6b, but the difference is in CH3. Thus Fig. 6 also shows that apparent genomic gene conversion events also involve allelic variants as well as entire genes. The pattern shows that the CH1, hinge and CH2 of

IgG1a and IgG4a were derived from the same ancestral C gene but the IgG1b and IgG4b were derived from another ancestral gene. The reverse effects of genomic gene conversion may explain certain heterozygous C deletions (Migone et al., 1984; Keyeux et al., 1989). Some swine lack certain C genes (Butler et al., 2009a) and deletions in the human C sublocus are well documented (LeFranc et al., 1983a; Rabbani et al., 1995).

Fig. 6. Alignment of the constant region domains of the porcine Cgenes. Regions of >95% homology are designated with the same texture. Superscripts in the gene designation denote allotypic variants. From Butler et al., 2009a.

As we have shown elsewhere, porcine IgG3 has a gene structure which is most similar to the consensus Cgenes of other mammals and therefore is closest to the ancestral C gene of all mammals (Butler et al., 2009a). IgG3 in humans, mice andswine all occupy the same 5' position which is immediately downstream of C (Eguchi-Ogawa-et al., 2010). Our studies indicate that the remainder of the porcine C genes were derived from an ancestral C that

Figure 2B illustrates that in humans, a major segment of the C- region sublocus has been duplicated resulting in two IgEs, two IgAs and two pair of IgG genes (Lefranc et al., 1982; Flanagan & Rabbitts, 1982). A C pseudogenes separates them. In mammals, the target of duplications in the C-region has been those genes encoding IgG (or IgA in rabbits; Fig. 2B; Table 2). The duplication process in the CH region suggests it also occurred together with genomic gene conversions to produce an array of modified Cgenes (Fig. 6). In the example provided the IgG1 and IgG2 alleles share a common CH1 domain that is also found in the IgG4 alleles. The allelic variants of IgG1 and IgG4 differ only in their hinge exons. However, IgG1a and IgG4a have the same hinge as do IgG1b and IgG4b. The difference between IgG1 and IgG4 is in the CH3 domain which is not shared with any other C subclass gene. Another example is IgG5a and IgG6 which share a common CH1 and hinge exons. IgG5a also has the same CH2 exon as IgG6b, but the difference is in CH3. Thus Fig. 6 also shows that apparent genomic gene conversion events also involve allelic variants as well as entire

IgG1a and IgG4a were derived from the same ancestral C gene but the IgG1b and IgG4b were derived from another ancestral gene. The reverse effects of genomic gene conversion may explain certain heterozygous C deletions (Migone et al., 1984; Keyeux et al., 1989). Some swine lack certain C genes (Butler et al., 2009a) and deletions in the human C

Fig. 6. Alignment of the constant region domains of the porcine Cgenes. Regions of >95% homology are designated with the same texture. Superscripts in the gene designation denote

As we have shown elsewhere, porcine IgG3 has a gene structure which is most similar to the consensus Cgenes of other mammals and therefore is closest to the ancestral C gene of all mammals (Butler et al., 2009a). IgG3 in humans, mice andswine all occupy the same 5' position which is immediately downstream of C (Eguchi-Ogawa-et al., 2010). Our studies indicate that the remainder of the porcine C genes were derived from an ancestral C that

**3.2 Duplication and deletion of Ig genes in the C-region sublocus** 

genes. The pattern shows that the CH1, hinge and CH2 of

allotypic variants. From Butler et al., 2009a.

sublocus are well documented (LeFranc et al., 1983a; Rabbani et al., 1995).

diverged early from IgG3 (Butler et al., 2009a). We hypothesize that the duplication/diversification of C subclass genes in mice and humans followed the same pattern.
