**4. Mammalian antibody repertoires result from somatic events**

### **4.1 Somatic gene segment recombination characterizes B cell lymphogenesis**

B lymphocytes, named because they form in the **B**one marrow or the chicken **B**ursa of Fabricius, are the cells that synthesize and secrete antibodies. This developmental process occurs in what are called "primary lymphoid tissues". These include the bone marrow, the chicken bursa, fetal liver, yolk sac and according to some, certain hindgut lymphoid tissues of artiodactyls. Among lower vertebrates, other tissues like the "head kidney"(pronephros), epigonal organ and Leydig organ are involved in this process (Solem & Stenvik, 2006; Rumfelt et al., 2002; Dooley & Flajnik, 2006) .

Somatic recombination is illustrated in Figure 7A. This process is mediated by Recombinase Activation Genes (RAGs) as well as a variety of DNA repair and ligation enzymes. In the heavy chain locus this process first involves recombination of one J region gene segment and one D region gene segment. This event also produces a circular DNA product containing the intervening DNA sequence that is excised and is known as a signal joint circle (Fig. 7B). Single joint circles are diagnostic evidence that B cell lymphogenesis has recently occurred since this nuclear product is rapidly degraded. The rearrangement process then proceeds to the rearrangement of the DJ unit with some V gene segment and generation of another signal joint circle (Fig. 7A). The selection of the J, D, and V gene segments is poorly understood and will be discussed in Section 5. A similar series of events occurs among segments in the light chain loci except that there are no D segments involved.

Fig. 7. The somatic rearrangement process among the gene segments of the variable heavy chain locus of mammals. A. Sequential rearrangement of D to J, then DJ to V and finally splicing of the primary transcript for VDJ to the exons encoding the C-region of IgM. B. Generation of a signal joint circle during the excision of intervening DNA during recombination of D and J.

Immunoglobulin Polygeny: An Evolutionary Perspective 123

when the neonate encounters environmental antigen (Fig. 9B). Since both CSR and SHM occur simultaneously, it is not surprising that both are mediated by AID and that AID is also correlated with SGC (Withers et al., 2005; Arakawa et al., 1996). These events occur in germinal centers (GCs) of secondary lymphoid tissues after exposure to environmental antigen. GCs are found only in mammals and birds (Yasuda et al., 2003; Vigliano et al., 2006; Du Pasquier et al., 2000). Although lacking GCs, there is SHM and CSR in *Xenopus* (Marr et al., 2007) although it may be less efficient. However, for these events to occur, the naïve immune system must first or simultaneously be exposed to Pathogen Associated Molecular Patterns (PAMPs) that are recognized by a variety of innate immune system receptors. This dependence was demonstrated using the isolator piglet model (Butler et al., 2002; 2005;

SHM is not random across the entire rearranged VDJ-C transcript. Rather it is largely concentrated in the CDR regions of the rearranged VDJ or VJ segments (Fig. 9C). This is generally believed to result from selection of B centrocytes in GCs rather than specific targeting. Although Fig. 9C only shows the accumulation of somatic mutations in CDR1 and CDR2, the same occurs in CDR3. As discussed previously, the CDRs are those segments of the encoded protein that coalesce to form the antibody binding site (Fig. 1A; Fig 8). There is little evidence to suggest that SHM proceeds downstream from segments of transcript that begins with the codon for the invariant tryptophan in FR4 (Fig. 8) or to sequences further

**4.3 The association of VH genes and VH- VL pairing in generation of specific** 

Much of the early studies on antibody specificity that appeared when VH or VL polygeny became known, attempted to correlate particular response to the use of certain VH or VL genes. We do not review that literature here but do provide a few examples. Cerato et al., (1997) studied hybridomas to show a lack of correlation between VH usage and specificity while Mo and Holmdahl (1996) show that mAbs to different epitopes used the same VH/Vk combinations. Boffey et al., (3004) showed that only 6/15 anti-LPS mAbs used the same VH gene (VH7183.3b). These observations should not be surprising considering the importance of HCDR3 in the specificity of antibodies (see Sectiion 4.3; 6.2). Lavoie et al., (1997) showed that nearly all mAbs to HEL use VH36-60 but differ in affinity because of SHM or HCDR3 differences. The antibody binding site involves CDRs (including CDR3) of both H and L chains (Fig 1B); this has been shown by separation and reassociation experiments. These experiments show that binding site specificity depends on both H and L chains even for antibodies specific for the same hapten since heterologous light chains seldom restore the full binding site (Kranz & Voss, 1981). This mutual dependence is also demonstrated by the non-random pairing found in antibodies of certain specificity such as to the capsular polysaccharides of *S. pneumoniae* (Thomson et al., 2011). Further evidence for the effect of H-L pairing comes from studies of autoantibodies in a phenomenon called "receptor editing" (deWildt et al., 1999). This *in vivo* phenomenon involves reactivation of recombinase activity in lymph nodes resulting in the replacement of the light chain with a new one. In this way, B cells expressing autoreactive BCRs acquire a new light chain which alters their specificity and removes or diminishes their autoreactivity apoptotic elimination (Tiegs et al., 1993; Gay

While L-H pairing is important for binding site specificity, there are situations in which light chains are not needed to form an antibody binding site. The best known examples are the

2009b; Butler & Sinkora 2007).

downstream in the C-sublocus.

**antibodies** 

et al., 1993).

The rearranged VDJ (heavy chain) and VJ (light chains) rearrangements are then transcribed and the primary transcript spliced to some set of C-region exons in the heavy and light chain loci respectively. For the heavy chain this is initially IgM in all higher vertebrates (Fig. 7A). The resulting VDJ-C and VJ-C transcripts are then translated into the light and heavy polypeptide chains that combine to form the complete antibody molecule (Fig. 1B). As shown in Fig. 1A and discussed above, the antigen-binding site is located in the peptide loops from the VH and VL domains that coalesce at the end of theseV-domain and that contain the CDRs; three from the VDJ and three from the light chain VJ rearrangements. CDR1 and CDR2 of the VH and VL are encoded within the germline genes whereas the CDR3 region is the result of the combining ofV-D-J segments (heavy chain) and V-J (light chains; Fig. 8). The CDR3 region of the heavy chain, hereafter designated HCDR3, is considered most important to the specificity of the binding site (Amit et al., 1986; Padlan, 1996; Xu & Davis2000; Mageed et al., 2001). In fact the same set of V-genes encoding CDR1 and CDR2 can theoretically and actuallycontribute the binding site for antibodies of quite different specificity in the context of different HCDR3 regions (Thomson et al., 2011; Ichiyoshi & Casali, 1994) as might be envisioned from a comparison of the CDR3 sequences shown in Fig. 8.


Fig. 8. The diversity of HCDR3 sequences resulting from the recombination of the same VH, DH and JH segments. Remnants of the DH germline segments are underlined. The 5' and 3' nucleotide additions are indicated. TGG is the codon for the invariant tryptophan found in the JH gene segments of all mammals while TGT is the codon for C that is nearly invariant in the FR3 of all VH3 family genes.

### **4.2 Maturation of the antibody repertoire involves class switch and somatic hypermutation**

All immunologists, immunopathologist and physicians in specialties such as rheumatology know that most Igs are IgG (serum) or IgA (secretions). This means that the rearrangements involved in B cell lymphogenesis that initially favors the expression of IgM (Fig. 7A) switch to these isotypes. After environmental exposure, the concentration of the major Igs in serum is elevated 100-300 fold compared to newborn piglets or those reared in germfree isolators (Fig. 9A). The transition from newborn to conventionally-reared young adults favors IgG in serum (Fig. 9A) and IgA in secretions (Butler et al., 2011a). This change involves class switch recombination (CSR) which is mediated by activation-induced cytidine deaminase (AID) of the APOBEC family which facilitates the splicing of RNA encoding the rearranged VDJ to transcripts encoding IgG and IgA rather than IgM. This maturation process typically occurs in tandem with somatic hypermutation (SHM) of the rearranged VDJs or VJs prior to their transcription. SHM is another mechanism for repertoire diversification and is triggered

The rearranged VDJ (heavy chain) and VJ (light chains) rearrangements are then transcribed and the primary transcript spliced to some set of C-region exons in the heavy and light chain loci respectively. For the heavy chain this is initially IgM in all higher vertebrates (Fig. 7A). The resulting VDJ-C and VJ-C transcripts are then translated into the light and heavy polypeptide chains that combine to form the complete antibody molecule (Fig. 1B). As shown in Fig. 1A and discussed above, the antigen-binding site is located in the peptide loops from the VH and VL domains that coalesce at the end of theseV-domain and that contain the CDRs; three from the VDJ and three from the light chain VJ rearrangements. CDR1 and CDR2 of the VH and VL are encoded within the germline genes whereas the CDR3 region is the result of the combining ofV-D-J segments (heavy chain) and V-J (light chains; Fig. 8). The CDR3 region of the heavy chain, hereafter designated HCDR3, is considered most important to the specificity of the binding site (Amit et al., 1986; Padlan, 1996; Xu & Davis2000; Mageed et al., 2001). In fact the same set of V-genes encoding CDR1 and CDR2 can theoretically and actuallycontribute the binding site for antibodies of quite different specificity in the context of different HCDR3 regions (Thomson et al., 2011; Ichiyoshi & Casali, 1994) as might be envisioned from a comparison of the CDR3 sequences

FR3 5' N-add DHA 3' N-add JH FR4 TGNGCNAGN GAATTGCTATAGCTATGGTGCTAGTTGCTATATGATGAC ATTACTATGTTATGGATCTCTGGGGCCCA TGTGCAAGT TGCTATAGCTATGGTGCTAGTTGC TTTTGGACAAGATCA TACTATGCTATGGATCTCTGGGGCCCA TGTGCAAGA GGCTGTTTTC GCTATAGCTATGGTGCTAGTTGCTATGATGTCG ACTATGCTATGGATCTCTGGGGCCCA TGTGCAA CAGGCGAT TGCTATAGCTA GGTGCTAGTTGCACCGGGATG GCTATGGATCTCTGGGGCCCA TGTGCAA TT GCTATAGCTATGGTGCTAGTTG T TATGGATCTCTGGGGCCCA TGTGCAA CAGAG TGCTATAGCTATGGTGCTAGTTGCTATATGTATGC TATGGATCTCTGGGGCCCA TGTGCAAGA G ATAGCTATGGTGCTAGTT ACCCCTC TATGGATCTCTGGGGCCCA TGTGC CCAG GCTATAGCTATGGTGCTAGT CCAGGATG TGGATCTCTGGGGCCCA TGTGCAA CAGGCATAGCTATGGTGCTAGTTGCTAT GAAGA TGGATCTCTGGGGCCCA TGTGCAAG GTCC AATTGCTATAGC TCCGGTGGTGAGTGCTATGGTTACCCTTGGGGTTATGTTGCTG TGGATCTCTGGGGCCCA TGTGCAA TT GCTATAGCTATGGTGCTAGTT AGATC GGATCTCTGGGGCCCA Fig. 8. The diversity of HCDR3 sequences resulting from the recombination of the same VH, DH and JH segments. Remnants of the DH germline segments are underlined. The 5' and 3' nucleotide additions are indicated. TGG is the codon for the invariant tryptophan found in the JH gene segments of all mammals while TGT is the codon for C that is nearly invariant

**4.2 Maturation of the antibody repertoire involves class switch and somatic** 

All immunologists, immunopathologist and physicians in specialties such as rheumatology know that most Igs are IgG (serum) or IgA (secretions). This means that the rearrangements involved in B cell lymphogenesis that initially favors the expression of IgM (Fig. 7A) switch to these isotypes. After environmental exposure, the concentration of the major Igs in serum is elevated 100-300 fold compared to newborn piglets or those reared in germfree isolators (Fig. 9A). The transition from newborn to conventionally-reared young adults favors IgG in serum (Fig. 9A) and IgA in secretions (Butler et al., 2011a). This change involves class switch recombination (CSR) which is mediated by activation-induced cytidine deaminase (AID) of the APOBEC family which facilitates the splicing of RNA encoding the rearranged VDJ to transcripts encoding IgG and IgA rather than IgM. This maturation process typically occurs in tandem with somatic hypermutation (SHM) of the rearranged VDJs or VJs prior to their transcription. SHM is another mechanism for repertoire diversification and is triggered

shown in Fig. 8.

**hypermutation** 

in the FR3 of all VH3 family genes.

when the neonate encounters environmental antigen (Fig. 9B). Since both CSR and SHM occur simultaneously, it is not surprising that both are mediated by AID and that AID is also correlated with SGC (Withers et al., 2005; Arakawa et al., 1996). These events occur in germinal centers (GCs) of secondary lymphoid tissues after exposure to environmental antigen. GCs are found only in mammals and birds (Yasuda et al., 2003; Vigliano et al., 2006; Du Pasquier et al., 2000). Although lacking GCs, there is SHM and CSR in *Xenopus* (Marr et al., 2007) although it may be less efficient. However, for these events to occur, the naïve immune system must first or simultaneously be exposed to Pathogen Associated Molecular Patterns (PAMPs) that are recognized by a variety of innate immune system receptors. This dependence was demonstrated using the isolator piglet model (Butler et al., 2002; 2005; 2009b; Butler & Sinkora 2007).

SHM is not random across the entire rearranged VDJ-C transcript. Rather it is largely concentrated in the CDR regions of the rearranged VDJ or VJ segments (Fig. 9C). This is generally believed to result from selection of B centrocytes in GCs rather than specific targeting. Although Fig. 9C only shows the accumulation of somatic mutations in CDR1 and CDR2, the same occurs in CDR3. As discussed previously, the CDRs are those segments of the encoded protein that coalesce to form the antibody binding site (Fig. 1A; Fig 8). There is little evidence to suggest that SHM proceeds downstream from segments of transcript that begins with the codon for the invariant tryptophan in FR4 (Fig. 8) or to sequences further downstream in the C-sublocus.

### **4.3 The association of VH genes and VH- VL pairing in generation of specific antibodies**

Much of the early studies on antibody specificity that appeared when VH or VL polygeny became known, attempted to correlate particular response to the use of certain VH or VL genes. We do not review that literature here but do provide a few examples. Cerato et al., (1997) studied hybridomas to show a lack of correlation between VH usage and specificity while Mo and Holmdahl (1996) show that mAbs to different epitopes used the same VH/Vk combinations. Boffey et al., (3004) showed that only 6/15 anti-LPS mAbs used the same VH gene (VH7183.3b). These observations should not be surprising considering the importance of HCDR3 in the specificity of antibodies (see Sectiion 4.3; 6.2). Lavoie et al., (1997) showed that nearly all mAbs to HEL use VH36-60 but differ in affinity because of SHM or HCDR3 differences. The antibody binding site involves CDRs (including CDR3) of both H and L chains (Fig 1B); this has been shown by separation and reassociation experiments. These experiments show that binding site specificity depends on both H and L chains even for antibodies specific for the same hapten since heterologous light chains seldom restore the full binding site (Kranz & Voss, 1981). This mutual dependence is also demonstrated by the non-random pairing found in antibodies of certain specificity such as to the capsular polysaccharides of *S. pneumoniae* (Thomson et al., 2011). Further evidence for the effect of H-L pairing comes from studies of autoantibodies in a phenomenon called "receptor editing" (deWildt et al., 1999). This *in vivo* phenomenon involves reactivation of recombinase activity in lymph nodes resulting in the replacement of the light chain with a new one. In this way, B cells expressing autoreactive BCRs acquire a new light chain which alters their specificity and removes or diminishes their autoreactivity apoptotic elimination (Tiegs et al., 1993; Gay et al., 1993).

While L-H pairing is important for binding site specificity, there are situations in which light chains are not needed to form an antibody binding site. The best known examples are the

Immunoglobulin Polygeny: An Evolutionary Perspective 125

Table 1 shows that higher vertebrates have many duplicated V-region gene segments available for use in the formation of their antibody repertoire using the recombinatorial process illustrated in Figure 7. Humans have available ~ 100 VH segments, ~30 DH segments and 9 JH segments (Fig. 2A). By contrast, swine have <30 VH genes belonging to a single family (Fig. 4), only two functional DH segments and like the chicken (Fig. 3) one functional JH segment (Sun et al., 1994; Butler et al., 1996; Eguchi-Ogawa et al., 2010). While the ancestral VH3 family (Schroeder et al., 1990) dominates the V-region loci of many species, the ~100 VH genes of mice and human belong to 14 and 7 different families

Usage of VH genes in rabbit is biased to the most 3' VH gene, which accounts for 90% of VH usage in the pre-immune repertoire although there are >100 VH genes in the rabbit repertoire (Currier et al., 1988; Table 2). In humans there is bias for V3-23,V3-30,V3-33 and V4-34 (Glas et al., 2000). While some suggest that VH usage is random in mice (Dildrop et al., 1985) studies on J558 usage (one-half of the mouse genome) indicates that usage in unequal and rather scattered across the entire J558 genome even in the pre-immune repertoire (Gu et al., 1991) and that usage is not affected by SHM or CSR. Foster et al., (1997) showed that while most Vk genes were used, usage was non-random and the same was true for J. Sheehan et al., (1993) showed that fetal VH usage can differ from 0.1 to 1.0 but that most 5' VH genes are underrepresented. In swine VHA (IGHV4) and its near duplicate (IGHV10; see Figs. 4 & 10) account for one-third to one-half of the pre-immune repertoire (Butler et al., 2006; Eguchi-Ogawa et al., 2010; Butler et al., 2011b). Interestingly, the majority of these preferred genes in all these species belong to the ancestral VH3 family (Schroder et

Early studies suggested that VH usage was biased during early stages of B cell lymphogenesis to favor the most JH proximal DH segments and the most 3' VH genes (Schroeder et al., 1987; Yancopoulos et al., 1984) but that this pattern became "normalized" in adults (Malynn et al., 1987). This concept gained support when it was found that young rabbits use their 3' most VH gene > 90% of the time and then further diversfied their repertoire using upstream VH genes and SGC; perhaps a type of "developmental normalization (Knight 1992; Becker & Knight 1990). However, additional studies in humans neither substantiated the positional "3' bias" (Matsuda et al., 1993) nor have our studies in swine (Eguchi-Ogawa et al., 2010; Fig. 10). The most 3' functional VH in swine (IGHV2) is almost never used while upstream VH15 (IGHV15) can account for ~13% of VH usage (Fig. 10). Thus, the "position hypothesis" to explain VH usage has not been universally fulfilled.

Vertical studies on VH usage in especially humans and mice are difficult because: (a) the V-D-J repertoire of these species is complex (Table 2) and could require up to 56 primer sets to recover all VDJ rearrangements in mouse and 42 sets for human (b) maternal regulatory factors transmitted *in utero* or via colostrum/milk can influence pre-natal and postnatal development (Wikler et al., 1980; Rodkey & Adler, 1983; Klobasa et al., 1981; Wang & Shlomchik 1998; Yamaguchi et al., 1983) and (c) control of environmental and maternal

**5.2 Variable region gene segment usage is not position dependent** 

**5.3 VH gene usage remains constant in fetal and young pigs** 

**5. Patterns of V, D and J gene segment usage 5.1 VH usage is biased to favor certain VH genes** 

respectively (Table 1).

al., 1990; Brezinschek et al., 1997).

naturally occurring single chain antibodies of the camelid group and some sharks (Hamers-Casterman et al., 1993; De Genst et al., 2006; Dooley et al., 2003; Diaz et al., 2002; Nguyen et al., 2002). Based on the convenience of producing single chain antibodies from these species for therapy and the evidence that the HCDR3 domainplays the major role in forming the antibody binding site (see Section 6.2) there have also been various attempts to developsynthetic single chain antibodies or "camelized" antibodies (Janssens et al., 2006; Reiter et al., 1999; Davies & Riechmann,1995). Among the camelids, single chain antibodies use a separate set of VH genes (called VHH) that encode a much largerportion of the binding sites than the conventional VH genes which compensates for the lack of a light chain. This topic has been recently reviewed (Muyldermanns et al., 2009). We mention these single chain antibodies here because we believe they further support the role played by HCDR3 in binding antigen and diminishes the value of polygeny of conventional VH genes (Section 6.2). It also shows that the extensive and universal V and V polygeny among mammals (Table 1) is unnecessary.

Fig. 9. The effect of antigen exposure on: A. Serum Ig levels; B. Frequency of SHM and C. Accumulation of somatic mutations in various segments of the VH genes. Germfree piglets are reared in isolators for 5 weeks and their only contact with potentially foreign antigen is food protein. PIC=conventionally-reared young pigs that are heavily antigenized through colonization and also infected with nematodes. The horizontal line (9B) in the scattergram is the mean frequency of SHM. SHM is significantly greater in PIC piglets than in fetal and germfree piglets. In 9C SHM accumulates in CDR regions as opposed to FR regions that encode the -pleated "staves" of the -barrel (Fig. 1A).

naturally occurring single chain antibodies of the camelid group and some sharks (Hamers-Casterman et al., 1993; De Genst et al., 2006; Dooley et al., 2003; Diaz et al., 2002; Nguyen et al., 2002). Based on the convenience of producing single chain antibodies from these species for therapy and the evidence that the HCDR3 domainplays the major role in forming the antibody binding site (see Section 6.2) there have also been various attempts to developsynthetic single chain antibodies or "camelized" antibodies (Janssens et al., 2006; Reiter et al., 1999; Davies & Riechmann,1995). Among the camelids, single chain antibodies use a separate set of VH genes (called VHH) that encode a much largerportion of the binding sites than the conventional VH genes which compensates for the lack of a light chain. This topic has been recently reviewed (Muyldermanns et al., 2009). We mention these single chain antibodies here because we believe they further support the role played by HCDR3 in binding antigen and diminishes the value of polygeny of conventional VH genes (Section 6.2). It also shows that the extensive and universal V and V polygeny among

**A B C**

**Fetal GF PIC**

Fig. 9. The effect of antigen exposure on: A. Serum Ig levels; B. Frequency of SHM and C. Accumulation of somatic mutations in various segments of the VH genes. Germfree piglets are reared in isolators for 5 weeks and their only contact with potentially foreign antigen is food protein. PIC=conventionally-reared young pigs that are heavily antigenized through colonization and also infected with nematodes. The horizontal line (9B) in the scattergram is the mean frequency of SHM. SHM is significantly greater in PIC piglets than in fetal and germfree piglets. In 9C SHM accumulates in CDR regions as opposed to FR regions that

**FR1 CDR1 FR2 CDR2 FR3**

**0**

**50**

**100**

**150**

**Mutation / Kilobase**

**200**

**<sup>250</sup> Fetal**

**GF PIC**

**0**

**20**

**40**

**60**

**Mutation/Kilobase**

**80**

**100**

**120**

mammals (Table 1) is unnecessary.

**IgM IgG IgA**

**Fetal GF PIC**

encode the -pleated "staves" of the -barrel (Fig. 1A).

**51**

**30**

**2051**

**µg / ml**

**ml**

**4051**

**6051 6052**

**16052**

**26052**
