**3. The BXD family**

The BXD family were among the first RI strains to be produced [24, 42, 43]. This work was started by Benjamin A. Taylor who crossed female C57BL/6 J (B6 or B) and male DBA/2 J (D2 or D) strains—hence BXD (**Figure 1A**). The first sets of BXDs were intended for mapping Mendelian loci [42, 44], but the family was also used to map complex traits such as cancer and cardiovascular disease [45–48], variation in CNS structure [49–52], and behavioral and pharmacological differences [53–62]. Twenty-seven of the original 32 BXD strains are still available from The Jackson Laboratory (JAX). In the mid-1990s, Taylor began the production of a second set of BXDs [44] and added nine new strains (BXD33–BXD42). BXD1-BXD42 carry the strain suffix "/TyJ".

We started production of another wave of BXDs at UTHSC in the late 1990s [29]. These new lines were derived from advanced intercross (AI) progeny that had accumulated chromosomal recombination events across 8 to 14 generations [63] (**Figure 1B**). These AI-derived BXDs incorporate roughly twice as many recombinations between parental genomes than do conventional F2-derived BXDs [63–67]. This improves mapping precision nearly two-fold. BXD strains BXD43 and above from UTHSC were donated to JAX once fully inbred, and carry the strain suffix "/RwwJ".

The BXD family has been used to define specific genes and even sequence variants corresponding to 20 or more QTLs. These include two tightly linked genes, *Iigp2* and *Irgb10,* for *Chlamydia* infectivity [68, 69], *Fmn2* as a master controller of tRNA synthetases in neurons [70], *Ubp1* for blood pressure [48], *Hc* for H5N1 influenza resistance [71], *Comt* as a master controller of neuropharmacological traits [72], *Alpl* for hypophosphatasia [73], *Mrps5* for longevity [74], *Bckdhb* for maple syrup urine disease, *Dhtkd1* for diabetes [75], *Hp1bp3* for cognitive aging [76], *Ahr* for locomotor activity [77], *Cacna2d1* for glaucoma [78] and *Gabra2* for behavioral traits [79]. Alleles discovered in the BXD have been successfully translated into medical applications in humans, such as stratified preclinical testing based on glaucoma risk alleles revealed in the BXDs [80, 81].

Two things now set the BXD family apart from all other recombinant inbred populations: the number of strains within the family, and the deep, coherent phenome that has been collected for them.

### **3.1 The largest mammalian recombinant inbred family**

The BXD family is the largest mammalian recombinant inbred population, having expanded during its lifetime, from ~20 [42], to ~35 [44], to ~80 [29], to a total of 198 strains with data on GeneNetwork.org. There are 123 BXD strains currently distributed by The Jackson Laboratory (JAX) and an additional seventeen strains available at UTHSC, soon to be donated to JAX [82]. All 140 of these strains are available under a standard material transfer agreement. This expanded number of easily accessible strains increases the power and precision of linkage studies [82].

As the number of strains increases, there is an increase in the number of recombination junctions within the population, and consequently, quantitative trait loci (QTLs) can be narrowed down to smaller intervals. This is improved still further by the fact that approximately half of the BXD family are derived from advanced intercrosses, each of which will have a larger number of recombinations than their F2 derived cousins. We have demonstrated that when using approximately half of the family (60–80 strains), precision is close to 1 Mb for many traits [82]. This is also partially due to two other features of the family. The first, common to all RIs, is that the effective heritability of the trait can be boosted by resampling the same genome-type [38], and the second, that because there are two parents in the population, there is a well-balanced distribution of the two haplotypes across the genome (the mean minor allele frequency is ~0.44).

When carrying out QTL mapping the largest gain of power is given by increasing the number of genome-types tested [38, 73], and therefore, as the largest RI family, the BXD have the most power to detect genotype–phenotype linkage. A simple app has been produced to estimate power to detect QTL in the BXD, available at http://power.genenetwork.org [82]. When we examine power in the BXD family, we see a fact that might seem counter-intuitive to some: power is always increased more by increasing the number of strains compared to increasing the number of within strain biological replicates, even when heritability is low. Even

#### *Recombinant Inbred Mice as Models for Experimental Precision Medicine and Biology DOI: http://dx.doi.org/10.5772/intechopen.96173*

at low-to-moderate heritabilities, increasing replicates above 6 within-strain gives very little improvement in power.

We should also note that the effect sizes seen in the BXD family (and other twoparent RIs), appear to be high, but this is correct, as effect size is highly dependent upon the population being studied. Effect sizes measured in families of inbred lines are typically much higher than those measured in an otherwise matched analysis of intercrosses, heterogeneous stock, or diversity outbred stock. Two factors contribute to the higher level of explained variance of loci when using inbred panels. The first reason is due to replicability. When effect size is treated as the proportion of total genomic variance explained by the QTL, effect size will increase as environmental effects decrease due to replication. That is, resampling decreases the standard error of the mean, suppressing environmental "noise" [38]. This is in addition to the increase in heritability above (i.e. an increase in total variance explained by the total genomic variance).

The second reason is that nearly all loci in inbred panels are homozygous and the same number of sampled animals will account for twice as much genetic variance as in an F2 cross, and four times as much variance as in a backcross [38]. When phenotyping with fully homozygous strains we are only examining the extreme ends of the distribution, providing a boost in power to detect additive effects. The downside is obvious: we cannot detect non-additive effects. However, if we add in members of the diallel cross population (DAX), we can now estimate both dominance and parent-of-origin effects. This is a topic we will come to later.

#### **3.2 The deepest phenome for any family**

As well as being the largest recombinant inbred family, the BXD are also the most deeply phenotyped. Over 40 years of data is now openly and publicly available at genenetwork.org, providing an unrivaled resource. This dense and well-integrated phenome consists of over 10,000 classical phenotypes [83]. The phenome begins with Taylor's 1973 analysis of cadmium toxicity, through to recent quantitative studies of addiction [84–86], behavior [87–90], vision [91], infectious disease [92–94], epigenetics [95, 96], and even indirect genetic effects [97–99]. The BXDs have been used to test specific developmental and evolutionary hypotheses [49, 100, 101]. They have allowed the study of gene-by-environmental interactions, with environmental exposures including alcohol and drugs of abuse [86, 102–105], infectious agents [71, 106–109], dietary modifications [110–115], and stress [116, 117]. The consequences of interventions and treatments as a function of genome, diet, age, and sex have been quantified [90, 96, 115, 118–120], and gene pleiotropy has been identified [121].

Beyond this, there is now extensive omics data for the BXD. Both parents have been fully sequenced [75, 122, 123], and deep linked-read and long-read sequencing of 152 members the BXD family is underway. Over 100 transcriptome datasets are available (e.g, [124, 125]), as well as more recent miRNA [84, 126], proteome [118, 120, 127], metabolome [75, 118, 125], epigenome [95, 128], and metagenome [93, 129] profiles. Nevertheless, much more is still to be done, as many of these measures have only been taken in the liver or in specific brain regions [118, 120]. However, as each of these new datasets is added, they will be fully coherent with previous datasets, multiplicatively increasing the usefulness of the whole phenome.

Access to this plethora of data is freely available from open-source web services, allowing users to download the data, or to make use of powerful statistical tools designed for global analyses that are integrated into websites (e.g. GeneNetwork. org, bxd.vital-it.ch, and Systems-Genetics.org) [125, 130, 131].

It cannot be overstated how important it is that those using the BXDs gain access to coherent genomes and quantitative phenomes generated under diverse laboratory and environmental conditions [83, 132]. New data can be compared to thousands of publicly available quantitative traits, and with each addition, the number of network connections grows quadratically—enabling powerful multi-systems analysis for all users [73, 111, 112, 118, 125, 133]. Causal pathways can be produced from genome variants, to gene expression, to metabolite levels, to phenotype [73]. Within minutes of finding a gene of interest, a researcher can look for correlations between its expression and thousands of other genes, across dozens of tissues. Enrichment analysis can then be carried out on these 'gene-friends' suggesting pathways and networks that your gene of interest may be associated with. Correlations can be found between the expression of your gene and over 10,000 phenotypes, giving suggestions of the role of the gene at the whole-organism level. Shared QTLs, where both the gene-expression and a phenotype of interest are associated with the same locus, provide strong evidence of a genetic link. Using GeneNetwork.org we can build biological networks, moving from genetic variant, to expression difference, to protein expression, to whole-system outcomes, with just a few keystrokes, and without touching a lab bench [134–136]. Entire manuscripts can be written without leaving a web browser [137]. This is a massive step forward that is under-appreciated by many.

The above demonstrates how the BXD can help us achieve our goal of predictive modeling of disease risk and the efficacy of interventions [138]. Indeed, the family has already been used to test specific functional predictions of behavior based on neuroanatomical variation [139]. The BXD family is well placed to address these questions that encompass both high levels of genetic variation and gene-environmental interactions: our many-to-many-to-many problem. This is bolstered by the family's easy extendibility into a massive diallel cross population (DAX).

## **4. Diallel crosses**

The diallel cross is another simple idea that has been with us for over 60 years [140–142]. We now have the major opportunity to take full advantage of this approach using large panels of fully sequenced isogenic strains. A DAX is the set of all possible matings between several genome-types (**Figure 1D**). For the C57BL/6 J and DBA/2 J there are the two reciprocal F1s, and these have been used to study parent-of-origin effects and to estimate heritability (e.g. [53]). As the number of parental strains increases, the number of potential diallel crosses increases exponentially, and tools have been developed to deal with large DAXs [143]. Although we have learnt much about the genetic architecture of traits [53, 143–147], QTL mapping has been more difficult, given the relatively small number of strains used [148]. We can now imagine the full DAX for the BXD family of 140 strains – 19,460 replicable isogenic F1s, all of which have a reproducible, entirely defined genome, and any subset of which can be generated efficiently for *in vitro* and *in vivo* predictive biology and experimental precision medicine. Just as the C57BL/6 J and DBA/2 J are the parents of the BXDs, the BXD strains are the parents of a potentially huge isogenic DAX.

At the first level, this has important consequences for power and precision. The number of strains phenotyped can be increased massively, giving power to detect loci with even the weakest of effect sizes [148]. Precision can also be enhanced, as F1s can be produced which segregate for a narrow region of the genome, producing a small QTL interval containing fewer genes. All the data collected in these F1s can

#### *Recombinant Inbred Mice as Models for Experimental Precision Medicine and Biology DOI: http://dx.doi.org/10.5772/intechopen.96173*

be coherently integrated into the phenome already aggregated for the BXD, meaning that every new phenotype measured adds quadratically to the phenome and that any user of this F1 has access to over 40 years of data.

At the next level up, it also allows us to detect, for example, dominance and parent-of-origin effects mentioned above. Small DAXs of mouse strains have been able to identify parent-of-origin effects, epistasis, and dominance, but have been unable to map the loci causing these effects [53, 143–146, 149, 150]. By using reciprocal crosses of inbred strains (e.g. BXD001xBXD002F1 vs. BXD002xBXD001F1), we can produce isogenic litters, the members of which are all genetically identical, and whose only differences are due to parent-of-origin effects [151] (**Figure 1C**). By building a large DAX of reciprocal crosses, the genomic loci causing these dominance, epistatic, and/or parent-of-origin effects can be identified. Mapping of these non-additive effects is a complete dark zone in fully homozygous inbred populations.

Finally, and most importantly, the DAX provides a population for the testing of predictions. Using the BXD family we have enough strains to make associations, whether gene-phenotype, environment-phenotype, or gene–environmentphenotype, with high power. However, using only the inbred BXD lines, we do not have a second population in which to test predicted associations. The BXD DAX provides a matrix of 19,600 isogenic genome-types. If only the 'diagonal' of inbred BXD strains are used to detect associations and make predictions, any of the 19,460 isogenic F1s are available to test these associations and predictions (**Figure 1D**).

We can expand the DAX even further using easily available isogenic strains. There are approximately 200 RI strains from other two-parent mouse populations, including AXB/BXA (29 strains), AKXD (20), BXH (11), BRX58N (7), CXB (19), ILSXISS (60), LGXSM (~18), NXSM (16) and SWXJ (12), plus approximately 55–75 strains from the Collaborative Cross 8-parent RI population [28]. From these inbred parents, there are over 152,100 isogenic F1s that can be produced and replicated. An additional expansion of this design is to cross RI families to genetically engineered disease models.
