**2.2 Trait definition**

Health events of interest were extracted from the herd management software backup files. Wellness traits mastitis (MAST), metritis (METR), retained placenta (RETP), displaced abomasum (DA), ketosis (KETO), and lameness (LAME) were considered.

Each wellness trait was defined as a binary event, having a value of one if a respective health event has been recorded at least once during the lactation and zero otherwise. Animal were required to have a lactation record with a valid calving date and lactation number, with a calving interval ranging from 250 to 999 days [23]. Lactations of the same cow without recorded disorders, as well as lactations of all herdmates of an animal without recorded health events, were added as "healthy" records. Phenotype records were checked against the pedigree, and all animals recorded as male as well as those having incompatible birth and calving dates were removed. Records were also removed if an animal in her most recent lactation did not reach an opportunity period, which was defined as a number of days in milk (DIM) by which 90% of all cases of a particular disorder have been recorded, or if the health event was recorded after the highest number of DIM when the occurrence of a disorder was biologically plausible. Animals not reaching the opportunity period were removed from the analysis regardless of whether they were healthy or sick.

Contemporary groups were created by combining the herd, year, and season of calving. Each group was required to have a minimum of 20 lactation records and at least one "sick" and one "healthy" record; otherwise, the entire group was discarded.

#### **2.3 Methodology**

Single-step genomic BLUP (ssGBLUP) was the method of choice for creating genomic predictions for wellness traits. ssGBLUP combines all available sources of information–pedigree, phenotypes, and genotypes–into one single evaluation, without the need of post-analysis processing, and incorporating information on genotyped and non-genotyped animals in this method in a straightforward manner [25].

The data were analyzed for each trait separately, using the following threshold model [23]:

$$
\lambda = X\beta + Z\_h h + Z\_a a + Z\_p p + e,\tag{1}
$$

where *λ* represents a vector of the unobserved liabilities for the given disorder; *β* is the vector of fixed effect of parity; parities 1, 2, 3, 4, and 5+ were considered; *h* is the random effect of herd, and year and season of calving, where *<sup>h</sup>* � *<sup>N</sup>* 0,*Iσ*<sup>2</sup> *h* � � with the variance *σ*<sup>2</sup> *<sup>h</sup>*; four seasons were defined within each year: Winter (Dec-Feb), Spring (Mar-May), Summer (Jun-Aug), and Fall (Sep-Nov); *a* is the random animal effect, with *<sup>a</sup>* � *<sup>N</sup>* 0, *<sup>A</sup>σ*<sup>2</sup> *a* � �, where *σ*<sup>2</sup> *<sup>a</sup>* is the additive genetic variance and A is the pedigree relationship matrix; *p* is the random effect of permanent environment with *<sup>p</sup>* � *<sup>N</sup>* 0,*Iσ*<sup>2</sup> *pe* � �, and *<sup>e</sup>* is the random residual, where *<sup>e</sup>* � *<sup>N</sup>*ð Þ 0,*<sup>I</sup> : <sup>X</sup>*, *Zh*, *Za*, and *Zp* are incidence matrices corresponding to the fixed effect in *Xβ* and the random effects of HYS, animal, and permanent environment, respectively.

In ssGBLUP, the inverse of the traditional pedigree relationship matrix, A�<sup>1</sup> is replaced by the inverse of H matrix, which is the pedigree relationship matrix augmented using genotypes [26, 27].

$$H^{-1} = A^{-1} + \begin{bmatrix} \mathbf{0} & \mathbf{0} \\ \mathbf{0} & G^{-1} - A\_{22}^{-1} \end{bmatrix} \tag{2}$$

where *G*�<sup>1</sup> is an inverse of the genomic relationship matrix and *A*�<sup>1</sup> <sup>22</sup> is an inverse of the pedigree relationship matrix for genotyped animals only. The genomic relationship matrix G was constructed using allele frequencies for each of the 45,245 SNP markers as described in [28]. By using the "hybrid" relationship matrix H, the SNP markers are utilized to better define relationships among all animals in the analysis.

Prior to genetic evaluation, variance components for each trait were estimated using the same data and model, but without including genotype information. Heritability of each trait was expressed as the ratio of genetic variance (*σ*<sup>2</sup> *<sup>a</sup>*) and the sum of all estimated variances:

$$h^2 = \frac{\sigma\_a^2}{\sigma\_a^2 + \sigma\_{pe}^2 + \sigma\_h^2 + \sigma\_\epsilon^2} \tag{3}$$

#### **2.4 Software**

All analyses were performed using the BLUPF90 suite of programs created by Prof. Ignacy Misztal and his team at the University of Georgia in Athens (UGA) [29]. First, the data were formatted and renumbered using the program RENUMF90 v. 1.14. The variance components were estimated using the program THRGIBBS1F90 ver. 2.116. The genetic evaluation was performed with a program CBLUP90IOD2 version 3.21, which is appropriate for massive datasets as it uses iteration on data. To accommodate the large number of genotypes, the algorithm for

#### *Genetic Control of Wellness in Dairy Cattle DOI: http://dx.doi.org/10.5772/intechopen.103819*

proven and young animals (APY) was applied [30]. The APY algorithm generates the inverse of the genomic relationship matrix (*G*�<sup>1</sup> ) indirectly using recursion based on a proportionally small subset of animals (proven or core animals). Only the genomic relationship matrix for the core animals needs to be inverted; then, the elements of *G*�<sup>1</sup> for all other animals (young or non-core) are calculated linearly by recursion, which significantly reduces the computational requirements [30]. Computational details of the APY algorithm are described in [31, 32]. In our analysis, the core consisted of 25,000 animals selected at random. Each trait was run in a separate process, but with the same model and *H*�<sup>1</sup> matrix. The reliabilities of estimated breeding values were approximated using the program ACCF90GS v. 2.54 that implements an algorithm that combines contributions of genotypes, pedigree, and phenotypes [33]. Reliability of estimated breeding values (EBV) is formally defined as the squared correlation between true (unknown) and estimated breeding value; in practice, reliability shows how well the estimate represents the true breeding value. Higher values of reliabilities indicate that EBV are more accurate and less likely to change over time, with the addition of new information. Reliability estimates depend on the amount of data available, heritability of the trait, connectedness among the animals in the population, as well as methodology used to estimate reliabilities. In our analyses with a very large number of genotypes, an approximation considers value of the diagonal of the G matrix, gii, as a proxy for the contribution from genotypes (Daniela Lourenco, University of Georgia, Athens, personal communication, 2016).

#### **2.5 Expression of evaluation results**

The solutions for the random animal effect obtained by the cblup90iod program represent raw estimated breeding values (EBV) on the liability scale. To make them easier to interpret, raw EBV for each trait were transformed into probabilities of exceeding the value of the threshold. The threshold value represents the estimated point of transition between the two categories of a binary trait (in the case of wellness traits, the transition from healthy to sick). Threshold values for all traits were estimated from the data. For each animal solution, the probability that a standard normal variable with a mean equal to that solution and a variance of 1 exceeds the threshold was calculated [23]. These probabilities were then transformed into percentages by multiplying by 100, divided by 2 to obtain predicted transmitting abilities (PTA), which are defined as a half of EBV, and expressed as the differences from the average of the reference population, that is, a group of animals selected to represent relevant individuals from current commercial herds. Higher values of PTA (or genomically enhanced PTA—gPTA—if the animal was genotyped) represent higher risk of having a disorder. For example, in a reference population with an average incidence of mastitis of 20%, an animal with a PTA for mastitis of 2.5 will have offspring with an estimated 22.5% chance of getting mastitis during lactation. Animals' genetic merit for wellness traits is reported as standardized transmitting abilities (STA) [34] where;

$$\text{STA} = \left(\frac{\text{gPTA} - \mu}{\sigma} \times (-5)\right) + 100 \tag{4}$$

where μ and σ represent the mean and the standard deviation of gPTA, respectively. Therefore, a value of 100 represents the average expected disease risk, with animals at 95 or 105 being one standard deviation away from the mean. For wellness traits, larger STA are more desirable for all traits, because they represent lower expected average disease risk. Selecting for a higher STA is expected to result in reduced incidence of the respective disease (**Figure 1**).
