**2.1 Data**

Phenotypic data have mostly been collected directly from producers upon obtaining their signed permissions. The main source of data was backup files from herd management software DairyComp 305 (Valley Agricultural Software, Tulare, CA), PC Dart (Dairy Records Management Systems, Raleigh, NC), and DHI Plus (DHI Computing Services Inc., Provo, UT). Backup files are processed using internally written scripts, and information on pedigree, production, reproduction, and health events is extracted. Terminology used to record the health events varies across different herds, which was standardized as described [12, 34]. About 300 herds from around the United States have been participating in providing data.

The majority of genotypes used in genomic evaluation have been obtained in the Zoetis genotyping lab. Samples from animals from commercial herds (hair, blood, ear tissue, or semen for males) submitted to Zoetis for genomic testing were analyzed. Upon DNA extraction, genotyping was performed using Illumina BeadArray SNP chips with a number of SNP markers ranging from about 3000 to over 80,000. Raw genotypes were edited following the criteria as described previously [22, 23]. All animals genotyped with lower-density chips (<40,000 markers) were imputed using the program FImpute [24] to a set of 45,245 markers selected based on their call rates and minor allele frequencies that are used in genomic evaluation.
