**2.2 Cohort selection**

For this chapter, a sample of 314,101 confirmed endometriosis patients in 2019 in the US healthcare claims patient-level database was leveraged for the analysis. The patients were identified using predefined ICD 10 diagnosis codes (**Table 1**). Female patients of age 18 and older were identified for the target cohort. For the control cohort, a random sample of 3 million female patients with the same age specifications was selected from the database [21].

To define a control cohort of an equal size to the study target group, a 'propensity score matching' methodology was employed [18]. The algorithm selected the controls based on several similar characteristics or covariates. Covariates included patient age and medical history [26, 27]. **Table 2** presents the summary of the distribution comparison between the study target and control cohorts by age and

#### *Endometriosis - Recent Advances, New Perspectives and Treatments*

#### **Figure 1.**

*Healthcare claims patient level database summary.*


#### **Table 1.**

*ICD 10 diagnosis codes of endometriosis.*


#### **Table 2.**

*Comparison between target and control cohort by age and region respectively.*

Census geographies. The patient age variable was created via grouping age ranges, while states were grouped into the US regions [21].
