**3. The biomarker development process**

Three distinct phases can be delineated within a typical development pipeline: discovery, verification and validation. These can be further subdivided so there is a reduced number of candidates at each stage, each with an increased probability of utility 51. In the subsequent sections we aim to clearly segregate these phases in the biomarker 'pipeline' and further expand on the vastly different requirements of each (Figure 1). This process is prefaced by a brief overview of pre-analytical factors which can introduce unwanted bias or variation.

#### **3.1 The discovery phase**

144 Autoimmune Disorders – Current Concepts and Advances from Bedside to Mechanistic Insights

The false discovery rate (FDR) can also be calculated (Benjamini et al., 2003; Storey, 2003; Strimmer, 2008). By setting the FDR level, it is possible to diminish the risk of a false positive identification for a differentially-expressed protein, i.e., at P ≤ 0.05, we expect only 5% false positives. However, by doing so, the process of discovery may be compromised by overly stringent criteria. Although proteins displaying the most dramatic changes may appear to be useful biomarkers, it is important to attempt to rationalize their changes to the pathology. For example, acute phase proteins are frequently identified in plasma or serum-based studies as 'specific' biomarkers of a wide range of chronic disorders, including arthritis and

Cross-validation procedures can be used to reduce false positives. In this instance one data set is used to build the model (called training) and a second data set generated from an independent patient cohort is used to assess the predictive accuracy of the model (called testing). Another commonly-used validation strategy is known as K-fold cross-validation where the analysis is repeated over many random splits of the data. For each analysis, a subset of the data is used to build K number of predictive models, with the remaining subset available for a test of predictive accuracy. Although useful initially after discovery, validation based on splitting a single data set is of limited use because confounding factors

Given the issues noted above, it advisable to validate intial 'discoveries' on independent sample sets, perhaps incorporating analysis by orthogonal methods which are more amenable to the requirements of clinical throughput and precision (Dupuy & Simon, 2007). Re-analysis or meta-analysis using raw data coming from other research groups is another possibility, although data standards, such as the 'minimum information about a proteomics experiment' (MIAPE) (Taylor et al., 2007), often do not extend into the initial design of clinical studies. Consequently, detailed clinical data may not be captured and reported consistently for clinical proteomics experiments, limiting the ability of investigators to

For thorough validation, the number of patient samples required should be determined through the use of statistical tools that take into account the imprecision of the analytical method, inter-patient variability and the acceptable threshold of difference that is deemed significant for a given biomarker application (Ye et al., 2009). Patient numbers (biological replicates) and other statistical considerations of power have also been discussed in detail

Several multivariate analysis tools are available for the analysis of large multidimensional data sets and some of these have been arranged into commercial software packages. Visual tools, including principle component analysis, hierarchical cluster analysis and heat maps which display variance, relatedness and patterns in data (respectively), are also available and are useful preliminary aids in data analysis. These analyses stive to represent variance in a graphical fashion and give for example an overall view protein expression prevalence within outcome groups in the case of heat maps or 'relatedness' of expression levels between different proteins with hierarchical trees (Marengo et al. 2006; Marengo et al. 2008). Emphasis however, should be placed on using supervised or semi-supervised methods such as distribution free learning (kernel- based or Bayesian analysis) or support vector machine (SVM) which allow for advanced categorization and classification of multidimensional

cancer but clearly they are not specific to any one disease (Addona et al., 2009).

can introduce systematic biases into both training and test splits.

elsewhere (Cairns et al., 2009).

**2.6 Feature selection and classifier assessment** 

independently verify, combine or correlate data from multiple experiments.

In the discovery phase, proteomic platforms are unsupervised and are used to highlight qualitative and/or quantitative differences in multiple proteins across distinct clinical phenotypes. The process of discovery is focused on assessing many candidates, while minimizing the probability of false positives and negatives.

Discovery by definition requires an analytical approach which does not preempt the identity of the biomarker candidates. Generally speaking as most discovery methods prioritise the

Validation of Protein Biomarkers to Advance the Management of Autoimmune Disorders 147

a peptide-centric approach (Duncan et al., 2010), including spectra search criteria, sequence

Accordingly, changing levels of a modified protein may represent a better biomarker than changes in the total expression levels of a given protein. For example, alterations in the levels of naturally-occurring glycosylation motifs can serve as a marker of inflammation, lymphocyte tolerance and senescence in arthritis (Garcia et al., 2005), *viz.* increased branching of sugar moieties on alpha-1 acid glycoprotein can act as biomarkers of inflammation, whereas decreased branching of T-cell receptor affects the development of Th1/Th2 cells increasing susceptibility to autoimmunity (Havenaar et al., 1998; Morgan et

A depiction of possible qualitative and quantitative changes in protein isoforms between health and a disease state. The illustration of an isoform of a given protein associated with a specific adverse outcome demonstrates that it can only be detected by high 'resolution' proteomic strategies which can detect variance in post translational modifications. Conventional genomic and antibody based methods will only pick up on a change in expression of recognized transcripts or epitopes, giving a high likely hood of missing the

significance of the isoform prevalent in a particular disease outcome.

coverage and database completeness.

Fig. 2. Protein isoform verification

al., 2004).

measurement of as many proteins as possible they have inherently low throughput, are labor intensive and offer a low dynamic range. These characteristics preclude their use in later phases of the biomarker pipeline. It is also important to realize that as yet there is no single method available for looking at the complete complexity of the proteome within a given clinical sample. Because we are working with a relatively blunt set of tools in discovery we need to transition to more precise methods for validation.

A two-step approach to the discovery phase, though widely used, is not well defined in the literature. In the initial pilot exploration of a low number of individuals the aim is to gain a grasp of the variability of whole proteome being measured across the cohort, selecting a suitable sample type, optimizing the separation and quantification platform and ultimately calculating appropriate patient numbers to power a second (discovery) round with greater statistical confidence.

#### Fig. 1. Biomarker pipeline

Table describes the aim, the likely analytical platform and associated characteristics of each phase in an ideal biomarker discovery pipeline through verification to validation and final pre-market approval. The schema represents the increase in patient sample and decrease in candidate protein numbers as a biomarker study moves from discovery (two-step) through to validation phases; 2DE- 2-dimensional gel electrophoresis, DIGE- difference in-gel electrophoresis, LC-MS- liquid chromatography associated with mass spectrometry, ELISAenzyme linked immuno-adsorbant assay, MRM- multiple reaction monitoring mass spectrometry, IVDMIA-in vitro diagnostic multivariate index assay.

#### **3.2 Verification of protein modifications**

Protein modifications are common but are frequently overlooked, especially during the discovery phase. Amongst the most significant modifications are covalent alternations to amino acids (*e.g.,* phosphorylation, nitration or redox changes) and covalent addition of large groups (*e.g.,* glycosylation). These modifications can have dramatic effects on protein function and may play a significant role in a range of arthritides and autoimmune disorders. Because most biomarker candidate identification strategies rely on peptide surrogate based mass spectrometry, there is added potential to characterize low abundance PTM variants. MALDI-TOF is an example mode of mass spectrometry can scrutinize multiple variants of a given protein in a concurrent, swift and relatively sensitive fashion. Several criteria determine accurate structural assignment and the quantification of specific modifications via

measurement of as many proteins as possible they have inherently low throughput, are labor intensive and offer a low dynamic range. These characteristics preclude their use in later phases of the biomarker pipeline. It is also important to realize that as yet there is no single method available for looking at the complete complexity of the proteome within a given clinical sample. Because we are working with a relatively blunt set of tools in

A two-step approach to the discovery phase, though widely used, is not well defined in the literature. In the initial pilot exploration of a low number of individuals the aim is to gain a grasp of the variability of whole proteome being measured across the cohort, selecting a suitable sample type, optimizing the separation and quantification platform and ultimately calculating appropriate patient numbers to power a second (discovery) round with greater

Table describes the aim, the likely analytical platform and associated characteristics of each phase in an ideal biomarker discovery pipeline through verification to validation and final pre-market approval. The schema represents the increase in patient sample and decrease in candidate protein numbers as a biomarker study moves from discovery (two-step) through to validation phases; 2DE- 2-dimensional gel electrophoresis, DIGE- difference in-gel electrophoresis, LC-MS- liquid chromatography associated with mass spectrometry, ELISAenzyme linked immuno-adsorbant assay, MRM- multiple reaction monitoring mass

Protein modifications are common but are frequently overlooked, especially during the discovery phase. Amongst the most significant modifications are covalent alternations to amino acids (*e.g.,* phosphorylation, nitration or redox changes) and covalent addition of large groups (*e.g.,* glycosylation). These modifications can have dramatic effects on protein function and may play a significant role in a range of arthritides and autoimmune disorders. Because most biomarker candidate identification strategies rely on peptide surrogate based mass spectrometry, there is added potential to characterize low abundance PTM variants. MALDI-TOF is an example mode of mass spectrometry can scrutinize multiple variants of a given protein in a concurrent, swift and relatively sensitive fashion. Several criteria determine accurate structural assignment and the quantification of specific modifications via

spectrometry, IVDMIA-in vitro diagnostic multivariate index assay.

**3.2 Verification of protein modifications** 

discovery we need to transition to more precise methods for validation.

statistical confidence.

Fig. 1. Biomarker pipeline

a peptide-centric approach (Duncan et al., 2010), including spectra search criteria, sequence coverage and database completeness.

Accordingly, changing levels of a modified protein may represent a better biomarker than changes in the total expression levels of a given protein. For example, alterations in the levels of naturally-occurring glycosylation motifs can serve as a marker of inflammation, lymphocyte tolerance and senescence in arthritis (Garcia et al., 2005), *viz.* increased branching of sugar moieties on alpha-1 acid glycoprotein can act as biomarkers of inflammation, whereas decreased branching of T-cell receptor affects the development of Th1/Th2 cells increasing susceptibility to autoimmunity (Havenaar et al., 1998; Morgan et al., 2004).

Fig. 2. Protein isoform verification

A depiction of possible qualitative and quantitative changes in protein isoforms between health and a disease state. The illustration of an isoform of a given protein associated with a specific adverse outcome demonstrates that it can only be detected by high 'resolution' proteomic strategies which can detect variance in post translational modifications. Conventional genomic and antibody based methods will only pick up on a change in expression of recognized transcripts or epitopes, giving a high likely hood of missing the significance of the isoform prevalent in a particular disease outcome.

Validation of Protein Biomarkers to Advance the Management of Autoimmune Disorders 149

org/ index.php]), this remains a slow process. It requires antibody generation and

Antibody-independent strategies are highly desirable. The most popular of these is based on peptide-centric, multiple reaction monitoring (MRM). MRM is a technology that has unique potential for reliable quantification of analytes of low abundance in complex mixtures. In an MRM assay, a predefined precursor ion and one of its fragments are selected by the two mass filters of a triple quadrupole instrument and monitored over time for precise quantification. A series of transitions (precursor/fragment ion pairs) in combination with the retention time of the targeted peptide can constitute a definitive assay (Lange et al., 2008). The combination of MRM, chemistry and software to aid with the selection of suitable proteotypic peptides, has provided the opportunity to rapidly develop quantitative multiplexed assays of protein expression and post-translational modification that are both highly specific and sensitive (Scheiss et al., 2009). In recent years, significant advances have been made in the measurement of protein expression using MRM on triple quadrupole (QQQ) mass spectrometers (Pan et al., 2009). In this system, one or more peptide ions of unique and known mass are preselected in the first quadrupole (Q1), induced to fragment in the second quadrupole (Q2), and some of the resulting 'product ions' (or fragments) are selected for transmission to the detector in the third quadrupole (Q3) (Figure 3A). MRM supports the simultaneous measurement of multiple proteotypic peptides and synthetic mass variants of them (usually spiked into samples in known amounts). The strategy enables the absolute quantification of multiple proteins (Keshishan et al., 2007; Kuzyk et al., 2009). When MRM is combined with immunoaffinity purification and internal peptide standards, for example SISCAPA, detection is in the sub-

In a relatively early demonstration of peptide MRM, assays were developed to simultaneously quantify the expression of sixteen cytochrome P450 enzymes - proteins important in determining susceptibility to adverse drug reactions (Jenkins et al., 2006). Previously, a method was described for the MRM assay of C-reactive protein (CRP) as a means of differentiating erosive from non-erosive RA patients (Kuhn et al., 2004). The same research team then applied the same MRM technique to measure elevated levels in synovial fluid of six additional members of the S100 calcium-binding proteins associated with an

The production of antibodies against self-antigens (autoantibodies) is a characteristic feature of many autoimmune diseases. At a clinical level, tests for specific autoantibodies, such as ANA positivity, are routinely employed to aid the diagnosis and track the progress of these diseases. Traditionally, autoantibodies have been identified with a one-antigen-at-a-time,

Microarrays provide a particularly effective platform for the systematic study of thousands of proteins in parallel because they are sensitive and require low sample volumes (MacBeath & Schreiber, 2000; Zhu et al., 2001). Protein microarrays involve the display of thousands of different proteins with high spatial density on a microscopic surface. Protein microarrays have been applied to autoimmune biomarker studies focused on pre-symptomatic screening and diagnosis, clinical outcome prognosis and therapeutic response prediction (Hueber et

hypothesis-driven approach using methods such as immunofluorescence and ELISA.

characterization to establish specificity and utility in different assay formats.

**4.1 Multiple reaction monitoring** 

femtomolar range (Whiteaker et al., 2010).

erosive subtype of RA (Liao et al., 2004).

**4.2 Nucleic acid programmable protein arrays** 

Recent evidence suggests that oxidative modifications to the proteins S100A8 and S100A9 shifts function from macrophage and neutrophil activation in inflammatory arthritis towards a protective role (Lim et al., 2009). In this case, the modification appears to serve as a regulatory switch. Citrulination of arginine side chains has the potential to alter structure, antigenicity and protein function (Wegner et al., 2010). In fact, synthetic peptides modified to mimic possible neo-antigens which trigger an autoimmune response have been used to identify novel diagnostic/prognostic autoantibodies (McLaren et al., 2005; Papini et al., 2009).

Before disease becomes apparent, it is likely that a particular disease pathology 'specific' protein isoform combination has been expressed for some time, impacting normal physiological pathways. These disease 'specific' proteins may also be expressed in a benign or developing state of the disease devoid of clinical symptoms and may contain a sub pool of surrogate markers of chronic inflammation. An example from the world of autoimmune disease is presented by a study of systemic lupus erythematosus patients in whom autoantibodies were detected prior to clinical symptoms (Eriksson et al., 2011). Susceptibility to develop several other auoimmune diseases including diabetes and rheumatoid arthritis can be predicted by long periods of pre-clinical autoantibody expression (Bastra et al., 2001; Rantapaa-Dahlquist et al, 2003). Another recent study indicates that galactosylation of IgG precedes disease onset, correlates with disease activity, and is prevalent in autoantibodies in rheumatoid arthritis patients (Ercan et al., 2010).

Evidently these preclinical biomarker 'screening' studies are unique in that they rely heavily on concerted biobanking of samples in a prospective fashion, generally have focused on more easily retrieved antibodies and may incur long 'wait times' until a specific disorder may occur. They do however offer a fascinating glimpse of what could be occurring at the protein level prior to disease onset, which arguably could offer a window of opportunity to diagnose earlier, manage the pathology before it becomes clinically symptomatic and possibly prevent aberrant processes all together. Alterations in protein isoforms therefore may also comprise part of the milieu of pathological changes and thereby serve as biomarkers**.** Studies aimed at full length characterization of proteins indicate that preliminary discovery stages may therefore not reflect the full extent of protein variants due to the low cohort sizes (and low throughput techniques) typical of this stage. For example, a study of diabetes patients revealed that, within a cohort of 96 individuals, an average of 3 variants of each protein were observed; a further 8 variants were observed across 1000 individuals (Borges et al., 2010). This highlights the importance of accounting for protein micro-heterogeneity across patient populations and correlation of prevalence with specific disease outcome sub-groups (Figure 2). Statistical evidence of prevalence and analytical limits of detection of a specific group of isoforms should then direct the study towards validation of candidates in a much larger group of multi-center patient populations.
