**4. Emerging tools for targeted biomarker validation**

The biggest challenge in proteomics remains independent validation of changes 'discovered' in observational investigations. Traditionally, validation has been undertaken by antibodybased approaches, including Western blotting, ELISA and immunohistochemistry (IHC). However, despite major efforts to generate proteome-scale panels of suitable antibodies (most notably the impressive Human Protein Atlas initiative [http:// www. proteinatlas. org/ index.php]), this remains a slow process. It requires antibody generation and characterization to establish specificity and utility in different assay formats.

#### **4.1 Multiple reaction monitoring**

148 Autoimmune Disorders – Current Concepts and Advances from Bedside to Mechanistic Insights

Recent evidence suggests that oxidative modifications to the proteins S100A8 and S100A9 shifts function from macrophage and neutrophil activation in inflammatory arthritis towards a protective role (Lim et al., 2009). In this case, the modification appears to serve as a regulatory switch. Citrulination of arginine side chains has the potential to alter structure, antigenicity and protein function (Wegner et al., 2010). In fact, synthetic peptides modified to mimic possible neo-antigens which trigger an autoimmune response have been used to identify novel diagnostic/prognostic autoantibodies (McLaren et al., 2005; Papini et al.,

Before disease becomes apparent, it is likely that a particular disease pathology 'specific' protein isoform combination has been expressed for some time, impacting normal physiological pathways. These disease 'specific' proteins may also be expressed in a benign or developing state of the disease devoid of clinical symptoms and may contain a sub pool of surrogate markers of chronic inflammation. An example from the world of autoimmune disease is presented by a study of systemic lupus erythematosus patients in whom autoantibodies were detected prior to clinical symptoms (Eriksson et al., 2011). Susceptibility to develop several other auoimmune diseases including diabetes and rheumatoid arthritis can be predicted by long periods of pre-clinical autoantibody expression (Bastra et al., 2001; Rantapaa-Dahlquist et al, 2003). Another recent study indicates that galactosylation of IgG precedes disease onset, correlates with disease activity, and is prevalent in autoantibodies in rheumatoid arthritis patients (Ercan et al., 2010). Evidently these preclinical biomarker 'screening' studies are unique in that they rely heavily on concerted biobanking of samples in a prospective fashion, generally have focused on more easily retrieved antibodies and may incur long 'wait times' until a specific disorder may occur. They do however offer a fascinating glimpse of what could be occurring at the protein level prior to disease onset, which arguably could offer a window of opportunity to diagnose earlier, manage the pathology before it becomes clinically symptomatic and possibly prevent aberrant processes all together. Alterations in protein isoforms therefore may also comprise part of the milieu of pathological changes and thereby serve as biomarkers**.** Studies aimed at full length characterization of proteins indicate that preliminary discovery stages may therefore not reflect the full extent of protein variants due to the low cohort sizes (and low throughput techniques) typical of this stage. For example, a study of diabetes patients revealed that, within a cohort of 96 individuals, an average of 3 variants of each protein were observed; a further 8 variants were observed across 1000 individuals (Borges et al., 2010). This highlights the importance of accounting for protein micro-heterogeneity across patient populations and correlation of prevalence with specific disease outcome sub-groups (Figure 2). Statistical evidence of prevalence and analytical limits of detection of a specific group of isoforms should then direct the study towards

validation of candidates in a much larger group of multi-center patient populations.

The biggest challenge in proteomics remains independent validation of changes 'discovered' in observational investigations. Traditionally, validation has been undertaken by antibodybased approaches, including Western blotting, ELISA and immunohistochemistry (IHC). However, despite major efforts to generate proteome-scale panels of suitable antibodies (most notably the impressive Human Protein Atlas initiative [http:// www. proteinatlas.

**4. Emerging tools for targeted biomarker validation** 

2009).

Antibody-independent strategies are highly desirable. The most popular of these is based on peptide-centric, multiple reaction monitoring (MRM). MRM is a technology that has unique potential for reliable quantification of analytes of low abundance in complex mixtures. In an MRM assay, a predefined precursor ion and one of its fragments are selected by the two mass filters of a triple quadrupole instrument and monitored over time for precise quantification. A series of transitions (precursor/fragment ion pairs) in combination with the retention time of the targeted peptide can constitute a definitive assay (Lange et al., 2008). The combination of MRM, chemistry and software to aid with the selection of suitable proteotypic peptides, has provided the opportunity to rapidly develop quantitative multiplexed assays of protein expression and post-translational modification that are both highly specific and sensitive (Scheiss et al., 2009). In recent years, significant advances have been made in the measurement of protein expression using MRM on triple quadrupole (QQQ) mass spectrometers (Pan et al., 2009). In this system, one or more peptide ions of unique and known mass are preselected in the first quadrupole (Q1), induced to fragment in the second quadrupole (Q2), and some of the resulting 'product ions' (or fragments) are selected for transmission to the detector in the third quadrupole (Q3) (Figure 3A). MRM supports the simultaneous measurement of multiple proteotypic peptides and synthetic mass variants of them (usually spiked into samples in known amounts). The strategy enables the absolute quantification of multiple proteins (Keshishan et al., 2007; Kuzyk et al., 2009). When MRM is combined with immunoaffinity purification and internal peptide standards, for example SISCAPA, detection is in the subfemtomolar range (Whiteaker et al., 2010).

In a relatively early demonstration of peptide MRM, assays were developed to simultaneously quantify the expression of sixteen cytochrome P450 enzymes - proteins important in determining susceptibility to adverse drug reactions (Jenkins et al., 2006). Previously, a method was described for the MRM assay of C-reactive protein (CRP) as a means of differentiating erosive from non-erosive RA patients (Kuhn et al., 2004). The same research team then applied the same MRM technique to measure elevated levels in synovial fluid of six additional members of the S100 calcium-binding proteins associated with an erosive subtype of RA (Liao et al., 2004).

#### **4.2 Nucleic acid programmable protein arrays**

The production of antibodies against self-antigens (autoantibodies) is a characteristic feature of many autoimmune diseases. At a clinical level, tests for specific autoantibodies, such as ANA positivity, are routinely employed to aid the diagnosis and track the progress of these diseases. Traditionally, autoantibodies have been identified with a one-antigen-at-a-time, hypothesis-driven approach using methods such as immunofluorescence and ELISA.

Microarrays provide a particularly effective platform for the systematic study of thousands of proteins in parallel because they are sensitive and require low sample volumes (MacBeath & Schreiber, 2000; Zhu et al., 2001). Protein microarrays involve the display of thousands of different proteins with high spatial density on a microscopic surface. Protein microarrays have been applied to autoimmune biomarker studies focused on pre-symptomatic screening and diagnosis, clinical outcome prognosis and therapeutic response prediction (Hueber et

Validation of Protein Biomarkers to Advance the Management of Autoimmune Disorders 151

Nucleic Acid Programmable Protein Array (NAPPA) is an innovative method to produce protein microarrays, where cDNAs encoding proteins of interest are spotted onto activated surfaces and proteins are produced *in situ* using mammalian *in vitro* expression systems (Ramachandran et al., 2004; Ramachandran et al., 2008). The freshly made protein is captured by co-spotted antibodies specific for a 'tag' encoded at the end of the amino acid sequence. This approach circumvents the labor and cost considerations associated with conventional spotting of labile recombinant proteins into arrays. NAPPA technology recently revealed that ankylosing spondylitis patients' autoantibody responses were targeted towards connective, skeletal and muscular tissue, unlike those of RA patients (Wright et al., 2010). In a recent pilot study, a strong correlation was observed between 768 autoantibodies in paired plasma and synovial fluid samples from patients with juvenile

Intact protein profiling across clinical cohorts gives a glimpse into the degree of variation evident in a single gene product (Borges et al., 2008a). The same approach may be useful in the study of arthritis. Mass spectrometry-based techniques can potentially distinguish these physical and structural variations and allow the relative abundance of one isoform to be determined (Duncan et al., 2010). By contrast, these variants would be overlooked by conventional ELISA methods (Figure 2). A brief description and recent application of such

*MALDI / SELDI Profiling (Immuno-MALDI):* Matrix assisted laser desorption ionisation (MALDI) mode of mass spectrometry allows the 'soft' ionization of complete proteins which are liable to fragment under conventional ionization methods. The type of a mass spectrometer most widely used with MALDI is the time-of-flight (TOF), mainly due to its large mass range (Figure 3C). Purifying a protein from a clinical sample by immunoprecipitation can greatly reduce the complexity of the proteome being analysed. In one approach, purified polyclonal antibodies that capture the target protein isoforms can be immobilized onto sepharose beads packed within a pipette tip or 'fret' (Borges et al., 2008b). Eluted proteins can then be spotted on a MALDI target plate and spectra obtained. For example, some recent MALDI profiling applications have demonstrated the ability to diagnose early RA and hypertension and distinguish active SLE (Dai et al., 2010; Long et al., 2010; Reid et al., 2010). Glycosylation heterogeneity of selected inflammation associated molecules such as serum amyloid and vitamin D binding protein have been investigated in

As a modification of MALDI, surface-enhanced laser desorption ionization (SELDI) methods can be used to target lower molecular weight proteins (<20 KDa) to differentiate arthritides and therapeutic response (de Seny et al., 2008; Miyame et al. 2005). The technology is currently being developed to affinity capture the protein of interest directly to the mass

Although proteomics has been full of promise, few validated biomarkers have made their way into the public domain and even fewer influence clinical practice. There is little doubt that validation is a serious bottleneck in the biomarker development process. While there is abundant discussion of approaches to discovery, the tools for validation and their

cancer and diabetic patients (Rehder et al., 2009; Weiss et al., 2011).

spectrometry target plate (Brauer et al., 2010).

**4.4 Biomarker research and grant funding** 

arthritis (Figure 3B).

techniques follows.

**4.3 Proteomic profiling methods** 

al., 2005; Quitana et al., 2004) With particular relevance to the remit of this chapter, conventional printed arrays have been used to study rheumatoid arthritis, systemic lupus erythematosus, multiple sclerosis, hepatitis and encephalomyelitis (Fattal et al., 2010; Hueber et al. 2009; Li et al., 2005; Somers et al., 2009; Song et al., 2010).

**A** In protein multiple reaction monitoring (MRM), one or more peptides of unique and known mass (proteotypic peptides) are preselected in the first quadrupole (Q1), induced to fragment in Q2 by collisional excitation with a neutral gas in a pressurized cell and some of the resulting 'product ions' (fragments) are selected for transition to the detector in the third quadrupole (Q3). **B1** Nucleic acid programmable protein array (NAPPA) spotted with genes of interest; All proteins are tagged at the c-terminus to ensure only full length translated proteins can be captured in situ by co-spotted anti-tag antibodies. NAPPA has consistent protein amounts displayed at each spot; most are within two fold of the average (Ramachandaran et al., 2008). Proteins are expressed "just-in-time" for assay, which eliminates concern of protein stability. **B2** Image of NAPPA with randomly selected 768 genes probed with a synovial fluid sample from a patient with juvenile arthritis. Antibodies in patient samples bind to their antigen targets on the array and are detected by Alexa647 conjugated goat anti-human IgG. **B3** Scatterplot of reactivity on NAPPA between paired plasma and synovial fluid samples from arthritis patients. Median correlation is 0.982. **C1**  Matrix assisted laser desorption ionization- time of flight (MALDI-TOF) mass spectrometry whereby proteins or peptides imbedded in a crystallized matrices are ionized by a high frequency laser beam and accelerated through a flight tube by electrical field; ions 'fly' and reach the detector plate with respect to their mass:charge ratio. **C2** A spectra is generated which reflects the energy of a given ion vs the mass:charge ratio (m/z). **C3** A birds eye view representation of the spectra reveals distinguishing peaks (\*) from the six samples analysed.

Fig. 3. Targeted identification methods

Nucleic Acid Programmable Protein Array (NAPPA) is an innovative method to produce protein microarrays, where cDNAs encoding proteins of interest are spotted onto activated surfaces and proteins are produced *in situ* using mammalian *in vitro* expression systems (Ramachandran et al., 2004; Ramachandran et al., 2008). The freshly made protein is captured by co-spotted antibodies specific for a 'tag' encoded at the end of the amino acid sequence. This approach circumvents the labor and cost considerations associated with conventional spotting of labile recombinant proteins into arrays. NAPPA technology recently revealed that ankylosing spondylitis patients' autoantibody responses were targeted towards connective, skeletal and muscular tissue, unlike those of RA patients (Wright et al., 2010). In a recent pilot study, a strong correlation was observed between 768 autoantibodies in paired plasma and synovial fluid samples from patients with juvenile arthritis (Figure 3B).

#### **4.3 Proteomic profiling methods**

150 Autoimmune Disorders – Current Concepts and Advances from Bedside to Mechanistic Insights

al., 2005; Quitana et al., 2004) With particular relevance to the remit of this chapter, conventional printed arrays have been used to study rheumatoid arthritis, systemic lupus erythematosus, multiple sclerosis, hepatitis and encephalomyelitis (Fattal et al., 2010;

**A** In protein multiple reaction monitoring (MRM), one or more peptides of unique and known mass (proteotypic peptides) are preselected in the first quadrupole (Q1), induced to fragment in Q2 by collisional excitation with a neutral gas in a pressurized cell and some of the resulting 'product ions' (fragments) are selected for transition to the detector in the third quadrupole (Q3). **B1** Nucleic acid programmable protein array (NAPPA) spotted with genes of interest; All proteins are tagged at the c-terminus to ensure only full length translated proteins can be captured in situ by co-spotted anti-tag antibodies. NAPPA has consistent protein amounts displayed at each spot; most are within two fold of the average (Ramachandaran et al., 2008). Proteins are expressed "just-in-time" for assay, which eliminates concern of protein stability. **B2** Image of NAPPA with randomly selected 768 genes probed with a synovial fluid sample from a patient with juvenile arthritis. Antibodies in patient samples bind to their antigen targets on the array and are detected by Alexa647 conjugated goat anti-human IgG. **B3** Scatterplot of reactivity on NAPPA between paired plasma and synovial fluid samples from arthritis patients. Median correlation is 0.982. **C1**  Matrix assisted laser desorption ionization- time of flight (MALDI-TOF) mass spectrometry whereby proteins or peptides imbedded in a crystallized matrices are ionized by a high frequency laser beam and accelerated through a flight tube by electrical field; ions 'fly' and reach the detector plate with respect to their mass:charge ratio. **C2** A spectra is generated which reflects the energy of a given ion vs the mass:charge ratio (m/z). **C3** A birds eye view representation of the spectra reveals distinguishing peaks (\*) from the six samples analysed.

Hueber et al. 2009; Li et al., 2005; Somers et al., 2009; Song et al., 2010).

Fig. 3. Targeted identification methods

Intact protein profiling across clinical cohorts gives a glimpse into the degree of variation evident in a single gene product (Borges et al., 2008a). The same approach may be useful in the study of arthritis. Mass spectrometry-based techniques can potentially distinguish these physical and structural variations and allow the relative abundance of one isoform to be determined (Duncan et al., 2010). By contrast, these variants would be overlooked by conventional ELISA methods (Figure 2). A brief description and recent application of such techniques follows.

*MALDI / SELDI Profiling (Immuno-MALDI):* Matrix assisted laser desorption ionisation (MALDI) mode of mass spectrometry allows the 'soft' ionization of complete proteins which are liable to fragment under conventional ionization methods. The type of a mass spectrometer most widely used with MALDI is the time-of-flight (TOF), mainly due to its large mass range (Figure 3C). Purifying a protein from a clinical sample by immunoprecipitation can greatly reduce the complexity of the proteome being analysed. In one approach, purified polyclonal antibodies that capture the target protein isoforms can be immobilized onto sepharose beads packed within a pipette tip or 'fret' (Borges et al., 2008b). Eluted proteins can then be spotted on a MALDI target plate and spectra obtained. For example, some recent MALDI profiling applications have demonstrated the ability to diagnose early RA and hypertension and distinguish active SLE (Dai et al., 2010; Long et al., 2010; Reid et al., 2010). Glycosylation heterogeneity of selected inflammation associated molecules such as serum amyloid and vitamin D binding protein have been investigated in cancer and diabetic patients (Rehder et al., 2009; Weiss et al., 2011).

As a modification of MALDI, surface-enhanced laser desorption ionization (SELDI) methods can be used to target lower molecular weight proteins (<20 KDa) to differentiate arthritides and therapeutic response (de Seny et al., 2008; Miyame et al. 2005). The technology is currently being developed to affinity capture the protein of interest directly to the mass spectrometry target plate (Brauer et al., 2010).

#### **4.4 Biomarker research and grant funding**

Although proteomics has been full of promise, few validated biomarkers have made their way into the public domain and even fewer influence clinical practice. There is little doubt that validation is a serious bottleneck in the biomarker development process. While there is abundant discussion of approaches to discovery, the tools for validation and their

Validation of Protein Biomarkers to Advance the Management of Autoimmune Disorders 153

Currently there are few FDA-approved proteomic tests for autoimmune disease. Although there is little doubt that such tests could help the diagnosis and treatment of arthritis, it is a major clinical and financial challenge to develop, validate and market them. Robust validation data including evidence of sensitivity, specificity and correlation to the existing limited set of clinical or laboratory criteria are necessary to support clinical utility. Disease activity scores (DAS-CRP and DAS28), for example, combine inflamed joint count and ESR/CRP to document levels of disease activity at a static time point. The measurement of specific proteins that flag a particular patient's status add objectivity in circumstances where

From a clinician's perspective, it is important to address several questions in a timely fashion for a given patient presenting with autoimmune disease. In each instance, the clinician is attempting to minimize underlying disease and adverse outcomes, such as joint damage in arthritis. Key questions that can currently only be partially answered by clinical observation and patient history include: (a) is this true autoimmune-driven arthritis (i.e., diagnosis), (b) how severe or at what stage is the disease process, (c) what is this patient's likely outcome (i.e., prognosis) and (d) which drugs could abrogate that outcome (i.e., prediction)? Decision-making also extends to selection of therapy: (e) what is the patientspecific titer, (f) which disease subgroups will benefit from a specific therapeutic strategy

This chapter has addressed and discussed three key areas for consideration, which if addressed after initial discovery work could provide solid evidence of their clinical utility and commercial viability: (i) limiting bias in study design, (ii) thorough protein isoform

*Bias*- In statistics, bias is systematic favoritism present in data collection, analysis or

*Biomarker*- or *biological marker*, is a molecular characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or

*Classifier*- in statistics is the formula or criteria for identifying a sub-population based on

*Development pipeline*- represents the process from candidate discovery, through verification,

*Diagnostic*- in the context of medicine is any test performed or criteria applied to aid to

*Discovery*- in the context of biomarkers, describes the initial process of observation, identification and quantification of one or more biological molecules which may act as a

*Isoform*- describes the biological phenomenon of several different structural forms of the same protein which may arise by alternate gene splicing and single-nucleotide polymorphisms before messenger RNA translation and chemical modifications e.g.

*Multiplex*- in the context of protein assay is a method or platform which permits the simultaneously measururement of multiple analytes (dozens or more) in a single test.

the clinician currently relies on clinical judgment alone.

and (g) when should treatment be terminated?

reporting of quantitative research

validation and final pre-market approval.

classifier.

verification and (iii) modes of orthogonal and targeted validation.

pharmacologic responses to a therapeutic intervention.

determine and/or identity a possible disease or disorder.

**6. Glossary- the language of biomarker and proteomic research** 

quantitative information on one or more measurements, traits or characteristics.

phosphorylation or glycosylation which occur post-translation of proteins.

applications have received little attention. It is very often difficult to receive funding from traditional grant programs to validate markers: funding agencies balk at the prospect of funding a 're-measurement' of the same entity in larger independent cohorts. Additionally, the continuum from discovery through to validation is tedious and extends well beyond the time-frame of a typical research grant. In fact, the time from initial discovery to routine use can take up to a decade (Anderson, 2010; Wilson et al., 2007). A recent example illustrates the seven year journey from discovery to FDA approval for the multivariate diagnostic test OVA1, used to screen ovarian cancer patients (Fung, 2010).

Similarly, when validation fails it is difficult for academic investigators to publish these 'negative' results; when validation succeeds, the emphasis frequently shifts to commercialization rather than publication.
