**4. Prostate cancer**

338 Biomarker

native counterparts (S. Garbis, Lubec, & Fountoulakis, 2005; Lubec & Afjehi-Sadat, 2007; Nilsson et al., 2010; van Bentem, Mentzen, de la Fuente, & Hirt, 2008). Currently more than 150 different types of *in vivo* modifications are possible (Seymour et al., 2006; Shilov et al., 2007). The ability to detect and discriminate these post-translational modified proteins constitutes a major advancement in the more comprehensive understanding of signaling cascades at the protein level allowing for a more direct appreciation of protein-protein interaction and consequently biological pathways and their networks (Kocher & Superti-Furga, 2007; Mann & Kelleher, 2008; Ong & Mann, 2005; van Bentem et al., 2008). It is assumed that the vast majority of proteins have undergone multiple and diverse *in vivo* modifications that define their induction or silencing status. Such protein traits can only be captured with tandem MS spectra generated at high sensitivity and high resolution providing unequivocal evidence in the annotation of their *in vivo* modification at the precise amino acid location in single LC-MS experiment (Liu et al., 2007; Mann & Kelleher, 2008; Ong & Mann, 2005; Papayannopoulos, 1995). Conceptually, a vast array of *in vivo*  modifications can be captured and stored for later use as means to provide a multifactorial understanding of biological pathways and their networks. The current biochemical assays such as Immunohistochemistry and Western blots fail to account for these intrinsic protein *in vivo* modification traits. It is this limitation that has often resulted in the analysis bias between the MS and biochemical assay measurements (Diamandis, 2004; Lubec & Afjehi-

The collective LC-MS analysis characteristics constitute a major advancement toward an in-depth proteome analysis of the fresh-frozen tumor specimens.Advanced proteomics approaches can bridge the gap between the genetic and epigenetic alterations underlying cancer and cellular physiology. The precepts of multidimensional liquid chromatography hyphenated with high resolution, tandem mass spectrometry (MDLC-MS-MS) techniques in combination with the use of isobaric tags for relative and absolute quantification (iTRAQTM) of whole tissue biopsies of various types of cancer tissue (i.e., breast, prostate, cervical) has played a key role in bridging this gap. In general, a key advantage of 2DLC-MS-MS methods that utilize isobaric stable isotope based approaches (i.e., cICAT, TMT, iTRAQ, etc.) is the ability to conduct multiplex experiments, whereby specimen extracts can be analyzed concurrently under the same experimental conditions. This multiplexing advantage reduces systematic error, and improves the signal-to-noise of the precursor MS and product ion MS-MS response allowing for a greater number of proteins to be quantitatively profiled (DeSouza et al., 2005; S. D. Garbis et al., 2008; Glen et al., 2008; Pichler et al., 2011; Wu, Wang, Baek, & Shen, 2006). Advancements made to liquid chromatography and mass spectrometry stand to further potentiate the utility of these isobaric stable isotope tags (Fournier et al., 2007; Pichler et al., 2011). Other key attributes that make MS based methods the premier choice for the analysis of small amounts of clinically valuable and complex biological specimens along with reduced requirements for stable isotope reagents is driven by the increased automation and miniaturization imparted by lab-on-a-chip formats (Everley, Krijgsveld, Zetter, & Gygi, 2004; Koster & Verpoorte, 2007; Rubakhin et al., 2011; Tsougeni et al., 2011). These themes are covered within the context of case studies in the analysis of clinical whole tissue biopsies and their

Sadat, 2007; Nilsson et al., 2010).

sera for prostate cancer.

#### **4.1 The quantitative proteomic profiling of clinical whole tissue biopsies derived from benign prostate hyperplasia and prostate cancer**

Prostate whole tissue biopsies exhibit extensive biological variability when accounting for the diversity in human subjects and the heterogeneity and size of the tissue specimen itself. These variables must be taken into consideration when executing its proteomic study. Factors such as tissue procurement, histopathology pre-assessment, storage, handling, and pre-analytical processing, and instrumental performance verification with standardization (chromatographic and nano-ESI ionization efficiency, MS and MS-MS sensitivity, resolution, accuracy and precision) are variables that need to be optimized for any given proteomic study. The optimization of these variables will minimize the histopathological, biological, pre-analytical and analytical variability so essential to a reproducible and information-rich proteomic output (Buchen, 2011; Cox & Mann, 2011; Diamandis, 2004; Hilario & Kalousis, 2008; Nilsson et al., 2010).

Several multiplex proteomics studies that rely on the use of cysteine-specific isotope-coded affinity tags (cICAT), stable isotope labeling with amino acids in cell culture (SILAC), difference gel electrophoresis (DIGE) and trypsin-mediated 18O isotope labeling have been successful in detecting differentially expressed proteins in combined specimen samples (DeSouza et al., 2005; Everley et al., 2004; Hood et al., 2005). Despite their advantages however, intrinsic limitations exist for each of these approaches. The cICAT approach allows only the labeling of proteins containing cysteine residues on tractable peptides upon proteolysis making this approach unsuitable as a comprehensive and in-depth protein discovery tool. The cICAT approach has been used for the quantitative proteomic profiling in secondary prostate cancer cell cultures. In one such study, 524 secreted proteins were from the LNCaP neoplastic prostate epithelium of which 9% of these were found to be differentially expressed (Martin et al., 2004). In another study involving the same cell culture model in response to androgen exposure resulted in the identification of 1064 proteins of which approx. 21% of these proteins were modulated (Wright et al., 2004).

Another label-based approach for prostate biomarker discovery efforts makes use of heavy water. In such an approach, H218O water is used instead of regular water for the solution phase trypsinization process thus allowing the trypsin-mediated 18O stable isotope incorporation (18O labeling) for those proteins extracted from one specimen category (i.e. control, treated or diseased states). This process leads to the exchange of two equivalents of 16O with two equivalents of the 18O stable isotope at the carboxyl terminus of the resulting tryptic peptides coined as the «heavy» peptides. The heavy water approach was applied to proteins extracted from benign prostate hyperplasia (BPH) vs. prostate cancer (PCa) cells isolated from a single formalin-fixed paraffin embedded (FFPE) prostate cancer tissue specimen (Hood et al., 2005). This study resulted in the quantitative profiling of only 68 proteins. The limited proteins amounts along with their cross-linked form limit the utility of FFPE as a viable specimen source for proteomic assessment. Another confounding factor in the practical utility of the 18O labeling strategy, which also applies in cICAT labeling case, is that only two samples can be analyzed per experiment.

A gel-based relative quantitative approach that has been used for prostate cancer cells is known as the differential gel-electrophoresis (DIGE). The DIGE method represents a variant

The Discovery of Cancer Tissue Specific Proteins in Serum: Case Studies on Prostate Cancer 341

2006), as essential hallmark features for these prostate cancer tissue specimens. Another interesting finding that also goes toward validating the accuracy of the proteomic method is the differential expression of several prostate specific cancer markers such as the prostatespecific transglutaminase, the prostate associated gene 4 protein, the prostatic acid phosphatase, and the prostate specific membrane antigen (see Figure 2). The presence of the prostate-specific transglutaminase in PCa has been recently reported as a potential antitumour target (Ablin et al., 2011; Jiang & Ablin, 2011). Yet another important finding from this study were proteins reported to be implicated as potential cancer chemoprevention targets also affiliated with poor nutritional status and metabolic syndrome disease (Das et al., 2011; De Nunzio et al., 2011; DeMarzo et al., 2003; Dong, Zhang, Hawthorn, Ganther, & Ip, 2003; Gonzalez-Moreno et al., 2011; Jeronimo et al., 2004; J. Kim et al., 2005; Kuemmerle et al., 2011; Menendez & Lupu, 2007; Nelson et al., 2005; Oh et al., 2006; Sytkowski, Gao, Feldman, & Chen, 2005; Toki et al., 2010; Tsavachidou et al., 2009; Walsh, 2010; C. M. Yang, Yen, Huang, & Hu, 2011; Zeliadt & Ramsey, 2010). These proteins include the retinol binding protein I, selenium binding protein 1, fatty acid synthase, and insulin-regulated lipase and are oftentimes synergistically expressed with other proteins implicated in the

Fig. 2. A surrogate peptide sequence and its relative quantification indicating the over expression of prostate-specific membrane antigen (PSMA) in prostate cancer (PCa) vs. benign prostate hyperplasia (BPH) with corresponding immunohistochemical confirmation for these specimen categories that effectively corroborate the quantitative proteomic

inflammation response and androgen regulation.

findings (S. D. Garbis et al., 2008).

of the classical 2-D gel electrophoresis (2DGE) technique whereby CyDye fluorescence probes are used as tags to covalently modify proteins without affecting their electrophoretic properties. Consequently, the resulting CyDye fluorescence labeled proteins originating from multiple biological specimens migrate to almost the same location of a 2-D gel. Using this approach, up to three different fluor labeled samples can be combined and 2DGE separated in a single experiment thus allowing better spot matching and reduction in gel-togel non-reproducibility. One fundamental drawback to the DIGE approach is its MSincompatibility because of the ionization suppression effects induced by fluor labeled reagents. Consequently, all the intrinsic gel-based limitations also apply for the DIGE approach (S.Garbis et al., 2005; Garcia-Ramirez et al., 2007; Lubec & Afjehi-Sadat, 2007; Wu et al., 2006). The use of the DIGE based method was applied to the study of perturbed protein networks in LNCaP prostate cancer cells administered to both androgen and antiandrogen exposure resulting in the quantitative profiling of 107 proteins (Rowland et al., 2004).

The development and application of a quantitative proteomic method involving the use of off-line size-exclusion chromatography (SCX) followed by the on-line reverse phase (RP) chromatography hyphenated with high resolution, tandem mass spectrometry (2DLC-MS-MS) in combination with the use of isobaric tags for relative and absolute quantification (iTRAQTM) was applied to the analysis of clinical whole tissue biopsies derived from patients with benign prostate hyperplasia (BPH, n=10) and prostate cancer (PCa, n=10)(S. D. Garbis et al., 2008). Key advantages to this approach include the ability to conduct multiplex experiments, whereby up to eight samples can be analyzed concurrently under the same 2DLC-MS conditions, resulting in reduced systematic error and increased electrospray ionization efficiency leading to higher sensitivity; in addition, since protein identification and quantification is based on tandem mass spectrometric (MS-MS) evidence, increased selectivity, specificity and confirmatory power are achieved. This study resulted in the reproducible quantitative profiling of 827 proteins of which 65 were differentially expressed. The access to well defined human whole prostate tissue biopsies allowed for the investigation of the stromal vs. epithelial cell interaction in the manifestation of prostate cancer. An essential requirement to the iTRAQ 2DLC-MS-MS approach is the use effective liquid chromatographic technique to impart sufficient separation of the large number of tryptic peptides generated. This will reduce the co-eluting peptides that would otherwise result in erroneous product ion MS-MS spectra negating the accurate relative quantification efficiency and protein identification accuracy (Fournier et al., 2007). The modulated proteins identified were implicated in the inflammation response (Albini et al., 2007; Albini, Tosetti, Benelli, & Noonan, 2005; DeSouza et al., 2005; Goldstraw, Fitzpatrick, & Kirby, 2007; Nelson, DeMarzo, DeWeese, & Isaacs, 2005), the modulation of the androgen (Cheung-Flynn et al., 2005; De Leon et al., 2011; Hildenbrand et al., 2011; McKeen et al., 2011; Milad et al., 1995; Miyoshi et al., 2003; Nelson et al., 2005; M. H. Yang & Sytkowski, 1998), and prostate cancer metastasis (Ablin, Kynaston, Mason, & Jiang, 2011; Dabbous, Jefferson, Haney, & Thomas, 2011; Di Cristofano et al., 2010; Grisendi, Mecucci, Falini, & Pandolfi, 2006; Hale, Price, Sanchez, Demark-Wahnefried, & Madden, 2001; Jiang & Ablin, 2011; Khanna et al., 2004; C. J. Kim, Sakamoto, Tambe, & Inoue, 2011; Krust, El Khoury, Nondier, Soundaramourty, & Hovanessian, 2011; Moretti et al., 2011; Okuda et al., 2000; Planche et al., 2011; Sun, Song, et al., 2011; Sun, Zhao, et al., 2011; Weng, Ahlen, Astrom, Lui, & Larsson, 2005; Yu & Luo,

of the classical 2-D gel electrophoresis (2DGE) technique whereby CyDye fluorescence probes are used as tags to covalently modify proteins without affecting their electrophoretic properties. Consequently, the resulting CyDye fluorescence labeled proteins originating from multiple biological specimens migrate to almost the same location of a 2-D gel. Using this approach, up to three different fluor labeled samples can be combined and 2DGE separated in a single experiment thus allowing better spot matching and reduction in gel-togel non-reproducibility. One fundamental drawback to the DIGE approach is its MSincompatibility because of the ionization suppression effects induced by fluor labeled reagents. Consequently, all the intrinsic gel-based limitations also apply for the DIGE approach (S.Garbis et al., 2005; Garcia-Ramirez et al., 2007; Lubec & Afjehi-Sadat, 2007; Wu et al., 2006). The use of the DIGE based method was applied to the study of perturbed protein networks in LNCaP prostate cancer cells administered to both androgen and antiandrogen exposure resulting in the quantitative profiling of 107 proteins (Rowland et al.,

The development and application of a quantitative proteomic method involving the use of off-line size-exclusion chromatography (SCX) followed by the on-line reverse phase (RP) chromatography hyphenated with high resolution, tandem mass spectrometry (2DLC-MS-MS) in combination with the use of isobaric tags for relative and absolute quantification (iTRAQTM) was applied to the analysis of clinical whole tissue biopsies derived from patients with benign prostate hyperplasia (BPH, n=10) and prostate cancer (PCa, n=10)(S. D. Garbis et al., 2008). Key advantages to this approach include the ability to conduct multiplex experiments, whereby up to eight samples can be analyzed concurrently under the same 2DLC-MS conditions, resulting in reduced systematic error and increased electrospray ionization efficiency leading to higher sensitivity; in addition, since protein identification and quantification is based on tandem mass spectrometric (MS-MS) evidence, increased selectivity, specificity and confirmatory power are achieved. This study resulted in the reproducible quantitative profiling of 827 proteins of which 65 were differentially expressed. The access to well defined human whole prostate tissue biopsies allowed for the investigation of the stromal vs. epithelial cell interaction in the manifestation of prostate cancer. An essential requirement to the iTRAQ 2DLC-MS-MS approach is the use effective liquid chromatographic technique to impart sufficient separation of the large number of tryptic peptides generated. This will reduce the co-eluting peptides that would otherwise result in erroneous product ion MS-MS spectra negating the accurate relative quantification efficiency and protein identification accuracy (Fournier et al., 2007). The modulated proteins identified were implicated in the inflammation response (Albini et al., 2007; Albini, Tosetti, Benelli, & Noonan, 2005; DeSouza et al., 2005; Goldstraw, Fitzpatrick, & Kirby, 2007; Nelson, DeMarzo, DeWeese, & Isaacs, 2005), the modulation of the androgen (Cheung-Flynn et al., 2005; De Leon et al., 2011; Hildenbrand et al., 2011; McKeen et al., 2011; Milad et al., 1995; Miyoshi et al., 2003; Nelson et al., 2005; M. H. Yang & Sytkowski, 1998), and prostate cancer metastasis (Ablin, Kynaston, Mason, & Jiang, 2011; Dabbous, Jefferson, Haney, & Thomas, 2011; Di Cristofano et al., 2010; Grisendi, Mecucci, Falini, & Pandolfi, 2006; Hale, Price, Sanchez, Demark-Wahnefried, & Madden, 2001; Jiang & Ablin, 2011; Khanna et al., 2004; C. J. Kim, Sakamoto, Tambe, & Inoue, 2011; Krust, El Khoury, Nondier, Soundaramourty, & Hovanessian, 2011; Moretti et al., 2011; Okuda et al., 2000; Planche et al., 2011; Sun, Song, et al., 2011; Sun, Zhao, et al., 2011; Weng, Ahlen, Astrom, Lui, & Larsson, 2005; Yu & Luo,

2004).

2006), as essential hallmark features for these prostate cancer tissue specimens. Another interesting finding that also goes toward validating the accuracy of the proteomic method is the differential expression of several prostate specific cancer markers such as the prostatespecific transglutaminase, the prostate associated gene 4 protein, the prostatic acid phosphatase, and the prostate specific membrane antigen (see Figure 2). The presence of the prostate-specific transglutaminase in PCa has been recently reported as a potential antitumour target (Ablin et al., 2011; Jiang & Ablin, 2011). Yet another important finding from this study were proteins reported to be implicated as potential cancer chemoprevention targets also affiliated with poor nutritional status and metabolic syndrome disease (Das et al., 2011; De Nunzio et al., 2011; DeMarzo et al., 2003; Dong, Zhang, Hawthorn, Ganther, & Ip, 2003; Gonzalez-Moreno et al., 2011; Jeronimo et al., 2004; J. Kim et al., 2005; Kuemmerle et al., 2011; Menendez & Lupu, 2007; Nelson et al., 2005; Oh et al., 2006; Sytkowski, Gao, Feldman, & Chen, 2005; Toki et al., 2010; Tsavachidou et al., 2009; Walsh, 2010; C. M. Yang, Yen, Huang, & Hu, 2011; Zeliadt & Ramsey, 2010). These proteins include the retinol binding protein I, selenium binding protein 1, fatty acid synthase, and insulin-regulated lipase and are oftentimes synergistically expressed with other proteins implicated in the inflammation response and androgen regulation.

Fig. 2. A surrogate peptide sequence and its relative quantification indicating the over expression of prostate-specific membrane antigen (PSMA) in prostate cancer (PCa) vs. benign prostate hyperplasia (BPH) with corresponding immunohistochemical confirmation for these specimen categories that effectively corroborate the quantitative proteomic findings (S. D. Garbis et al., 2008).

al, 2011).

(Garbis et al, 2011)

The Discovery of Cancer Tissue Specific Proteins in Serum: Case Studies on Prostate Cancer 343

explore the possibility of finding tissue specific proteins in their respective serum (Garbis et

The analytical features of the 3-D MudPIT approach included (Figure 3): (1) high pressure size-exclusion chromatography (SEC) for the pre-fractionation of serum proteins followed by their dialysis exchange and solution phase trypsin proteolysis, (2) The tryptic peptides were then subjected to offline zwitterion-ion hydrophilic interaction chromatography (ZIC-HILIC) fractionation, and (3) their online analysis with reversed-phase nano ultraperformance chromatography (RP nUPLC) hyphenated to nano-electrospray ionization tandem mass spectrometry.This orthogonal chromatographic strategy used imparts a more effective parsing, purification and enrichment of the tryptic peptides when combined with the prior SEC protein pre-fractionation stage. This has the effect on increasing their

individual mass density of the tryptic peptides (higher peptide signal intensity per

Fig. 3. **Top HPLC trace:** A representative size exclusion chromatography (SEC) trace of a pooled serum sample. Calibrant SEC traces are also shown along with their log MW vs. RT (min) linear response curve. **Middle HPLC traces:** Post-SEC sample treatment and ZIC HILIC tryptic peptide traces in concordance to SEC protein segment. The ZIC-HILIC peptide fractionation was performed in a peak-dependant manner. **Bottom HPLC trace:**  Each lyophilized peptide fraction was reconstituted in MP and individually analyzed with RP C18 nUPLC-nESI-MS2 analysis. The resulting product ion MS2 peptide spectra were processed with Scaffold validation, SpectrumMill and InsPecT software programs

#### **4.2 The quantitative proteomic profiling of clinical serum samples derived from benign prostate hyperplasia**

Tissue proteomics is considered a logical first step for the novel discovery of tumourderived proteins as they exist in higher concentrations due to their more direct proximity to cancer cells (Cravatt et al., 2007; Hanash et al., 2008; Joyce, 2005; Mueller & Fusenig, 2004; Wright et al., 2005). However, it is not well understood how protein expression in tissues reflect measurable levels in the serum or plasma that would allow the monitoring of the pathophysiological status of respective tissue (Anderson, 2010; Barelli, Crettaz, Thadikkaran, Rubin, & Tissot, 2007; Farrah et al., 2011; Hanash et al., 2008; Issaq, Xiao, & Veenstra, 2007). This may partially stem from the trend that the comprehensive analysis of tissue relevant proteins in less invasive clinical matrices such as the plasma or serum has been a daunting task for MS based methods despite all their latest technological advancements (Anderson, 2010; Farrah et al., 2011; Hanash et al., 2008). For example, currently available serum and plasma proteomics methods rely on the prior removal of high abundant proteins (i.e. albumin, IgGs, etc.) so that the lower abundant proteins, where potential biomarkers can be revealed, could be more easily analyzed. Several studies, however, have shown that their removal also resulted in the co-removal of a significant percentage of these lower abundant proteins due to their propensity to bind with the higher abundant proteins (S. D. Garbis et al., 2011; Granger, Siddiqui, Copeland, & Remick, 2005; Gundry, White, Nogee, Tchernyshyov, & Van Eyk, 2009; Zolotarjova et al., 2005). Additionally, these studies correctly purport than no MS based method to date has managed to fully remove albumin and other high abundant proteins despite claims made on the contrary. It is estimated that the 20 most abundant proteins in serum and plasma constitute over 99% of the total protein mass found in these matrices. In fact, the difference in endogenous concentration levels of proteins found in serum or plasma span from the mg/mL level (i.e. Albumin, IgG's) down to the low ng/mL level (i.e. Cyclin F, Interleukin 7) (Anderson, 2010; Farrah et al., 2011). This represents a 12-order of magnitude concentration range whose lower limit exceeds the detection capability of the fluorescence based ELISA technique, the most sensitive bioassay technique to date (Rissin et al., 2010). At the same token, the detection of endogenously occurring cleavage products (serum degradome) originating from both high and low abundance proteins may confer greater insight on serum biochemistry and cancer biology (van Winden et al., 2010). This is considered a very important incentive for the whole proteome wide analysis of the serum or plasma matrix in the prospecting of mechanism based biomarker panels.

In an effort to overcome these challenges, an approach coined multidimensional protein identification technology (MudPIT) has been developed (Fournier et al., 2007; S. D. Garbis et al., 2011; Hanash et al., 2008). This approach is principally based on combining two or more different types of liquid chromatographic chemistries so as to increase the separation efficiency as a result. This effect on the separation power is referred to as "orthogonal chromatography" and constitutes a very unique and powerful tool towards the more effective analysis of complex biological matrices (cell cultures, tissues, serum and plasma). Building on this theme, a three-dimensional (3-D) MudPIT variant was developed and applied to the analysis of clinical sera derived from patients with (BPH). The tissues from these BPH patients were analyzed and reported with the iTRAQ 2DLC-MS discussed in the previous section and was considered requisite for this proof-of-principle study so as to

Tissue proteomics is considered a logical first step for the novel discovery of tumourderived proteins as they exist in higher concentrations due to their more direct proximity to cancer cells (Cravatt et al., 2007; Hanash et al., 2008; Joyce, 2005; Mueller & Fusenig, 2004; Wright et al., 2005). However, it is not well understood how protein expression in tissues reflect measurable levels in the serum or plasma that would allow the monitoring of the pathophysiological status of respective tissue (Anderson, 2010; Barelli, Crettaz, Thadikkaran, Rubin, & Tissot, 2007; Farrah et al., 2011; Hanash et al., 2008; Issaq, Xiao, & Veenstra, 2007). This may partially stem from the trend that the comprehensive analysis of tissue relevant proteins in less invasive clinical matrices such as the plasma or serum has been a daunting task for MS based methods despite all their latest technological advancements (Anderson, 2010; Farrah et al., 2011; Hanash et al., 2008). For example, currently available serum and plasma proteomics methods rely on the prior removal of high abundant proteins (i.e. albumin, IgGs, etc.) so that the lower abundant proteins, where potential biomarkers can be revealed, could be more easily analyzed. Several studies, however, have shown that their removal also resulted in the co-removal of a significant percentage of these lower abundant proteins due to their propensity to bind with the higher abundant proteins (S. D. Garbis et al., 2011; Granger, Siddiqui, Copeland, & Remick, 2005; Gundry, White, Nogee, Tchernyshyov, & Van Eyk, 2009; Zolotarjova et al., 2005). Additionally, these studies correctly purport than no MS based method to date has managed to fully remove albumin and other high abundant proteins despite claims made on the contrary. It is estimated that the 20 most abundant proteins in serum and plasma constitute over 99% of the total protein mass found in these matrices. In fact, the difference in endogenous concentration levels of proteins found in serum or plasma span from the mg/mL level (i.e. Albumin, IgG's) down to the low ng/mL level (i.e. Cyclin F, Interleukin 7) (Anderson, 2010; Farrah et al., 2011). This represents a 12-order of magnitude concentration range whose lower limit exceeds the detection capability of the fluorescence based ELISA technique, the most sensitive bioassay technique to date (Rissin et al., 2010). At the same token, the detection of endogenously occurring cleavage products (serum degradome) originating from both high and low abundance proteins may confer greater insight on serum biochemistry and cancer biology (van Winden et al., 2010). This is considered a very important incentive for the whole proteome wide analysis of the serum or plasma matrix in

In an effort to overcome these challenges, an approach coined multidimensional protein identification technology (MudPIT) has been developed (Fournier et al., 2007; S. D. Garbis et al., 2011; Hanash et al., 2008). This approach is principally based on combining two or more different types of liquid chromatographic chemistries so as to increase the separation efficiency as a result. This effect on the separation power is referred to as "orthogonal chromatography" and constitutes a very unique and powerful tool towards the more effective analysis of complex biological matrices (cell cultures, tissues, serum and plasma). Building on this theme, a three-dimensional (3-D) MudPIT variant was developed and applied to the analysis of clinical sera derived from patients with (BPH). The tissues from these BPH patients were analyzed and reported with the iTRAQ 2DLC-MS discussed in the previous section and was considered requisite for this proof-of-principle study so as to

**4.2 The quantitative proteomic profiling of clinical serum samples derived from** 

**benign prostate hyperplasia** 

the prospecting of mechanism based biomarker panels.

explore the possibility of finding tissue specific proteins in their respective serum (Garbis et al, 2011).

The analytical features of the 3-D MudPIT approach included (Figure 3): (1) high pressure size-exclusion chromatography (SEC) for the pre-fractionation of serum proteins followed by their dialysis exchange and solution phase trypsin proteolysis, (2) The tryptic peptides were then subjected to offline zwitterion-ion hydrophilic interaction chromatography (ZIC-HILIC) fractionation, and (3) their online analysis with reversed-phase nano ultraperformance chromatography (RP nUPLC) hyphenated to nano-electrospray ionization tandem mass spectrometry.This orthogonal chromatographic strategy used imparts a more effective parsing, purification and enrichment of the tryptic peptides when combined with the prior SEC protein pre-fractionation stage. This has the effect on increasing their individual mass density of the tryptic peptides (higher peptide signal intensity per

Fig. 3. **Top HPLC trace:** A representative size exclusion chromatography (SEC) trace of a pooled serum sample. Calibrant SEC traces are also shown along with their log MW vs. RT (min) linear response curve. **Middle HPLC traces:** Post-SEC sample treatment and ZIC HILIC tryptic peptide traces in concordance to SEC protein segment. The ZIC-HILIC peptide fractionation was performed in a peak-dependant manner. **Bottom HPLC trace:**  Each lyophilized peptide fraction was reconstituted in MP and individually analyzed with RP C18 nUPLC-nESI-MS2 analysis. The resulting product ion MS2 peptide spectra were processed with Scaffold validation, SpectrumMill and InsPecT software programs (Garbis et al, 2011)

The Discovery of Cancer Tissue Specific Proteins in Serum: Case Studies on Prostate Cancer 345

BPH/PCa prostate tissue study reported by the authors. Such an approach can serve as part of a more systematic serum biomarker discovery study that can eventually lead to their validation over a very large number of specimens from healthy and diseased patient cohorts, typically exceeding 1000 for each group. So far, however, and despite the advancements made in analytical technologies, the discovery and validation of robust protein biomarkers with good specificity and sensitivity has been very disappointing. This low return on investment is due to several factors. One of them is due to the lack of functional or mechanistic utility of the candidate biomarkers. This lack of mechanistic relevance also applies to proteins that exhibit a significant differential expression between the healthy and disease samples. Another factor is associated with the large biological heterogeneity of the specimens tested. Unless the clinical samples have well defined inclusion and inclusion criteria along with effective sample procurement and handling protocols at statistically significant numbers to address a hypothesis at hand (i.e., power analysis), the analytical output will lack accuracy and precision to be of any value to the clinician (Adewale et al., 2008; Anderson, 2010; Barelli et al., 2007; Farrah et al., 2011). Another impediment is the lack of lower-cost and high-throughput validation protocols to compensate for the large number of samples that need to be analyzed. This is further compounded by the lack of antibodies for the vast majority of candidate proteins needed for the development of an ELISA kit, which is the only suitable bioassay for protein measurements in serum or plasma. Yet another limitation relates to the unreliability of a significant number of commercially available ELISA kits due to their lack of sufficient antibody validation in terms of their selectivity, cross-reactivity, linear dynamic range and sensitivity(Bordeaux et al., 2010; Stoevesandt & Taussig, 2007). An additional factor to the high failure rate of the effectiveness of the ELISA assay is that its development is principally based on recombinant protein standards that do not capture the level of complexity of the protein as it exists its *in vivo* modification status within the context of its biological matrix and also the level of protein purification is not high enough to compare to the behavior observed for the respective recombinant, highly purified, protein. Moreover, the ELISA assay is not conducive to multiplexing approaches that could have reduced some of the biological variation already discussed. This is where targeted tandem mass spectrometry methods can overcome these limitations (Gerber, Rush, Stemman, Kirschner, & Gygi, 2003; Jaffe et al., 2008). Examples of these methods include accurate inclusion mass spectrometry (AIMS) and quantitative selection reaction monitoring (Q-SRM). These more targeted MS methods specifically account for the amino-acid composition of surrogate tryptic peptides to which the selective monitoring of their precursor mass (i.e., with quadrupole mass filter), its fragmentation (i.e., CID, HCD, ETD), and subsequent product ions take place. This Selective towards one specific peptide MS precursor – product ion Reaction Monitoring (hence the term SRM) allows for its more full-time measurement and henceforth its enhanced detection in complex mixtures. The SRM detection is therefore based on the molecular signature (i.e. the unique amino acid composition of a peptide) traceable to an information rich, distinctively annotatable (i.e., *de novo* peptide sequencing), tandem (MS-MS) spectrum. Also, the intensity of the tandem spectrum traceable to one specific peptide depends on the relative or absolute concentration level of this peptide (Q-SRM). Such a level of selectivity and specificity is well beyond what can be attained with antibody capture technologies (i.e., ELISA assay)(Rissin et al., 2010). In addition, the detection of a biochemical assay is based on an absorption reading to a specific wavelength that is highly subject to background signal

chromatographic retention time window) while at the same time reducing their co-elution (improved separation efficiency). It is precisely these chromatographic characteristics that allowed the enhancement of the nano-electrospray ionization of the eluting peptides followed by their tandem mass spectrometry. The end result from this process was the generation of more information rich tandem mass spectra at improved S/N ratios, which constitutes the ultimate objective for any effective MS based method.

Consequently, the collective analytical attributes of this milestone 3-D MudPIT analysis study of BPH sera resulted in the identification of proteins differing by approximately 12 orders concentration range in terms of their native abundance levels in the naturally occurring serum matrix (as measured with bioassay technique such as ELISA). In addition to this extensive dynamic range coverage, the study identified 1955 proteins with a wide spectrum of biological and physico-chemical properties. A key component however to this proteome including the detection of secreted, tissue-specific proteins also found to be differentially expressed in the respective BPH tissue reported (S. D. Garbis et al., 2008). This constitutes a hallmark feature in the effective discovery of serum protein markers that reflect the pathophysiology of a specific organ tissue of interest. An additional performance characteristic of the 3-D study method is its accuracy and sensitivity in identifying close to 400 phosphoproteins of potential importance to cancer biology. The identification of the phosphorylated variant to a potential protein marker imparts an additional molecular feature in the more precise capturing of unique chemical signatures of disease. This is based on the notion that a phosphorylated motif may signify the induction or silencing of a potential physiologic protein target already discussed. The versatility and adaptability of the method's constituent techniques permit the incorporation of label-based or label-free strategies to impart a quantitative feature for the in-depth proteome analysis of any given biological specimen derived from tissue, blood plasma or serum, and cell culture.

The tissue-surrogate serum proteins detected in this study and other MudPIT studies allow for the un-biased and in-depth discovery of useful biomarkers without recourse to the targeted antibody capture approach, as is common the case. In contrast, the Medical Therapy of Prostatic Symptoms (MTOPS) clinical trial, attempted to characterize potential biomarkers that could stratify the BPH patients according to their response to medical therapy, by using the *a priori* use of the ELISA assay (Mullins et al., 2008). However, such an *a priori* approach bypassed the possibility in observing unexpected low-abundant tissue specific and secreted proteins that might play a significant role on the differential diagnosis between BPH and PCa. Conclusively, the MudPIT approach is definitely a forward trend in the establishment of novel proteins marker that can be validated with more targeted approaches such as those based on Immuno-MRM techniques discussed below.
