2. Overview of proteomic techniques

The development of proteomic techniques in the past 20 years has enabled many research studies to identify the roles of proteins and PTMs in biology and human diseases at a large scale. It has also inspired the Human Proteome Project [15], a global effort that aims to "generate the map of protein based molecular architecture of the human body and become a resource to help elucidate biological and molecular function and advance diagnosis and treatment of diseases". Current proteomic approaches can be divided into two sub-categories: mass spectrometry (MS)-based, and antibody-based. Here, we describe the fundamentals of each technique and their recent applications in AML.

#### 2.1. MS-based methods

abnormal expression of proteins can potentially be molecularly targeted, creating more per-

The workflow of a typical proteomic project in AML is shown in Figure 1. In this chapter, we focus on reviewing the main proteomic techniques and the various applications of proteomics in AML research, the topics of the next two sections. In the last section, we will discuss the main challenges and issues in AML proteomic research by covering topics related to sample

sonalized therapy options for AML patients.

46 Myeloid Leukemia

collection considerations and proteomic data analysis techniques.

Figure 1. Typical workflow of a proteomic project in AML with methodology choices for each step.

One intuitive way to identify a protein is by measuring its mass directly. MS is a widely-used analytical technique that ionizes a sample (solid, liquid, or gas) and measures the mass based on the mass-to-charge ratios of the ions. The ionization causes the molecules to break into charged fragments, which pass through an electric (e.g. time-of-flight (TOF)) or magnetic field that sorts ions by their mass-to-charge ratios. The relative abundance of ions detected as a function of the mass-to-charge ratio is usually presented in a mass spectrum for deciphering the identity of the molecule. MS is often used in tandem with liquid chromatography (termed LC-MS or LC/MS) which separates the liquid compounds chromatographically before passing them through the mass spectrometer.

When applying MS to detect proteins, one can take either a "top-down" or a "bottom-up" approach [16–18]. The "top-down" approach ionizes the intact protein directly, and is usually limited to low-throughput single protein studies. On the other hand, the "bottom-up" approach first digests the protein into peptides using enzymes such as trypsin, and then analyzes the peptides using tandem mass spectrometry. The "bottom-up" approaches using LC-MS are also referred to as "shotgun proteomics" [19]. The "bottom-up" approach is more widely adopted compared to the "top-down" approach in proteomic studies because it is much easier to handle small tryptic peptides and determine their masses with high accuracy than handling intact protein ions. However, the limited protein sequence coverage by peptides, loss of PTM information and redundant peptides of ambiguous origin are some of the disadvantages of "bottom-up" approaches. Notably, an intermediate approach, "middledown", was proposed to break proteins into proteolytic peptides (size of 2–20 kDa) instead of small tryptic peptides (which is ~8–25 residues long) using proteases such as OmpT [20]. This hybrid approach potentially combines the benefits from the "top-down" and "bottom-up" approaches and overcomes their drawbacks.

Electrospray ionization (ESI) [21] and matrix-assisted laser desorption/ionization (MALDI) [22] are two primary methods for ionizing proteins and peptides. ESI generates ionized molecules by applying a high electric field and dispersing the liquid sample into an aerosol. In contrast, MALDI ionizes the sample by firing laser pulses at the sample mixed with an energy absorbing matrix. Both methods are considered to be "soft" ways of obtaining ions of large molecules with low fragmentation. The main advantage of ESI is that it produces multiply charged ions, extending the mass detection range of the analyzer. MALDI, on the other hand, is advantageous for its robustness and high speed. ESI is frequently coupled with LC, whereas MALDI is most often used with TOF. A more recent method, Surface-enhanced laser desorption/ionization (SELDI) [23], was proposed as an alternative to MALDI. SELDI is similar to MALDI with the exception that the sample is bound to a surface in SELDI instead of being mixed with a matrix material. The SELDI surface allows for more retention of analytes and therefore is more suitable for detecting proteins in lower concentrations. SELDI is usually coupled with TOF, and it was shown that SELDI-TOF-MS can detect proteins from as little as 1 μL of serum or as few as 25–50 cells [24], which can be very beneficial when studying clinical samples.

their PTMs, which is very favorable for profiling kinases and signaling activities. Commonly used techniques such as western blot and enzyme-linked immunosorbent assay (ELISA) already use antibodies to measure protein expressions. However, these methods are low-throughput, and they are therefore unsuitable to profile a large number of proteins or samples in a timely fashion. Using microarray technologies, multiple types of high-throughput antibody-based methods were developed to enable profiling proteins at a much larger scale, including tissue microarrays (TMA) and protein microarrays. TMA is a proteomic technique in application to tissue samples [33]. TMA assembles up to 1000 tissue samples into one paraffin block to enable simultaneous evaluation of biomarkers. Since tissue samples are of more importance in solid

Proteomics in Acute Myeloid Leukemia http://dx.doi.org/10.5772/intechopen.70929 49

Based on the application purpose, protein microarrays can be divided into two categories: analytical protein arrays and functional protein arrays [34]. Functional protein arrays print a large number of individually purified proteins on an array to investigate their biochemical activities. The use of functional arrays is mostly in basic research, including identifying interactions between protein-protein, protein-DNA, protein-antibody, protein-lipid, protein-RNA, or protein-small molecules, and identifying substrates or enzymes for protein modifications. On the other hand, analytical protein arrays use well-characterized antibodies to measure the amounts of specific proteins in a large scale. These arrays are widely used in clinical research for biomarker discovery and protein expression profiling, and can be applied in disease diag-

There are two types of analytical protein arrays: forward-phase protein array (FPPA) and reverse-phase protein array (RPPA) [35]. The major difference between FPPA and RPPA is whether antibodies or samples are immobilized. In FPPA, various antibodies are printed on a slide as bait molecules, where each spot on the array is one type of antibody. Each slide is then exposed to a single protein lysate (sample), and multiple protein expression levels are measured. The main advantage of FPPA is that a single slide can provide measurements of many proteins simultaneously. However, FPPA needs two highly specific antibodies (similar to "sandwich ELISA") for assaying each protein, and it also requires a higher amount of the protein lysate sample (which is often a luxury in clinical research). In contrast, RPPA immobilizes protein lysates, where each spot on the slide is a sample from a different source or condition. Each slide is then probed with one type of antibody and provides a read-out of the corresponding protein level across all printed samples, allowing for a direct comparison between samples. To profile multiple proteins, one can prepare a batch of identical slides printed with the same samples (which is straightforward to do), and process them in parallel, each slide with a unique type of antibody. RPPA is known to be highly sensitive and robust, and it is particularly advantageous for clinical applications because it requires lower amounts of samples. In the past decade, RPPA was used in multiple research studies to generate protein

Compared to MS-based methods, antibody-based methods are less of a de novo discovery approach, and provides less coverage of the proteome. This is mainly because antibody-based methods only profile proteins that are known ahead of the experiment, and the coverage of these methods depend on the availability of specific antibodies. It is still an ongoing effort to generate antibodies that specifically recognize all protein isoforms present in the human

tumors than in leukemia, we will focus the discussion on protein microarrays.

nosis in clinic.

profiles and identify biomarkers in AML [36–41].

To quantify the protein levels (or termed "quantitative proteomics"), there are three major groups of labeling methods that can be used in the proteomic workflow: label-free, stable isotope labeling, and multiple reaction monitoring [25]. By its name, label-free methods (e.g. spectral counting and peptide peak intensity measurement) do not use any isotope containing compound to bind to and label proteins [26]. Though easy to perform, inexpensive, high throughput and with a wider dynamic range, label-free methods are in general less accurate [27]. Stable isotope labeling approaches use differential stable isotopes to label and distinguish samples via either metabolic labeling or chemical labeling. One example of metabolic labeling approach is stable isotope labeling by amino acids (SILAC) [28], which feeds cells from different samples with heavy and light forms of arginine or lysine through the growth medium. SILAC generates precise quantitation of proteins, but can only be applied to living or metabolically active samples. An alternative method, "super-SILAC", was developed to extend SILAC to human tissue samples by using a mixture of SILAC-labeled cell lines as the internal standard [29]. A super-SILAC mix based on five AML cell lines (Molm-13, NB4, MV4-11, THP-1, and OCI-AML3) was recently established for quantifying patient AML cells [30].

While most MS-based methods profile proteins from cell lysates, mass cytometry is a fusion technology of MS and flow cytometry that can be used to measure protein levels in single cells [31]. Mass cytometry is also referred to as cytometry by time-of-flight (CyTOF), which is the current commercialized implementation. Mass cytometry overcomes the spectral overlap in flow cytometry by conjugating probes (often antibodies) with heavy-metal isotopes as expression reporters instead of fluorophores. The metal-conjugated antibodies, ionized and detected using the TOF mass spectrometer, greatly increase the number of parameters measureable in single cells due to their little signal overlap. Currently, mass cytometry can be used to detect up to 40 parameters per cell (up to 100 parameters theoretically), including protein levels, PTMs and proteolysis products. Mass cytometry was recently used in pediatric AML to profile both the surface markers and intracellular signaling proteins in single cells [32]. Notably, the study discovered that the surface phenotypes and their regulatory intracellular signaling phenotypes are decoupled in AML, rendering the surface markers unreliable for reporting signaling states. The study also identified a gene signature associated with the primitive signaling phenotype that is predictive of survival.

#### 2.2. Antibody-based methods

The other group of methods for detecting and quantifying proteins is based on the use of antibodies. Antibodies can be engineered to specifically recognize not only proteins but also

their PTMs, which is very favorable for profiling kinases and signaling activities. Commonly used techniques such as western blot and enzyme-linked immunosorbent assay (ELISA) already use antibodies to measure protein expressions. However, these methods are low-throughput, and they are therefore unsuitable to profile a large number of proteins or samples in a timely fashion. Using microarray technologies, multiple types of high-throughput antibody-based methods were developed to enable profiling proteins at a much larger scale, including tissue microarrays (TMA) and protein microarrays. TMA is a proteomic technique in application to tissue samples [33]. TMA assembles up to 1000 tissue samples into one paraffin block to enable simultaneous evaluation of biomarkers. Since tissue samples are of more importance in solid tumors than in leukemia, we will focus the discussion on protein microarrays.

the mass detection range of the analyzer. MALDI, on the other hand, is advantageous for its robustness and high speed. ESI is frequently coupled with LC, whereas MALDI is most often used with TOF. A more recent method, Surface-enhanced laser desorption/ionization (SELDI) [23], was proposed as an alternative to MALDI. SELDI is similar to MALDI with the exception that the sample is bound to a surface in SELDI instead of being mixed with a matrix material. The SELDI surface allows for more retention of analytes and therefore is more suitable for detecting proteins in lower concentrations. SELDI is usually coupled with TOF, and it was shown that SELDI-TOF-MS can detect proteins from as little as 1 μL of serum or as few as 25–50 cells [24],

To quantify the protein levels (or termed "quantitative proteomics"), there are three major groups of labeling methods that can be used in the proteomic workflow: label-free, stable isotope labeling, and multiple reaction monitoring [25]. By its name, label-free methods (e.g. spectral counting and peptide peak intensity measurement) do not use any isotope containing compound to bind to and label proteins [26]. Though easy to perform, inexpensive, high throughput and with a wider dynamic range, label-free methods are in general less accurate [27]. Stable isotope labeling approaches use differential stable isotopes to label and distinguish samples via either metabolic labeling or chemical labeling. One example of metabolic labeling approach is stable isotope labeling by amino acids (SILAC) [28], which feeds cells from different samples with heavy and light forms of arginine or lysine through the growth medium. SILAC generates precise quantitation of proteins, but can only be applied to living or metabolically active samples. An alternative method, "super-SILAC", was developed to extend SILAC to human tissue samples by using a mixture of SILAC-labeled cell lines as the internal standard [29]. A super-SILAC mix based on five AML cell lines (Molm-13, NB4, MV4-11, THP-1, and OCI-AML3) was recently

While most MS-based methods profile proteins from cell lysates, mass cytometry is a fusion technology of MS and flow cytometry that can be used to measure protein levels in single cells [31]. Mass cytometry is also referred to as cytometry by time-of-flight (CyTOF), which is the current commercialized implementation. Mass cytometry overcomes the spectral overlap in flow cytometry by conjugating probes (often antibodies) with heavy-metal isotopes as expression reporters instead of fluorophores. The metal-conjugated antibodies, ionized and detected using the TOF mass spectrometer, greatly increase the number of parameters measureable in single cells due to their little signal overlap. Currently, mass cytometry can be used to detect up to 40 parameters per cell (up to 100 parameters theoretically), including protein levels, PTMs and proteolysis products. Mass cytometry was recently used in pediatric AML to profile both the surface markers and intracellular signaling proteins in single cells [32]. Notably, the study discovered that the surface phenotypes and their regulatory intracellular signaling phenotypes are decoupled in AML, rendering the surface markers unreliable for reporting signaling states. The study also identified a gene signature associated with the primitive signaling phenotype

The other group of methods for detecting and quantifying proteins is based on the use of antibodies. Antibodies can be engineered to specifically recognize not only proteins but also

which can be very beneficial when studying clinical samples.

48 Myeloid Leukemia

established for quantifying patient AML cells [30].

that is predictive of survival.

2.2. Antibody-based methods

Based on the application purpose, protein microarrays can be divided into two categories: analytical protein arrays and functional protein arrays [34]. Functional protein arrays print a large number of individually purified proteins on an array to investigate their biochemical activities. The use of functional arrays is mostly in basic research, including identifying interactions between protein-protein, protein-DNA, protein-antibody, protein-lipid, protein-RNA, or protein-small molecules, and identifying substrates or enzymes for protein modifications. On the other hand, analytical protein arrays use well-characterized antibodies to measure the amounts of specific proteins in a large scale. These arrays are widely used in clinical research for biomarker discovery and protein expression profiling, and can be applied in disease diagnosis in clinic.

There are two types of analytical protein arrays: forward-phase protein array (FPPA) and reverse-phase protein array (RPPA) [35]. The major difference between FPPA and RPPA is whether antibodies or samples are immobilized. In FPPA, various antibodies are printed on a slide as bait molecules, where each spot on the array is one type of antibody. Each slide is then exposed to a single protein lysate (sample), and multiple protein expression levels are measured. The main advantage of FPPA is that a single slide can provide measurements of many proteins simultaneously. However, FPPA needs two highly specific antibodies (similar to "sandwich ELISA") for assaying each protein, and it also requires a higher amount of the protein lysate sample (which is often a luxury in clinical research). In contrast, RPPA immobilizes protein lysates, where each spot on the slide is a sample from a different source or condition. Each slide is then probed with one type of antibody and provides a read-out of the corresponding protein level across all printed samples, allowing for a direct comparison between samples. To profile multiple proteins, one can prepare a batch of identical slides printed with the same samples (which is straightforward to do), and process them in parallel, each slide with a unique type of antibody. RPPA is known to be highly sensitive and robust, and it is particularly advantageous for clinical applications because it requires lower amounts of samples. In the past decade, RPPA was used in multiple research studies to generate protein profiles and identify biomarkers in AML [36–41].

Compared to MS-based methods, antibody-based methods are less of a de novo discovery approach, and provides less coverage of the proteome. This is mainly because antibody-based methods only profile proteins that are known ahead of the experiment, and the coverage of these methods depend on the availability of specific antibodies. It is still an ongoing effort to generate antibodies that specifically recognize all protein isoforms present in the human proteome. The Human Protein Atlas project, started in 2003, maps the expression and location of proteins in cells, normal tissues and cancers using an antibody-based approach. Its latest version (16th release) now includes more than 25,000 antibodies that about 86% of all human protein-coding genes [42, 43]. In addition, the quality of antibodies is key to the success of any antibody-based methods. Before printing an array, antibodies need to be validated to ensure that they are highly specific and do not cross-react with other proteins in the lysate. Otherwise, the accuracy of the profiling will be compromised by false signals. Antibodypedia (https:// www.antibodypedia.com/), a public database containing validation data of more than one million antibodies, is a useful resource for antibody-based research [44].

alignment-based label-free quantitation approaches in LC-MS/MS to distinguish AML from ALL and CD34+ cells from healthy donors [49]. Based on the same data generated in Foss et al.'s study, Elo et al. used a more advanced statistical method (reproducibility optimized test statistics (ROTS)) to identify biomarkers from the proteomic data and from the transcriptomic data. They found that the alignment-based proteomic method was able to generate novel and significant biomarkers that were not detected by the transcriptomic assay [50]. From the proteomic profiles of 151 AML bone marrow samples generated by SELDI-TOF-MS, Xu et al. developed a proteomicbased decision tree model to classify patients into APL, AML-granulocytic, AML-monocytic,

Proteomics in Acute Myeloid Leukemia http://dx.doi.org/10.5772/intechopen.70929 51

AML subtypes display unique proteomic patterns, which may present therapeutic opportunities for each of these subtypes. In a study of 38 AML-M1/M2 patients and 17 healthy volunteers [52], Luczak et al. demonstrated the use of 2-DE-MS to distinguish between M1 and M2 patients. They identified five proteins that were differentially accumulated between M1 and M2, in which Annexin III, L-plastin and 6-phosphogluconate dehydrogenase were found exclusively in M2. Comparing the protein expression levels across AML FAB classes, Cui et al. identified 23 proteins differentially expressed between the granulocytic lineage (M1, M2, M3) and monocytic lineage (M5), where they found 7 proteins up-regulated in both M2 and M3, and 15 proteins tightly associated with M3 (e.g. cathepsin G) [47]. In an RPPA study of 256 newly diagnosed AML patients [36], 24 proteins were found to significantly differ in expression between FAB subtypes out of 51 proteins that were tested. The proteins were found to belong to three clusters: (1) total and phosphorylated signal transduction proteins (KCA, PKCA.p, ERK2, AKT.p308, P38.p P70S6K, P70S6K.p, and Src.p527), with lower expression in myeloid subtypes (M0, M1, and M2); (2) PTEN and PTEN.p, with lower expression in M6 and M7; (3) apoptosis, cell cycle or differentiation regulating proteins and activated STAT proteins

Differences in proteomics (expression patterns, protein interaction pathways, and PTMs) were also found between cytogenetic abnormalities. In a study of 42 AML patients study using 2-DE MALDI-TOF-MS [53], Balkhi et al. showed that there were significant differences of protein expression levels, protein interaction networks and PTMs between cytogenetic groups. PTMs specific to cytogenetic abnormalities were identified, including a b-O-linked N-acetyl glucosamine (O-GlcNAc) of hnRNPH1 in patients with 11q23 translocation, an acetylation of calreticulin in patients with t(8;21), and methylation of hnRNPA2/B1 in patients with t(8;21) and inv(16). In an RPPA study, increased MET phosphorylation levels were found to associate

Proteomic comparisons of relapsed against newly diagnosed patients or patients in remission can reveal biomarkers for early detection of relapse and non-invasive monitoring of minimal residual disease (MRD). Using MALDI-TOF-MS and high performance LC (HPLC)-ESI-MS/MS [55], Bai et al. identified 47 peptides that were differentially expressed between AML and healthy controls. In specific, they built a quality classifier model based on three peptides (ubiquitin-like modifier activating enzyme 1 (UBA1), isoform 1 of fibrinogen alpha chain precursor and platelet factor 4 (PF4)). UBA1 was up-regulated in newly diagnosed AML, decreased to normal level after complete remission, and then elevated again in relapse, whereas the other two peptides had

ALL, and control (healthy volunteers) [51].

that have higher expression in myeloid subtypes.

with t(15;17) and t(8;21) cytogenetic subtypes [54].
