**2. Identification of individual glycation sites**

Like any chronic pathology, diabetes can be efficiently recognized by a set of reliable well-established methods according universal criteria [81]. However, its first manifestations are often invisible for patients and recognized, therefore, already after onset of the pathology [81]. Thus, early diagnosis of T2DM and timely start of its therapy would allow deceleration of the disease progress and reduction the probability of life-threatening complications. Therefore, it is very important to develop a panel of biomarkers, giving access to the early and reliable discovery of diabetes mellitus.

Although HbA1c, fasting blood glucose and glucose tolerance test are well established and universally recognized diagnostic criteria of DM [81], this setup is usually unable to recognize the short term excursions of blood glucose concentrations, which are characterized the beginning of pre-diabetes [82]. Therefore, it was proposed that the biomarkers based on disease-related structural changes of individual proteins might be more sensitive and, hence, more diagnostically efficient [82]. Among such changes, post-translational modifications (PTMs) represent the most promising source of diagnostic information [82–84]. Thereby, modified

peptides, rather than proteins represent the best targets in the search for new T2DM biomarkers of this type. Indeed, under *in vivo* conditions proteins can have multiple modification sites, the patterns of which can be very heterogeneous in terms of diversity of chemical structures (phosphorylation, nitration, carbonylation, glycosylation, methylation, acetylation and many others) and their relative abundances [85]. On the other hand, each individual PTM changes the molecular weight of the target protein, which leads to difficulties in MS analysis [86].

Superior, in comparison to protein analysis, level of precision could provide the fact, that individual lysine and arginine residues in protein molecule are featured with different reactivity with sugars. This effect can be related to both the amino acid environment of the site [87] and its accessibility to the molecule of glycation agent [86, 88]. Moreover, as the blood plasma proteins have different half-lives these markers can potentially cover a broad range of times prior to analysis. Thus, in contrast to HbA1c analysis, this approach might provide a short-term markers, which could successfully address short-term fluctuations in blood glucose, preceding onset of DM [84, 89]. Monitoring of blood protein glycation in this way might provide an opportunity for detection of hyperglycaemia at very early stages of the T2DM [90].

Since decades, the bottom-up proteomics (BUP) approach is the method of choice to address PTMs in proteins [91]. Accordingly, it is efficiently applied to analysis of protein glycation and can be applied to protein mixtures of any composition and complexity [92]. In the most general way it includes several critical steps: (*i*) separation of proteins, (*ii*) limited proteolysis, (*iii*) separation of resulted cleavage peptides, (*iv*) their identification by tandem mass spectrometry (MS/MS) and (*v*) annotation of individual protein sequence tags [79, 83, 89]. In application to sugar-modified proteins, BUP used for detailed information about glycoprotein profile and mapping of specific glycation sites [93].

For the BUP only several microliters of blood plasma are necessarily [83, 84]. The short workflow is present on the **Figure 4**. Plasma proteins can be separated during electrophoresis with further in gel digestion [94], or Amadori-modified proteins can be retained on BAC before digestion *in solution* [89, 92]. Several important aspects need to be continuously considered on this way. Thus, for successful quantitative BUP analysis it is very important to use the same concentration of protein in all samples [79, 80, 95]. Further, tryptic digestion of plasma samples is challenging because of high complexity of sample matrices and needs to be performed in the presence of chaotropic agents like urea [96] or detergents, e.g. sodium dodecyl sulfate (SDS) [83]. Next step is enrichment of glycated peptides on BAC which helps to eliminate chaotropic agents [83, 84, 97]. The BAC method is based on covalent binding of the column-bound ligand (*m*-aminophenyl-boronic acid) to cis-diol groups on the sugar portion of peptides, accompanied with formation of a reversible five-member ring derivative. After washing out non-bound unglycated molecules from the sample by alkaline buffer, the five-member ring can be hydrolyzed under acidic conditions, and glycated peptides can be eluted by acidic (pH 2–3) buffer [98, 99]. Prior to the MS analysis, the obtained peptides need to be desalted by solid phase extraction (SPE) [83, 84, 100, 101]. Several separation steps (on a protein and/or peptide level prior to separation by mass-to-charge ratio) and high specificity of endoproteases used for digestion are provide high proteome discovery rate and sensitivity [85, 87].

The gel-based strategy was implemented for analysis of glycation of apolipoprotein A-I in human plasma samples [94]. For this, blood samples were obtained from ten T2DM patients, affected by end-stage renal disease (ESRD), and ten healthy control individuals. The plasma samples were pooled by mixing the samples of each group of subjects and then applied onto a Centriplus centrifugal concentrator

**71**

patients [94].

**Figure 4.**

*Individual Glycation Sites as Biomarkers of Type 2 Diabetes Mellitus*

membrane with molecular weight cut off (MWCO) 30000. After two-dimensional gel electrophoresis (2-DE) the apolipoprotein A-I spots were cut, digested, and the digests were analyzed by matrix laser desorption ionization time-of-flight (MALDI-TOF) with a standard nitrogen laser (λ = 337 nm). In this study three glycated peptides from apolipoprotein A-I were identified in T2DM and nephropathic

One of the first scientific groups started developing methods for analysis of individual glycation sites in proteins of human plasma was the Metz's laboratory. Initially, they investigated *in vitro* glycated proteins in pooled plasma from healthy humans [96]. Glycated proteins were enriched using BAC and then digested by three different proteolytic enzymes (trypsin, Arg-C and Lys-C) to increase sequence coverage. After protein digestion, Amadori-modified peptides were enriched by BAC and analyzed by linear ion trap – orbital trap mass spectrometer (LIT-Orbitrap-MS) with electron-transfer dissociation (ETD) fragmentation option. As a result, 346 unique glycated peptides were identified. It was shown that

trypsin was the most applicable enzyme in study of glycated peptides [96].

Alternatively, Zhang et al. performed the first proteomics-based characterization of non-enzymatically glycated proteins in human plasma and erythrocyte membranes from participants with normal glucose tolerance (NGT), impaired glucose tolerance (IGT), and T2DM [102]. In this study one additional step was introduced, and twelve highly-abundant plasma proteins were removed from the samples during immunodepletion procedure. Depletion of such proteins as HSA, immunoglobulin G (IgG), α1-antitrypsin, IgA, IgM, transferrin, haptoglobin, α1-acid glycoprotein, α2-macroglobulin, apolipoprotein A-I, apolipoprotein A-II and fibrinogen from blood plasma enabled the analysis of less abundant plasma

*DOI: http://dx.doi.org/10.5772/intechopen.95532*

*The short workflow of analysis individual glycation sites.*

*Individual Glycation Sites as Biomarkers of Type 2 Diabetes Mellitus DOI: http://dx.doi.org/10.5772/intechopen.95532*

*Type 2 Diabetes - From Pathophysiology to Cyber Systems*

target protein, which leads to difficulties in MS analysis [86].

profile and mapping of specific glycation sites [93].

discovery rate and sensitivity [85, 87].

peptides, rather than proteins represent the best targets in the search for new T2DM biomarkers of this type. Indeed, under *in vivo* conditions proteins can have multiple modification sites, the patterns of which can be very heterogeneous in terms of diversity of chemical structures (phosphorylation, nitration, carbonylation, glycosylation, methylation, acetylation and many others) and their relative abundances [85]. On the other hand, each individual PTM changes the molecular weight of the

Superior, in comparison to protein analysis, level of precision could provide the fact, that individual lysine and arginine residues in protein molecule are featured with different reactivity with sugars. This effect can be related to both the amino acid environment of the site [87] and its accessibility to the molecule of glycation agent [86, 88]. Moreover, as the blood plasma proteins have different half-lives these markers can potentially cover a broad range of times prior to analysis. Thus, in contrast to HbA1c analysis, this approach might provide a short-term markers, which could successfully address short-term fluctuations in blood glucose, preceding onset of DM [84, 89]. Monitoring of blood protein glycation in this way might provide an opportunity for detection of hyperglycaemia at very early stages of the

Since decades, the bottom-up proteomics (BUP) approach is the method of choice to address PTMs in proteins [91]. Accordingly, it is efficiently applied to analysis of protein glycation and can be applied to protein mixtures of any composition and complexity [92]. In the most general way it includes several critical steps: (*i*) separation of proteins, (*ii*) limited proteolysis, (*iii*) separation of resulted cleavage peptides, (*iv*) their identification by tandem mass spectrometry (MS/MS) and (*v*) annotation of individual protein sequence tags [79, 83, 89]. In application to sugar-modified proteins, BUP used for detailed information about glycoprotein

For the BUP only several microliters of blood plasma are necessarily [83, 84]. The short workflow is present on the **Figure 4**. Plasma proteins can be separated during electrophoresis with further in gel digestion [94], or Amadori-modified proteins can be retained on BAC before digestion *in solution* [89, 92]. Several important aspects need to be continuously considered on this way. Thus, for successful quantitative BUP analysis it is very important to use the same concentration of protein in all samples [79, 80, 95]. Further, tryptic digestion of plasma samples is challenging because of high complexity of sample matrices and needs to be performed in the presence of chaotropic agents like urea [96] or detergents, e.g. sodium dodecyl sulfate (SDS) [83]. Next step is enrichment of glycated peptides on BAC which helps to eliminate chaotropic agents [83, 84, 97]. The BAC method is based on covalent binding of the column-bound ligand (*m*-aminophenyl-boronic acid) to cis-diol groups on the sugar portion of peptides, accompanied with formation of a reversible five-member ring derivative. After washing out non-bound unglycated molecules from the sample by alkaline buffer, the five-member ring can be hydrolyzed under acidic conditions, and glycated peptides can be eluted by acidic (pH 2–3) buffer [98, 99]. Prior to the MS analysis, the obtained peptides need to be desalted by solid phase extraction (SPE) [83, 84, 100, 101]. Several separation steps (on a protein and/or peptide level prior to separation by mass-to-charge ratio) and high specificity of endoproteases used for digestion are provide high proteome

The gel-based strategy was implemented for analysis of glycation of apolipoprotein A-I in human plasma samples [94]. For this, blood samples were obtained from ten T2DM patients, affected by end-stage renal disease (ESRD), and ten healthy control individuals. The plasma samples were pooled by mixing the samples of each group of subjects and then applied onto a Centriplus centrifugal concentrator

**70**

T2DM [90].

**Figure 4.** *The short workflow of analysis individual glycation sites.*

membrane with molecular weight cut off (MWCO) 30000. After two-dimensional gel electrophoresis (2-DE) the apolipoprotein A-I spots were cut, digested, and the digests were analyzed by matrix laser desorption ionization time-of-flight (MALDI-TOF) with a standard nitrogen laser (λ = 337 nm). In this study three glycated peptides from apolipoprotein A-I were identified in T2DM and nephropathic patients [94].

One of the first scientific groups started developing methods for analysis of individual glycation sites in proteins of human plasma was the Metz's laboratory. Initially, they investigated *in vitro* glycated proteins in pooled plasma from healthy humans [96]. Glycated proteins were enriched using BAC and then digested by three different proteolytic enzymes (trypsin, Arg-C and Lys-C) to increase sequence coverage. After protein digestion, Amadori-modified peptides were enriched by BAC and analyzed by linear ion trap – orbital trap mass spectrometer (LIT-Orbitrap-MS) with electron-transfer dissociation (ETD) fragmentation option. As a result, 346 unique glycated peptides were identified. It was shown that trypsin was the most applicable enzyme in study of glycated peptides [96].

Alternatively, Zhang et al. performed the first proteomics-based characterization of non-enzymatically glycated proteins in human plasma and erythrocyte membranes from participants with normal glucose tolerance (NGT), impaired glucose tolerance (IGT), and T2DM [102]. In this study one additional step was introduced, and twelve highly-abundant plasma proteins were removed from the samples during immunodepletion procedure. Depletion of such proteins as HSA, immunoglobulin G (IgG), α1-antitrypsin, IgA, IgM, transferrin, haptoglobin, α1-acid glycoprotein, α2-macroglobulin, apolipoprotein A-I, apolipoprotein A-II and fibrinogen from blood plasma enabled the analysis of less abundant plasma

proteins. As the result, 260 unique Amadori-modified peptides representing 76 unique glycated proteins from human plasma were identified. Among them 39 unique glycated proteins, represented by 114 unique glycated peptides could be detected in human plasma prior to immunodepletion. On the other hand, further 46 unique glycated proteins (156 unique glycated peptides) were discovered in the low-abundance protein fraction of human plasma. As for the proteins of the erythrocyte membrane, 75 unique glycated peptides corresponding to 31 unique glycated proteins were identified. That means, that under diabetic conditions the functions of major structural proteins, major integral proteins of erythrocyte lipid rafts and GAPDH are affected by glycation. Interestingly that a majority of the identified Amadori-modified proteins appear in all three subject groups, with little variation in terms of the numbers of glycated peptides or glycation sites. In that study no label-free-quantification analysis was performed. However, a roughly estimation showed, that 50 of unique glycated peptides from plasma samples and 14 from erythrocyte membrane were up-regulated in both IGT and T2DM groups compared to the NGT group [102].

The next logical step of the Metz's work was comprehensive identification of glycated peptides in plasma and erythrocytes of control and diabetic subjects performed in 2011 [87]. After a three-step separation by strong cation exchange chromatography (SCX), BAC, nanoHPLC and sub-sequent mass spectrometric analysis with ETD-based fragmentation, a comprehensive database of glycated peptides/glycation sites and corresponding proteins was built to facilitate the discovery of potential novel markers of diabetes. For selective and specific identification of glycated peptides, the authors established a data-dependent neutral loss triggered ETD scan, where the top six most intense ions were first fragmented with and precursor ions producing neutral losses of 3 H2O and 3 H2O + HCHO (characteristic neutral losses for Amadori-modified peptides during CID [103]) were further fragmented using ETD. In total, 7749 unique glycated peptides corresponding to 3742 unique glycated proteins were identified [87], that was a massive advantage in sequence coverage in comparison to the previous study. In general, characteristic neutral losses represent a convenient and powerful tool in identification of glycation products: they allow not only identification of involved monosaccharide [99, 100], but also more complex modifications, like ADPglucose-dependent glycation [104].

In the work of Bai *et al.* [95] the analysis of glycated HSA peptides by liquid chromatography – ion trap – time-of-flight (LC-IT-TOF)-MS revealed 21 glycation sites in the serum samples of healthy persons and only 16 glycation sites in that from the T2DM patients [95]. Here BAC was used for enrichment of glycated proteins. The sub-sequent digestion procedure was carried out by incubation with endoproteinase Glu-C and trypsin. High sequence coverage (88% for GA from healthy person and 78% for GA from T2DM) was achieved by combining the peptide mass fingerprinting mapping results of the digests, obtained by both Glu-C and trypsin [95].

Using capillary flow data-independent acquisition (DIA) proteomics approach, 234 glycation sites in human plasma proteins were characterized [100]. 1508 plasma samples were obtained from overweight/obese non-diabetic adults. For DIA analysis, peptides were loaded on RP-UHPLC coupled on-line to Orbitrap Fusion Lumos MS tribrid. Full MS scan was performed from 350 to 1650 *m/z*, then 33 DIA segments were acquired with higher-energy collisional dissociation (HCD) 27%. For DDA analysis, isolation width was set to 1.6 *m/z*, 3 s method cycle time and 27% HCD for the dependent MS/MS scans. It resulted in identification of 242 glycation sites on 70 proteins. In this study most glycation sites were detected in serum albumin (36 sites), serotransferrin (13) and Ig kappa constant region (7) [100].

**73**

*Individual Glycation Sites as Biomarkers of Type 2 Diabetes Mellitus*

AGE-modified sites were also in the focus of research groups working in the field of DM biomarker discovery. Greifenhagen *et al.* [105] optimized the method for identification carboxymethylated (CML-modified) and carboxyethylated (CEL-modified) peptides in tryptic digests of proteins from human plasma, based on the precursor ion approach, earlier established for Amadori compounds [103]. The verification of results and identification of individual glycation sites relied on LIT-Orbitrap-MS analysis. Overall 21CML-modifications sites were identified in 17 proteins including only 2 sites K88 and K396 in HSA [105]. The same procedure were applied to characterize tryptic peptides (and corresponding glycation sites) with AGE-modified arginine residues [106]. It was shown that 42 plasma proteins are modified by their arginine residues with Glarg, glyoxal-derived dihydroxyimidazolidine (GD-HI), MG-H and methylglyoxal-derived dihydroxyimidazolidine (MGD-HI) [106]. In both strategies [105, 106] were no step of enrichment of AGE-modified peptides which simplifies the analysis and improves the robustness. However, the products can be reliably separated in longer LC gradients, whereas AGE-modified sites can be assigned not only by characteristic mass increments, but also by characteristic fragmentation patterns of in vitro glycated model peptides [24, 107, 108].

Different MS-based methods were developed for characterization of individual

**3. Label-free relative quantification of glycation occupancy at individual** 

The first insight in the potential biomarker value of glycated proteolytic peptides was provided by the Hoffmann's group in 2010 [110]. In their first pilot study the early glycation patterns in HSA in blood samples obtained from five T2DM patients

To understand the differences in the levels of site-specific Amadori modifications, observed between healthy individuals and T2DM patients, five blood samples

Label-free relative quantification (LFQ ) is the widely used method for biomarker discovery. It is based on the relative comparisons of the abundances (expressed as peak areas, heights or spectral counts) of individual analytes in

was addressed. The experimental procedure included protein concentration determination, two steps of trypsin digestion, BAC, filtration on Centricon YM-10 cartridges (to remove high molecular mass cleavage products and aggregates), HPLC separation on C18 trap column and C18 nano-column coupled with electrospray ionization - quadrupole-quadrupole-time-of-flight MS (ESI-QqTOF). The MS analysis was performed in the information-dependent acquisition (IDA) mode with CID for fragmentation. Tandem mass spectra were automatically processed with MASCOT (Matrix Science Ltd) against the SwissProt database and also confirmed by manual interpretation [110]. In most fragment ion spectra, the ions of glycated peptides showed intense signals corresponding to consecutive neutral losses of 18 (–H2O), 36 (−2× H2O), 54 (−3× H2O, pyrylium ion) and 84 (−3× H2O–HCHO, furylium ion) units. These patterns of neutral losses are characteristic for peptides, containing a carbohydrate moiety [99, 105]. Quantification relied on integration of specific extracted ion chromatograms (XICs, *m/z* ± 0.02) at characteristic retention times (tR). The BUP approach revealed 18 fructosamine-modified peptides identified by their fragmentation patterns in the plasma samples. Relative quantification showed that 15 glycated peptides were detected with quite similar intensities of corresponding signals in all T2DM samples, whereas two glycation sites showed dramatically different abundances, which could indicate individual, maybe disease-

*DOI: http://dx.doi.org/10.5772/intechopen.95532*

glycation sites in plasma proteins.

control and experimental samples [109].

specific, alteration of glycation patterns [110].

**protein sites**

*Individual Glycation Sites as Biomarkers of Type 2 Diabetes Mellitus DOI: http://dx.doi.org/10.5772/intechopen.95532*

*Type 2 Diabetes - From Pathophysiology to Cyber Systems*

to the NGT group [102].

glucose-dependent glycation [104].

Glu-C and trypsin [95].

proteins. As the result, 260 unique Amadori-modified peptides representing 76 unique glycated proteins from human plasma were identified. Among them 39 unique glycated proteins, represented by 114 unique glycated peptides could be detected in human plasma prior to immunodepletion. On the other hand, further 46 unique glycated proteins (156 unique glycated peptides) were discovered in the low-abundance protein fraction of human plasma. As for the proteins of the erythrocyte membrane, 75 unique glycated peptides corresponding to 31 unique glycated proteins were identified. That means, that under diabetic conditions the functions of major structural proteins, major integral proteins of erythrocyte lipid rafts and GAPDH are affected by glycation. Interestingly that a majority of the identified Amadori-modified proteins appear in all three subject groups, with little variation in terms of the numbers of glycated peptides or glycation sites. In that study no label-free-quantification analysis was performed. However, a roughly estimation showed, that 50 of unique glycated peptides from plasma samples and 14 from erythrocyte membrane were up-regulated in both IGT and T2DM groups compared

The next logical step of the Metz's work was comprehensive identification of glycated peptides in plasma and erythrocytes of control and diabetic subjects performed in 2011 [87]. After a three-step separation by strong cation exchange chromatography (SCX), BAC, nanoHPLC and sub-sequent mass spectrometric analysis with ETD-based fragmentation, a comprehensive database of glycated peptides/glycation sites and corresponding proteins was built to facilitate the discovery of potential novel markers of diabetes. For selective and specific identification of glycated peptides, the authors established a data-dependent neutral loss triggered ETD scan, where the top six most intense ions were first fragmented with and precursor ions producing neutral losses of 3 H2O and 3 H2O + HCHO (characteristic neutral losses for Amadori-modified peptides during CID [103]) were further fragmented using ETD. In total, 7749 unique glycated peptides corresponding to 3742 unique glycated proteins were identified [87], that was a massive advantage in sequence coverage in comparison to the previous study. In general, characteristic neutral losses represent a convenient and powerful tool in identification of glycation products: they allow not only identification of involved monosaccharide [99, 100], but also more complex modifications, like ADP-

In the work of Bai *et al.* [95] the analysis of glycated HSA peptides by liquid chromatography – ion trap – time-of-flight (LC-IT-TOF)-MS revealed 21 glycation sites in the serum samples of healthy persons and only 16 glycation sites in that from the T2DM patients [95]. Here BAC was used for enrichment of glycated proteins. The sub-sequent digestion procedure was carried out by incubation with endoproteinase Glu-C and trypsin. High sequence coverage (88% for GA from healthy person and 78% for GA from T2DM) was achieved by combining the peptide mass fingerprinting mapping results of the digests, obtained by both

Using capillary flow data-independent acquisition (DIA) proteomics approach,

234 glycation sites in human plasma proteins were characterized [100]. 1508 plasma samples were obtained from overweight/obese non-diabetic adults. For DIA analysis, peptides were loaded on RP-UHPLC coupled on-line to Orbitrap Fusion Lumos MS tribrid. Full MS scan was performed from 350 to 1650 *m/z*, then 33 DIA segments were acquired with higher-energy collisional dissociation (HCD) 27%. For DDA analysis, isolation width was set to 1.6 *m/z*, 3 s method cycle time and 27% HCD for the dependent MS/MS scans. It resulted in identification of 242 glycation sites on 70 proteins. In this study most glycation sites were detected in serum albumin (36 sites), serotransferrin (13) and Ig kappa constant region (7) [100].

**72**

AGE-modified sites were also in the focus of research groups working in the field of DM biomarker discovery. Greifenhagen *et al.* [105] optimized the method for identification carboxymethylated (CML-modified) and carboxyethylated (CEL-modified) peptides in tryptic digests of proteins from human plasma, based on the precursor ion approach, earlier established for Amadori compounds [103]. The verification of results and identification of individual glycation sites relied on LIT-Orbitrap-MS analysis. Overall 21CML-modifications sites were identified in 17 proteins including only 2 sites K88 and K396 in HSA [105]. The same procedure were applied to characterize tryptic peptides (and corresponding glycation sites) with AGE-modified arginine residues [106]. It was shown that 42 plasma proteins are modified by their arginine residues with Glarg, glyoxal-derived dihydroxyimidazolidine (GD-HI), MG-H and methylglyoxal-derived dihydroxyimidazolidine (MGD-HI) [106]. In both strategies [105, 106] were no step of enrichment of AGE-modified peptides which simplifies the analysis and improves the robustness. However, the products can be reliably separated in longer LC gradients, whereas AGE-modified sites can be assigned not only by characteristic mass increments, but also by characteristic fragmentation patterns of in vitro glycated model peptides [24, 107, 108].

Different MS-based methods were developed for characterization of individual glycation sites in plasma proteins.
