**3. Label-free relative quantification of glycation occupancy at individual protein sites**

Label-free relative quantification (LFQ ) is the widely used method for biomarker discovery. It is based on the relative comparisons of the abundances (expressed as peak areas, heights or spectral counts) of individual analytes in control and experimental samples [109].

The first insight in the potential biomarker value of glycated proteolytic peptides was provided by the Hoffmann's group in 2010 [110]. In their first pilot study the early glycation patterns in HSA in blood samples obtained from five T2DM patients was addressed. The experimental procedure included protein concentration determination, two steps of trypsin digestion, BAC, filtration on Centricon YM-10 cartridges (to remove high molecular mass cleavage products and aggregates), HPLC separation on C18 trap column and C18 nano-column coupled with electrospray ionization - quadrupole-quadrupole-time-of-flight MS (ESI-QqTOF). The MS analysis was performed in the information-dependent acquisition (IDA) mode with CID for fragmentation. Tandem mass spectra were automatically processed with MASCOT (Matrix Science Ltd) against the SwissProt database and also confirmed by manual interpretation [110]. In most fragment ion spectra, the ions of glycated peptides showed intense signals corresponding to consecutive neutral losses of 18 (–H2O), 36 (−2× H2O), 54 (−3× H2O, pyrylium ion) and 84 (−3× H2O–HCHO, furylium ion) units. These patterns of neutral losses are characteristic for peptides, containing a carbohydrate moiety [99, 105]. Quantification relied on integration of specific extracted ion chromatograms (XICs, *m/z* ± 0.02) at characteristic retention times (tR). The BUP approach revealed 18 fructosamine-modified peptides identified by their fragmentation patterns in the plasma samples. Relative quantification showed that 15 glycated peptides were detected with quite similar intensities of corresponding signals in all T2DM samples, whereas two glycation sites showed dramatically different abundances, which could indicate individual, maybe diseasespecific, alteration of glycation patterns [110].

To understand the differences in the levels of site-specific Amadori modifications, observed between healthy individuals and T2DM patients, five blood samples from poor glycemic control (HbA1c ≥ 6.5%) and four non-diabetic participants were used for BUP experiment [83]. The procedure of sample preparation was modified. Filtration step was replaced with SPE on C18-gel loader StageTips, whereas LC–MS analysis followed the procedure of Frolov and Hoffmann [110]. This strategy revealed 52 glycated peptides in T2DM plasma representing 47 glycated lysine residues in12 proteins (HSA, Ig kappa and lambda chain C region, fibrinogen (alpha, beta and gamma chains), complement C3,alpha-2-macroglobulin, serotransferrin, apolipoprotein A-I, and haptoglobin). The Mann–Whitney U-test allowed splitting these peptides into three groups based on the difference of integrated peak area. Five peptides were detected only in T2DM plasma and represented the first group – T2DM-specific sites. The second group included 15 peptides detectable in T2DM plasma at significantly higher levels than in control plasma samples. And third group represented 32 peptides detected inT2DM and control plasma samples at similar intensities, i.e. did not exhibit biomarker properties. It is necessary to take into account that the prevalence of not affected sites could be explained by small size of the cohorts, which could be insufficient for reliable conclusions [83].

Therefore, recently, we extended this approach to larger cohort size and established an integrated biomarker, based on multiple glycation sites [84]. This experiment employed T2DM female patients (n = 20 with the serum levels of HbA1c ≥ 7.5%) and age-matched normoglycemic women (n = 18 with the levels of HbA1c ≤ 6.5%). After nanoLC–MS by the above described workflow, all peptide signals were matched to the most complete glycation site database from Zhang et al. [87, 102] and results of our previous work [83] (in total more than 350 sites in plasma proteins). This approach resulted in identification of 51 Amadori peptides, 42 of which were differentially abundant in diabetic and normoglycemic controls. These peptides represented in total nine plasma proteins (HSA Ig kappa chain C region, complement C4-A, alpha-2-macroglobulin, serotransferrin, apolipoprotein A-I, ceruloplasmin precursor, Vitamin D-binding protein precursor and FLJ00385 protein), with half-lives from 2 to 21 days. Based on these differentially modified sites, we proposed an integrated biomarker based on multiple protein-specific Amadori peptides. The validation of this biomarker relied on linear discriminant analysis (LDA) with random sub-sampling of the training set and leave-one-out cross-validation (LOOCV), which resulted in an accuracy, specificity, and sensitivity of 92%, 100%, and 85%, respectively. In this context, it is logical to assume that a biomarker strategy, based on multiple specific glycation sites in plasma proteins, could essentially increase the efficiency of glycemic control and disease prediction.

Due to a high heterogeneity of AGE structures and relatively low abundances of individual AGEs at specific amino acid residues, label-free analysis of modification sites in advanced glycated proteins is rather challenging [105]. Recently, we reported plasma patterns of amide AGEs in the patients, featured with different obesity status and degree of glycaemic control, i.e. we compared four cohorts represented with hyperglycaemic and normoglycemic lean and obese individuals [111]. Although sample preparation followed our well-established pipeline [105, 106], at the stage of LC–MS analysis we employed gas phase fractionation (18 *m/z* intervals in the overall mass range 100–1400 *m/z*), that allowed higher discovery rates of AGE-modified peptides. As a result, altogether 15 advanced glycated sites in 11 proteins were detected in plasma of hyperglycaemic patients. Thereby, the relative contents of two sites, representing acetylation at K199 in HSA (LKacetylCASLQK) and formylation at K51 in apolipoprotein A-II (SKformylEQLTPLIK) were significantly (*p* < 0.05) higher in patients with poor glycemic control [111]. Thus, the peptides, representing the sites, can be considered as potential marker of hyperglycemia. The follow-up study, involving larger cohorts and addressing a wider array of

**75**

*Individual Glycation Sites as Biomarkers of Type 2 Diabetes Mellitus*

AGEs [112] identified 36 sites in 22 highly abundant proteins in individual plasma samples obtained from T2DM patients with long-term disease. Major modifications

lysine (7), and CML (7). No significant changes were observed between control and

Brede and co-authors [101] established fast and high-throughput analysis of several glycated peptides of HSA. The trypsin digestion was done in 76% acetonitrile. Thereby, the authors skipped BAC enrichment and pre-cleaning with SPE. Before qualitative LC–MS/MS analysis, acetonitrile was evaporated from the samples and tryptic peptides were loaded on a C18 reversed phase trap column and separated on an analytical column coupled on-line to a QqTOF mass spectrometer. Quantitative LC–MS/MS analysis was performed by separation on BEHC18 column coupled with a Xevo TQ-S triple quadrupole tandem mass spectrometer operated in multiple reaction monitoring (MRM) mode. This method allows identification of only several glycated peptides from high abundant plasma protein HSA with the modification sites K525, K137, K12, and K414, respectively. Glycated peptide contained K525 was used in the quantitative analysis. The level of glycation at K525 was strongly correlated with HbA1c (r = 0.84) for patients without ESRD. In theT2DM patients with ESRD had a higher ratio of K525/HbA1c on average, provides an excellent incentive for exploring the method as a supplement to HbA1c for detecting

In the work of Rathore and co-authors [90], both AGE-modified and Amadorimodified peptides were used for prediction of pre-diabetes in an integrated biomarker approach. Thereby, the authors focused on glycation of the major plasma protein - HSA. Based on HbA1c levels, the patients were categorized as healthy (n = 20) and pre-diabetic (n = 20). The digestion strategy relied on RapiGest – a detergent, which could be removed from the samples by precipitation with strong acids upon digestion and pre-cleaning on C18 zip-tip columns. Tryptic hydrolysates were separated on C18-reverse phase column coupled to Q-Exactive Orbitrap MS operated in parallel reaction monitoring (PRM) mode based on the information about precursor *m/z*, and charge state obtained during DDA (targeted label-fee approach). Normalized peak areas of glycated peptides were used for a two-tailed, unpaired, non-parametric t-test and two way ANOVA to determine the significance of glycation. As consequence, four CML- or Amadori modified peptides corresponding to 3 glucose sensitive lysine residues K36, K438, and K549, respectively showed significantly higher abundance in pre-diabetes than control. Additionally, the abundance of three of these peptides (KAmQTALVELVK, KCMLVPQVSTPTLVEVSR and FKCMLDLGEENFK) was >1.8-fold in pre-diabetes, which was significantly higher than the differences observed for fasting blood glucose (FBG), 2 h postprandial glucose (PPG), and HbA1c. Further, the four glycated peptides showed a significant correlation with FBG, PPG, HbA1c, triglycerides, very low density lipoproteins (VLDL), and high-density lipoproteins (HDL). It indicates that glycated peptides, containing glucose-sensitive lysine residues K36, K438 and K549 of HSA could be potentially useful

As can be seen from the overview, LFQ provides an essential advantage in quantification of relative glycation rates at practically all available modification sites in multiple proteins. Therefore, this approach gives a direct access to combining multiple biomarkers by simultaneous consideration of several proteins with different half-life times. This allows monitoring any long- and short-term fluctuations of blood glucose concentrations, as for any desired duration of the observation period

Currently, accumulation of the information on prospective biomarker sites is necessary. In further studies, this information needs to be verified in large cohorts to



*DOI: http://dx.doi.org/10.5772/intechopen.95532*

T2DM group [112].

were Glarg (11 modification sites), CMA (5), *N*<sup>ε</sup>

increased blood glucose in these patients [101].

markers for prediction of pre-diabetes [90].

a protein with appropriate half-life can be found.

#### *Individual Glycation Sites as Biomarkers of Type 2 Diabetes Mellitus DOI: http://dx.doi.org/10.5772/intechopen.95532*

*Type 2 Diabetes - From Pathophysiology to Cyber Systems*

conclusions [83].

from poor glycemic control (HbA1c ≥ 6.5%) and four non-diabetic participants were used for BUP experiment [83]. The procedure of sample preparation was modified. Filtration step was replaced with SPE on C18-gel loader StageTips, whereas LC–MS analysis followed the procedure of Frolov and Hoffmann [110]. This strategy revealed 52 glycated peptides in T2DM plasma representing 47 glycated lysine residues in12 proteins (HSA, Ig kappa and lambda chain C region, fibrinogen (alpha, beta and gamma chains), complement C3,alpha-2-macroglobulin, serotransferrin, apolipoprotein A-I, and haptoglobin). The Mann–Whitney U-test allowed splitting these peptides into three groups based on the difference of integrated peak area. Five peptides were detected only in T2DM plasma and represented the first group – T2DM-specific sites. The second group included 15 peptides detectable in T2DM plasma at significantly higher levels than in control plasma samples. And third group represented 32 peptides detected inT2DM and control plasma samples at similar intensities, i.e. did not exhibit biomarker properties. It is necessary to take into account that the prevalence of not affected sites could be explained by small size of the cohorts, which could be insufficient for reliable

Therefore, recently, we extended this approach to larger cohort size and established an integrated biomarker, based on multiple glycation sites [84]. This experiment employed T2DM female patients (n = 20 with the serum levels of HbA1c ≥ 7.5%) and age-matched normoglycemic women (n = 18 with the levels of HbA1c ≤ 6.5%). After nanoLC–MS by the above described workflow, all peptide signals were matched to the most complete glycation site database from Zhang et al. [87, 102] and results of our previous work [83] (in total more than 350 sites in plasma proteins). This approach resulted in identification of 51 Amadori peptides, 42 of which were differentially abundant in diabetic and normoglycemic controls. These peptides represented in total nine plasma proteins (HSA Ig kappa chain C region, complement C4-A, alpha-2-macroglobulin, serotransferrin, apolipoprotein A-I, ceruloplasmin precursor, Vitamin D-binding protein precursor and FLJ00385 protein), with half-lives from 2 to 21 days. Based on these differentially modified sites, we proposed an integrated biomarker based on multiple protein-specific Amadori peptides. The validation of this biomarker relied on linear discriminant analysis (LDA) with random sub-sampling of the training set and leave-one-out cross-validation (LOOCV), which resulted in an accuracy, specificity, and sensitivity of 92%, 100%, and 85%, respectively. In this context, it is logical to assume that a biomarker strategy, based on multiple specific glycation sites in plasma proteins, could essentially increase the efficiency of glycemic control and disease prediction. Due to a high heterogeneity of AGE structures and relatively low abundances of individual AGEs at specific amino acid residues, label-free analysis of modification sites in advanced glycated proteins is rather challenging [105]. Recently, we reported plasma patterns of amide AGEs in the patients, featured with different obesity status and degree of glycaemic control, i.e. we compared four cohorts represented with hyperglycaemic and normoglycemic lean and obese individuals [111]. Although sample preparation followed our well-established pipeline [105, 106], at the stage of LC–MS analysis we employed gas phase fractionation (18 *m/z* intervals in the overall mass range 100–1400 *m/z*), that allowed higher discovery rates of AGE-modified peptides. As a result, altogether 15 advanced glycated sites in 11 proteins were detected in plasma of hyperglycaemic patients. Thereby, the relative contents of two sites, representing acetylation at K199 in HSA (LKacetylCASLQK) and formylation at K51 in apolipoprotein A-II (SKformylEQLTPLIK) were significantly (*p* < 0.05) higher in patients with poor glycemic control [111]. Thus, the peptides, representing the sites, can be considered as potential marker of hyperglycemia. The follow-up study, involving larger cohorts and addressing a wider array of

**74**

AGEs [112] identified 36 sites in 22 highly abundant proteins in individual plasma samples obtained from T2DM patients with long-term disease. Major modifications were Glarg (11 modification sites), CMA (5), *N*<sup>ε</sup> -(formyl)lysine (8), *N*<sup>ε</sup> -(acetyl) lysine (7), and CML (7). No significant changes were observed between control and T2DM group [112].

Brede and co-authors [101] established fast and high-throughput analysis of several glycated peptides of HSA. The trypsin digestion was done in 76% acetonitrile. Thereby, the authors skipped BAC enrichment and pre-cleaning with SPE. Before qualitative LC–MS/MS analysis, acetonitrile was evaporated from the samples and tryptic peptides were loaded on a C18 reversed phase trap column and separated on an analytical column coupled on-line to a QqTOF mass spectrometer. Quantitative LC–MS/MS analysis was performed by separation on BEHC18 column coupled with a Xevo TQ-S triple quadrupole tandem mass spectrometer operated in multiple reaction monitoring (MRM) mode. This method allows identification of only several glycated peptides from high abundant plasma protein HSA with the modification sites K525, K137, K12, and K414, respectively. Glycated peptide contained K525 was used in the quantitative analysis. The level of glycation at K525 was strongly correlated with HbA1c (r = 0.84) for patients without ESRD. In theT2DM patients with ESRD had a higher ratio of K525/HbA1c on average, provides an excellent incentive for exploring the method as a supplement to HbA1c for detecting increased blood glucose in these patients [101].

In the work of Rathore and co-authors [90], both AGE-modified and Amadorimodified peptides were used for prediction of pre-diabetes in an integrated biomarker approach. Thereby, the authors focused on glycation of the major plasma protein - HSA. Based on HbA1c levels, the patients were categorized as healthy (n = 20) and pre-diabetic (n = 20). The digestion strategy relied on RapiGest – a detergent, which could be removed from the samples by precipitation with strong acids upon digestion and pre-cleaning on C18 zip-tip columns. Tryptic hydrolysates were separated on C18-reverse phase column coupled to Q-Exactive Orbitrap MS operated in parallel reaction monitoring (PRM) mode based on the information about precursor *m/z*, and charge state obtained during DDA (targeted label-fee approach). Normalized peak areas of glycated peptides were used for a two-tailed, unpaired, non-parametric t-test and two way ANOVA to determine the significance of glycation. As consequence, four CML- or Amadori modified peptides corresponding to 3 glucose sensitive lysine residues K36, K438, and K549, respectively showed significantly higher abundance in pre-diabetes than control. Additionally, the abundance of three of these peptides (KAmQTALVELVK, KCMLVPQVSTPTLVEVSR and FKCMLDLGEENFK) was >1.8-fold in pre-diabetes, which was significantly higher than the differences observed for fasting blood glucose (FBG), 2 h postprandial glucose (PPG), and HbA1c. Further, the four glycated peptides showed a significant correlation with FBG, PPG, HbA1c, triglycerides, very low density lipoproteins (VLDL), and high-density lipoproteins (HDL). It indicates that glycated peptides, containing glucose-sensitive lysine residues K36, K438 and K549 of HSA could be potentially useful markers for prediction of pre-diabetes [90].

As can be seen from the overview, LFQ provides an essential advantage in quantification of relative glycation rates at practically all available modification sites in multiple proteins. Therefore, this approach gives a direct access to combining multiple biomarkers by simultaneous consideration of several proteins with different half-life times. This allows monitoring any long- and short-term fluctuations of blood glucose concentrations, as for any desired duration of the observation period a protein with appropriate half-life can be found.

Currently, accumulation of the information on prospective biomarker sites is necessary. In further studies, this information needs to be verified in large cohorts to assess the predictive potential of these markers. However, on this way, the limitations of label-free approach become critical. Indeed, LFQ is disadvantages for analysis of large cohorts due to sensitivity of electrospray ionization technique (ESI) to multiple factors, which is manifested as matrix effects [113]. Thus, the analysis conditions (e.g. temperature, experimenter, column condition) must be as constant as possible, that is difficult to achieve with big batch sizes. And duration of batch analysis can be rather long, as prolonged gradients are often used to improve peptide separation [114]. To overcome these limitations, absolute quantification strategies can be employed [86].
