*2.2.6 Native lectin-mediated glycopeptide enrichment*

Lectins are non-catalytic proteins that bind to carbohydrates. Lectins have been used in a variety of glycan, glycoprotein and glycopeptide enrichment strategies. A common approach utilizes broad-specificity bead-immobilized lectins to capture a wide spectrum of glycopeptides. For example, the lectins Concanavalin A (ConA) and wheat germ agglutinin (WGA) bind to high mannose structures and GlcNAc or sialic acid residues, respectively. Each has been used to isolate N-glycopeptides from peptide mixtures [27]. However, WGA does not exclusively bind to N-glycopeptides as it also binds O-β-GlcNAc found on intracellular proteins [28]. Similar strategies

#### **Figure 4.**

*Basic workflow of O-glycopeptide enrichment by O-glycoprotease.*

have been applied to O-glycopeptide enrichment. For example, the lectins Jacalin and *Vicia villosa* agglutinin (VVA) bind to O-linked Gal(β-1,3)GalNAc and α- or β- linked terminal N-acetylgalactosamine, respectively [29, 30].

Lectin-based enrichment strategies have some limitations due to their natural properties. First, most lectins bind their substrates rather weakly (Kd of ~10 mM to 1 μM) [31]. Additionally, limitations in a lectin's specificity can introduce bias into an enrichment scheme. Strategies employing multiple lectins (multi-lectin affinity chromatography, M-LAC) have successfully increased glycopeptide recovery and coverage but do not completely solve the problem of lectin specificity bias [32]. To improve the performance of lectins in glycopeptide enrichment strategies, today's advanced capabilities for cloning and recombinant expression of lectins allows for mutagenesis and selection of lectins with improved binding properties.

### *2.2.7 Engineered lectins for N- and O-glycopeptide enrichment*

The use of structure-guided protein engineering techniques has been used to create lectins with enhanced utility for glycopeptide enrichment. One area of interest has been to engineer binding proteins that can stratify a peptide mixture into different classes of glycopeptides (*e.g*., N-glycopeptides or O-glycopeptides). Here we summarize recent progress in creating such reagents.

An ideal lectin for N-glycopeptide enrichment would bind to a structurally invariable portion of the N-glycan structure. A common trimannosyl chitobiose (Man3GlcNAc2) core glycan is a common feature of all N-glycans (**Figure 1a**). The human Fbs1 protein specifically recognizes this core motif [33, 34]. Fbs1 participates in glycoprotein quality control within the endoplasmic-reticulum-associated degradation (ERAD) system by binding to misfolded glycoproteins that have been retrotranslocated into the cytosol for degradation [35]. As part of the E3 ubiquitin complex, Fbs1 mediates ubiquitination and degradation of glycoproteins by the proteosome [33, 34]. Wild-type (wt) Fbs1 preferentially binds to high mannose N-glycans with sub-micromolar binding affinity (Kd of 0.1–0.2 μM) and only weakly binds to complex N-glycans having terminal sialic acids [36]. To adapt Fbs1 for use as a universal N-glycan/N-glycopeptide binding reagent, Fbs1 variants with greater tolerance for the presence of sialic acids were engineered using a novel

#### *Improving the Study of Protein Glycosylation with New Tools for Glycopeptide Enrichment DOI: http://dx.doi.org/10.5772/intechopen.97339*

plasmid display strategy where library variants were enriched for their ability to bind immobilized fetuin [37]. An Fbs1 variant (termed Fbs1-GYR) containing S155G, F173Y and E174R substitutions was identified that efficiently binds to both high mannose N-glycans and complex N-glycans (**Figure 5**). Fbs1-GYR is unhindered by sialic acid and core fucose substitution, but does not bind to N-glycans bearing bisecting GlcNAc.

Fbs1-GYR is an efficient and substantially unbiased N-glycopeptide enrichment reagent. It enabled a deep characterization of the human serum N-glycoproteome [37] where Fbs1-GYR enrichment outperformed enrichment by the native lectin mixture of WGA, ConA and RCA120 (WCR). Fbs1-GYR enrichment enabled identification of 2.2-fold more N-glycopeptides: an average of 2,142 N-glycopeptide spectra with Fbs1-GYR whereas enrichment with the WCR lectin mixture yielded an average of 965 N-glycopeptide spectra when the same amount of sample was analyzed by MS [37]. Fbs1-GYR mediated enrichment may be performed by using the N-glyco FASP method [32] or by using Fbs1-GYR immobilized beads. In the latter case, Fbs1-GYR has been expressed as a fusion to a SNAP-tag which permits covalent conjugation to benzyl-guanine beads [37–39].

A lectin (termed 'BGL') from the North American Kurokawa mushroom (*Boletopsis grisea*) was recently shown to have a specificity suitable for enrichment of a broad range of O-glycan and O-glycopeptide structures [40]. BGL is a member of the fungal fruit body lectins (Pfam PF07367) that possess two ligand binding sites, as verified by x-ray crystallography [41, 42]. One site binds to N-glycans possessing outer-arm terminal GlcNAc and the other to O-glycans bearing the TF-antigen disaccharide Galβ1,3GalNAc [40]. Ganatra *et al*. used structure-guided mutagenesis to generate single ligand binding site BGL variants [40]. One mutant BGL protein (R103Y) lost the ability to bind N-glycans with a terminal GlcNAc but retained the ability to bind O-glycans bearing the Galβ1,3GalNAc epitope. Both the R103Y BGL variant and *wt* BGL were shown to specifically isolate O-glycopeptides from proteolyzed fetuin, a peptide mixture that contains N-, O- and aglycosylated peptides [40]. As the R103Y BGL variant does not bind to N-glycans, it shows promise as a selective O-glycan/O-glycopeptide enrichment reagent (**Figure 6**). It is plausible that BGL (R103Y) and Fbs1-GYR could be used in tandem to stratify glycopeptide mixtures into enriched pools of O- or N-glycopeptides, respectively.

#### **Figure 5.**

*Fbs1-GYR variant binding to a diverse set of N-glycopeptides is substantially unbiased. Sialylglycopeptide (SGP), an Fbs1 binding substrate, was fluorescently labeled with Tetramethylrhodamine (TMR) at the epsilon-amino group of lysine. For simplicity, TMR is only shown in N-glycopeptide structure 1. N-glycans of SGP-TMR (1) were trimmed with different combinations of exoglycosidases to produce asialo-SGP-TMR (2), SGP-TMR without sialic acids and galactose (3) and SGP-TMR without sialic acids, galactose and GlcNAc (4). The trimmed glycopeptides were then added to binding assays with wt Fbs1 or Fbs1-GYR beads in 50mM ammonium acetate pH 7.5. The relative binding affinity to wt Fbs1 or Fbs1-GYR is reported as the recovery percentage (TMR fluorescence on beads/input TMR fluorescence). Results represent the mean ± s.e.m. of three replicates. (This figure was originally published within Nature Communications, Volume 8, Article number: 15487 (2017)).*

#### **Figure 6.**

*Enrichment of O-glycosylated peptides/peptiforms from Pronase digested bovine fetuin before or after enrichment with BGL or BGL variant R103Y. Sample 1 and 2 represent replicate samples that were each separately digested with Pronase and subjected to lectin enrichment. Blue bars represent the total number of peptides identified (unglycosylated peptides and O-glycopeptides). Yellow bars represent the number of unique O-glycopeptides/peptiforms identified in each sample. (This figure was originally published within Scientific Reports, Volume 11: Article number: 160 (2021)).*

## **2.3 LC-MS/MS and computer algorithms to search glycopeptides**

## *2.3.1 LC-MS/MS*

To identify intact glycopeptides, information of both the peptide backbone and the appended glycan is required. There are four major MS/MS fragmentation methods: collision induced dissociation (CID), electron-capture dissociation (ECD), electron transfer dissociation (ETD), and higher energy collisional dissociation (HCD). CID mainly fragments the peptide backbone, while ECD/ETD is more specific for glycan fragmentation. HCD can fragment both peptide backbone and glycan, and is widely used in intact glycopeptide MS/MS. A combination of different fragmentation methods can improve intact glycopeptide identification.

One recent study reported analysis of more than 5,600 glycopeptides and 1545 N-glycosites [43]. This report implemented a new type of tandem MS fragmentation: activated-ion electron transfer (AI-ETD). The analysis illustrated one of the first studies of glycoproteome profiling with AI-ETD on a quadrupole-Orbitrap-linear ion trap MS system (Orbitrap Fusion Lumos) [44]. Through specialized ion scanning routines, the authors acquired glycopeptide spectra with a higher-energy collision dissociation-product dependent-activated ion electron transfer dissociation (HCD-pd-AI-ETD). This strategy borrows from an established approach in N-glycopeptide analysis, HCD-product ion-triggered-ETD activation where abundant oxonium ions (*m/z* 204.087, HexNAc) in HCD MS/MS initiate subsequent ETD of the selected precursors [45, 46]. The new method HCD-pd-AI-ETD showed a median of peptide backbone sequence coverage of 89% and a median 78% glycan sequence coverage [44]. These parameters were derived from informatics tools with multiple filtering steps postanalysis. Overall, the filtering strategy aimed to attain no decoy peptide hits within the constraints of below 1% FDR estimations for both AI-ETD and HCD spectra.

#### *2.3.2 Computer algorithms for intact glycopeptide identification*

The complicated structure of intact glycopeptides makes the MS/MS spectra extremely complex. Therefore, special computer algorithms have been developed *Improving the Study of Protein Glycosylation with New Tools for Glycopeptide Enrichment DOI: http://dx.doi.org/10.5772/intechopen.97339*

to match the MS/MS spectra to both the peptide sequences and the attached glycan compositions. The algorithms include Byonic [47], GPQuest [48], pGlyco 2.0 [49], and O-pair Search [50]. Amongst these programs, the Byonic search engine provides high sensitivity identification of glycopeptides and allows the use of customized databases for both glycans and proteins. Byonic software identifies glycopeptides to the level of glycan composition and peptide sequence, and it is suitable for both N-glycopeptide and O-glycopeptide searches. A newly published computer algorithm, called O-Pair Search, is specific for O-glycopeptide searches [50]. The authors claim that O-Pair Search can not only greatly reduce search times (up to more than 2,000-fold) compared to a Byonic search, but it can also generate more O-glycopeptide identifications.
