**Bioprocess Engineering of** *Pichia pastoris***, an Exciting Host Eukaryotic Cell Expression System**

Francisco Valero

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/56407

## **1. Introduction**

Yeasts are the favorite alternative hosts for the expression of heterologous proteins for research, industrial or medical use [1]. As unicellulars microorganism have the advantages of bacteria as ease of manipulation and growth rate. But comparing with bacterial system, they are capable of many of the post-translational modifications performed by higher eukaryotic cells, such as proteolytic processing, folding, disulfide bond formation and glycosylation [2].

Historically *Saccharomyces cerevisiae* has been the most used yeast host due to the large amount of knowledge on genetics, molecular biology and physiology accumulated for this microor‐ ganism [3-5]. However, it was rapidly found to have certain limitations: low product yields, poor plasmid stability, hyperglycosylation and low secretion capacities. These limitations are now relieved by a battery of alternative yeast as cell factories to produce recombinant proteins.

Some of these alternative yeast cell factories are fission yeast as *Schizosaccharomyces pombe* [6], *Kluyveromyces lactis* [7], methylotrophic species as *Pichia pastoris* [8], *Candida boidinii* [9], *Pichia methanolica* [10], *Hansenula polymorpha* [11], and the dimorphic species *Yarrowia lipolytica* [12], and *Arxula adeninivorans* [13]. It is very usual that the performance of these alternative hosts frequently surpass those of *S. cerevisiae*in terms of product yield, reduced hyperglycosylation and secretion efficiency, especially for high molecular weight proteins [14].

Several reviews compare advantages and limitations of expression systems for foreign genes [15-20]. Between them *Pichia pastoris* has emerged in the last decade as the favorite yeast cell factory for the production of heterologous proteins. A search in ISI Web of knowledge (web of science) with the keywords microorganism+ heterologous protein *P. pastoris* is the preferred host (667 entrances) followed by *Candida* and *Schizosaccharomyces*

(161 and 124 entrances respectively). Specifically for heterologous lipase production *P. pastoris* is the most used host [21].

Why *P. pastoris* emerged as an excellent host system to produce recombinant products?. The story started one decade after oil crisis in the 70's when Phillips Petroleum and the Salk Institute Biotechnology/Industrial Associates Inc. (SIBIA, La Jolla, Ca, USA) used *Pichia* as a host system for heterologous protein expression [22-24]. Nowadays, more than 500 proteins have been expressed using this system [25] and it also has been selected by several protein production platforms for structural genomics programs [26]. *P. pastoris* combines the ability of growing on minimal medium at very high cell densities (higher than 100 g DCW/L), secreting the heterologous protein simplifying their recovery. Also, it performs many of the higher eukary‐ otic post-translational modifications such as protein folding, proteolytic processing, disulfide bond and glycosylation [24]. However, it has been shown that both, N- and O-linked oligo‐ saccharide structures, are quite different from mammalian cells, for example, they are of a heterogeneous high-mannose type. The consequence is that high mannose type N-glycans attached to recombinant glycoproteins can be cleared rapidly from the human bloodstream, and they can cause immunogenic reactions in humans [27]. Nevertheless GlycoFi's glycoengineering technology allows the generation of yeast stains capable of replicating the most essential steps of the N-glycosylation pathway found in mammals [28].

But, probably the most important characteristic of *P. pastoris* is the existence of a strong and tightly regulated promoter from alcohol oxidase 1 gene, *PAOX1*. Thus, methanol was used as carbon source and inducer of heterologous protein production in this system [29].

Daly and Hearn [30] reviewed various aspects of the *P. pastoris* expression system and also consider the factors that need to be taken into account to achieve successful recombinant protein expression, particular when more complex systems are contemplated, such as those used in tandem gene or multiple gene copy experiments. Between them, several genetic and physiological factors such as the codon usage of the expression gene, the gene copy number, efficient transcription by using strong promoters, translation signals, translocation determined by the secretion signal peptide, processing and folding in the endoplasmatic reticulum and Golgy and, finally, secretion out of the cell, as well as protein turnovers by proteolysis, but also of the optimization of fermentation strategy [31].

The objective of this chapter is to review the classic and alternative operational strategies to maximize yield and/or productivity from an industrial point of view and also how to obtain a repetitive product from batch to batch applying process analytical technology (BioPAT)

## **2. Host strains and** *PAOX1* **promoter**

Host strains and vectors are available as commercial kits from Invitrogen Corporation (Carlsbad, CA) [32]. *PAOX1* is the preferred promoter. Previous to design operational strategies is necessary to know the machinery to inducer this promoter and how *Pichia* metabolizes methanol.

*PAOX1* is strongly repressed in presence of carbon sources as glucose, glycerol, ethanol and most of other carbon sources, being strongly induced by the presence of methanol [33]. Alcohol oxidase is the first enzyme of methanol assimilation pathway, which catalyzes its oxidation to formaldehyde [34]. The genome of *Pichia* contains two genes of this functional enzyme AOX1 and AOX2. Around the 85% of alcohol oxidase activity is regulated by *AOX1* gene, whereas *AOX2* gene regulates the other 15% [35]. AOX concentration can reach 30% of the total cell protein when is growing on methanol, which compensates for the low affinity of the enzyme for methanol [22].

There are three types of *P. pastoris* host strains available which vary with regards to their ability to utilize methanol. The wild-type or methanol utilization plus phenotype (Mut+ ), and the strains resulting from deletions in the *AOX1* gene, methanol utilization slow (Muts ), or both *AOX* genes, methanol utilization minus (Mut- ) [36].

Although AOX1 is the promoter most commonly used, it presents a serie of limitations. Oxygen supply becomes a major concern in *P. pastoris* in methanol non-limited fed-batch cultures when high cell densities are desired for the production process using Mut+ phenotype, since the bioreactor oxygen transfer capacity unable to sustain the oxygen metabolic demand [24]. Another important disadvantage of *PAOX1*, especially in Mut+ phenotype in large scale productions, is the necessity to storage huge amount of methanol which constitutes a potential industrial risk. On the other hand, methanol presents a high heat of combustion (-727 kJ Cmol-1) [37]. Thus, considerable heat is generated during the bioprocess growing on this carbon source. It requires rapid and efficient cooling systems, particularly at large scale where heat losses through the bioreactor walls may be limiting due to the small surface area to volume ratio. Failure to remove this heat may result in reactor temperature increase affecting the productivity and quality of the recombinant protein [38]. Furthermore since methanol is mainly derived from petrochemical sources, may require purifications steps for the production of certain foods and additives products [39].

## **3.** *Pichia* **Process Analytical Technology (PAT)**

(161 and 124 entrances respectively). Specifically for heterologous lipase production *P.*

Why *P. pastoris* emerged as an excellent host system to produce recombinant products?. The story started one decade after oil crisis in the 70's when Phillips Petroleum and the Salk Institute Biotechnology/Industrial Associates Inc. (SIBIA, La Jolla, Ca, USA) used *Pichia* as a host system for heterologous protein expression [22-24]. Nowadays, more than 500 proteins have been expressed using this system [25] and it also has been selected by several protein production platforms for structural genomics programs [26]. *P. pastoris* combines the ability of growing on minimal medium at very high cell densities (higher than 100 g DCW/L), secreting the heterologous protein simplifying their recovery. Also, it performs many of the higher eukary‐ otic post-translational modifications such as protein folding, proteolytic processing, disulfide bond and glycosylation [24]. However, it has been shown that both, N- and O-linked oligo‐ saccharide structures, are quite different from mammalian cells, for example, they are of a heterogeneous high-mannose type. The consequence is that high mannose type N-glycans attached to recombinant glycoproteins can be cleared rapidly from the human bloodstream, and they can cause immunogenic reactions in humans [27]. Nevertheless GlycoFi's glycoengineering technology allows the generation of yeast stains capable of replicating the most

But, probably the most important characteristic of *P. pastoris* is the existence of a strong and tightly regulated promoter from alcohol oxidase 1 gene, *PAOX1*. Thus, methanol was used as

Daly and Hearn [30] reviewed various aspects of the *P. pastoris* expression system and also consider the factors that need to be taken into account to achieve successful recombinant protein expression, particular when more complex systems are contemplated, such as those used in tandem gene or multiple gene copy experiments. Between them, several genetic and physiological factors such as the codon usage of the expression gene, the gene copy number, efficient transcription by using strong promoters, translation signals, translocation determined by the secretion signal peptide, processing and folding in the endoplasmatic reticulum and Golgy and, finally, secretion out of the cell, as well as protein turnovers by proteolysis, but

The objective of this chapter is to review the classic and alternative operational strategies to maximize yield and/or productivity from an industrial point of view and also how to obtain a repetitive product from batch to batch applying process analytical technology (BioPAT)

Host strains and vectors are available as commercial kits from Invitrogen Corporation (Carlsbad, CA) [32]. *PAOX1* is the preferred promoter. Previous to design operational strategies is necessary to know the machinery to inducer this promoter and how *Pichia*

carbon source and inducer of heterologous protein production in this system [29].

essential steps of the N-glycosylation pathway found in mammals [28].

also of the optimization of fermentation strategy [31].

**2. Host strains and** *PAOX1* **promoter**

metabolizes methanol.

*pastoris* is the most used host [21].

4 Protein Engineering - Technology and Application

It is necessary to develop bioprocess optimization and control tools in order to implement a Process Analytical Technology (PAT), BIOPAT when it is applied to bioprocesses [40]. This initiative has been promoted by regulatory agencies such as FDA and EMEA [41]. PAT is a multidisciplinary platform for designing, analyzing and controlling manufacturing through timely measurements of critical quality and performance attributes of raw and in-process materials and processes with the goal of ensuring final product quality [42].The final goal is guarantee consistent product quality at the end of the process, ease the regulatory review bioprocess and increase flexibility with respect to post-approval manufacturing changes [43] [Figure 1].

Applied to *Pichia* cell factory, on-line monitoring of biomass, methanol and product are the dream of all researchers involved in the production of heterologous protein in this host.

**Figure 1.** Scheme of a process analytical technology (PAT).

Different approaches have been applied for the on-line determination of biomass in *Pichia's* fermentation. Multi-wavelength fluorescence coupling with PARAFAC-PLS chemometric methodology resulting in important qualitative and quantitative bioprocess information [Figure 2; Figure 3]. Biomass and substrate (glycerol or methanol) were determined success‐ fully. The recombinant lipase, the heterologous product, could also be on-line determined in the exponential phase. However in the stationary Phase, where proteolytic problems appears, the estimation of the product could not be estimated accurately [44-46].

Multi-wavelength fluorescence is not standard equipment used in bioprocesses. Thus, when direct biomass quantification methods are not available, biomass can be determined from indirect on-line measurements using software sensors. The estimation of biomass, substrate and specific growth rate by two non-linear observers, nonlinear observed-based estimator (NLOBE) and second-order dynamic tuning (AO-SODE) and a linear estimator, recursive least squares with variable forgetting factor (RLS-VFF) have been applied to *Pichia* bioprocess using different indirect measurements, carbon dioxide transfer rate (CTR), oxygen uptake rate (OUR) from conventional infrared and paramagnetic gas analysis, and sorbitol. The AO-SODE algorithm using OUR on-line measurement was the most efficient approach demonstrating the robustness of this methodology [47]. A comparison of the performance of the different observers is presented in table 1.

Bioprocess Engineering of *Pichia pastoris*, an Exciting Host Eukaryotic Cell Expression System http://dx.doi.org/10.5772/56407 7

**Figure 2.** Scheme of the calibration and prediction processes for PARAFAC combined with PLS regression for state variables determination.

Different approaches have been applied for the on-line determination of biomass in *Pichia's* fermentation. Multi-wavelength fluorescence coupling with PARAFAC-PLS chemometric methodology resulting in important qualitative and quantitative bioprocess information [Figure 2; Figure 3]. Biomass and substrate (glycerol or methanol) were determined success‐ fully. The recombinant lipase, the heterologous product, could also be on-line determined in the exponential phase. However in the stationary Phase, where proteolytic problems appears,

Multi-wavelength fluorescence is not standard equipment used in bioprocesses. Thus, when direct biomass quantification methods are not available, biomass can be determined from indirect on-line measurements using software sensors. The estimation of biomass, substrate and specific growth rate by two non-linear observers, nonlinear observed-based estimator (NLOBE) and second-order dynamic tuning (AO-SODE) and a linear estimator, recursive least squares with variable forgetting factor (RLS-VFF) have been applied to *Pichia* bioprocess using different indirect measurements, carbon dioxide transfer rate (CTR), oxygen uptake rate (OUR) from conventional infrared and paramagnetic gas analysis, and sorbitol. The AO-SODE algorithm using OUR on-line measurement was the most efficient approach demonstrating the robustness of this methodology [47]. A comparison of the performance of the different

the estimation of the product could not be estimated accurately [44-46].

**Figure 1.** Scheme of a process analytical technology (PAT).

6 Protein Engineering - Technology and Application

observers is presented in table 1.

Methanol concentration, the inducer substrate, is the most important variable for on-line monitoring because the productivity of the bioprocess is quite related to this parameter. Concentrations between 2-3.5 g/L are referenced as optimal concentrations to maximize protein production [48,49], higher concentrations present inhibition problems and in some cases lower concentration stops recombinant protein production [50].

Although chromatographic methods such as GC and HPLC are common methods for the offline analysis, their on-line implementation is not usual due to the low sampling frequency [49].

On-line methods are generally based on liquid-gas equilibrium by analyzing the fermenter exhaust gas [51]. Nowadays, commercial equipments based in this principle are available from

**Figure 3.** Summary of the application of on-line PARAFAC approach (NOC = Normal Operating Conditions).

Raven Biotec, Figaro Biotech, PTI Instruments [52]. These equipments are quite robust and with minimum maintenance although some precautions should be taking into account to obtain a precise measurement [53].

Other alternatives are sequential injection analysis [54] Fourier transform mid-infrared spectroscopy [49] and flame ionization [55].

Process optimization only can conclude with effective measurement of heterologous protein production. Classical methods as ELISA, SDS-PAGE and Western blots or bioactivity assay are time-consuming, labour-intensive, and not applicable for the determination of the product in real time [51]. Methods including perfusion chromatography, specific biosensors and


**Table 1.** Comparison of three different observers for the estimation of biomass, substrate and specific growth rate.

immunonephelometric assays are limited to proteins secreted in the extracellular culture broth, but not intracellular protein production [56,57]. To circumvent this problem fusing a GFP signal marker to the recombinant protein could be detected by fluorescence [58]. However the co-expression of this protein fusion could provoke a lost in the production of the recombinant product. When the recombinant protein has an associated colorimetric reaction, for instance enzymes, analytical approaches using flow injection analysis (FIA) or sequential injection analysis (SIA) are widely used [59].One of the most fully automated *Pichia* bioprocess has been developed by the group of professor Luttmann [60]. An example of on-line monitoring and control of *Pichia* bioprocess producing *Rhizopus oryzae* lipase is presented in Figure 4. The real time evolution of the main parameters, variables and specific rates of this bioprocess are presented in Figure 5a and 5b.

#### **4. Operational strategies using** *PAOX1* **Mut+ phenotype**

Raven Biotec, Figaro Biotech, PTI Instruments [52]. These equipments are quite robust and with minimum maintenance although some precautions should be taking into account to

**Figure 3.** Summary of the application of on-line PARAFAC approach (NOC = Normal Operating Conditions).

Other alternatives are sequential injection analysis [54] Fourier transform mid-infrared

Process optimization only can conclude with effective measurement of heterologous protein production. Classical methods as ELISA, SDS-PAGE and Western blots or bioactivity assay are time-consuming, labour-intensive, and not applicable for the determination of the product in real time [51]. Methods including perfusion chromatography, specific biosensors and

obtain a precise measurement [53].

8 Protein Engineering - Technology and Application

spectroscopy [49] and flame ionization [55].

Some of the operational strategies using the phenotype Mut+ are focused in order to circumvent operational problems previously commented. Invitrogen Co., only provides an operational manual for the fed-batch growth on *Pichia* (Manual Invitrogen) [61] mainly derived from the protocols of Brierley and coworkers [62]. Fed-batch fermentation protocols include three different phases. A glycerol batch phase (GBP), a transient phase (TP) and finally, a methanol induction phase (MIP). Normally GBP and TP are similar for both phenotypes (Mut+ and Muts ). The objective of the GBP is the fast generation of biomass previous to the induction of methanol. The specific growth rate and yield of *Pichia* growing on glycerol are from 0.18 h-1 and 0.5 g DCW per gram of glycerol [63] to 0.26 h-1 and 0.7 g DCW per gram of glycerol [67]. Brierley and coworkers recommended a maximum glycerol concentration of 6% [62]. Higher concentration inhibits growth [68]. The specific growth rate and yield is higher than growing on methanol (0.12 h-1) and 0.27 g DCW per gram of methanol) [62]. When higher initial biomass concentration is required a second step with an exponential feeding rate of glycerol is imple‐ mented. It is important that in GBP dissolved oxygen (DO) reaches values higher than 20-30% to avoid the production of ethanol.

Once the GBP is finished, indicated by a spike in measured DO or a decreased in CO2 consumption rate (CER), TP is started. The objective of TP is increase biomass level to generate

**Figure 4.** Bioprocess scheme for on-line monitoring and control of *Pichia pastoris* producing recombinant *Rhizopus oryzae* lipase.

high cell density cultures jointly with the derepression of AOX1 promoter due to the absence of an excess of glycerol prior to MIP. Different strategies are collected in a set of reviews [32, 34, 51, 52].

The selected operational strategy used in the MIP is one of the most important factors to maximizing heterologous protein production [67]. These strategies using a Mut+ phenotype have to circumvent the associated problems to the maximum methanol consumption capacity previously pointed out.

At his point, the monitoring and control of the inducer substrate, methanol, are the most important key parameter. High levels of this inducer substrate can generate inhibitory effect on cell growth [67], and low levels of methanol may not be enough to initiate the AOX transcription [8]. The inhibition profile on methanol follows an uncompetitive inhibition growth model, with a reported critic methanol concentration between 3 and 5.5 g/L depending

Bioprocess Engineering of *Pichia pastoris*, an Exciting Host Eukaryotic Cell Expression System http://dx.doi.org/10.5772/56407 11

21

**A**

high cell density cultures jointly with the derepression of AOX1 promoter due to the absence of an excess of glycerol prior to MIP. Different strategies are collected in a set of reviews [32,

**Figure 4.** Bioprocess scheme for on-line monitoring and control of *Pichia pastoris* producing recombinant *Rhizopus*

CONTROLADOR DE PRESSIÓ 25 % 10 %

Pressure control

Estimation & Control

SENSOR DE PRESIÓN

TCP/IP

RS-485

PC2 (MONITORITZACIÓN Y CONTROL)

> MICROBURETAS AUTOMÁTICAS

System B

Microburettes

Drain Bottle 

BIOREACTOR Automatic Sampling

22

Drain

ANALIZADOR DE GASES O2 CO2

Gas analyzer

RS-232

Sample Bottle No 1

System A

Step sensing Step driving

11 Sample Bottles for 50ml

S-25

System A

PC CONTROL ACTIVIDAD

Drain

ANÁLISIS ACTIVIDAD LIPOLÍTICA (SIA- SEQUENTIAL INJECTION ANALYSIS)

Product analyzer (SIA – sequential injection analysis)

System B

VÁLVULA DE REGULACIÓN

Pressure control valve

RS-232

22 Sample Bottles for 15ml

12 11

Step sensing Step driving

TOMA DE MUESTRAS AUTOMÁTICO

1

The selected operational strategy used in the MIP is one of the most important factors to maximizing heterologous protein production [67]. These strategies using a Mut+ phenotype have to circumvent the associated problems to the maximum methanol consumption capacity

At his point, the monitoring and control of the inducer substrate, methanol, are the most important key parameter. High levels of this inducer substrate can generate inhibitory effect on cell growth [67], and low levels of methanol may not be enough to initiate the AOX transcription [8]. The inhibition profile on methanol follows an uncompetitive inhibition growth model, with a reported critic methanol concentration between 3 and 5.5 g/L depending

34, 51, 52].

*oryzae* lipase.

previously pointed out.

PTI INSTRUMENTS

RS-422

10 Protein Engineering - Technology and Application

FERMENTADOR B.BRAUN BIOSTAT ED

SENSOR DE METANOL MC-168

Methanol sensor

Methanol sensor ON OFF

RS-232

PC1 (COMUNICACIONES)

Monitoring

**B**

**Figure 5.** A.- Example of the on-line monitoring of *Pichia pastoris* producing recombinant *Rhizopus oryzae* lipase. Real time performance of standard fermentation parameters. B.- Example of the on-line monitoring of *Pichia pastoris* pro‐ ducing recombinant *Rhizopus oryzae* lipase. Real time evolution of biomass, substrate and product and their corre‐ sponding specific growth rate, methanol consumption rate and lipase production rate.

on the target protein [34]. Thus, a set-point methanol concentration around 2 g/L seems an optimal value to maximize protein production. Although keeping a constant methanol concentration during the induction phase has positive effects on the production of foreign

22

protein [65], some authors pointed out that the design of an optimal methanol or specific growth rat profile along the MIP maximize the productivity of the process [68].

It is quite difficult to compare the performance of different fed-batch strategies with different heterologous protein. On the other hand, the selection of the fed-batch strategy depends on the facilities to monitor methanol or other key variables as biomass or recombinant product.

Simple strategies, like the addition of a pulse of methanol at different time intervals, must be limited in basic studies to obtain a quantity of recombinant protein for preliminary character‐ ization or structural studies, but is not realistic from an industrial point of view.

Several strategies have been proposed to optimize the methanol feeding rate with the final objective of maximizing protein production and to get a reproducible bioprocess:

## **5. DO-stat control**

*Pichia* cells utilize methanol through the oxidative pathway only when oxygen is non-limiting [34]. Thus, DO must be controlled above a minimal level around 20% [69]. However, oxygen limitation was successfully used to control the methanol uptake during single-chain antibody fragment production [70,71] and other groups have proposed using oxygen as the growthlimiting nutrient, instead of methanol to circumvent the problem of high oxygen demand and observed 16-55% improvements in product concentrations [72,73]. Recently, an oxygenlimited process has been developed and optimized for the production of monoclonal antibod‐ ies in glycoengineered *P. pastoris* strain using oxygen uptake rate as a scale-up parameter from 3L laboratory scale to 1200 L pilot plant scale. Scalability and productivity were improved reducing oxygen consumption and cell growth [74-76]. On the other hand, excessive high DO levels are cytotoxic reducing cell viability [77].

Although different DO-start control has been developed [77-80]. This strategy cannot dis‐ tinguish the possible accumulation of methanol. In this situation DO signal increases due to the inhibitory effect of methanol on growth, and the response of the DO-controller should be to increase the feeding rate of methanol aggravating the problem. This is par‐ ticularly problematic in the beginning of the induction phase where *AOX1* is not yet strongly induced and the AOX activity in the cells is growth-rate limiting but constantly increasing as a result of the induction [32].

## **5. Methanol open-loop control**

on the target protein [34]. Thus, a set-point methanol concentration around 2 g/L seems an optimal value to maximize protein production. Although keeping a constant methanol concentration during the induction phase has positive effects on the production of foreign

sponding specific growth rate, methanol consumption rate and lipase production rate.

**Figure 5.** A.- Example of the on-line monitoring of *Pichia pastoris* producing recombinant *Rhizopus oryzae* lipase. Real time performance of standard fermentation parameters. B.- Example of the on-line monitoring of *Pichia pastoris* pro‐ ducing recombinant *Rhizopus oryzae* lipase. Real time evolution of biomass, substrate and product and their corre‐

**B**

12 Protein Engineering - Technology and Application

22

In this simple strategy, the methanol feeding rate profile (exponential) is obtained from mass balance equations with the objective to maintain a constant specific growth rate (µ) under methanol limiting conditions (no accumulation of methanol should be observed). To imple‐ ment preprogrammed exponential feeding rate strategy, biomass concentration and volume at the beginning of the MIP have to be known and to assume that a constant biomass/substrate yield is maintained along the induction phase. This strategy has problems in terms of robust‐ ness and process stability, because, although open-loop system could be easy to implement they do not respond to perturbations of the bioprocess. To avoid this problem the set point of µ is fixed far from the µmax diminishing the productivity of the process. Nevertheless this simple strategy has been applied successfully in different bioprocesses [81-84]. On the other hand, when the recombinant protein affects the growth of the host reaching µmax lower than the wild strain, like in the production of *Rhizopus oryzae*lipase under methanol limiting conditions, the production is stopped few hours later of the beginning of MIP (personal communication of the author).

## **6. Methanol closed-loop control**

In previous strategies methanol concentration is neither measured on-line not directly controlled [51]. Thus, an accurate monitoring and control of methanol concentration is required. As previously has been commented, different analytical approaches has been implemented in order to on-line monitoring of methanol concentration in *Pichia's* fermenta‐ tion. Analytical devices based on liquid-gas equilibrium by analyzing exhaust gas from the fermented are the most used. There are as set of methanol sensors available in the market from Raven Biotech, Figaro Electronics, PTI Instruments, and Frings America [52, 85]. The first attempts have been based to maintain the methanol concentration along the induction phase at a constant and optimal concentration to maximize protein production or productivity bioprocess. However, in the last years, some approaches are implementing in order to define an optimal variable methanol set-point function of the different stages of the induction phase. A scheme of both methanol feeding strategies, open and closed loop, is presented in Figure 6.

**Figure 6.** Scheme of methanol feeding strategies: open loop and closed loop control.

Different methanol control concentration algorithms and strategies have been proposed. Although the on-off control is the simplest feed-back control strategy and it has been used by different authors [81, 85-88] *Pichia* fermentation, as bioprocesses in general, is characterized by a complex and highly non-linear process dynamics. For this reason this control strategy is inadequate for precise control of methanol concentration in the bioreactor because it can result

yield is maintained along the induction phase. This strategy has problems in terms of robust‐ ness and process stability, because, although open-loop system could be easy to implement they do not respond to perturbations of the bioprocess. To avoid this problem the set point of µ is fixed far from the µmax diminishing the productivity of the process. Nevertheless this simple strategy has been applied successfully in different bioprocesses [81-84]. On the other hand, when the recombinant protein affects the growth of the host reaching µmax lower than the wild strain, like in the production of *Rhizopus oryzae*lipase under methanol limiting conditions, the production is stopped few hours later of the beginning of MIP (personal communication of

In previous strategies methanol concentration is neither measured on-line not directly controlled [51]. Thus, an accurate monitoring and control of methanol concentration is required. As previously has been commented, different analytical approaches has been implemented in order to on-line monitoring of methanol concentration in *Pichia's* fermenta‐ tion. Analytical devices based on liquid-gas equilibrium by analyzing exhaust gas from the fermented are the most used. There are as set of methanol sensors available in the market from Raven Biotech, Figaro Electronics, PTI Instruments, and Frings America [52, 85]. The first attempts have been based to maintain the methanol concentration along the induction phase at a constant and optimal concentration to maximize protein production or productivity bioprocess. However, in the last years, some approaches are implementing in order to define an optimal variable methanol set-point function of the different stages of the induction phase. A scheme of both methanol feeding strategies, open and closed loop, is presented in Figure 6.

**Figure 6.** Scheme of methanol feeding strategies: open loop and closed loop control.

Different methanol control concentration algorithms and strategies have been proposed. Although the on-off control is the simplest feed-back control strategy and it has been used by different authors [81, 85-88] *Pichia* fermentation, as bioprocesses in general, is characterized by a complex and highly non-linear process dynamics. For this reason this control strategy is inadequate for precise control of methanol concentration in the bioreactor because it can result

the author).

**6. Methanol closed-loop control**

14 Protein Engineering - Technology and Application

**Figure 7.** Comparison of the performance of the different methanol control algorithms in *Pichia pastoris* bioprocess producing recombinant lipase.

in a fluctuating methanol concentration around the set-point [34]. In Muts phenotype, where the methanol consumption rate is lower than in Mut+ phenotype, this control algorithm has better performance.

A proportional-integral (PI) or proportional-integral derivative (PID) control algorithms are more effective approach. Nevertheless, the optimal settings of the PID controller (gain KC, the integral time constant τI and the derivative time constant τD) are hardly ascertained by trial and error tuning or other empirical methods. Some authors have developed a PID control Bode stabilization criterion to achieve the parameters associated to this king of control, obtaining good results on methanol regulation in short time fermentations [77,88]. Because of the dynamics of the system, the optimal control parameters may vary significantly during the fermentation. Moreover, the existence of an important response time for both, the on-line methanol determination and the biological system has promoted the development of other control alternatives [34].

A predictive control algorithm coupled with a PI feedback controller has been implemented successfully in heterologous *Rhizopus oryzae* lipase production. It is based on the methanol uptake on-line calculation from the substrate mass balance in fed-batch cultivations, requiring the first-time derivative of methanol concentration for each time interval. This predictive part is coupled to a feed-back term (PI) to regulate the addition aiming a stabilizing the signal around the set-point [89]. Although this strategy was implemented in Muts phenotype, it has been implemented in Mut+ phenotype successfully. A comparison of the performance of the different control algorithms is presented in Figure 7.

Model based on-line parameters estimation and on-line optimization algorithms have been developed to determine optimal inducer feeding rates. Continuous fermentation using methanol was performed via on-line methanol measurement and control using a minimalvariance-controller and a semi-continuous Kalman-Filter [90].

## **7. Strategies to minimize oxygen demand**

The standard fed-batch fermentation without oxygen limitation is namely methanol nonlimited fed-batch (MNLFB). Independently of the strategy selected, high cell density cultures with Mut+ *P. pastoris* phenotype in laboratory bioreactors presents the problems of oxygen supply, since the bioreactor oxygen transfer capacity is unable to sustain the oxygen metabolic demand [91]. When the biomass reaches values higher than 60 gDCW/L oxygen limitations appears, even using mixtures of air and oxygen or pure oxygen. Different approaches have been published to overcome this disadvantage:

Temperature-limited fed-batch (TLFB). In this strategy the common methanol limitation is replaced by temperature limitation in order to avoid oxygen limitation at high cell density limitation [92]. Temperature controller was programmed to maintain a DO set-point around 25%,. When DO is lower than the set-point the culture temperature was decreased [32]. Using this approach cell death values decrease drastically and also protein proteolysis where reduced, although specific growth rate diminishes and, sometimes, it affect negatively to the productivity of the bioprocess [92]. This strategy has been applied successfully in different heterologous protein production [92-96].

Methanol limited fed-batch strategy (MLFB). The strategy is applied once the DO value under non limited conditions achieves values lower than the set-point (around 25%). At this point methanol feeding rate is controlled in order to assure the DO set-point. At this point methanol concentration starts to diminish from the methanol set-point to limiting conditions, although specific productivity can diminish the production of the heterologous product is not stopped [84, 91, 97-98].

#### **8. Operational Strategies using** *PAOX1* **Muts Phenotype**

Probably Mut+ phenotype under *PAOX1* is the most common *P. pastoris* strain used. However, as it has been commented along the chapter, it presents important operational problems related to oxygen and heat demand and methanol security requires. From the biological point of view, Muts phenotype can be used, since they require less oxygen supply and heat elimination. However, the specific growth rate using methanol as sole carbon source is too low compared with Mut+ , and low levels of biomass are produced [34,50]. Although from the bioprocess engineering point of view the slow operational conditions facilitates the control and reprodu‐ cibility of the bioprocess, the fermentation time increase and sometimes the productivity of the process decreases drastically.

## **9. Mixed substrates**

methanol was performed via on-line methanol measurement and control using a minimal-

The standard fed-batch fermentation without oxygen limitation is namely methanol nonlimited fed-batch (MNLFB). Independently of the strategy selected, high cell density cultures with Mut+ *P. pastoris* phenotype in laboratory bioreactors presents the problems of oxygen supply, since the bioreactor oxygen transfer capacity is unable to sustain the oxygen metabolic demand [91]. When the biomass reaches values higher than 60 gDCW/L oxygen limitations appears, even using mixtures of air and oxygen or pure oxygen. Different approaches have

Temperature-limited fed-batch (TLFB). In this strategy the common methanol limitation is replaced by temperature limitation in order to avoid oxygen limitation at high cell density limitation [92]. Temperature controller was programmed to maintain a DO set-point around 25%,. When DO is lower than the set-point the culture temperature was decreased [32]. Using this approach cell death values decrease drastically and also protein proteolysis where reduced, although specific growth rate diminishes and, sometimes, it affect negatively to the productivity of the bioprocess [92]. This strategy has been applied successfully in different

Methanol limited fed-batch strategy (MLFB). The strategy is applied once the DO value under non limited conditions achieves values lower than the set-point (around 25%). At this point methanol feeding rate is controlled in order to assure the DO set-point. At this point methanol concentration starts to diminish from the methanol set-point to limiting conditions, although specific productivity can diminish the production of the heterologous product is not stopped

as it has been commented along the chapter, it presents important operational problems related to oxygen and heat demand and methanol security requires. From the biological point of view,

engineering point of view the slow operational conditions facilitates the control and reprodu‐ cibility of the bioprocess, the fermentation time increase and sometimes the productivity of

 phenotype can be used, since they require less oxygen supply and heat elimination. However, the specific growth rate using methanol as sole carbon source is too low compared

, and low levels of biomass are produced [34,50]. Although from the bioprocess

 **Phenotype**

phenotype under *PAOX1* is the most common *P. pastoris* strain used. However,

variance-controller and a semi-continuous Kalman-Filter [90].

**7. Strategies to minimize oxygen demand**

16 Protein Engineering - Technology and Application

been published to overcome this disadvantage:

heterologous protein production [92-96].

**8. Operational Strategies using** *PAOX1* **Muts**

[84, 91, 97-98].

Probably Mut+

Muts

with Mut+

the process decreases drastically.

All the strategies previously described for Mut+ phenotype can be applied to Muts phenotype, but to increase cell density and process productivity, as well as to reduce the induction time, a typical approach is the use of a multicarbon substrate in addition to methanol. It is a simple strategy to increase the energy supply to recombinant cells and the concentration of the carbon sources in the culture broth [81, 86, 88].

One of the most selected substrates is glycerol. Several authors have reported that the use of mixed feeds of glycerol and methanol during the induction phase increase productivity and feeding rates [99]. The advantages to use glycerol as co-substrate is that enthalpy of combustion of glycerol -549,5 kJC-mol-1 [100] is lower than the enthalpy of combustion of methanol, -727 kJC-mol-1 [37]. Thus, less heat will be released using mixing substrates compared with methanol alone. On the other hand, oxygen consumption is also reduced since less oxygen is necessary for the oxidation of glycerol [38]. Therefore, any method which reduces the heat and oxygen consumption rate without affecting productivity would clearly advantageous.

However, glycerol is reported to repress the expression of alcohol oxidase and subsequently the expression of the target protein [101]. Thus, the rational design of operational strategies for the addition of both substrates in fed-batch fermentation, while avoiding glycerol repres‐ sion, is the key point of the bioprocess. Different strategies have been developed in Mut+ phenotype [24, 32, 52, 102-105]. One of the most applied is a pre-programmed exponential feeding rate with an optimum methanol-glycerol ratio [38, 106], or similar strategy maintaining a residual methanol concentration between 1- 2 gl-1 [78]. The effect of different methanolglycerol ratios at constant feeding rate has also been studied in the production of mouse αamylase [107].

One important feature showed in these works is that, although the maximum specific growth rate of *P. pastoris* is around 0.2 h-1, the optimum specific growth rate in Mut+ phenotype is around 0.06 h-1, too low compared with the maximum value. It seems that although glycerol is under limiting conditions high specific growth rate diminish the productivity of the bioprocess.

For this reason the use of different carbon sources other than glycerol may improve operational strategies on fed-batch cultures [99]. In contrast with glycerol, sorbitol accumulation during the induction phase does not affect the expression level of recombinant protein [108].

In shake flasks, inhibitory effect of sorbitol on cell growth appears at concentrations around 50 gl-1 [99]. Hence, control of residual sorbitol concentration during the induction phase is less critical than mixed feeds of glycerol and methanol. On the other hand less oxygen will be consumed during mixed substrate growth on sorbitol and methanol than using the combina‐ tion glycerol and methanol or on methanol as sole carbon source [99]. However sorbitol has the disadvantage that the maximum specific growth rate is too low around 0.02 h-1 similar value that obtained in Muts phenotype growing on methanol as sole carbon source. Thus, time fermentation is long and sometimes the increase in the production not is reflected in the producitivity of the bioprocess.

Some different operational strategies have been implemented using sorbitol as co-substrate [99, 102, 106,109-114].

Arnau et al., [102,113] designed an operational strategy using a Muts phenotype comparing both co-substrates sorbitol and glycerol in the production of *Rhizopus oryzae* lipase [102,113]. The induction phase started with a preprogrammed exponential feeding rate of sorbitol or glycerol with the objective to maintain a constant specific growth rate under limiting substrate conditions. Methanol set-point was maintained using a predictive control algorithm coupled with a PI feedback control [89]. A set of different specific growth rates and methanol set-points were tested. When sorbitol was used as co-substrate the different specific growth rates tested did not have significance influence on specific production rate of the bioprocess, probably because the use of co-substrate improved the energetic state of the cells overcoming partially the unfolding protein response (UPR) and secretion problems observed in the production of this recombinant fungi lipase. The key parameter in terms of protein production was the methanol set-point selected. Optimal methanol concentration was 2 gl-1, lower and higher concentrations diminished specific production rates. The product/biomass yield and the volumetric and specific productivity were 1.25-1.35 fold higher than using methanol as sole carbon source [113].

Irrespective of any economical reasons to use sorbitol or glycerol as co-substrate, one of the key advantages of using glycerol instead of sorbitol is its higher µ (0.2 h-1 versus 0.02 h-1) and the subsequent potential increase in the productivity of the bioprocess. However, for Muts phenotype this potential advantage is ineffective, because when glycerol exceeds the µmax of *P. pastoris* growing on methanol as a sole carbon source (around 0.014h-1) a repression of *AOX* promoter is clearly observed, represented by a drastic decrease in methanol consumption rates. Additionally, when the relation µGly per µMeOH was larger than 4, an important decrease of all productivity ROL parameters was observed. On the other hand, the presence of proteolytic activity detected when glycerol was used as co-substrate is another important drawback [102]. In conclusion sorbitol presented better results than glycerol as co-substrate in the heterologous production of *Rhizopus oryzae* lipase).

*PAOX1* is strongly repressed by glucose at the transcription level. This is the cause that few authors present positive results using this substrate. Nevertheless, a real-time parameter-based controlled glucose feeding strategy has been developed successfully in the recombinant production of phytases [115], Mixtures of glucose and methanol has also been used in continuous cultures producing recombinant trypsinogen [116].

## **10. Alternative promoters**

An important set of inducer promoters derived from genes which code for enzymes involved in the methanol metabolism are used as alternative promoters to the classical. *PAOX1*. A summary of the main alternative promoters is presented in table 2. Formaldehyde dehydro‐ genase promoter *PFLD1* inducible by methanol or methylamine [116], dihidroxyacetone synthase promoter *PDHAS* [101], and peroxisomal matrix protein gene promoter *PEX8*


inducible by methanol or oleate [118] are some examples. Other inducer promoter is the isocitrate lyase 1 *PICL1*. This promoter is inducible with ethanol and repressed by glucose in the exponential phase, but not in the stationary phase [119].

**Table 2.** Summary of the main inducible and constitutive alternative promoters to *PAOX1*.

Some different operational strategies have been implemented using sorbitol as co-substrate

both co-substrates sorbitol and glycerol in the production of *Rhizopus oryzae* lipase [102,113]. The induction phase started with a preprogrammed exponential feeding rate of sorbitol or glycerol with the objective to maintain a constant specific growth rate under limiting substrate conditions. Methanol set-point was maintained using a predictive control algorithm coupled with a PI feedback control [89]. A set of different specific growth rates and methanol set-points were tested. When sorbitol was used as co-substrate the different specific growth rates tested did not have significance influence on specific production rate of the bioprocess, probably because the use of co-substrate improved the energetic state of the cells overcoming partially the unfolding protein response (UPR) and secretion problems observed in the production of this recombinant fungi lipase. The key parameter in terms of protein production was the methanol set-point selected. Optimal methanol concentration was 2 gl-1, lower and higher concentrations diminished specific production rates. The product/biomass yield and the volumetric and specific productivity were 1.25-1.35 fold higher than using methanol as sole

Irrespective of any economical reasons to use sorbitol or glycerol as co-substrate, one of the key advantages of using glycerol instead of sorbitol is its higher µ (0.2 h-1 versus 0.02 h-1) and the subsequent potential increase in the productivity of the bioprocess. However, for Muts phenotype this potential advantage is ineffective, because when glycerol exceeds the µmax of *P. pastoris* growing on methanol as a sole carbon source (around 0.014h-1) a repression of *AOX* promoter is clearly observed, represented by a drastic decrease in methanol consumption rates. Additionally, when the relation µGly per µMeOH was larger than 4, an important decrease of all productivity ROL parameters was observed. On the other hand, the presence of proteolytic activity detected when glycerol was used as co-substrate is another important drawback [102]. In conclusion sorbitol presented better results than glycerol as co-substrate in the heterologous

*PAOX1* is strongly repressed by glucose at the transcription level. This is the cause that few authors present positive results using this substrate. Nevertheless, a real-time parameter-based controlled glucose feeding strategy has been developed successfully in the recombinant production of phytases [115], Mixtures of glucose and methanol has also been used in

An important set of inducer promoters derived from genes which code for enzymes involved in the methanol metabolism are used as alternative promoters to the classical. *PAOX1*. A summary of the main alternative promoters is presented in table 2. Formaldehyde dehydro‐ genase promoter *PFLD1* inducible by methanol or methylamine [116], dihidroxyacetone synthase promoter *PDHAS* [101], and peroxisomal matrix protein gene promoter *PEX8*

phenotype comparing

Arnau et al., [102,113] designed an operational strategy using a Muts

[99, 102, 106,109-114].

18 Protein Engineering - Technology and Application

carbon source [113].

production of *Rhizopus oryzae* lipase).

**10. Alternative promoters**

continuous cultures producing recombinant trypsinogen [116].

However, these alternative promoters have similar operational problems than *PAOX1*, especially when methanol is not substituted as inducer due to safety problems. This is the cause of a strong demand for alternative regulated promoters [120]. Between them, the constitutive glyceraldehydes-3-phosphate dehydrogenase promoter *PGAP is* the most common used [121]. Other constitutive promoters are the translation elongation factor 1-α promoter *PTEF1* [122], the promoter of YPT1, a GTPase involved in secretion [123] and the promoter of the 3-phosphoglycerate kinase *PPGK1*, from a glycolytic enzyme [124].

Stadlmayr *et al.,* [120] have identified 24 novel potential regulatory sequences from microarray data and tested their applicability to drive the expression of both, intracellular and secretory recombinant proteins with a broad range of expression levels. Although the production of model proteins not exceed the values obtained with the constitutive promoter *PGAP*, higher transcription levels at certain growth phases were detected with the translation elongation factor EF-1 promoter *PTEF1* and the promoter of a protein involved in the synthesis of the thiamine precursor *PTHI1.*

Between them only the inducer *PFLD1* and the constitutive *PGAP* have been applied for the routine production process, specially the last one.

The *FLD1* gene codes for an enzyme that plays an important role in the methanol catabolism as carbon source, as well as in the methylated amines metabolism as nitrogen source [125]. *PFLD1* is a strongly an independently induced either by methanol as carbon source or methylamine as nitrogen source [117]. Preliminary experiments to get an alternative carbon source to methanol showed that sorbitol, a carbon source that no repress the synthesis of methanol metabolism enzymes, also allows the induction of *PFLD1* by methylamine [126]. It suggests that the use of sorbitol as carbon source combined with methylamine as nitrogen source could be the basis for the development of methanol-free fed-batch fermentation. In fact, a methanol-free high cell density fed-batch strategy has been developed for the recombinant production of *Rhizopus oryzae* lipase. These fed-batch strategy has the same phases that a standard *PAOX1* promoter. GBP is similar but glycerol and ammonia as carbon and nitrogen sources are presented in a stoichiometric relation to achieve the exhaustion of both substrates at the end of the GBP. The TP consist in a sorbitol methylamine batch (SMBP) as a transition phase. The objective of the SMBP is the adaptation of the cells to the carbon and nitrogen sources used in the induction phase. Finally, the methylamine induction phase (MAIP) where a pre-programmed feeding rate strategy ensured a constant specific rate under sorbitol limiting conditions or maintaining a set-point of methanol at high specific growth rate have been implemented [127]. The result showed that the recombinant protein production is favored with the second strategy. When the performance of the bioprocess were compared to classical *PAOX1* promoter, the results were quite similar in terms of process productivity [63]. The production of this recombinant lipase under *PFLD1* triggers the unfolding protein response (UPR) detected at transcriptional levels [128].To overcome this problem two cell engineering strategies have been developed and applied successfully: the constitutive expression of the induced form of the *Saccharomyces cerevisiae* unfolded protein response transcriptional factor Hac1 and the deletion of the *GAS1*gene encoding a β-1,3 glucanosyltransglycosylase, GPIanchored to the outlet leaflet of the plasma membrane, playing a key role in yeast cell wall assembly [129]. This is an example that how the co-expression of proteins or the deletion of genes affect to bioprocess engineering.

The great advantage of the constitutive GAP promoter is that the cloned heteroloogus protein will be expressed along with cell growth if the protein is not toxic for the cells [130]. The use of this promoter is more suitable for large-scale production because the hazard and cost associated with the storage and delivery of large volumes of methanol are eliminated [131], and also for the implementation of continuous cultures, continuous cultures practically not described using *PAOX1* [134]. Thus, the features of the GAP expression system may contribute significantly to the development of cost-effective methods for large-scale production of heterologous recombinants proteins [132-133]. The efficiency of *PGAP* compared with *PAOX1*depends generally of the protein expressed, although some times the better optimiza‐ tion of operational strategy can mask the results.

In general, the substrates used with this promoter are glucose or glycerol. The standard operational strategy is a batch phase using glycerol and a fed-batch phase in an open-loop control using glucose. The selection of the optimal sequence of both substrates is under studies. For instance, the production of rPEPT2 growing on glucose was approximately 2 and 8 times higher than in cells grown on glycerol and methanol [135].

When using this expression system, specific production rate increases asymptotically to a maximum value with increasing µ [68]. Maurer *et al.,* have developed a model to describe growth and product formation, optimizing the feeding profile of glucose limited fed batch cultures to increase volumetric productivity under aerobic conditions [68]. Under hypoxic conditions, where growth is controlled by carbon source limitation, while oxygen limitation was applied to modulate metabolism and heterologous protein productivity, an increase in the specific productivity has been observed. This strategy has additional benefits including lower aeration and lower final biomass concentration [73].

In conclusion PGAP is the most promise alternative to the classical PAOX1 promoter.

## **Acknowledgements**

standard *PAOX1* promoter. GBP is similar but glycerol and ammonia as carbon and nitrogen sources are presented in a stoichiometric relation to achieve the exhaustion of both substrates at the end of the GBP. The TP consist in a sorbitol methylamine batch (SMBP) as a transition phase. The objective of the SMBP is the adaptation of the cells to the carbon and nitrogen sources used in the induction phase. Finally, the methylamine induction phase (MAIP) where a pre-programmed feeding rate strategy ensured a constant specific rate under sorbitol limiting conditions or maintaining a set-point of methanol at high specific growth rate have been implemented [127]. The result showed that the recombinant protein production is favored with the second strategy. When the performance of the bioprocess were compared to classical *PAOX1* promoter, the results were quite similar in terms of process productivity [63]. The production of this recombinant lipase under *PFLD1* triggers the unfolding protein response (UPR) detected at transcriptional levels [128].To overcome this problem two cell engineering strategies have been developed and applied successfully: the constitutive expression of the induced form of the *Saccharomyces cerevisiae* unfolded protein response transcriptional factor Hac1 and the deletion of the *GAS1*gene encoding a β-1,3 glucanosyltransglycosylase, GPIanchored to the outlet leaflet of the plasma membrane, playing a key role in yeast cell wall assembly [129]. This is an example that how the co-expression of proteins or the deletion of

The great advantage of the constitutive GAP promoter is that the cloned heteroloogus protein will be expressed along with cell growth if the protein is not toxic for the cells [130]. The use of this promoter is more suitable for large-scale production because the hazard and cost associated with the storage and delivery of large volumes of methanol are eliminated [131], and also for the implementation of continuous cultures, continuous cultures practically not described using *PAOX1* [134]. Thus, the features of the GAP expression system may contribute significantly to the development of cost-effective methods for large-scale production of heterologous recombinants proteins [132-133]. The efficiency of *PGAP* compared with *PAOX1*depends generally of the protein expressed, although some times the better optimiza‐

In general, the substrates used with this promoter are glucose or glycerol. The standard operational strategy is a batch phase using glycerol and a fed-batch phase in an open-loop control using glucose. The selection of the optimal sequence of both substrates is under studies. For instance, the production of rPEPT2 growing on glucose was approximately 2 and 8 times

When using this expression system, specific production rate increases asymptotically to a maximum value with increasing µ [68]. Maurer *et al.,* have developed a model to describe growth and product formation, optimizing the feeding profile of glucose limited fed batch cultures to increase volumetric productivity under aerobic conditions [68]. Under hypoxic conditions, where growth is controlled by carbon source limitation, while oxygen limitation was applied to modulate metabolism and heterologous protein productivity, an increase in the specific productivity has been observed. This strategy has additional benefits including

In conclusion PGAP is the most promise alternative to the classical PAOX1 promoter.

genes affect to bioprocess engineering.

20 Protein Engineering - Technology and Application

tion of operational strategy can mask the results.

higher than in cells grown on glycerol and methanol [135].

lower aeration and lower final biomass concentration [73].

This work was supported by the project CTQ2010-15131 of the Spanish Ministry of Science and Innovation, 2009-SGR-281 and the Reference Network in Biotechnology (XRB) (Generalitat de Catalunya)

## **Author details**

Francisco Valero\*

Address all correspondence to: Francisco.Valero@uab.cat

Department of Chemical Engineering. Engineering School, Universitat Autònoma de Barce‐ lona, Bellaterra, Barcelona, Spain

## **References**


[23] Lin Cereghino GP, Cregg JM. Applications of yeast in biotechnology: protein produc‐ tion and genetic analysis. Current Opinions in Biotechnology 1999;10 422-427.

[9] Sakai Y, Akiyama M, Kondoh H, Shibano Y, Kato N. High level secretion of fungal glucoamylase using the *Candida boindii* gene expression system. Biochimica Biophysi‐

[10] ] Raymond CK, Bukowski T, Holderman SD, Ching AF, Vanaja E, Stamm MR. Devel‐ opment of the methylotrophic yeast *Pichia methanolica* for the expression of the 65 kil‐

[11] Kang HA, Gellisen G. *Hansenula polymorpha*. In: Gellison G (ed) Production of re‐ combinant proteins – novel microbial and eukaryotic expression systems. Weinheim:

[12] Madzac C, Nicaud JM, Gaillardin C (2005). *Yarrowia lipolytica.* In: Gellison G (ed) Pro‐ duction of recombinant proteins – novel microbial and eukaryotic expression sys‐

[13] Böer E, Gellisen G, Kunze G (2005) *Arxula adeninivorans*. In: Gellison G (ed) Produc‐ tion of recombinant proteins – novel microbial and eukaryotic expression systems.

[14] Madzac C, Gaillardin C, Beckerich JM. Heterologous protein expression and secre‐ tion in the non-conventional yeast *Yarrowia lipolytica*: a review. J Biotechnol 2004;109

[15] Yin J, Li G, Ren X, Herrler G. Select what you need: A comparative evaluation of the advantages and limitations of frequently used expression systems for foreign pro‐

[16] Böer E, Steinborn G, Kunze G, Gellisen G. Yeast expression platforms. Applied Mi‐

[17] Graf A, Dragosits M, Gasser B, Mattanovich D. Yeast systems biotechnology for the production of heterologous proteins. FEMS Yeast Research 2009;9 335-348.

[18] Idiris A, Tohda H, Kumagai H, Takegawa A. Engineering of protein secretion in yeast: strategies and impact on protein production. Applied Microbiology and Bio‐

[19] Porro D, Gasser B, Fossati T, Maurer M, Branduardi P, Sauer M, Mattanovich D. Pro‐ duction of recombinant proteins and metabolites in yeasts. Applied Microbiology

[20] Çelik E, Çalik P. Production of recombinant proteins by yeast cells. Biotechnology

[21] Valero F., Heterologous expression system for lipases: A review. Methods in Molecu‐

[22] Cregg JM, Vedvick TS, Raschke WV. Recent advances in the expression of foreign

genes in *Pichia pastoris*. Bio/Technology 1993;11 905-910.

odalton isoform of human glutamate decarboxylase. Yeast 1998;14 11-23.

ca Acta 1996;1308 81-87.

22 Protein Engineering - Technology and Application

Wiley-VCH; 2005.

63-81.

tems. Weinheim: Wiley-VCH; 2005.

teins. Journal of Biotechnology 2007;127 335-347.

crobiology and Biotechnology 2007;77 513-523.

Weinheim: Wiley-VCH; 2005.

technology 2010;86 403-417.

Advances 2012;30 1108-1118.

lar Biology 2012;861 161-178.

and Biotechnology 2011;89 939-948.


[49] Schenk J, Marison IW, von Stockar U. A simple method to monitor and control meth‐ anol feeding of *Pichia pastoris* fermentations using mid-IR spectroscopy. Journal of Bi‐ otechnology 2007; 128 344-353.

[37] Weast RC. Handbook of Chemistry and Physics. Boca Ratón (Florida): CRC Press Inc;

[38] Jungo C, Marison I, von Stockar U. Mixed feed of glycerol and methanol can improve the performance of *Pichia pastoris* cultures: A quantitative study based on concentra‐ tion gradients in transient continuous cultures. Journal of Biotechnology 2007;128

[39] Macauley-Patrick S, Fazenda ML, McNeil B, Harvey LM. Heterologous protein pro‐ duction using the *Pichia pastoris* expression system. Yeast 2005;22 249-270.

[40] Junker BH, Wang HY 2006. Bioprocess monitoring and computer control: key roots of the current PAT initiative. Biotechnology and Bioengineering 2006;95(2) 325-336.

[41] FDA. Guidance for industrial PAT-a Framework for innovative pharmaceutical man‐ ufacturing and quality assurance. Food and drug administration Rockville.

[42] Wechselberger P, Seifert A, Herwig C. PAT method to gather bioprocess parameters in real-time using simple input variables and first principle relationships. Chemical

[43] Teixeira AP, Duarte TM, Carrondo MJT, Alves PM. Synchronous fluorescence spec‐ troscopy as a novel tool to enable PAT applications in Bioprocesses. Biotechnology

[44] Surribas A, Geissler D, Gierse A, Scheper T, Hitzmann B, Montesinos JL, Valero F. State variables monitoring by in situ multi-wavelength fluorescence spectroscopy in heterologous protein production by *Pichia pastoris*. Journal of Biotechnology 2006;124

[45] Surribas A, Amigo JM, Coello J, Montesinos JL, Valero F, Maspoch S. Parallel factor analysis combined with PLS regression applied to the on-line monitoring of *Pichia*

[46] Amigo JM, Surribas A, Coello J, Montesinos JL, Maspoch S, Valero F. On-line parallel factor analysis.A step forward in the monitoring of bioprocesses in real time. Analyti‐ cal Bioanalytical Chemometrics and Intelligent Laboratory Systems 2008;92 44-52. [47] Barrigón JM, Ramón R, Rocha I, Valero F, Ferreira EC, Montesinos JL. State and spe‐ cific growth estimation in heterologous protein production by *Pichia pastoris*. Aiche

[48] Cunha AE, Clemente JJ, Gomes R, Pinto F, Thomaz M, Miranda S, Pinto R, Moos‐ mayer D, Donner P, Carrondo MJT: Methanol induction optimization for scFv anti‐ body fragment production in *Pichia pastoris*: Biotechnology and Bioengineering

*pastoris* cultures. Analytical Bioanalytical Chemistry 2006;385 1281-1288.

Engineering Science 2010;65 5734-5746.

and Bioengineering 2011;108(8) 1852-1861.

Journal 2012;58(10) 2966*-*2979.

2004;86 458-467.

1980.

24 Protein Engineering - Technology and Application

824-837.

412-419.


[73] Potgieter TI, Kersey SD, Mallem MR, Nylen AC, d'Anjou M. Antibody expression ki‐ netics in glycoengineered *Pichia pastoris.* Biotechnology and Bioengineering 2010; 106(6) 918-927.

[61] Invitrogen corporation http://www.invitrogen.com (accessed 3 September 2012)

New York Academic of Science 1990;589: 350-362.

Journal 2005;26 86-94.

26 Protein Engineering - Technology and Application

system Engineering 2002;24 385-393.

technology 1998;75 163-173.

Bioengineering 2005;93 871-879.

neering 2000;5 275-287.

[62] Brierley RA, Bussineau C, Kosson R, Melton A, Siegel RS. Fermentation development of recombinant *Pichia pastoris* expressing heterologous gene: bovine lysozyme. Annal

[63] Cos O, Resina D, Ferrer P, Montesinos JL, Valero F. Heterologous protein production of *Rhizopus oryzae* lipase in *Pichia pastoris* using the alcohol oxidase an formaldehyde dehydrogenase promoters in batch and fed-batch cultures. Biochemical Engineering

[64] Jahic M, Rotticci-Mulder JC, Martinelle M, Hult K, Enfors SO. Modeling of growth and energy metabolism of *Pichia pastoris* producing a fusion protein. Bioprocess Bio‐

[65] Chiruvolu V, Eskridge K, Cregg J, Meagher M. Effects on glycerol concentration and pH on growth of recombinant *Pichia pastoris* yeast. Applied Biochemistry and Bio‐

[66] Zhang W, Inan M, Meagher MM. Fermentation strategies for recombinant protein ex‐ pression in the methylotrophic yeast *Pichia pastoris*. Biotechnology Bioprocess Engi‐

[67] Maurer M, Kühleitner M, Gasser B, Mattanovich D. Versatile modeling and optimi‐ zation of fed-batch processes for the production of secreted heterologous proteins

[68] Sing S, Gras A, Vandal CF, Ruprecht J, Rana R, Martinez M, Strange PG, Wagner R, Byrne B. large-scale functional expression of WT and truncated human adenosine

[69] Khatri NK, Hoffmann F. Impact of methanol concentration on secreted protein pro‐ duction in oxygen-limited cultures or recombinant *Pichia pastoris*. Biotechnology and

[70] Khatri NK, Hoffmann F. Oxygen-limited control of methanol uptake for improved production of a single-chain antibody fragment with recombinant *Pichia pastoris*. Ap‐

[71] Charoenrat T, Ketudat-Cairns M, Sthendahl-Andersen H, Jahic M, Enfors SO. Oxy‐ gen-limited fed-batch process: an alternative control for *Pichia pastoris* recombinant

[72] Baumann K, Maurer M, Dragosits M, Cos O, Ferrer P, Mattanovich D. Hypoxic fedbatch cultivation of *Pichia pastoris* increases specific and volumetric productivity of

protein processes. Bioprocess and Biosystem Engineering 2005;27 399-406.

recombinant proteins. Biotechnology and Bioengineering 2008;100 177-183.

A2A in *Pichia pastoris* bioreactor cultures. Microbial Cell Factories 2008:7 28.

with *Pichia pastoris*. Microbial Cell Factories 2006;5(37) 1-10.

plied Microbiology and Biotechnology 2006;72 492-498.


main V by a recombinant *Pichia pastoris*: A simple system for the control of methanol concentration using a semiconductor gas sensor. Journal of Fermentation and Bioen‐ gineering 1998;86 482-487.


[95] Dragosits M, Stadlmann J, Albiol J, Baumann, K, Maurer M, Gasser B, Sauer M, Alt‐ mann F, Ferrer P, Mattanovich D. The Effect of Temperature on the Proteome of Re‐ combinant *Pichia pastoris*. Journal of proteome research 2009;8(3) 1380-1392.

main V by a recombinant *Pichia pastoris*: A simple system for the control of methanol concentration using a semiconductor gas sensor. Journal of Fermentation and Bioen‐

[86] Guarna MM, Lesnicki GJ, Tam BM, Robinson J, Radziminski CZ, Hasenwinkle D, Boraston A, Jervis E, Macgillivray RTA, Turner RFB, Kilburn DG. On line monitoring and control of methanol concentration in shake-flasks cultures of *Pichia pastoris.* Bio‐

[87] Zhang WH, Smith LA, Plantz BA, Siegel Vl, Meagher MM. Design of methanol feed control in *Pichia pastoris* fermentations based upon a growth model. Biotechnology

[88] Cos O, Ramón R, Montesinos JL, Valero F. A simple model-based control for *Pichia pastoris* allowas a more efficient heterologous protein production bioprocess. Biotech‐

[89] Curvers S, Brixius P, Klauser T, Thömmes J, Weuster-Botz D, Takors R, Wandrey C. Human chymotrypsinogen B production with *Pichia pastoris* by integrated develop‐ ment of fermentation and downstream processing. Part I. Fermentation. Biotechnolo‐

[90] Surribas A, Stahn R, Montesinos JL, Enfors SO, Valero F, Jahic M. Production of a *Rhizopus oryzae* lipase from *Pichia pastoris* using alternative operational strategies.

[91] Jahic M, Wallberg F, Bollok M, García P, Enfors SO. Temperature limited fed-batch technique for control of proteolysis in *Pichia pastoris* bioreactor cultures. Microbial

[92] Siren N, Weegar J, Dahlbacka J, Kalkkinen N, Fagervik K, Leisola M, von Weymarn N. Production of recombinant HIV-1 Nef (negative factor) protein using *Pichia pasto‐ ris* and a low-temperature fed-batch strategy. Biotechnology and Applied Biochemis‐

[93] Yang M, Johnson SC, Murthy PN. Enhancement of alkaline phytase production in *Pichia pastoris*: Influence of gene dosage, sequence optimization and expression tem‐

[94] Dragosits M, Frascotti G, Bernard-Granger L, Vazquez F, Giuliani M, Baumann K, Rodriguez-Carmona E, Tokkanen J, Parrilli E, Wiebe MG, Kunert R, Maurer M, Gass‐ er B, Sauer M, Branduardi P, Pakula T, Saloheimo M, Penttila M, Ferrer P, Tutino ML, Villaverde A, Porro D, Mattanovich D. Influence of Growth Temperature on the Production of Antibody Fab Fragments in Different Microbes: A Host Comparative

perature. Protein Expression and Purification 2012;84(2) 247-254.

Analysis. Biotechnology Progress 2011;27(1) 38-46.

gineering 1998;86 482-487.

28 Protein Engineering - Technology and Application

Progress 2002;18 1392-1399.

gy Progress 2001;17 495-502.

Cell Factories 2003;2 1-6.

try 2006; 44 151-158.

technology and Bioengineering 1997; 56 279-286.

nology and Bioengineering 2006;95(1) 145-1154.

Journal of Biotechnology 2007; 130 291-299.


[119] Menendez J, Valdes I, Cabrera N. The ICL1 gene of *Pichia pastoris*, transcriptional reg‐ ulation and use of its promoter. Yeast 2003;20(13) 1097-1108.

[107] Resina D, Cos O, Ferrer P, Valero F. Developing high cell density fed-batch cultiva‐ tion strategies for heterologous protein production in *Pichia pastoris* using the nitro‐ gen source-regulated *FLD1* promoter. Biotechnology and Bioengineering

[108] Sreekrishna K, Brankamp RG, Kropp KE, Blankenship DT, Tsay JT, Smith PL, Wierschke JD, Subramaniam A, Birkenberger LA. Strategies for optimal synthesis and secretion of heterologous proteins in the methylotrophic yeast *Pichia pastoris.*

[109] Inan M, Meagher MM. Non-represing carbon sources for alcohol oxidase *(AOX1)* promoter of *Pichia pastoris*. Journal of Bioscience and Bioengineering 2001;92 585-589.

[110] Thorpe ED, D'Anjou MC, Daugulis A. Sorbitol as a non-represing carbon source for fed-batch fermentation of recombinant *Pichia pastoris*. Biotechnology Letters 1999;21

[111] Xie JL, Zhou QW, Pen D, Gan RB, Qin Y. Use of different carbon sources in cultiva‐ tion of recombinant *Pichia pastoris* for angiostatin production. Enzyme and Microbial

[112] Arnau C, Ramón R, Casas C, Valero F. Optimization of the heterologous production of *Rhizopus oryzae* lipase in *Pichia pastoris* system using mixed substrates on control‐

led fed-batch bioprocess. Enzyme and Microbial Technology 2010;46 494-500. [113] Çelik E, Çalik P, Oliver SG. Fed-batch methanol feeding strategy for recombinant protein production by *Pichia pastoris* in the presence of co-substrate sorbitol. Yeast

[114] Hang H, Ye XY, Guo M, Chu J, Zhuang Y, Zhang M, Zhang S. A simple fermentation strategy for high-level production of recombinant phytase by *Pichia pastoris* using glucose as the growth substrate. Enzyme and Microbial Technology 2009;44 185-188.

[115] Paulova L, Hyka P, Branska B, Melzoch K, Kovar K. Use of a mixture of glucose and methanol as substrates for the production of recombinant trypsinogen in continuous

[116] Shen S, Sulter G, Jeffries TW, Cregg JM. A strong nitrogen source-regulated promoter for controlled expression of foreign genes in the yeast *Pichia pastoris*. Gene

[117] Liu H, Tan X, Rissell KA, Veenhuis M, Cregg JM. *ER3*, a gene required for peroxi‐ some biogenesis in *Pichia pastoris*, encodes a peroxisomal membrane protein involved

[118] Menendez J, Valdes I, Cabrera N. The ICL1 gene of *Pichia pastoris*, transcriptional reg‐

in protein import. Journal Biological Chemistry 1995;270 10940-10951.

ulation and use of its promoter. Yeast 2003;20(13) 1097-1108.

. Journal of Biotechnology 2012;157 180-188.

2005;91:760–767.

30 Protein Engineering - Technology and Application

Gene 1997;190 55-62.

Technology 2005;36 210-216

cultures with *Pichia pastoris* Mut+

2009;26 473-484.

1998;216(1) 93-102.

669-672.


tinuous constitutive *Pichia pastoris* expression system. Biotechnology and Bioengineering 2004;31 330-334.


**Chapter 2**

## **Chromatography Method**

Jingjing Li, Wei Han and Yan Yu

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/56265

## **1. Introduction**

tinuous constitutive *Pichia pastoris* expression system. Biotechnology and

[132] Wu JM, Lin JC, Chieng LL, Lee CK, Hsu TA. Combined used of GAP and AOX1 pro‐ moter to enhance the expression of human granulocyte-macrophage colony-stimulat‐ ing factor in *Pichia pastoris*. Enzyme and Microbial Technology 2003;33 453-459. [133] Delroise JM, Dannau M, Gilsoul JJ, El Mejdoub T, Destain J, Portetelle D, Thonart P, Haubruge E, Vandelbol M. Expression of a synthetic gene encoding a tribolium cas‐ taneum carboxylesterase in *Pichia pastoris*. Protein Expression and Purification 2005;

[134] Zhang A-L, Luo J-X, Zhang T-Y, Pan Y-W, Tan Y-H, Fu C-Y, Tu F-z. Recent advances on the GAP promoter derived expression system of *Pichia pastoris*. Molecular Biology

[135] Dóring F, Klapper M, Theis S, Daniel H. Use of the glyceraldehydes-3-phosphate dehdrogenase promoter for production of functional mammalian membrane trans‐ port proteins in the yeast *Pichia pastoris*. Biochemistry Biophysics Research Commu‐

Bioengineering 2004;31 330-334.

32 Protein Engineering - Technology and Application

Reports 2009;36 1611-1619.

nication 1998;250. 531-535.

42:286-294.

Term 'chromatography' was firstly employed by Russian Scientist Mikhail Tsvet in 1900 to describe the phenomenon that a mixture of pigments was carried by a solvent to move on paper and separated from each other. Since the pigments have different colors, the phenom‐ enon was the termed by "chromato-graphy' literally means 'color writing' [1]. Now, it is generally refers to a series techniques for the separation of mixtures [2].

Each chromatography involves two phases, mobile phase and stationary phase. The mobile phase drives compounds to flow through the surface of the stationary phase and the move‐ ments of compounds are retarded by interaction with stationary phase. Compounds are retarded differentially according to the strength the interaction and finally are separated.

The chromatography was early performed on papers or thin layers to separate small molecule compounds, termed planar chromatography (Figure 1A). Later, the column chromatography was developed, in which the stationary phase is manufactured into porous particle media and parked in a column and the mobile phase flows through thin channels among media [3]. If the mobile phase is gas and stationary phase is liquids, the technique is termed gas chromatogra‐ phy [4], which is used in separation of volatile compounds (Figure 1B). If the mobile phase is liquid and stationary phase is solid, it is termed liquid chromatography [5] and used widely in separation of small compounds or biological macromolecules (Figure 1C).

The liquid chromatography is the most popular technique in protein purification and analysis. The liquid mobile phase containing proteins flows through the column and is separated by interacted with media. The stationary phase composed by porous particles supplies much more surface compared with traditional planar chromatography. So the loading capacity is much more increased and could purify even grams of protein in one cycle. Furthermore, the column structure provides a possibility to employ high pressure to drive the mobile phase flowing much faster and complete a separation within short time, termed high pressure liquid

© 2013 Li et al.; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Li et al., licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### **Figure 1.** Different types of chromatography

chromatography [6]. At same time, uniform size of matrix benefited by exquisite quality gives the column chromatography much higher resolution than before. The high performance in high loading capacity, high flow rate, and high resolution made the column chromatography become the most rapidly developed protein separation technique in the last two decades.

Several basic types of chromatography had been developed based on different separation properties (Table 1).This chapterdescribesbothprinciples andapplicationsofthese techniques.


**Table 1.** Different chromatography techniques and corresponding protein properties

## **2. Ion-exchange chromatography**

Ion-exchange chromatography (IEXC) was introduced to protein separation in the 1960s and plays a major role in the purification of biomolecules [7]. IEXC separation is based on the reversible electrostatic interactions between charged solutes and an oppositely charged medium. The technique is straightforward on its theory and operation, so that easily to be grasped by beginners.

Ion exchange refers to the exchange of ions between two electrolytes or between an electrolyte solution and a complex. For example: NiSO4 + Ca2+ = CaSO4 + Ni2+. When one of the electrolytes was immobilized on resin, the exchange will happen between the interface of liquid phase and solid phase, termed exchanger, such as,

$$\text{R-O-CH}\_2\text{-COOY} + X^\* \rightarrow \text{R-O-CH}\_2\text{-COOX} + Y^\* \tag{1}$$

In which R indicates the base matrix portion of the resin, the ion X+ exchanges with Y+ and is adsorbed by resin.

The exchange reaction is reversible and the direction depends on the concentration and ionization constant of the electrolytes. In Equation 1, if concentration of ion Y+ increases, X+ will be desorbed.

$$\text{R-O-CH2-COOX} + \text{Y}^\* \rightarrow \text{R-O-CH2-COOY} + \text{X}^\* \tag{2}$$

The ion Y+ could be any cation, such as Na+ , H+ . The two equations present the process of the binding and elution in IEXC.

According to the above two equations we know the binding of protein on exchanger is a kinetic equilibrium between adsorption and desorption. The equilibrium constant Kd is:

$$\text{Kd} = \begin{bmatrix} \mathbf{X}^\* \\ \end{bmatrix} \begin{bmatrix} \text{R-O-CH2-COOY} \end{bmatrix} \begin{bmatrix} \mathbf{Y}^\* \\ \end{bmatrix} \begin{bmatrix} \text{R-O-CH2-COOX} \end{bmatrix} \tag{3}$$

With mobile phase moving, protein molecules in mobile phase are carried forward and adsorbed by downstream medium, at same time adsorbed proteins are released from station‐ ary phase to mobile phase. Proteins remove forward companied with continuous adsorption and desorption. Under the same ionic strength, the higher Kd a protein has, the more fraction distributes in mobile phase and moves faster. Reversely, proteins having smaller Kd are more retarded than that having larger Kd. Actually, all kinds of adsorption chromatographys are base on the kinetic equilibrium mechanism.

#### **2.1. Isoelectric point of protein**

chromatography [6]. At same time, uniform size of matrix benefited by exquisite quality gives the column chromatography much higher resolution than before. The high performance in high loading capacity, high flow rate, and high resolution made the column chromatography become the most rapidly developed protein separation technique in the last two decades.

Several basic types of chromatography had been developed based on different separation properties (Table 1).This chapterdescribesbothprinciples andapplicationsofthese techniques.

Ion-exchange chromatography (IEXC) was introduced to protein separation in the 1960s and plays a major role in the purification of biomolecules [7]. IEXC separation is based on the

Reverse phase chromatography

**Property Technique**

**2. Ion-exchange chromatography**

**Figure 1.** Different types of chromatography

34 Protein Engineering - Technology and Application

Net charge Ion exchange chromatography

Biorecognition Affinity chromatography Size Size exclusion chromatography

**Table 1.** Different chromatography techniques and corresponding protein properties

Hydrophobicity Hydrophobic interaction chromatography and

Proteins are ampolytes on which carboxyl groups and amino groups of side chains and two terminals could ionize and cause proteins being positively and negatively charged. The positive charges of proteins typically attribute to ionized cysteine, aspartate, lysines, and histidines. Negative charges are principally provided by aspartate and glutamate residues. At a certain pH point, the total positive charges of a protein equal to the total negative charges, the net charge is 0 at this time and the pH is defined as the isoelectric point (pI) of this protein. When the solution pH higher than pI of a protein, more carboxyl groups ionized and the protein is negatively charged, vise versa (Figure 2). pI of a protein could be determined by several experiment methods, but an approximate value could be calculated by mathematics methods. Once a protein primary structure is given, the pI can be calculated by software or some concise websites such as:

http://web.expasy.org/compute\_pi/

http://www.scripps.edu/~cdputnam/protcalc.html

## **2.2. Selection of exchanger**

The exchangers in IEXC are composed of base matrix and functional groups that coupled on surface of the matrix. The base matrix is nonporous or porous spherical particles with charge free surface on which different functional groups link. Porous matrix offers a large surface area for protein binding and so gives a high binding capacity, but sacrificed some resolution due to the diffusion between outside and inside of matrix. On the contrary, nonporous matrix is limited on binding capacity, but used to provide high resolution on micropreparative or analytical separations.

Similar to the effect of porosity, the size of particles also influences the resolution of all kinds of chromatography including IEXC. Even and small particle size facilitates the efficient transfer of molecules between the mobile and the stationary phases, and provides high resolution, but increases the resistance of the column so that needs higher pressure or longer separation time. Small size particles are preferable for analytical separations. On the contrary the large size particles are more used on large scale production.

The selectivity of ion exchange media depends briefly on the nature and substitution degree of the functional groups, or called ligands. The media are classified into anion exchangers and cation exchangers. Ligand of the anion exchangers can be positively charged and anions can bind and exchange on it. On the contrary, the cation exchangers can be negatively charged on which cations exchange. The commonly used exchangers named after the functional groups and list in Table 2.


**Table 2.** Commonly used exchangers

Ion exchangers are classified as weak or strong according to the ionization properties of ligands. The strong exchangers own ligands with high ionization coefficient (Figure 2). They are fully charged in pH range 1~13. In this range, pH change does not influence the charge of the ion exchanger. Thus the strong exchangers can wide used in almost all pH range. On the contrary, weak ion exchangers have weak electrolytes as functional ionic groups. The ioniza‐ tion of these groups is influenced by solution pH. So that they can offer a different selectivity compared to strong ion exchangers.

**Figure 2.** Charge property of the common types of ion exchangers and example protein with different pH value. (Modified from Ion Exchange chromatography & chromatofocusing, principle and methods, GE healthcare)

## **2.3. Surface charge of protein**

a certain pH point, the total positive charges of a protein equal to the total negative charges, the net charge is 0 at this time and the pH is defined as the isoelectric point (pI) of this protein. When the solution pH higher than pI of a protein, more carboxyl groups ionized and the protein is negatively charged, vise versa (Figure 2). pI of a protein could be determined by several experiment methods, but an approximate value could be calculated by mathematics methods. Once a protein primary structure is given, the pI can be calculated by software or

The exchangers in IEXC are composed of base matrix and functional groups that coupled on surface of the matrix. The base matrix is nonporous or porous spherical particles with charge free surface on which different functional groups link. Porous matrix offers a large surface area for protein binding and so gives a high binding capacity, but sacrificed some resolution due to the diffusion between outside and inside of matrix. On the contrary, nonporous matrix is limited on binding capacity, but used to provide high resolution on micropreparative or

Similar to the effect of porosity, the size of particles also influences the resolution of all kinds of chromatography including IEXC. Even and small particle size facilitates the efficient transfer of molecules between the mobile and the stationary phases, and provides high resolution, but increases the resistance of the column so that needs higher pressure or longer separation time. Small size particles are preferable for analytical separations. On the contrary the large size

The selectivity of ion exchange media depends briefly on the nature and substitution degree of the functional groups, or called ligands. The media are classified into anion exchangers and cation exchangers. Ligand of the anion exchangers can be positively charged and anions can bind and exchange on it. On the contrary, the cation exchangers can be negatively charged on which cations exchange. The commonly used exchangers named after the functional groups

Diethylaminopropyl (ANX) -N+H(C2H5)2

**Exchanger Ligand Charged group** Strong cation Sulfopropyl (SP) -CH2CH2CH2SO3 - Weak cation Carboxymethyl(CM) -O-CH2COO - Strong anion Quaternary ammonium (Q) -N+(CH3)3 Weak anion Diethylaminoethyl (DEAE) -N+H(C2H5)2

some concise websites such as:

**2.2. Selection of exchanger**

analytical separations.

and list in Table 2.

**Table 2.** Commonly used exchangers

http://web.expasy.org/compute\_pi/

36 Protein Engineering - Technology and Application

http://www.scripps.edu/~cdputnam/protcalc.html

particles are more used on large scale production.

The mobile phase in IEXC is aqueous solution with proper pH value and ionic strength. The pH value determines the charge property of protein. A pH value lower than a protein pI will causes a positive net charge of the protein and vise versa. It should be noted that IEXC is base on the electrostatic interaction. The interaction between a protein and an ion exchanger depends more on the charge distribution of the protein surface than the net charge (Figure 3). The distribution of the charge on surface and internal is not even, so a solution with pH value slightly different to protein pI could not insure the protein exhibit an expected charged surface. In practice, the pH is typically at least 1 unit higher or lower than pI of target protein to ensure the protein has an expected surface charge.

**Figure 3.** Different charge distributions of proteins.

#### **2.4. Mobile phase**

Mobile phase is composed of pH buffer system and neutral salt ions. Buffering ions in buffer should have the same charge with exchanger. Otherwise the buffering ions will bind to exchanger prior to eluent ions and cause significant pH fluctuation during elution. The commonly used buffers are given in table 3.


**Table 3.** Commonly used buffer for cation and anion exchange chromatography

Except pH value, the ionic strength also influences the binding of the protein. A typical IEXC experiment includes a binding stage and an elution stage. As indicated in Equation 1 and 2, proteins tend to be adsorbed by exchanger at low ionic strength and be desorbed at high ionic strength. So the ionic strength should be low enough in binding process to ensure protein adsorption and increased to elute proteins. The ionic strength in IEXC is usually modulated by adding high concentration of NaCl solution.

## **2.5. Operation**

## *2.5.1. Binding process*

All solutions used in column chromatography, including sample solution, should be degased and filtered (0.22 or 0.45 um membrane) to avoid the clogging of column by air bubbles or particles. Before sample loading the column should be equilibrated with 2 column volumes (CV) of initial buffer. And then sample is loaded with same flow rate. After that 3 CV of initial buffer should be run to wash off the unbound impurity proteins.

### *2.5.2. Elution*

**Figure 3.** Different charge distributions of proteins.

38 Protein Engineering - Technology and Application

commonly used buffers are given in table 3.

**Buffer for cation exchange chromatography**

**Buffer for anion exchange chromatography**

Mobile phase is composed of pH buffer system and neutral salt ions. Buffering ions in buffer should have the same charge with exchanger. Otherwise the buffering ions will bind to exchanger prior to eluent ions and cause significant pH fluctuation during elution. The

**Buffers pH range at 20 mM**

Citric acid 2.6~3.6 Acetic acid 5.3~6.3 MES 5.8~6.8 Phosphate buffer 6.3~7.3 HEPES 7.1~8.1

Bis-tris 6.0~7.0 Tris-HCl 7.5~8.5 TEA 7.4-8.8 Ethanolamine 9.0~10.0 Piperdine 10.5~11.5

**Table 3.** Commonly used buffer for cation and anion exchange chromatography

**2.4. Mobile phase**

Although proteins could be separated under constant solvent composition, termed isocratic elution, for most tightly adsorbed proteins, it will take very long time to be eluted.

In practice, the mostly used strategy is to accelerate the exchange of protein by increas‐ ing ion strength in initial buffer. The most widely used agent is NaCl. It is convenient to increase the cation Na+ and anion Cl at same time and without significantly change pH value of solution. Proteins could be eluted by linear or stepwise gradient ion strength or combination of them (Figure 4). The stepwise gradient elution is used in group separa‐ tion. In each step one group of proteins with similar charge property is eluted simultane‐ ously. It is often used in large scale production. While, linear gradient could be seem as infinite number of tiny steps, in which protein was eluted and separated one by one. It is more used in preliminary experiments or analytical separations. In practice, the usual strategy is combination of linear and stepwise gradient. As show in figure 4C, a part of impurities are eluted first by a step elution, and then the target protein is separated from the similar charged protein by a linear elution.

Another elution method is to change the surface charge of proteins by changing pH value of the elution buffer. Typically, in cation IEXC, increased pH value decreases the surface positive charge and the interaction between proteins and exchangers is weakened. Reversely the pH value is decreased in anion IEXC to elute protein. Proteins are eluted at the pH value close to their pI. It should be noted that, change of pH could also alter the charge property of weak exchangers in certain ranges, so the weak exchanger possibly gives different resolution in these ranges. But pH elution is less used in practice because some proteins precipitate at pH value near to their pI and clog column. Additionally, it is hard to keep ion strength constant as changing pH value and present a worse reproducibility.

**Figure 4.** Different strategies of gradient elution.

#### **2.6. Feature and application**

IEXC is one of the most frequently used chromatographic techniques for the protein separation. The adsorption and elution take place under mild condition so that the natural activities can be well maintained during chromatographic process.

#### *2.6.1. Purification of recombinant human Midkine by SP column*

A recombinant human Midkine, pI=9.7, was expressed by a yeast fermentation technology and separated by IEX chromatography using SP column. The fermentation culture with high potassium phosphate buffer (100 mM) was diluted by pure water, lowering the conductivity to <10 mS/cm, and adjusted to pH 6.2 by Na2HPO4 solution. 50 ml Sepharose FF column with maximum loading capacity of 70 mg/ml was used to capture total 200 mg proteins in sample solution. A fraction of non-target protein was eluted by stepwise elution using 0.5 M NaCl, and then a linear gradient from 0.5~1.0 M NaCl was used to separated the target protein from the other impurities.

**Figure 4.** Different strategies of gradient elution.

40 Protein Engineering - Technology and Application

be well maintained during chromatographic process.

*2.6.1. Purification of recombinant human Midkine by SP column*

IEXC is one of the most frequently used chromatographic techniques for the protein separation. The adsorption and elution take place under mild condition so that the natural activities can

A recombinant human Midkine, pI=9.7, was expressed by a yeast fermentation technology and separated by IEX chromatography using SP column. The fermentation culture with high potassium phosphate buffer (100 mM) was diluted by pure water, lowering the conductivity to <10 mS/cm, and adjusted to pH 6.2 by Na2HPO4 solution. 50 ml Sepharose FF column with maximum loading capacity of 70 mg/ml was used to capture total 200 mg proteins in sample solution. A fraction of non-target protein was eluted by stepwise elution using 0.5 M NaCl, and then a linear gradient from 0.5~1.0 M NaCl was used to separated the target protein from

**2.6. Feature and application**

the other impurities.

**Figure 5.** Cation IEXC of rhMK (result of Shixiang Jia, Ping Tu et al. General regeneratives (shanghai) limited, Shanghai, PR China)

## **3. Hydrophobic interaction chromatography**

Hydrophobic interaction chromatography (HIC) bases on the interactions between hydro‐ phobic surface of proteins and hydrophobic ligands on the medium [8]. It is used in protein separation for more than a half century, although there is not a widely accepted theory to define the hydrophobic interaction.

The principle of HIC is parallel to that of salting out. In aqueous solution, hydrogen bond is formed between water molecules and protein surface. By hydrogen bond, the side chains of protein molecules adsorb water molecules to form an ordered water film around them. The water film prevents protein molecules from aggregating and precipitating. Different amino acid side chains have variant abilities in forming hydrogen bond. Hydrophobic amino acids, such as isoleucine, valine, leucine, and phenylalanine, tend to loss their ordered water as solution ion strength increases. Relative hydrophobicity of amino acids was defined by the change of Gibbs free energy when amino acids are transferred from aqueous solution to nonpolar solvent [9]. The distribution of hydrophobic amino acids on protein surface determines the hydrophobicity of the protein. As salt concentration increases, proteins associate each other and precipitate in the order of decreasing hydrophobicity. This process is termed fractional salting out (Figure 6B).

In HIC, the concentration of salt is controlled at an appropriate value, for example, 1 M (NH4)2SO4. At this concentration, the hydrophobic interaction still not strong enough to cause proteins precipitate. However, the hydrophobic media, termed adsorbent, could adsorb proteins by high hydrophobic ligand coupled on it (Figure 6C). When protein solution flows through the HIC column, proteins having certein hydrophobicity will be adsorbed, and proteins with weak hydrophobicity will flow through with mobile phase. So, to adsorb proteins with weak hydrophobicity needs application of higher salt concentration or medium with stronger hydrophobicity to increase the hydrophobic interaction.

### **3.1. Stationary phase**

The media of HIC are composed of base matrix and ligand. Base matrix functions as a support on which the hydrophobic ligand is immobilized. To avoid the disturbance the hydrophobic interactions between proteins and ligand, the matrix should have an inert surface. Cross-linked agarose is one of the most widely used matrix, it has a porous structure, having high binding capacity, high flow rate, good physical and chemical stability. Except that silico or synthetic copolymer materials are also widely used matrix.

Hydrophobic ligands are attached to the surface of base matrix by covalent bonds, for example, by glycidyl-ether for agarose and silyl-ether for silico gel. Widely used ligands for HIC are linear chain alkanes and phenyl. The strength of the hydrophobicity increases with the increase of length of the carbon chain. Butyl (C4) and octyl (C8) are often used linear chain ligands. Another widely used ligand is phenyl, which not only has a same hydrophobicity with pentyl ligand, also has a potential for π-π interactions with proteins rich in aromatic groups.

**3. Hydrophobic interaction chromatography**

the hydrophobic interaction.

42 Protein Engineering - Technology and Application

salting out (Figure 6B).

**3.1. Stationary phase**

Hydrophobic interaction chromatography (HIC) bases on the interactions between hydro‐ phobic surface of proteins and hydrophobic ligands on the medium [8]. It is used in protein separation for more than a half century, although there is not a widely accepted theory to define

The principle of HIC is parallel to that of salting out. In aqueous solution, hydrogen bond is formed between water molecules and protein surface. By hydrogen bond, the side chains of protein molecules adsorb water molecules to form an ordered water film around them. The water film prevents protein molecules from aggregating and precipitating. Different amino acid side chains have variant abilities in forming hydrogen bond. Hydrophobic amino acids, such as isoleucine, valine, leucine, and phenylalanine, tend to loss their ordered water as solution ion strength increases. Relative hydrophobicity of amino acids was defined by the change of Gibbs free energy when amino acids are transferred from aqueous solution to nonpolar solvent [9]. The distribution of hydrophobic amino acids on protein surface determines the hydrophobicity of the protein. As salt concentration increases, proteins associate each other and precipitate in the order of decreasing hydrophobicity. This process is termed fractional

In HIC, the concentration of salt is controlled at an appropriate value, for example, 1 M (NH4)2SO4. At this concentration, the hydrophobic interaction still not strong enough to cause proteins precipitate. However, the hydrophobic media, termed adsorbent, could adsorb proteins by high hydrophobic ligand coupled on it (Figure 6C). When protein solution flows through the HIC column, proteins having certein hydrophobicity will be adsorbed, and proteins with weak hydrophobicity will flow through with mobile phase. So, to adsorb proteins with weak hydrophobicity needs application of higher salt concentration or medium

The media of HIC are composed of base matrix and ligand. Base matrix functions as a support on which the hydrophobic ligand is immobilized. To avoid the disturbance the hydrophobic interactions between proteins and ligand, the matrix should have an inert surface. Cross-linked agarose is one of the most widely used matrix, it has a porous structure, having high binding capacity, high flow rate, good physical and chemical stability. Except that silico or synthetic

Hydrophobic ligands are attached to the surface of base matrix by covalent bonds, for example, by glycidyl-ether for agarose and silyl-ether for silico gel. Widely used ligands for HIC are linear chain alkanes and phenyl. The strength of the hydrophobicity increases with the increase of length of the carbon chain. Butyl (C4) and octyl (C8) are often used linear chain ligands. Another widely used ligand is phenyl, which not only has a same hydrophobicity with pentyl

ligand, also has a potential for π-π interactions with proteins rich in aromatic groups.

with stronger hydrophobicity to increase the hydrophobic interaction.

copolymer materials are also widely used matrix.

**Figure 6.** Salting out process and adsorption between protein and adsorbent. (A) A protein can disperse in salt free solution. (B) When salt concentration increases, the ordered water molecules are taken up. Proteins tend to aggre‐ gates and precipitates. (C) With a moderate salt concentration, the hydrophobic interaction between protein mole‐ cules is not strong enough to cause salting out, but can result in proteins adsorbed by hydrophobic matrix.

Before separating of each new protein, it is a good idea to screen different media by pretests on small prepacked column. The pretests should start from the medium with lowest hydro‐ phobic. An ideal medium should firstly have an appropriate hydrophobicity by which the target protein could be adsorbed at a certain salt concentration. The lower hydrophobic a protein is, the higher hydrophobicity the medium should have in order to capture it. In addition the medium should be able to desorb the protein as the salt concentration decreases. Once proteins are captured too tightly to be eluted, organic solvent must be added to increase the elution power, which possibly causes the inactivation of proteins.

## **3.2. Mobile phase**

Contrary to the IEXC, the initial buffer in HIC requires the presence of high concentration of salt ions, which preferentially take up the ordered water molecules from the protein surface and promote the hydrophobic interaction. The power is various among different ions. An ion that more increases the tension of water tend to more increase the strength of interaction between proteins and HIC media, although the internal nature is still not clear. Hofmeister series list the common ions according to the power to increase the water tension [10].

Anions: HPO4 2- > SO4 2- > C2H3O2 - > F- > Cl- > Br- > I- > ClO4 - > SCN-

Cations: N(CH3)4 + > Cs+ > Rb+ > NH4 + > K+ > Na+ > Li+ > Ca2+ > Mg2+

Molal surface tension of salts is listed as below.

MgCl2> Na2SO4> K2SO4> (NH4)2 SO4> MgSO4> Na2HPO4> NaCl > LiCl > KSCN

This series is not consistent for every protein, since except for the effect on water tension, the specific interaction between ions and proteins also appears to be another parameter on hydrophobic interaction. It seems that the hydrophobic interaction is more effected by anions that by cations. For example, the MgCl2 is weaker than (NH4)2SO4 on the promotion of hydrophobic interaction.

In practice, (NH4)2SO4 is one of the most used salt, 1~1.5 M of (NH4)2SO4 solution could satisfied most protein separations. If could not obtain the ideal effect, altering concentration or changing other salt ions, such as Na2SO4 or NaCl, should be considered. The disadvantage of (NH4)2SO4 is that the NH4 + tend to form ammonia gas under high OH- concentration, so it should be used under pH < 8.0. As adding high concentration of salt into sample, some high hydrophobic proteins likely precipitate. Therefore ever remember to filter or centrifuge sample solution to remove particles after unstable proteins sufficiently aggregate.

Solution pH value also has complex effect on strength of hydrophobic interaction. The mechanism is not very clear. In general, an increase in pH weakens hydrophobic interaction [11], possible due to an increase of surface net charge. But a research of Hjerten et al. revealed that increase in pH, on the contrary, increased the retention of some protein [12].

The effect of temperature on hydrophobic interaction is also complex. An increase in temper‐ ature could promote the hydrophobic interaction for some proteins, but weaken it for some others. The effect still can not be predicted efficiently on theory.

#### **3.3. Elution**

Similar with IEXC, isocratic elution with constant solvent composition can not elute protein efficiently. Gradient decrease of ion strength is the mostly used method in elution process of HIC. By decrease of ion strength, proteins are desorbed in the order of increasing surface hydrophobicity.

As decrease of salt concentration, proteins again obtain ordered water molecules and are eluted in the order of increasing hydrophobicity. A linear or stepwise gradient decrease of salt concentration is employed in elution of protein in IHC. Similar to the strategies of IEXC, simple linear gradient elution presents even resolution to universal gradient range, which always used in the screening experiment or analytical separation, but takes more time. Stepwise gradient elution is preferred in large scale preparative separation. It is advantageous in time-saving and solution-saving and obtaining more concentrated product. But this strategy usually can not be performed until an appropriate elution condition is found out through preliminary works of linear gradient elution. A typical linear gradient elution spectrum is show in Figure 7.

**Figure 7.** A typical linear gradient elution spectrum of HIC

**3.2. Mobile phase**

Anions: HPO4

Cations: N(CH3)4

2- > SO4

44 Protein Engineering - Technology and Application

+ > Cs+

MgCl2> Na2SO4> K2SO4> (NH4)2

hydrophobic interaction.

(NH4)2SO4 is that the NH4

**3.3. Elution**

hydrophobicity.

2- > C2H3O2

> Rb+

+

Molal surface tension of salts is listed as below.


 > NH4 + > K+

Contrary to the IEXC, the initial buffer in HIC requires the presence of high concentration of salt ions, which preferentially take up the ordered water molecules from the protein surface and promote the hydrophobic interaction. The power is various among different ions. An ion that more increases the tension of water tend to more increase the strength of interaction between proteins and HIC media, although the internal nature is still not clear. Hofmeister

series list the common ions according to the power to increase the water tension [10].

 > Br- > I-

> Na+

This series is not consistent for every protein, since except for the effect on water tension, the specific interaction between ions and proteins also appears to be another parameter on hydrophobic interaction. It seems that the hydrophobic interaction is more effected by anions that by cations. For example, the MgCl2 is weaker than (NH4)2SO4 on the promotion of

In practice, (NH4)2SO4 is one of the most used salt, 1~1.5 M of (NH4)2SO4 solution could satisfied most protein separations. If could not obtain the ideal effect, altering concentration or changing other salt ions, such as Na2SO4 or NaCl, should be considered. The disadvantage of

should be used under pH < 8.0. As adding high concentration of salt into sample, some high hydrophobic proteins likely precipitate. Therefore ever remember to filter or centrifuge sample

Solution pH value also has complex effect on strength of hydrophobic interaction. The mechanism is not very clear. In general, an increase in pH weakens hydrophobic interaction [11], possible due to an increase of surface net charge. But a research of Hjerten et al. revealed

The effect of temperature on hydrophobic interaction is also complex. An increase in temper‐ ature could promote the hydrophobic interaction for some proteins, but weaken it for some

Similar with IEXC, isocratic elution with constant solvent composition can not elute protein efficiently. Gradient decrease of ion strength is the mostly used method in elution process of HIC. By decrease of ion strength, proteins are desorbed in the order of increasing surface

As decrease of salt concentration, proteins again obtain ordered water molecules and are eluted in the order of increasing hydrophobicity. A linear or stepwise gradient decrease of

solution to remove particles after unstable proteins sufficiently aggregate.

others. The effect still can not be predicted efficiently on theory.

that increase in pH, on the contrary, increased the retention of some protein [12].

 > ClO4 - > SCN-

> Ca2+ > Mg2+

tend to form ammonia gas under high OH- concentration, so it

SO4> MgSO4> Na2HPO4> NaCl > LiCl > KSCN

> Li+

Additionally, adding neutral nonpolar solution, such as detergents, to the elution buffer could promote the elution of higher hydrophobic protein, such as membrane proteins or aplipopro‐ teins. But nonpolar solution possibly causes irreversible inactivation, so should avoid to be used in IHC. If the target protein could not be eluted in salt free aqueous solution, changing of a lower hydrophobic medium should be considered. While, high concentration of organic solution could be used in column regeneration, by which tightly bound compounds will be washed away.

pH and temparature are two important factors on retention of proteins, but they are usually not used as variable parameters in elution since their effects are hardly controlled. So that, the pH and temperature condition should be consistent between patches in order to present a good reproducibility.

#### **3.4. Features**

HIC separates proteins based on different hydrophobicity of proteins. It combines the reversibility of hydrophobic interaction and the precision of column chromatography to yield excellent separation. With certain medium, HIC could capture almost all proteins at certain conditions and suit to capture, concentrate, or polish proteins.

The selectivity of HIC is orthogonal to that of IEXC and SEC, because it works base on hydrophobicity of proteins, a totally different property from the net surface charge used in IEXC and molecular size in SEC. So HIC is an orthogonal separation dimension when com‐ bining with IEXC or SEC. So using two of them in series will yields much better separation rather than using one.

## **4. Reversed-phase chromatography**

Reversed-phase chromatography was named due to a reversed polarity between mobile phase and stationary phase compared with normal phase chromatography [13]. In normal phase chromatography, the mobile phase is organic solvent and stationary phase is hydrophilic resin. Reversely RPC uses hydrophobic adsorbents as stationary phase, which is the same with HIC in theory. However, in practice, the two methods have many differences. It is mainly due to the different degree of substitution of hydrophobic ligands on the medium surface. As shown in table 4, the density of ligand in RPC is an order of magnitude higher than that of HIC. It means that a protein molecule could bind more ligands when it is adsorbed. The huge forces could extract proteins from aqueous solution without help of neutral salt, so that the adsorbed proteins could not be eluted until using nonpolar solvents. Therefore, RPC is less used in preparation of activity proteins. However, the excellent resolution makes this technique to be the most important analytic chromatography. Liquid Chromatrography-Mass Spectrometry is an important extended application of the technique.


**Table 4.** Comparison between RPC and HIC

## **4.1. Stationary phase**

**3.4. Features**

46 Protein Engineering - Technology and Application

rather than using one.

**4. Reversed-phase chromatography**

is an important extended application of the technique.

Application Protein analysis

**Table 4.** Comparison between RPC and HIC

HIC separates proteins based on different hydrophobicity of proteins. It combines the reversibility of hydrophobic interaction and the precision of column chromatography to yield excellent separation. With certain medium, HIC could capture almost all proteins at certain

The selectivity of HIC is orthogonal to that of IEXC and SEC, because it works base on hydrophobicity of proteins, a totally different property from the net surface charge used in IEXC and molecular size in SEC. So HIC is an orthogonal separation dimension when com‐ bining with IEXC or SEC. So using two of them in series will yields much better separation

Reversed-phase chromatography was named due to a reversed polarity between mobile phase and stationary phase compared with normal phase chromatography [13]. In normal phase chromatography, the mobile phase is organic solvent and stationary phase is hydrophilic resin. Reversely RPC uses hydrophobic adsorbents as stationary phase, which is the same with HIC in theory. However, in practice, the two methods have many differences. It is mainly due to the different degree of substitution of hydrophobic ligands on the medium surface. As shown in table 4, the density of ligand in RPC is an order of magnitude higher than that of HIC. It means that a protein molecule could bind more ligands when it is adsorbed. The huge forces could extract proteins from aqueous solution without help of neutral salt, so that the adsorbed proteins could not be eluted until using nonpolar solvents. Therefore, RPC is less used in preparation of activity proteins. However, the excellent resolution makes this technique to be the most important analytic chromatography. Liquid Chromatrography-Mass Spectrometry

**RPC HIC**

Interaction Hydrophobic interaction Hydrophobic interaction

Substitution degree 10–50 mmoles/ ml gel several hundred mmoles/ml gel

Preparative separation of poly peptide

Preparative separation of protein

Ligand C2~C8 alkyl or aryl C4~C18 alkyl

Capture condition Salt free solution High salt solution Elution Increase nonpolarity Decrease ion strength

or oligonucleotide

conditions and suit to capture, concentrate, or polish proteins.

Similar with HIC, the media of RPC is composed of inert base matrix and hydrophobic ligands on surface.

The base matrix for reversed phase media is generally composed of silica or a synthetic organic polymer such as polystyrene. Silica was the first material used as base matrix for RPC, which has an excellent mechanical strength and chemical stability under acid condition. However the disadvantages of silica base matrix is its chemical instability in aqueous solutions at high pH. Silica matrix could be dissolved at high pH, so it is not recommended for prolonged exposure above pH7.5. Additionally, due to incomplete substitution or long term usage, some underivatised silanol groups are exposed to mobile phase, which will be negatively charge at high pH value, and cause ionic interaction with proteins. The mixed chromatography always causes decreased resolution with significant broadening and tailing of peaks. Therefore, RPC using silica matrix is often performed at low pH values (<3).

The loading capacity and resolution are determined by size of resin, in general, smaller resin give the higher resolution but lower loading capacity. The resin with 3~5 µm in diameter is preferable for analytic separation. Due to small size, it is hard to be packed well. So it is often offered in the form of prepacking columns. With increasing of diameter, the loading capacity increase, but resolution decrease simultaneously. Generally media with 15 µm or larger diameter are used in preparative separation.

The porous structure is employed to increase the loading capacity of PRC media. In general the pore size is 10~30 nm. Media with pore sizes of 10 nm are used predominately for small peptides or molecules. Media with pore sizes of 30 nm or greater are used in purification of large peptide or proteins.

Ligands used in RPC are linear alkyl with different length of carbon chain, which is the main factor on selectivity of media. In general, a medium with longer chain ligands gives stronger hydrophobicity. Oligonucleotide and organic moleculars, having less hydrophobicity, needs more hydrophobic media to supply sufficient adsorbability, such as C18 media. On the contrary, large peptides or proteins generally have more hydrophobic sites and need less hydrophobic adsorbents, such as C4 or C8. Selectivity and loading capacity are also influenced by the substitution degree. For large peptides or protein, the effect of increase in substitution degree is equal to increase in length of carbon-chain.

### **4.2. Mobile phase**

### *4.2.1. Organic solvent*

Typically, sample was loaded onto the column in aqueous solution and eluted by decreasing solution polarity. The elution power increases as polarity decreases. Although a large part of organic solvents have enough elution power, only a few of them could be used in RPC because of the requirement on viscosity and ultraviolet (UV) transparence. High solution viscosity influences the diffusion of solutes between mobile and stationary phases, therefore high viscous solvent reduces resolution. UV absorption of solvent will disturb the detection of solute UV absorption. Acetonitrile and methanol are two most widely used organic modifiers due to their moderate viscosity and perfect UV transparent. Although isopropanol and normal propanol have higher elution power, they are only used to clean and regenerate column because of their high viscosity.

It should be noted, all solvent used in RPC should be HPLC grade to minimize the damage of impurities to resin or samples.

*4.2.2. pH*

pH value could influence protein hydrophobicity by possibly changing the charge property of proteins [14]. In practice, two proteins with the same retention time are likely separated by just changing the solution pH value, and *vise versa.* At present, there is not effective method to predict the effect, trying different pH value is the only way to optimize the resolution.

However, as described above, media base on silica matrix are not suit to work at high pH value because of uncovered silanol groups. So silica-based RPC should works at low pH value, in general between 2 to 3. Strong acids, such as trifluoroacetic acid (TFA) or ortho-phosphoric acid are typically used to just the pH.

## *4.2.3. Ion-pairing agent*

The retention time of solutes, such as proteins, peptides, or nucleotides can be modified by adding ion pairing agents to solution [15]. An ion-pairing agent could ionize and release positive or negative ions, which will bind to the sample molecules by ionic interactions and results in the modification of hydrophobicity. For example, at a very acid condition most proteins are positively charged. The negative ion pairing agent will bind to positive charge group. The effect of neutralization always increases the hydrophobicity of proteins. TFA is not only used in pH control but is the most commonly used negative ion pairing agent. Addition‐ ally, triethylamine is used as positive ion pairing agent in neutral and alkaline condition.

### **4.3. Elution**

A simple linear gradient elution is often used in RPC. The eluent is a mixture of buffer A and buffer B by a mix pump. The buffer A generally is the start buffer, in which 0.1~0.5% TFA is added to control pH and functions as an ion pairing agent. The Buffer B typically is 0.1~0.5% TFA in pure organic solvent, such as acetonitrile or methanol. A gradient increase of buffer B from 0% to 90% or more in 30~60 min is often used.

### **4.4. Application**

The application of RPC on protein separation is mainly focus on the analytic separation and purity check. Because, on one hand, RPC has the highest resolution compared with the other relative techniques, on the other hand, the harsh binding and desorption condition in RPC usually leads to protein denaturation and not suit to preparative separation. A good repro‐ ducibility on retention time and low limit of detection make it be the most favored method in protein purity check. Additionally, RPC is the only one chromatography that can be used in association with mass spectrometry analysis, since the high resolution of RPC is the only one chromatography can separate a complex sample, such as serum, into single components and immediately analyzed by mass spectrometry.

## **5. Size exclusion chromatography gel filteration chromatography**

Size exclusion chromatography (SEC), or termed gel filtration chromatography, separates protein according to the difference on molecular size [16]. Different to those chromatography techniques based on adsorption, molecules do not bind to the surface of media in SEC, but are retarded by the porous structure of media. As shown in Figure 8, media of SEC are composed of porous material. However the pore size is much smaller than the pore size of the matrix used in adsorption chromatography and not uniform. The pore size of adsorption chroma‐ tography is big enough to allow entries of all molecules without selectivity. Comparatively, the pore sizes of SEC are smaller and selectively allow molecules with appropriate size enter and exclude the bigger molecules outside. Smaller molecules run longer and more winding paths in media rather than running straight paths outside the media as larger molecules do. So that smaller molecules are more retarded than larger ones.

## **5.1. Stationary phase**

UV absorption. Acetonitrile and methanol are two most widely used organic modifiers due to their moderate viscosity and perfect UV transparent. Although isopropanol and normal propanol have higher elution power, they are only used to clean and regenerate column

It should be noted, all solvent used in RPC should be HPLC grade to minimize the damage of

pH value could influence protein hydrophobicity by possibly changing the charge property of proteins [14]. In practice, two proteins with the same retention time are likely separated by just changing the solution pH value, and *vise versa.* At present, there is not effective method to predict the effect, trying different pH value is the only way to optimize the resolution.

However, as described above, media base on silica matrix are not suit to work at high pH value because of uncovered silanol groups. So silica-based RPC should works at low pH value, in general between 2 to 3. Strong acids, such as trifluoroacetic acid (TFA) or ortho-phosphoric

The retention time of solutes, such as proteins, peptides, or nucleotides can be modified by adding ion pairing agents to solution [15]. An ion-pairing agent could ionize and release positive or negative ions, which will bind to the sample molecules by ionic interactions and results in the modification of hydrophobicity. For example, at a very acid condition most proteins are positively charged. The negative ion pairing agent will bind to positive charge group. The effect of neutralization always increases the hydrophobicity of proteins. TFA is not only used in pH control but is the most commonly used negative ion pairing agent. Addition‐ ally, triethylamine is used as positive ion pairing agent in neutral and alkaline condition.

A simple linear gradient elution is often used in RPC. The eluent is a mixture of buffer A and buffer B by a mix pump. The buffer A generally is the start buffer, in which 0.1~0.5% TFA is added to control pH and functions as an ion pairing agent. The Buffer B typically is 0.1~0.5% TFA in pure organic solvent, such as acetonitrile or methanol. A gradient increase of buffer B

The application of RPC on protein separation is mainly focus on the analytic separation and purity check. Because, on one hand, RPC has the highest resolution compared with the other relative techniques, on the other hand, the harsh binding and desorption condition in RPC usually leads to protein denaturation and not suit to preparative separation. A good repro‐ ducibility on retention time and low limit of detection make it be the most favored method in

because of their high viscosity.

48 Protein Engineering - Technology and Application

impurities to resin or samples.

acid are typically used to just the pH.

from 0% to 90% or more in 30~60 min is often used.

*4.2.3. Ion-pairing agent*

**4.3. Elution**

**4.4. Application**

*4.2.2. pH*

Resolution of SEC is influenced by many parameters of stationary phase, including, column volume, particle size, pore size distribution [17].

The matrix of SEC are often composed of polymers by cross-linking to form a three-dimen‐ sional network. The matrix is manufactured in small spherical particles. On the surface and the inside of the particles, small channels and pores are formed with different sizes by controlling different degree of cross-linking. The selectivity of a medium depends on the distribution of pore sizes and can be described by a selectivity curve (Figure 9). For example, the medium superdex 200 (by GE company) has a linear selectivity range of 1x104 ~6x105 , that means solutes having molecular mass (Mw) in this range could be differentially retarded. The molecules larger than the upper limit are completely excluded from the inside space of the medium because no pores are big enough to allow them enter. At this time, the distribution coefficient (Kd) reaches to 0. On the contrary, those molecules smaller than the lower limit are free to enter any channel, therefore they are maximally retarded without selectivity and has a Kd=1. Those solutes with Mw between the two extremes could enter channels with different degree, Kd is between 0 and 1, are retarded differentially.

The media with narrow linear range often employed in group separations, by which solutes are simply separated into two groups. A typical application is protein desalting by a G25 column (Figure 9). On the contrary, the media with wide linear range usually used to separate similar components (Figure 9), such as using superdex 200 to separate IgG (Mw=1.5 x 105 ) and albumin (Mw=6.6 x 104 ).

**Figure 8.** In SEC large molecules run though the space between media with a shorter pathway, while the smaller mol‐ ecules run through the channels inside the medium with a longer pathway.

**Figure 9.** Selectivity curves of Superdex 200 and G25 media.

The height of packing bed affects both resolution and the separation time. Larger bed height often gives a better resolution with same sample volume, but takes more time to run a separation (Figure 10C).

The size of particle also is a parameter affecting resolution and the separation time. Smaller resin particles supply more efficient mass transfer between mobile and stationary phase, therefore present higher resolution. But simultaneously smaller particles increase the flow resistance and generally cause prolonged separation time.

## **5.2. Mobile phase**

An unparalleled advantage of SEC in all chromatography is the wide compatibility to various solutions. Because SEC separates proteins depends on molecular size rather than interactions between solutes and media, so pH value and polarity of mobile phase generally have slight influence the retention of compounds.

Since SEC has no concentration effect on elutes, so volumes of elution peak of each components are proportional to the sample volume. Increased sample volume will decrease the resolution (Figure 10B).

High viscosity in mobile phase has a certain effect on resolution by influence on the mass transfer between the mobile and the stationary phases, so that will cause broadening and tailing peaks (Figure 10D).

It is should be noticed that the ionic interaction between proteins and the resin possibly takes place at a low ionic strength, so generally 0.15 M NaCl is added to avoid it.

#### **5.3. Elution**

**Figure 8.** In SEC large molecules run though the space between media with a shorter pathway, while the smaller mol‐

ecules run through the channels inside the medium with a longer pathway.

50 Protein Engineering - Technology and Application

SEC has no a definite elution step, since molecules are not adsorbed by media. After sample is loaded, a buffer usually same to the initial buffer is pumped with two column volumes until all solutes are eluted.

**Figure 10.** The factors affecting resolution of SEC.

#### **5.4. Application**

SEC has the most mild separation condition, since in the whole process the composition of mobile phase needs no change. This is a good property for separating proteins that are unstable to alterations of pH value, ionic strength or polarity. SEC is often used in polish step after a sample has been crudely separated by other chromatography, especially in separation of the monomer and polymers. Since monomer and polymers usually could not be separated by IEXC and HIC due to the similarities in charge and hydrophobic property. But fortunately SEC can well separate them by different molecular size.

#### *5.4.1. Purification of recombinant human Midkine by SP column and SEC column*

A recombinant human Midkine (Mw=14 kDa) was expressed by an *E.coli* BL21 strain as inclusion body form. The inclusion body was denatured by 6 M guanidinium chloride and renatured through 10-fold dilution in renature buffer. The renatured protein was separated by IEXC and SEC (figure 11). Since the incorrect formation of intermolecular disulfide bond, a fraction of the rhMK molecules formed different polymers, which could not be separated from monomers by IEXC and were eluted as a mixture (Figure 11A). To separate bioactive monomers, a Sephadex G-75 column, which owns a fractionation range of 3000~80,000 dalton, was used to separate monomers from polymers. Non-reduced SDS PAGE demonstrated the purity of monomers reached 95% in the target peak.

**Figure 11.** Purification of E. coli rhMK by IEXC and SEC. (result of Shixiang Jia, Ping Tu et al. General regeneratives (shanghai) limited, Shanghai, PR China)

**5.4. Application**

**Figure 10.** The factors affecting resolution of SEC.

52 Protein Engineering - Technology and Application

SEC has the most mild separation condition, since in the whole process the composition of mobile phase needs no change. This is a good property for separating proteins that are unstable to alterations of pH value, ionic strength or polarity. SEC is often used in polish step after a sample has been crudely separated by other chromatography, especially in separation of the monomer and polymers. Since monomer and polymers usually could not be separated by IEXC

## **6. Affinity chromatography**

Affinity chromatography (AC) extensively refers to a series of techniques that separate proteins on the basis of a reversible interaction between proteins and their specific ligands coupled to a chromatography matrix [18]. The affinity interactions derive from a wide range of biorecog‐ nition, briefly including interactions between (1) enzymes and substrate analogues, inhibitors, cofactors [19], (2) antibodies and antigens [20], (3) membrane receptors and ligands [21], (4) nucleic acid and complementary sequence, histones, or nucleic acid polymerase, nucleic acid binding proteins, (5) biological small molecules and their receptors or carrier proteins [22], (6) metal ions and proteins having polyhistidine sequence.

Affinity interactions are always a result of a combination of different types of interactions, including electrostatic interactions, hydrophobic interactions, van der Vaals' forces, or hydrogen bonding. The interactions of high specificity always supply extremely high selec‐ tivity, by which a target protein could easily be separated in one step with thousands fold of increase in purity and high recovery.

## **6.1. Media**

Development of an AC media is much more complex than that other chromatography. It needs not only a specific ligand, but also complex coupling process to couple the ligand to the matrix without reducing its binding activity significantly. Therefore more and more ready-to-use matrices, which already have active ligands coupled to, were developed commercially to satisfy different separation. If no suitable ligand is available, it can be considered to develop a specific affinity medium or use alternative purification techniques.

## *6.1.1. Base matrix*

The mostly used material is agarose or cross-linked agarose. The hydroxyl groups on the sugar resides are easily derivatized for covalent attachment of a ligand or spacer arms and the porous structure also supplies ideal flow rate and high capacity.

### *6.1.2. Spacer arms*

The binding site of a target protein often locates deep within the molecule. Due to steric interference, a small ligand directly coupled to the matrix always shows a lower affinity with the target protein than in their free state. To overcome this situation, spacer arms, typically linear molecules with different chain length, are used to bridge ligands and matrix. In general a spacer arm is necessary in coupling ligands Mw <1000, and not need for larger ligands (Figure 12). An ideal spacer arms should have active groups at two ends by which it can be covalently coupled with matrix and ligand respectively. After coupling with matrix and ligand, the arms should be chemically stable to avoid reaction with other solutes and be hydrophilic to avoid the hydrophobic interaction with proteins.

**Figure 12.** The influence of spacer arms on small or large ligands. A spacer arm is often necessary for coupling small li‐ gands, which ensure a efficient binding between ligands and target proteins (A), but not necessary for large ligands (B).

The atom number of commonly used space arms varies from 4 to 12. They often coupled with agarose matrix by stable ether links at one end and with ligand by other chemical bonds at the opposite end.

### *6.1.3. Ligand coupling*

**6. Affinity chromatography**

54 Protein Engineering - Technology and Application

increase in purity and high recovery.

**6.1. Media**

*6.1.1. Base matrix*

*6.1.2. Spacer arms*

metal ions and proteins having polyhistidine sequence.

Affinity chromatography (AC) extensively refers to a series of techniques that separate proteins on the basis of a reversible interaction between proteins and their specific ligands coupled to a chromatography matrix [18]. The affinity interactions derive from a wide range of biorecog‐ nition, briefly including interactions between (1) enzymes and substrate analogues, inhibitors, cofactors [19], (2) antibodies and antigens [20], (3) membrane receptors and ligands [21], (4) nucleic acid and complementary sequence, histones, or nucleic acid polymerase, nucleic acid binding proteins, (5) biological small molecules and their receptors or carrier proteins [22], (6)

Affinity interactions are always a result of a combination of different types of interactions, including electrostatic interactions, hydrophobic interactions, van der Vaals' forces, or hydrogen bonding. The interactions of high specificity always supply extremely high selec‐ tivity, by which a target protein could easily be separated in one step with thousands fold of

Development of an AC media is much more complex than that other chromatography. It needs not only a specific ligand, but also complex coupling process to couple the ligand to the matrix without reducing its binding activity significantly. Therefore more and more ready-to-use matrices, which already have active ligands coupled to, were developed commercially to satisfy different separation. If no suitable ligand is available, it can be considered to develop

The mostly used material is agarose or cross-linked agarose. The hydroxyl groups on the sugar resides are easily derivatized for covalent attachment of a ligand or spacer arms and the porous

The binding site of a target protein often locates deep within the molecule. Due to steric interference, a small ligand directly coupled to the matrix always shows a lower affinity with the target protein than in their free state. To overcome this situation, spacer arms, typically linear molecules with different chain length, are used to bridge ligands and matrix. In general a spacer arm is necessary in coupling ligands Mw <1000, and not need for larger ligands (Figure 12). An ideal spacer arms should have active groups at two ends by which it can be covalently coupled with matrix and ligand respectively. After coupling with matrix and ligand, the arms should be chemically stable to avoid reaction with other solutes and be hydrophilic to avoid

a specific affinity medium or use alternative purification techniques.

structure also supplies ideal flow rate and high capacity.

the hydrophobic interaction with proteins.

A coupling procedure of ligand is generally composed of three steps. First a group on matrix or spacer arm is activated by an activating agent. And then the activated group reacts with a functional group on ligand molecules. Finally, residual unreacted groups are blocked by blocking agent [23]. A matrix can be coupled with a ligand by a chemical group on itself or by groups on spacer arms. A variety of spacer arms are available to couple with to functional groups on ligands such as amino, hydroxyl, carboxyl, thiol groups (Figure 13).

**Figure 13.** Commonly used spacer arms and immobilization procedures of ligands. (A) Ligands are directly coupled with matrix by reaction between Cyanogen bromide activated hydroxyl on matrix and amino group on ligand. (B) Li‐ gands are coupled with spacer arms by reaction between N-hydroxysuccinimide activated carboxyl and amino group on ligand. (C) Lgands couple with spacer arms by reaction with epoxy group. (D) Coupling through condensation be‐ tween a free amino and a free carboxyl group. (E) Coupling through bisulfide bond or additive reaction between sila‐ nol and double bond in ligand, such as N=N or C=N.

## *6.1.4. Steric interference*

For a small ligand, it should be paid attentions to the influences of steric interference even if a spacer arm has been used. For small ligands the amount of each functional group is rare. even just one. A bad choice that makes a wrong spatial orientation in coupling will likely cause a serious decrease in binding capacity or even complete failure. On the contrary, large ligands have several equivalent groups through which coupling takes place, so that a large proportion couplings leave sufficient space for binding with target molecules (Figure 14). Therefore in coupling a small ligand, it is important to choose a suitable functional group without intro‐ ducing significant steric interference. The information of structure can be obtained from databases of X-ray crystal diffraction or NMR, or prediction by computational biology.

**Figure 14.** The influences of steric interference to small and large ligands. (A) For a small ligand, an inappropriate cou‐ pling orientation likely results in steric interference and inefficient adsorption. (B) This situation is less happened on large ligands.

#### **6.2. Binding and elution**

**Figure 13.** Commonly used spacer arms and immobilization procedures of ligands. (A) Ligands are directly coupled with matrix by reaction between Cyanogen bromide activated hydroxyl on matrix and amino group on ligand. (B) Li‐ gands are coupled with spacer arms by reaction between N-hydroxysuccinimide activated carboxyl and amino group on ligand. (C) Lgands couple with spacer arms by reaction with epoxy group. (D) Coupling through condensation be‐ tween a free amino and a free carboxyl group. (E) Coupling through bisulfide bond or additive reaction between sila‐

nol and double bond in ligand, such as N=N or C=N.

56 Protein Engineering - Technology and Application

An ideal binding buffer should be optimized to ensure efficient interaction between target molecules and ligands and minimize the nonspecific interaction at same time. Since the ligandprotein interaction is a result of combination of electrostatic attraction, hydrophobic interaction and hydrogen bonds, the binding conditions can be optimized on these aspects.

Adsorbed proteins could be eluted by modification of pH value, ionic strength, or polarity. pH value could be decrease to pH 2~3 to reduce the charge property of interaction surface between proteins. For example, immunoglubin could be adsorbed by a protein A column and eluted by a glycine buffer with pH 3.0. But the eluted sample should be neutralized as soon as possible to avoid being destroyed in extreme circumstance.

The ionic interaction also can be weakened by adding neutral salt, for example 1M NaCl is frequently used in practice.

A specific elution can be performed by adding competitors of either ligands or target proteins in elution buffer. An ideal competitor should have a moderate dissociation coefficient to the ligand or the target molecule, so that the competitor can elute target with high concentration but can be easily removed from column by wash or isolated from target protein by dialysis. Two classic applications is affinity chromatography of Glutathione S-transferase (GST) and polyhistdine [24] (Figure 15).

In binding process, flow rate should be control at a relative low degree to ensure an effective binding capacity.

**Figure 15.** Different elution mechanisms in GST affinity chromatography and metal chelate interaction chromatogra‐ phy (A) In GST purification, GST is captured by a medium with immobilized glutathione, and then dissociated by add‐ ing excess reduced glutathione. The excess glutathione is eluted together with target protein and removed by dialysis. (B) In nickel ion chelate interaction chromatography, protein with polyhistidine sequence is adsorbed by a medium with immobilized ionized nickel through a chelation between nickel ion and imidazolyl on polyhistidine sequence. The protein is eluted by adding high concentration of imidazol, a competitor of the imidazolyl on the protein. Finally, the small competitor is washed away from column by binding buffer.

## **6.3. Tag purification strategy**

eluted by a glycine buffer with pH 3.0. But the eluted sample should be neutralized as soon as

The ionic interaction also can be weakened by adding neutral salt, for example 1M NaCl is

A specific elution can be performed by adding competitors of either ligands or target proteins in elution buffer. An ideal competitor should have a moderate dissociation coefficient to the ligand or the target molecule, so that the competitor can elute target with high concentration but can be easily removed from column by wash or isolated from target protein by dialysis. Two classic applications is affinity chromatography of Glutathione S-transferase (GST) and

In binding process, flow rate should be control at a relative low degree to ensure an effective

**Figure 15.** Different elution mechanisms in GST affinity chromatography and metal chelate interaction chromatogra‐ phy (A) In GST purification, GST is captured by a medium with immobilized glutathione, and then dissociated by add‐ ing excess reduced glutathione. The excess glutathione is eluted together with target protein and removed by dialysis. (B) In nickel ion chelate interaction chromatography, protein with polyhistidine sequence is adsorbed by a medium with immobilized ionized nickel through a chelation between nickel ion and imidazolyl on polyhistidine sequence. The protein is eluted by adding high concentration of imidazol, a competitor of the imidazolyl on the protein. Finally, the

small competitor is washed away from column by binding buffer.

possible to avoid being destroyed in extreme circumstance.

frequently used in practice.

58 Protein Engineering - Technology and Application

polyhistdine [24] (Figure 15).

binding capacity.

AC separates protein typically on the basis of interactions between ligands and local domains of target proteins. The interactions are not interfered by other domains in most case. Therefore, the tag purification strategy was invented to rapidly separate recombinant protein by fusion expression and co-separation [25].

First the target protein is expressed with a tag protein in fusion form. Then the target protein is purified using an affinity column that is specific to tag protein. After that if the tag needs to be removed, a restrictive protease is used to hydrolyze the fusion protein and the freed tag is finally be separated from target protein by running the same column once again.

The ideal tag protein should (1) have economical affinity chromatography media for conven‐ ient separation, (2) be very stable in bioactivity, and (3) have a good expressing property that is helpful to increase the expression of target protein. Commonly used tags are GST tag, FLAG tag, S tag, Strep tag, His tag, and so on.

## **6.4. Application**

Affinity chromatography is a rapid and efficient chromatography technique. The high specific biorecognition give the technique an extremely high selectivity, by which a protein or a group of proteins could be separated from a crude sample in one step and reaches to a satisfying purify. However the excellent performance is based on the complex productive technology. Development of each noval medium needs a plenty of trials on finding suitable ligand and coupling the ligand on matrix properly. It is worth time and effort to develop a new specific affinity medium for high scale protein production, otherwise, the alternative method such as tag purfication or other chromatography should be a better choice for small scale preparation in expiremental research.

## **7. Summary**

This chapter introduces principles and applications of several basic chromatography techni‐ ques. Different techniques separate proteins depending on different properties including net surface charge, hydrophobicity, molecular size, and affinity interaction. Affinity chromatog‐ raphy has the highest selectivity and can purify target proteins in one step to > 95% purify. But due to the difficulties on obtaining and immobilization of suitable ligand, this chromatography technique is not used as widely as other ones. HIC and RPC are both based upon hydrophobic interaction. PRC is widely used in analytic separation because of its high resolution, but less used on preparative separation of proteins since the high nonpolarity of the eluent likely causes irreversible inactivation of proteins. IEXC, HIC and SEC separate proteins in mild conditions and are suitable for large scale separation of active proteins. However, their resolutions are comparatively lower and hard to purify a protein from complex components by a single technique. An ideal purification could be achieved by combined application of several techniques.

## **Author details**

Jingjing Li1 , Wei Han1 and Yan Yu2

1 Laboratory of Regeneromics, School of Pharmacology, Shanghai Jiao Tong University, Shanghai, China

2 School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China

## **References**


[12] Parente, E. S, & Wetlaufer, D. B. Relationship between isocratic and gradient reten‐ tion times in the high-performance ion-exchange chromatography of proteins. Theo‐ ry and experiment, *J Chromatogr*, (1986).

**Author details**

Shanghai, China

**References**

, Wei Han1

60 Protein Engineering - Technology and Application

and Yan Yu2

odology, *Anal Chem*, (1990). R-356R.

powdered cellulose, *Nature*, (1948).

tion in open hole tubes, *J Chromatogr*, (1963).

methodology, *Anal Chem*, (1994). R-546R.

meister series, *Curr Opin Chem Biol*, (2006).

proach to hydrophobic adsorption, *Nature*, (1973).

lulose, *Endocrinology*, (1960).

*matogr*, (1984).

1 Laboratory of Regeneromics, School of Pharmacology, Shanghai Jiao Tong University,

[2] Dorsey, J. G, Foley, J. P, & Cooper, W. T. Liquid chromatography: theory and meth‐

[3] Hough, L, Jones, J. K, & Wadman, W. H. Application of paper partition chromatogra‐ phy to the separation of the sugars and their methylated derivatives on a column of

[4] Thijssen, H. A. Gas-liquid chromatography. A contribution to the theory of separa‐

[5] Dorsey, J. G, Cooper, W. T, & Wheeler, J. F. Liquid chromatography: theory and

[6] Cashman, P. J, & Thornton, J. I. High speed liquid adsorption chromatography in

[7] Woods, M. C, & Simpson, M. E. Purification of sheep pituitary follicle-stimulating hormone (FSH) by ion exchange chromatography on diethylaminoethyl (DEAE)-cel‐

[8] Melander, W. R, Corradini, D, & Horvath, C. Salt-mediated retention of proteins in hydrophobic-interaction chromatography. Application of solvophobic theory, *J Chro‐*

[9] Biswas, K. M, Devido, D. R, & Dorsey, J. G. Evaluation of methods for measuring

[10] Zhang, Y, & Cremer, P. S. Interactions between macromolecules and ions: The Hof‐

[11] Porath, J, Sundberg, L, & Fornstedt, N. Salting-out in amphiphilic gels as a new ap‐

amino acid hydrophobicities and interactions, *J Chromatogr A*, (2003).

2 School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China

[1] Zechmeister, L. Early history of chromatography, *Nature*, (1951).

criminalistics. I. Theory and practice, *J Forensic Sci Soc*, (1971).

Jingjing Li1


## **Protein-Protein and Protein-Ligand Docking**

Alejandra Hernández-Santoyo, Aldo Yair Tenorio-Barajas, Victor Altuzar, Héctor Vivanco-Cid and Claudia Mendoza-Barrera

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/56376

## **1. Introduction**

Molecular interactions including protein-protein, enzyme-substrate, protein-nucleic acid, drug-protein, and drug-nucleic acid play important roles in many essential biological proc‐ esses, such as signal transduction, transport, cell regulation, gene expression control, enzyme inhibition, antibody–antigen recognition, and even the assembly of multi-domain proteins. These interactions very often lead to the formation of stable protein–protein or protein-ligand complexes that are essential to perform their biological functions. The tertiary structure of proteins is necessary to understand the binding mode and affinity between interacting molecules. However, it is often difficult and expensive to obtain complex structures by experimental methods, such as X-ray crystallography or NMR. Thus, docking computation is considered an important approach for understanding the protein-protein or protein-ligand interactions [1-3]. As the number of three-dimensional protein structures determined by experimental techniques grows —structure databases such as Protein Data.

Bank (PDB) and Worldwide Protein Data Bank (wwPDB) have over 88000 protein structures, many of which play vital roles in critical metabolic pathways that may be regarded as potential therapeutic targets — and specific databases containing structures of binary complexes become available, together with information about their binding affinities, such as in PDBBIND [4], PLD [5], AffinDB [6] and BindDB [7], molecular docking procedures improve, getting more importance than ever [8].

Molecular docking is a widely used computer simulation procedure to predict the conforma‐ tion of a receptor-ligand complex, where the receptor is usually a protein or a nucleic acid molecule and the ligand is either a small molecule or another protein (Figure 1).

© 2013 Hernández-Santoyo et al.; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Hernández-Santoyo et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**Figure 1.** Elements in molecular docking.

The accurate prediction of the binding modes between the ligand and protein is of fundamental importance in modern structure-based drug design. The most important application of docking software is the virtual screening, in which the most interesting and promising molecules are selected from an existing database for further research. This places demands on the used computational method: it must be fast and reliable. Another application is the research of molecular complexes.

Since the pioneering work of Kuntz *et al*. [9] during the early 1980s, significant progress has been made in docking research to improve the computational speed and accuracy. Over the last years several important steps beyond this point have been given. Handling efficiently the flexibility of the protein receptor is currently considered one of the major challenges in the field of docking. The binding-site location and binding orientation can be greatly influenced by protein flexibility. In fact, X-ray structure determination of protein–ligand complexes frequently reveals ligands with a buried surface area in the range of 70–100%, which can only be achieved as a consequence of protein flexibility [3]. There are many interesting docking suites and algorithms that have shown significant progress in predicting near-native binding poses by making use of biophysical and biochemical information combination with bioinfor‐ matics.

## **2. Theory**

Modeling the interaction of two molecules is a complex problem. Many forces are involved in the intermolecular association, including hydrophobic, van der Waals, or stacking interactions between aromatic amino acids, hydrogen bonding, and electrostatic forces. Modeling the intermolecular interactions in a ligand-protein complex is difficult since there are many degrees of freedom as well as insufficient knowledge of the effect of solvent on the binding association. The process of docking a ligand to a binding site tries to mimic the natural course of interaction of the ligand and its receptor via the lowest energy pathway [3]. There are simple methods for docking rigid ligands with rigid receptors and flexible ligands with rigid recep‐ tors, but general methods of docking considering conformationally flexible ligands and receptors are problematic. Docking protocols can be described as a combination of a search algorithm, and the scoring functions (Figure 2).

**Figure 2.** Methods used for protein-ligand docking.

**Figure 1.** Elements in molecular docking.

64 Protein Engineering - Technology and Application

of molecular complexes.

matics.

**2. Theory**

The accurate prediction of the binding modes between the ligand and protein is of fundamental importance in modern structure-based drug design. The most important application of docking software is the virtual screening, in which the most interesting and promising molecules are selected from an existing database for further research. This places demands on the used computational method: it must be fast and reliable. Another application is the research

Since the pioneering work of Kuntz *et al*. [9] during the early 1980s, significant progress has been made in docking research to improve the computational speed and accuracy. Over the last years several important steps beyond this point have been given. Handling efficiently the flexibility of the protein receptor is currently considered one of the major challenges in the field of docking. The binding-site location and binding orientation can be greatly influenced by protein flexibility. In fact, X-ray structure determination of protein–ligand complexes frequently reveals ligands with a buried surface area in the range of 70–100%, which can only be achieved as a consequence of protein flexibility [3]. There are many interesting docking suites and algorithms that have shown significant progress in predicting near-native binding poses by making use of biophysical and biochemical information combination with bioinfor‐

Modeling the interaction of two molecules is a complex problem. Many forces are involved in the intermolecular association, including hydrophobic, van der Waals, or stacking interactions between aromatic amino acids, hydrogen bonding, and electrostatic forces. Modeling the intermolecular interactions in a ligand-protein complex is difficult since there are many degrees of freedom as well as insufficient knowledge of the effect of solvent on the binding association. The process of docking a ligand to a binding site tries to mimic the natural course of interaction of the ligand and its receptor via the lowest energy pathway [3]. There are simple methods for docking rigid ligands with rigid receptors and flexible ligands with rigid recep‐ tors, but general methods of docking considering conformationally flexible ligands and

The search algorithm should create an optimum number of configurations that include the experimentally determined binding modes. Although a rigorous searching algorithm would go through all possible binding modes between the two molecules, this search would be impractical due to the size of the search space and amount of time it might take to complete. As a consequence, only a small amount of the total conformational space can be sampled, so a balance must be reached between the computational expense and the amount of the search space examined. Some common searching algorithms include molecular dynamics, Monte Carlo methods, genetic algorithms, fragment-based, point complementary and distance geometry methods, Tabu, and systematic searches. On the other hand, scoring function consists of a number of mathematical methods used to predict the strength of the non-covalent interaction called the binding affinity. In all the computational methodologies, one important problem is the development of an energy scoring function that can rapidly and accurately describe the interaction between the protein and ligand. Several reviews on scoring are available in the literature [10-12].

There are three important applications of scoring functions in molecular docking. First is to determine the binding mode and site of a ligand on a protein. The second application is to predict the absolute binding affinity between protein and ligand. This is particularly important in lead optimization. The third application, and perhaps the most important one in structurebased drug design, is to identify the potential drug hits/leads for a given protein target by searching a large ligand database. Over the course of the last years, different scoring functions have been developed that exhibit different accuracies and computational efficiencies. Some of these commonly-used functions are: force-field, empirical, knowledge-based and consensus scoring.

The protein-ligand docking procedure can be typically divided into two parts: rigid body docking and flexible docking [9].


## **3. Experimental docking procedures**

There are a number of excellent reviews of molecular docking methods and a large number of publications comparing the performance of a variety of molecular docking tools [1-3], [13]. Following, we will describe the four-step procedure adopted in this study to perform the molecular docking.

### **3.1. Target selection**

Ideally, the target structure should be determined experimentally by either X-ray crystallog‐ raphy or nuclear magnetic resonance, which can be downloaded from PDB; however, docking has been performed successfully in comparison to homology models or threading. The model should have good quality. It can be tested using validation software such as Molprobity [14]. After selecting the model, it must be prepared by removing the water molecules from the cavity, stabilizing charges, filling the missing residues, and generating the side chains, all according to the available parameters. The receptor should be at this point biologically active and in the stable state.

## **3.2. Ligand selection and preparation**

The type of ligands chosen for docking will depend on the goal. It can be obtained from various databases, e.g. ZINC or/and PubChem, or it can be sketched by means of Chemsketch tool [8]. Often it is necessary to apply filters to reduce the number of molecules to be docked. Examples include the net charge, molecular weight, polar surface area, solubility, commercial availabil‐ ity, similarity thresholds, pharmacophores, synthetic accessibility, and absorption, distribu‐ tion, metabolism, excretion, and toxicology properties. Many times the researchers design their own molecules such as those generated by us in the example that will be described in this work in the section 5.

## **3.3. Docking**

predict the absolute binding affinity between protein and ligand. This is particularly important in lead optimization. The third application, and perhaps the most important one in structurebased drug design, is to identify the potential drug hits/leads for a given protein target by searching a large ligand database. Over the course of the last years, different scoring functions have been developed that exhibit different accuracies and computational efficiencies. Some of these commonly-used functions are: force-field, empirical, knowledge-based and consensus

The protein-ligand docking procedure can be typically divided into two parts: rigid body

**1.** *Rigid Docking*. This approximation treats both the ligand and the receptor as rigid and explores only six degrees of translational and rotational freedom, hence excluding any kind of flexibility. Most of the docking suites employ rigid body docking procedure as a

**2.** *Flexible Docking*. A more common approach is to model the ligand flexibility while assuming having a rigid protein receptor, considering thereby only the conformational space of the ligand. Ideally, however, protein flexibility should also be taken into account, and some approaches in this regard have been developed. There are three general categories of algorithms to treat ligand flexibility: systematic methods, random or stochastic methods, and simulation methods [3]. Due to the large size of proteins and their multiple degrees of freedom, their flexibility may be the most challenging issue in molecular docking. The methods to address the flexibility of proteins can be grouped into: soft docking, side-chain flexibility, molecular relaxation and protein ensemble docking.

There are a number of excellent reviews of molecular docking methods and a large number of publications comparing the performance of a variety of molecular docking tools [1-3], [13]. Following, we will describe the four-step procedure adopted in this study to perform the

Ideally, the target structure should be determined experimentally by either X-ray crystallog‐ raphy or nuclear magnetic resonance, which can be downloaded from PDB; however, docking has been performed successfully in comparison to homology models or threading. The model should have good quality. It can be tested using validation software such as Molprobity [14]. After selecting the model, it must be prepared by removing the water molecules from the cavity, stabilizing charges, filling the missing residues, and generating the side chains, all according to the available parameters. The receptor should be at this point biologically active

scoring.

first step.

molecular docking.

**3.1. Target selection**

and in the stable state.

docking and flexible docking [9].

66 Protein Engineering - Technology and Application

They were described by Huang *et al* [1].

**3. Experimental docking procedures**

This is the last step, where the ligand is docked onto the receptor and the interactions are checked. The scoring function generates a score depending on the best selected ligand.

## **3.4. Evaluating docking results**

The success of docking algorithms in predicting a ligand binding pose is normally measured in terms of the root-mean-square deviation (RMSD) between the experimentally-observed heavy-atom positions of the ligands and the one(s) predicted by the algorithm. The flexibility of the system is a major challenge in the search for the correct pose. The number of degrees of freedom included in the conformational search is a central aspect that determines the searching efficiency [3]. A good performance is usually considered when the RMSD is less than 2Å.

### **3.5. Docking software description**

There are many algorithms available to assess and rationalize ligand-protein or proteinprotein interactions, and their number is constantly increasing. Speed and accuracy are key features for obtaining successful results in docking approaches. Several algorithms share commonmethodologieswithnovelextensionsfocusedonobtainingafastmethodwithaccuracy as high as possible. The most common docking programs include AutoDock [15], DOCK [9], FlexX[16],GOLD[17],ICM[18],ADAM[19],DARWIN [20],DIVALI[21], andDockVision[22].

## **4. Application of molecular docking to a particular case — Biopolymers docked to dengue virus E protein**

In the last decades, the incidence of Dengue disease has dramatically increased around the world. About 2.5 billion persons (two fifth of the world population) are exposed to the risk of contracting the disease. Every year, dengue virus (DENV) infects more than 50 million people, with approximately 22 000 fatal cases [23]. The disease is endemic in more than 100 countries of Africa, America, Oriental Mediterranean, Southeast Asia, and the Western Pacific Ocean with the last two regions being the most affected by the disease. Before 1970 only nine countries suffered from the Hemorrhagic Dengue (HD) epidemics, number that in 1995 was multiplied for more than four. There are four antigenically distinct, but closely related, serotypes of dengue virus (DENV), which is a Flavivirus member of the family Flaviviridae [24]. Each serotype has genotypes, which are virulent at several levels; nevertheless, the factors of virulence are not totally established [25]. A better understanding of the mechanisms and the molecules involved in the key steps of the DENV transmission cycle may lead to the identifi‐ cation of new anti-dengue targets [26]. In fact, the presence of two or more serotypes in the same geographical region implies a growing risk to population of contracting Hemorrhagic Dengue or Dengue Shock Syndrome (SSD) due to a phenomenon known as the Antibody – Dependent Enhancement (ADE). As a result, the diagnosis and treatment of dengue disease has become a world-wide global problem to deal with. The mature DENV virion contains three structural proteins: capsid protein (C), membrane protein (M), and envelope protein (E). In particular, the DENV E glycoprotein (51-60 kDa~ 495 aa), found on the viral surface, is important in the initial attachment of the viral particle to the host cell, as it contains two Nlinked glycosylation sites at Asn-67 and Asn-153. While the glycosylation site at position 153 is conserved in most flaviviruses, the site at position 67 is thought to be unique for dengue virus. N-linked oligosaccharide side chains on flavivirus E proteins have been associated with viral morphogenesis, infectivity, and tropism [27, 28]. In addition, E protein is closely associ‐ ated with the lipid envelope containing a cellular receptor-binding site (s) and a fusion peptide [29]. It can be found in a form of a homodimer on the surface of the mature virion, and inside the cell, it creates a prM-E heterodimer together with the prM protein. E protein is the principal component of the virion surface, containing the antigenic determinants (epitopes) responsible for the neutralization of the virus and the hemagglutination of erythrocytes, inducing thereby an immunological response in the infected host [29]. On native virions, the elongated threedomain E molecule is positioned tangentially to the virus envelope in a head-to-tail homodi‐ meric conformation. Upon penetration of the virion into the target cell endosome, E dimers are converted into stable target-cell membrane-inserted homotrimers that reorient themselves vertically to promote virus-cell fusion at low pH [30]. Furthermore, there is a great deal of evidence that E protein contains the majority of molecular markers for pathogenicity. Com‐ paring the nucleotide sequence of the E protein gene in different flaviviruses has demonstrated a perfect conservation of 12 cysteine residues, which form six disulfide bridges. The structural model for the E protein was refined by Mandl and co-workers [31], who correlated the structural properties of different epitopes with disulfide bonds [32].

#### **4.1. Biopolymers as potential adjuvants carriers**

The aim of this work is to study the docking of monomers of polyvinylpyrrolidone (PVP), chitosan (CS), and chitosan-tripropylphosphate-chitosan (CS/TPP/CS) with E protein of dengue virus in order to use them as potential adjuvant carriers. Given their structure, these polymers have specific molecular anchor sites that are expected to be exploited to induce antigenic specificity to the conserved regions of dengue virus. Several authors report that the E protein produces immunity and confers protection against infection in mice with low levels of neutralizing antibodies [33-35]. Because of the dual role of its receptors as well as the cell entry through membrane fusion, the E protein, apart from being the most exposed protein, is the main target against which the neutralizing antibodies are produced to inhibit its functions.

At present, the biggest challenge in developing an efficient dengue vaccine is to achieve a lifelong protective immune response to all 4 serotypes (DEN1-4) simultaneously. Although several vaccines are currently being developed, so far only a chimeric dengue vaccine for live attenuated yellow fever (YF) has reached stage 3 in clinical trials. The candidate vaccines can be divided into the following types: (a) live attenuated, (b) DEN-DEN and DEN-YF live chimeric virus, (c) inactivated whole virus, (d) live recombinant, (e) DNA, and (f) subunit vaccines [36].

Chitosan is a polycationic polymer comprising of D-glucosamine and N-acetyl-D-glucosamine linked by β(1,4)-glycosides' bonds. It is produced by deacetylation of chitin, which is extracted from the shells of crabs and shrimp. It is a linear, hydrophilic, positively charged, water soluble biopolymer, can form thin films, hydrogels, porous scaffolds, fibers, and micro and nanopar‐ ticles in mild acidity conditions. As a polycationic polymer, it has a high affinity to associate macromolecules such as insulin, pDNA, siRNA, heparin, among others, with antigenic molecules, protecting them in turn from hydrolytic and enzymatic degradation [37].

Polyvinylpyrrolidone (N-vinyl-2-pyrrolidone, PVP) has chemical, physical and physiological properties which have been exploited in various industries, including but not limited to medical, pharmaceutical, cosmetic, food, and textile, due to its biological compatibility, low toxicity, tackiness, resistance to thermal degradation in solutions as well as inert behavior in salt and acidic solutions [38, 39]. It is a water soluble homopolymer with a wide range of molecular weights (2.5 to 1.200 kDa), molecules between 12 and 1350 monomers, and end-toend distances ranging from 2.3 to 93 nanometers. It is physically and chemically stable; it tolerates heating and air atmospheres for up to 16 hours at 100 °C, as well as the change of appearance for 2 months at 24 ° C and 15% HCl. When heated with strong bases such as lithium carbonate, trisodium phosphate or sodium metasilicate, it generates a precipitate due to the ring opening and subsequent crosslinking of chains. Yen-Jen *et al*. studied its effect as a drug deliverer and intracellular acceptor [40].

### *4.1.1. Molecular docking*

suffered from the Hemorrhagic Dengue (HD) epidemics, number that in 1995 was multiplied for more than four. There are four antigenically distinct, but closely related, serotypes of dengue virus (DENV), which is a Flavivirus member of the family Flaviviridae [24]. Each serotype has genotypes, which are virulent at several levels; nevertheless, the factors of virulence are not totally established [25]. A better understanding of the mechanisms and the molecules involved in the key steps of the DENV transmission cycle may lead to the identifi‐ cation of new anti-dengue targets [26]. In fact, the presence of two or more serotypes in the same geographical region implies a growing risk to population of contracting Hemorrhagic Dengue or Dengue Shock Syndrome (SSD) due to a phenomenon known as the Antibody – Dependent Enhancement (ADE). As a result, the diagnosis and treatment of dengue disease has become a world-wide global problem to deal with. The mature DENV virion contains three structural proteins: capsid protein (C), membrane protein (M), and envelope protein (E). In particular, the DENV E glycoprotein (51-60 kDa~ 495 aa), found on the viral surface, is important in the initial attachment of the viral particle to the host cell, as it contains two Nlinked glycosylation sites at Asn-67 and Asn-153. While the glycosylation site at position 153 is conserved in most flaviviruses, the site at position 67 is thought to be unique for dengue virus. N-linked oligosaccharide side chains on flavivirus E proteins have been associated with viral morphogenesis, infectivity, and tropism [27, 28]. In addition, E protein is closely associ‐ ated with the lipid envelope containing a cellular receptor-binding site (s) and a fusion peptide [29]. It can be found in a form of a homodimer on the surface of the mature virion, and inside the cell, it creates a prM-E heterodimer together with the prM protein. E protein is the principal component of the virion surface, containing the antigenic determinants (epitopes) responsible for the neutralization of the virus and the hemagglutination of erythrocytes, inducing thereby an immunological response in the infected host [29]. On native virions, the elongated threedomain E molecule is positioned tangentially to the virus envelope in a head-to-tail homodi‐ meric conformation. Upon penetration of the virion into the target cell endosome, E dimers are converted into stable target-cell membrane-inserted homotrimers that reorient themselves vertically to promote virus-cell fusion at low pH [30]. Furthermore, there is a great deal of evidence that E protein contains the majority of molecular markers for pathogenicity. Com‐ paring the nucleotide sequence of the E protein gene in different flaviviruses has demonstrated a perfect conservation of 12 cysteine residues, which form six disulfide bridges. The structural model for the E protein was refined by Mandl and co-workers [31], who correlated the

structural properties of different epitopes with disulfide bonds [32].

The aim of this work is to study the docking of monomers of polyvinylpyrrolidone (PVP), chitosan (CS), and chitosan-tripropylphosphate-chitosan (CS/TPP/CS) with E protein of dengue virus in order to use them as potential adjuvant carriers. Given their structure, these polymers have specific molecular anchor sites that are expected to be exploited to induce antigenic specificity to the conserved regions of dengue virus. Several authors report that the E protein produces immunity and confers protection against infection in mice with low levels of neutralizing antibodies [33-35]. Because of the dual role of its receptors as well as the cell

**4.1. Biopolymers as potential adjuvants carriers**

68 Protein Engineering - Technology and Application

In this work, the molecular docking calculations were performed using the AutoDock program. In particular, it uses a Lamarckian genetic algorithm (LGA) and a force field function based approximately on the AMBER force field, which consists of five terms: 1) the 12-6 dispersion term of Lennard-Jones, 2) a 12-10 directional hydrogen bonding term, 3) an electrostatic Coulomb potential, 4) an entropic term, and 5) one term of desolvation pairs. The scaling factor of these terms is empirically calibrated using a set of 30 structurally-known protein-ligand complexes, which affinities have been experimentally determined. The AutoDock program has become widely used due to its good precision and high versatility; moreover, the latest version of AutoDock (version 4.0) added flexible functions to the side chains in the receptor.

#### *4.1.2. Model preparation*

In this study we used the E protein of dengue virus. It consists of a dimer with 394 amino acids (aa) per monomer and, as mentioned before, it is the main component of the dengue virus envelope. E protein enters the cell by fusion with the membrane due to a previous conforma‐ tional change produced by a low pH, generating thereby a change of form from dimer to trimer, in which the fusion peptide between the II and III domains is exposed. When the pH is lower than 6.3, dimers dissociate from dimer phase, making the I and II domains rotate outwards and exposing the fusion loop, which interacts with the endosomal membrane of the cell. Domain III then rotates backwards to pull the I and II domains, which were already bound to the cell membrane by the fusion peptide, thus attaching the cell membrane with the membrane of the virus in order to release the RNA [27, 29, 41-45]. It is important to mention that Bressanelli showed that the virus domains remain at neutral pH but their relative orientation is altered [27]. For best results during the molecular docking process, we optimize the original model of the dengue virus protein E (PDB code 1OKE) with a number of refinements and validations cycles with Phenix and Molprobity programs respectively. Figure 3 shows the corrected model of the dimer and trimer.

**Figure 3.** Dengue virus protein E. (A) Dimeric protein after geometric and refining corrections with Phenix program. (B) Trimeric protein that represents a postfusion state (C) Top view of the trimeric form. Domain I (aa1-52,132-193,280-296) is in red, domain II (aa52-132, 193-280) in yellow and domain III (aa296-394) in blue color.

## *4.1.3. Ligands preparation*

*4.1.2. Model preparation*

70 Protein Engineering - Technology and Application

of the dimer and trimer.

In this study we used the E protein of dengue virus. It consists of a dimer with 394 amino acids (aa) per monomer and, as mentioned before, it is the main component of the dengue virus envelope. E protein enters the cell by fusion with the membrane due to a previous conforma‐ tional change produced by a low pH, generating thereby a change of form from dimer to trimer, in which the fusion peptide between the II and III domains is exposed. When the pH is lower than 6.3, dimers dissociate from dimer phase, making the I and II domains rotate outwards and exposing the fusion loop, which interacts with the endosomal membrane of the cell. Domain III then rotates backwards to pull the I and II domains, which were already bound to the cell membrane by the fusion peptide, thus attaching the cell membrane with the membrane of the virus in order to release the RNA [27, 29, 41-45]. It is important to mention that Bressanelli showed that the virus domains remain at neutral pH but their relative orientation is altered [27]. For best results during the molecular docking process, we optimize the original model of the dengue virus protein E (PDB code 1OKE) with a number of refinements and validations cycles with Phenix and Molprobity programs respectively. Figure 3 shows the corrected model

**Figure 3.** Dengue virus protein E. (A) Dimeric protein after geometric and refining corrections with Phenix program. (B) Trimeric protein that represents a postfusion state (C) Top view of the trimeric form. Domain I (aa1-52,132-193,280-296) is

in red, domain II (aa52-132, 193-280) in yellow and domain III (aa296-394) in blue color.

To prepare the ligands, we utilized the linear PVP and CS monomers, and CS/TPP/CS chains (Figure 4). The coordinates of those ligands were obtained using the SMILES program [46].

**Figure 4.** Ligands used in molecular docking. (A) PVP monomer, (B) CS monomer, and (C) CS/TPP/CS chain.

#### *4.1.4. Molecular docking*

Molecular docking was performed by means of the AutoDock program that combines rapid grid-based energy evaluation and efficient search of torsional freedom. This program uses a semi-empirical free energy force field to evaluate the conformations during the docking simulations. The force field is quantified using a large number of protein-inhibitor complexes, for which the inhibition constants (Ki ), are known. The force field evaluates the union in two steps, first when the ligand and the protein are separated. Then, the intramolecular energies are estimated for the transition from the unbound state to the protein-ligand bound state. In the second step, intermolecular energies are evaluated by combining the ligand with the protein conformations bound to themselves. The force field includes six pairs-wise of evalu‐ ations (Vi) and an estimated loss of conformational entropy after binding (ΔSconf):

$$
\Delta G = \left( V\_{bound}^{\;L\;-L} - V\_{umbound}^{\;L\;-L} \right) + \left( V\_{bound}^{\;P\;-P} - V\_{umbound}^{\;P\;-P} \right) + \left( V\_{bound}^{\;P\;-L} - V\_{umbound}^{\;P\;-L} + \Delta S\_{conf} \right) \tag{1}
$$

where L refers to the ligand and P to the "protein" in a ligand-protein docking. Each of the pair-wise energetic terms includes evaluations for dispersion/repulsion, hydrogen bonding, electrostatics, and desolvation [47].

The calculations can be summarized in the following four steps: (1) preparation of files using AutoDockTools coordinates, (2) pre-calculation of atomic affinities by using AutoGrid, (3) docking of ligands by using AutoDock, and (4) analysis of the results applying AutoDockTools.

## **4.2. Results**

## *4.2.1. Amino acids of interest in the dengue virus infection mechanism*

In the loop conformation, several amino acids are involved in trimerization of unit E of DENV. These amino acids are of particular interest since they are allocated in between I and II domains, the fusion loop of the host cell located between domains II and III, and aa268-270 (kl loop). Also are important, the loop of fusion to the host cell located between domains II and III, which subsequently is exposed in the trimer with the aa98-111 fusion peptide, and the C-terminal of domain III, which holds the protein to the virus membrane. Other important amino acids were mentioned by Mazumder [48], who made a structural analysis of the dengue virus E protein of the 4 serotypes in order to find the conserved and exposed sites as well as the epitopes in the T-cell. In our study, we additionally considered the sites of interest described by Yorgo Modis [42, 43] (Table 1).

In addition to the ten conserved regions presented in Table 1, we predicted around 740 E proteins of the 4 serotypes, some of which are included in the same Table 1. Their sequences were quantified using Shannon's entropy [48] with a variation from 0.3 to 1.1 bits.



**Table 1.** Sites of interest identified in the structure of the dengue virus E protein according to Yorgo Modis [42, 43, 48].

The analyzed proteins can be identified as: N8-G14, V24-D42, R73-E79, V97-S102, D192-M196, V208-W220, V252-H261, G281-C285, E314-T319, E370-G374, and K394-G399; whereas the hidden amino acids, which change to exposed amino acids in the trimer, can be listed as follows: M1, H244, K246, G254, G330 and K344; and the exposed residues that remain hidden include the following: S16, Q52, Q167, S169, P243, D290, Q293, S331 and E343. It is worth mentioning that we have identified at least 14 conserved negative sites in at least 3 of the 4 serotypes (C3, C60, R73, T189, F213, A267, F306, T319, S376, F392, K394, S424, G445, and V485). The importance of this discovery relies on the fact that it has demonstrated that the epitopes with negative sites work better as vaccines than those with positive sites as they are less likely to change due to their functional restrictions.

#### *4.2.2. Dengue virus E protein — PVP docking*

**4.2. Results**

72 Protein Engineering - Technology and Application

Modis [42, 43] (Table 1).

L191,T268-I270 This region is known as kl

V382-G385 C terminal that attaches

D98-G111 The loop is protected

*4.2.1. Amino acids of interest in the dengue virus infection mechanism*

In the loop conformation, several amino acids are involved in trimerization of unit E of DENV. These amino acids are of particular interest since they are allocated in between I and II domains, the fusion loop of the host cell located between domains II and III, and aa268-270 (kl loop). Also are important, the loop of fusion to the host cell located between domains II and III, which subsequently is exposed in the trimer with the aa98-111 fusion peptide, and the C-terminal of domain III, which holds the protein to the virus membrane. Other important amino acids were mentioned by Mazumder [48], who made a structural analysis of the dengue virus E protein of the 4 serotypes in order to find the conserved and exposed sites as well as the epitopes in the T-cell. In our study, we additionally considered the sites of interest described by Yorgo

In addition to the ten conserved regions presented in Table 1, we predicted around 740 E proteins of the 4 serotypes, some of which are included in the same Table 1. Their sequences

> The ligand is present; however, the kl loop does not adopt the open conformation present in the dimer-ligand pair.

It combines to create a trimeric contact with the other two domains.

It is exposed only in the trimeric conformation during the conformational changes, and it maintains

its structure.

N37, P207 Exposed. Exposed. Conserved epitope in the 4

A hinge allowing movement of domains I and II, as well as conformational changes when varying the pH of the

endosome.

serotypes.

4 serotypes.

serotypes.

It holds and folds the domains I and II, acting as a zipper. It is the most variable region among the 4

It is the region of highest interest since it is the binding receptor for the host cell. In the trimeric form it fuses and binds the cell and virus's membranes. It is the most conserved region among the

were quantified using Shannon's entropy [48] with a variation from 0.3 to 1.1 bits.

**Amino acid Dimer configuration Trimer configuration Function**

loop, without the presence

domain III to premembrane

between the domain II and III; it contains a fusion peptide that is formed by hydrophobic residues.

of the ligand (β-Octylglucoside, BOG), it forms a salt bridge and a hydrogen bond with beta strand I and j of the counterpart dimer.

(prM) virus.

The docking of PVP molecules with the E protein of dengue virus has demonstrated that the interface of the I and II domain was the most energetically favorable site for the binding (Figure 5). The interaction between protein and ligand takes place by establishing 8 hydrogen bonds with the Asn124, Lys202, and Asp203 amino acids (Table 2). This region is extremely impor‐ tant for the pivotal role it plays in the conformational changes triggered by low pH, which in turn is closely related to the infectivity of the virus. In particular, the PVP molecule, which interacts with aa124,202,203 in the E protein-BOG ligand complex, could act as a blocker of the kl aa268-270 pitchfork activity, which is responsible for the conformational changes in the E proteinatlowpH.Inotherwords,it couldinhibittheirfunctiontoworkas ahinge for conforma‐ tional changes due to its proximity to amino acids through steric hindrance, preventing thereby the hinge action between the I and II domain, which in turn could stiffen the area. Alternative‐ ly, if BOG ligand is absent, the molecule could be internalized into the hydrophobic pocket and replace it, but the subsequent molecular prediction simulations would be required to deter‐ mine how it could act in the presence of low pH, in order to find out whether the conformation‐ al changes would appear or be inhibited. The PVP is well-known to be highly stable at acid pH and high temperatures, so its structural integrity is assured to remain intact; the loop or kl pitchfork amino acids mutate, resulting in an increase of the pH threshold, at which conforma‐ tional changes occur.Itis achievedby replacing long hydrophobic side chains by the short ones. As the result, the site can be consistently represented as a potential trigger in the virus replica‐ tion cycle and a good candidate to inhibit its function (Figure 5).


**Table 2.** Polar interactions of the PVP molecule with dengue virus E protein.

**Figure 5.** Dengue virus E protein with PVP ligand. The lower part shows a close up of the docking area.

## *4.2.3. Dengue virus E protein — CS docking*

the hinge action between the I and II domain, which in turn could stiffen the area. Alternative‐ ly, if BOG ligand is absent, the molecule could be internalized into the hydrophobic pocket and replace it, but the subsequent molecular prediction simulations would be required to deter‐ mine how it could act in the presence of low pH, in order to find out whether the conformation‐ al changes would appear or be inhibited. The PVP is well-known to be highly stable at acid pH and high temperatures, so its structural integrity is assured to remain intact; the loop or kl pitchfork amino acids mutate, resulting in an increase of the pH threshold, at which conforma‐ tional changes occur.Itis achievedby replacing long hydrophobic side chains by the short ones. As the result, the site can be consistently represented as a potential trigger in the virus replica‐

**PVP molecule Amino acid Distance (Å)**

tion cycle and a good candidate to inhibit its function (Figure 5).

74 Protein Engineering - Technology and Application

**Table 2.** Polar interactions of the PVP molecule with dengue virus E protein.

**Figure 5.** Dengue virus E protein with PVP ligand. The lower part shows a close up of the docking area.

O1 OD1-Asp 203 2.75 O2 OD1-Asp 203 2.44 O3 N-Asp 203 2.64 O3 N-Lys 202 2.58 O4 N-Lys 202 3.37 O4 O-Asn 124 2.92 O5 O-Asn 124 3.32 O5 N-Asn 124 2.87 The docking of CS molecules in the E protein of dengue virus resulted in the interaction with the interface of domain I and II of the protein (see Figure 6). The CS ligand binds to seven amino acids of E protein by ten hydrogen bonds (see Table 3). The elongated CS molecule settles into a channel formed in the II domain surface of the protein. Additionally, it interacts with amino acids near the kl hinge or loop of I and II domain interface. There is a remarkable familiarity between the BOG and NAG complexes. Amino acids-CS molecule interactions, which are shown in Table 10 (aa65,68,202,249,251,272,273), suggest that the mechanism of action of this molecule is similar to PVP ligand. Additionally, it is very close to the conserved region V252-H261 that forms a channel in the 4 serotypes. This finding is of the highest importance since it could very well serve as a ligand for the 4 serotypes, and it could be even more useful in the development of a chimera vaccine with the four domains III of E protein, which would be similar to the chimeric vaccine developed in India at the International Centre for Genetic Engineering and Bionanotechnology.


**Table 3.** CS molecule interactions with dengue virus E protein.

#### *4.2.4. Dengue virus E protein — CS/TPP/CS docking*

In this case, we used the CS and TPP monomers taking into account that the CS units form bindings by means of 1-4 beta bonds. Similarly to the E protein–PVP docking, the molecular docking between the CS/TPP/CS ligand and the E protein was carried out between the domains I and II, although we observed more interactions in the case of PVP monomer. Table 4 and Figure 7 illustrate seven interactions between the amino acids and the BOG. The CS-TPP-CS complex interacts with aa49, 124, 126, 200, 202, 203, 271 amino acids, and the docking results suggest that these three molecules are attracted the most to the area formed by the hydrophobic pocket, indicating that the latter molecule has a direct interaction with the BOG ligand oxygen.



**Table 4.** Docking of CS/TPP/CS molecule with dengue virus E protein.

**Figure 7.** CS/TPP/CS molecule docking with the dengue virus E protein.

## **5. Conclusions**

**CS/TPP molecule atom Amino acid atom Distance (Å)**

**Figure 6.** Docking of dengue virus E protein with CS ligand. The interaction takes place at an interface between the

N ring 1 O-Ser 274 3.43 O3 ring 1 OE2-Glu 49 2.90 O5 ring 1 NE2-Gln 271 3.08 O5 ring 1 OE1-Gln 271 2.66 O5 ring 1 O3-BOG 2.74 O1 TPP1 O-Ser274 3.03 O3 TPP1 O4-BOG 3.14 O1 TPP2 OD1-Asp 203 2.83 O3 TPP2 O4-BOG 3.30 O1 TPP3 O-Lys 202 2.19 O1 TPP3 OD1-Asp 203 3.18 O3 TPP3 NE2-Gln 200 3.43 O4 TPP O-Lys 202 2.51 O1 ring 2 N-Lys 202 2.89 O3 ring 2 ND2-Asn 124 2.89 O3 ring 2 OD1-Asn 124 2.38 O4 ring 2 OE2-Glu 126 2.49

two monomers.

76 Protein Engineering - Technology and Application

**Table 4.** Docking of CS/TPP/CS molecule with dengue virus E protein.

We have reviewed the key concepts and current experimental procedures, including the recent advances in protein flexibility, ligand sampling, and scoring function. In addition, challenges and possible future directions were addressed in this chapter. As an example of protein ligand study we analyzed the interaction between the dengue virus E protein and Polyvinylpyrroli‐ done and Chitosan biopolymers and we confirmed that PVP, CS, and CS/TPP/CS biopolymers can fulfill the function of adjuvant carriers in the potential development of a chimeric dengue vaccine against the 4 serotypes of dengue virus. Furthermore, the ring-shaped molecules have shown affinity to or preference for a place of vital importance in the virus's cycle of infection and replication, which placed us on the path to develop an inhibitor of the aforementioned conformational changes (see Figures 5-7). Their binding to the E protein is possible due to the great affinity they present to simulated molecules. However, further analysis of molecular simulation is required to determine the behavior of the protein without the presence of BOG ligand or in different environmental conditions in the presence of low pH.

## **Acknowledgements**

This work has been supported by Fomix-Veracruz (2009-128001) and CONACyT-Mexico (CB2008-105491, CMB).

## **Author details**

Alejandra Hernández-Santoyo1 , Aldo Yair Tenorio-Barajas2 , Victor Altuzar2 , Héctor Vivanco-Cid3 and Claudia Mendoza-Barrera2\*

\*Address all correspondence to: omendoza@uv.mx

1 Instituto de Química, Universidad Nacional Autónoma de México, Mexico, D.F., Mexico

2 Laboratorio de Nanobiotecnología, Centro de Investigación en Micro y Nanotecnología, Universidad Veracruzana, Boca del Rio, Veracruz, Mexico

3 Instituto de Investigaciones Médico-Biológicas, Universidad Veracruzana, Boca del Río, Veracruz, Mexico

## **References**


[9] Kuntz, I. D, Blaney, J. M, Oatley, S. J, Langridge, R, & Ferrin, T. E. A geometric ap‐ proach to macromolecule-ligand interactions. Journal of Molecular Biology (1982). , 161(2), 269-88.

**Author details**

Héctor Vivanco-Cid3

Veracruz, Mexico

**References**

Alejandra Hernández-Santoyo1

78 Protein Engineering - Technology and Application

\*Address all correspondence to: omendoza@uv.mx

Universidad Veracruzana, Boca del Rio, Veracruz, Mexico

, Aldo Yair Tenorio-Barajas2

1 Instituto de Química, Universidad Nacional Autónoma de México, Mexico, D.F., Mexico

2 Laboratorio de Nanobiotecnología, Centro de Investigación en Micro y Nanotecnología,

3 Instituto de Investigaciones Médico-Biológicas, Universidad Veracruzana, Boca del Río,

[1] Huang, S. Y, & Zou, X. Advances and challenges in protein-ligand docking. Interna‐

[2] Halperin, I, Ma, B, Wolfson, H, & Nussinov, R. Principles of docking: An overview of search algorithms and a guide to scoring functions. Proteins (2002). , 47(4), 409-443.

[3] Sousa, S. F, Fernandes, P. A, & Ramos, M. J. Protein-ligand docking: current status

[4] Wang, R, Fang, X, Lu, Y, Yang, C. Y, & Wang, S. The PDBbind Database: Methodolo‐ gies and updates. Journal of Medicinal Chemistry (2005). , 48(12), 4111-4119.

[5] Puvanendrampillai, D, & Mitchell, J. B. L. D Protein Ligand Database (PLD): addi‐ tional understanding of the nature and specificity of protein-ligand complexes. Bioin‐

[6] Block, P, Sotriffer, CA, Dramburg, I, Klebe, G, & Affin, . : a freely accessible database of affinities for protein-ligand complexes from the PDB. Nucleic Acids Research 2006;

[7] Liu, T, Lin, Y, Wen, X, Jorissen, RN, Gilson, MK, & Binding, . : a web-accessible data‐ base of experimentally determined protein-ligand binding affinities. Nucleic Acids

[8] Dias, R. de Azevedo WF Jr. ((2008). Molecular docking algorithms. Curr Drug Tar‐

tional Journal of Molecular Sciences (2010). , 11(8), 3016-3034.

and future challenges. Proteins (2006). , 65(1), 15-26.

formatics (2003). , 19(14), 1856-1857.

Research 2007; 35 D198-201.

gets. Dec; , 9(12), 1040-7.

34 D522-526.

and Claudia Mendoza-Barrera2\*

, Victor Altuzar2

,


[37] Sundar, S, Kundu, J, & Kundu, S. C. Biopolimeric nanoparticles. Science and Tech‐ nology of Advanced Materials (2010).

[23] World Health OrganizationWHO: Media Centre, Fact sheets: Dengue and severe

[24] Huang, J. H, Wey, J. J, Sun, Y. C, Chin, C, Chien, L. J, & Wu, Y. C. Antibody respons‐ es to an immunodominant nonstructural 1 synthetic peptide in patients with dengue fever and dengue hemorrhagic fever. Journal of Medical Virology (1999). , 57(1), 1-8.

[25] Muñoz, M. L, Cisneros, A, Cruz, J, Das, P, Tovar, R, & Ortega, A. Putative dengue virus receptors from mosquito cells. FEMS Microbiology Letter (1998). , 168(2),

[26] Cao-lormeau, V. M. Dengue viruses binding proteins from Aedes aegypti and Aedes

[27] Bressanelli, S, et al. Structure of a flavivirus envelope glycoprotein in its low-pH-in‐ duced membrane fusion conformation. The EMBO Journal (2004). , 23(4), 728-738.

[28] Mondotte, J. A, Lozach, P. Y, Amara, A, & Gamarnik, A. V. Essential Role of Dengue Virus Envelope Protein N Glycosylation at Asparagine-67 during Viral Propagation.

[29] Mukhopadhyay, S, Kuhn, R. J, & Rossman, M. G. A structural perspective of the fla‐

[30] Stiasny, K, Allison, S. L, Schalich, J, & Heinz, F. X. Membrane Interactions of the Tick-Borne Encephalitis Virus Fusion Protein E at Low pH. Journal of Virology

[31] Mandl, C. W, Guirakhoo, F, Holzmann, H, Heinz, F. X, & Kunz, C. Antigenic struc‐ ture of the flavivirus envelope protein E at the molecular level, using tick-borne ence‐

[32] Acosta-bas, C, & Gómez-cordero, I. Biología y métodos diagnósticos del dengue. Re‐

[33] Kelly, E. P, Greene, J. J, King, A. D, & Innis, B. L. Purified dengue 2 virus envelope glycoprotein aggregates produced by baculovirus are immunogenic in mice. Vaccine

[34] Putnak, R, et al. Immunogenic and protective response in mice immunized with a purified, inactivated, Dengue-2 virus vaccine prototype made in fetal rhesus lung cells. The American Journal of Tropical Medicine Hygiene (1996). , 55(5), 504-510.

[35] Staropoli, I, Grenckiel, M. P, Mégret, F, & Deubel, V. Affinity-purified dengue-2 virus envelope glycoprotein induces neutralizing antibodies and protective immunity in

[36] Heinz, F. X, & Stiasny, K. Flaviviruses and flavivirus vaccines. Vaccine, (2012). ,

dengue. http://www.who.int/accessed 15 August (2012).

polynesiensis salivary glands. Virology Journal (2009).

vivirus life cycle. Nature Reviews Microbiology (2005). , 13-22.

phalitis virus as a model. Journal of Virology (1989). , 63(2), 564-571.

Journal Virology (2007). , 81(13), 7136-7148.

(2002). , 76(8), 3784-3790.

vista Biomedica (2005). , 16-113.

(2000). , 18(23), 2549-2559.

mice. Vaccine (1997).

30(29), 4301-4306.

251-258.

80 Protein Engineering - Technology and Application


**Section 2**
