**3. GFP biosensors**

Biosensors are distinct from biomarkers in that they are not linked to the expression of a specific gene product. Biosensors may function *in vivo* or *in vitro*. GFP variants that exhibit analyte-sensitive properties are genetically encoded biosensors, acting *in vivo*.GFP biosen‐ sors that contain amino acid substitutions that enable detection of pH changes, specific ions (Clor Ca2+), reactive oxygen species, redox state, and specific peptides have been reported [39, 58-60]. In addition, modifications have been reported that enable selective activation (ir‐ reversible or reversible) of the fluorescence [61,62].Genetically encoded GFP biosensors may be single GFP domains or FRET pairs.In the following subsections we describe selected ex‐ amples of GFP-based biosensors used *in vivo* or *in vitro*, with special emphasis on computa‐ tionally designed biosensors.

#### **3.1.** *In vivo* **pH biosensors**

In addition to providing a less laborious method for detecting protein variants and reaction conditions for generating soluble recombinant protein, the split GFP complementation assay also serves as an assay of aggregation in living cells. For example, aggregates of the microtu‐ bule associated protein tau are found in neurofibrillary tangles but their role in the patholo‐ gy of Alzheimer disease and Parkinson disease is not clear [51].The split GFP complementation assay enables monitoring of the aggregation process in living mammalian cells [52,53] and was validated using GFP1-10 and GFP11-tau variants.Cells containing soluble tagged protein show visible fluorescence but aggregates have little or no fluorescence. Pro‐ tein aggregates of GFP11-tau sequestered the GFP11 tag, leading to decreased complementa‐ tion of GFP1-10 and decreased fluorescence. Thus the split GFP complementation assay using tagged-GFP tau showed that it could be used as an *in vivo* model for studying factors that

It is also possible to utilize GFP biomarkers for single-molecule localization, a form of superresolution microscopy. High affinity single chain camelid antibodies (nanobodies) to GFP can be used to deliver organic fluorophores to GFP tagged proteins that are in turn used in single molecule "nanoscopy." [54, 55]. This novel approach combines the molecular specific‐ ity of genetic tagging with the high photon yield of the organic dyes. Additionally, by vary‐ ing the buffer conditions used, many organic dyes can become photoswitchable. The small size of camelid antibodies and their high affinity allow for access to regions that are general‐ ly inaccessible to conventional antibodies and targets that are expressed at very low levels

One should caution that the overexpression of FRET biomarkers in transgenic animals car‐ ries some concerns that this could lead to the perturbation of endogenous signaling path‐ ways and even retardation of animal development [57]. Additionally, in compact tissue, such as the brain tissue, cell type identification is particularly tedious due the diffused ex‐

Biosensors are distinct from biomarkers in that they are not linked to the expression of a specific gene product. Biosensors may function *in vivo* or *in vitro*. GFP variants that exhibit analyte-sensitive properties are genetically encoded biosensors, acting *in vivo*.GFP biosen‐ sors that contain amino acid substitutions that enable detection of pH changes, specific ions

or Ca2+), reactive oxygen species, redox state, and specific peptides have been reported [39, 58-60]. In addition, modifications have been reported that enable selective activation (ir‐ reversible or reversible) of the fluorescence [61,62].Genetically encoded GFP biosensors may be single GFP domains or FRET pairs.In the following subsections we describe selected ex‐ amples of GFP-based biosensors used *in vivo* or *in vitro*, with special emphasis on computa‐

influence aggregation.

14 State of the Art in Biosensors - General Aspects

pression of the biomarkers.

tionally designed biosensors.

**3. GFP biosensors**

[56].

(Cl-

**2.2. GFP biomarkers for single molecule imaging**

Within the cell, pH varies from the neutral pH of the cytosol to the acidic pH of the lyso‐ some lumen and protons may serve as cellular signals.Genetically encoded pH biosensors enable subcellular detection of pH and can provide insight into the regulation of cellular ac‐ tivities by pH. Addition of intracellular targeting tag directs the pH biosensor to particular subcellular compartments.

Many GFP variants show sensitivity to pH which results from protonation and deprotona‐ tion of the chromophore (see **Maturation of the GFP Chromophore**)(reviewed in [58]). The rapid and reversible response of EGFP to pH changes in the cells enabled EGFP to be used as an intracellular pH indicator [63] in place of chemical pH indicators such as fluorescein.A range of GFP based pH biosensors have been generated from modification of wtGFP and EGFP which resulted from amino acid substitutions primarily in and around the region of the chromophore.

Two classes of GFP pH indicators have been described: ratiometric and nonratiometric [58, 64].In the ratiometric pH indicators, the chromophore environment is such that the GFP bio‐ sensor has two sets of excitation/ emission spectra, one that varies with pH and another that does not.For these GFP variants, a calibration curve can be generated for the ratio of the spectra versus the pH.Nonratiometric GFP variants,such as EGFP [63] or ecliptic GFP [64], have pH dependent emission from the anionic chromophore (deprotonated) but almost no fluorescence of the neutral chromophore (protonated). These variants are used for reporting pH changes within cells when used as single molecule pH sensor or used in tandem with pH insensitive fluorescent partner (described below).

Ratiometric GFP pH biosensors have been generated by modification of a few key amino acids in the vicinity of the chromophore.Ratiometric pHluorin (RaGFP), the first ratiometric GFP described,contains a key S202H mutation and shows pH dependent change in excita‐ tion ratio between pH 5.5 and pH 7.5 [64].TheS202H mutation was shown to be important for the ratiometric property; pHlourins lacking the S202H were non ratiometric.Another class of GFP ratiometric pH sensors, deGFPs were generated from mutagenesis of the S65T GFP variant [65] resulting in substitutions H148G (deGFP1) or H148C (deGFP4) and T203C.The deGFPs are dual emission ratiometric GFPs emitting both blue and green light; blue light emission decreases with increase pH while green light emission increases with in‐ creased pH.

Variants pH GFP (H148D) [66] and E2 GFP (F64L/S65T/T203Y/L231H)[67] function as dual excitation ratiometric pH indicators with pH-dependent excitation at 488 nm and relatively pH-independent excitation at458 nm).In addition to its pH sensing properties, fluorescence emission from E2 GFP is affected by the concentration of certain ions, including Cl- . The chloride ion sensitivity of E2 GFP is a key component of the GFP–based chloride ion and pH sensor ClopHensor [68] (discussed in section **Fluorescent proteins as intrinsic ion sensors**).

In addition to single molecule based pH biosensors, ratiometric pH biosensors using tandem fluorescent protein variants have been constructed in which a pH sensitive GFP variant is linked to a less sensitive or pH insensitive GFP.GFpH and YFpH are tandem FRET pairsfor the detection of pH changes in the cytosol and nucleus of living cells. GFpH combines GFPuv, which has low pH sensitivity, with pH sensitive EGFP and YFpH combines GFPuv and EYFP [58, 69].Not all tandem GFP biosensors are FRET pairs, however. pHusion is a ra‐ tiometric tandem GFP biosensor in whichmRFP (pH insensitive) is tethered to EGFP (pH sensitive) via a linker.pH measurements are determined from the ratio of EGFP to mRFP flu‐ orescence. pHusion biosensor was developed for analysis of intracellular and extracellular pH in developing plants [60].

#### **3.2.** *In vivo***FRET-based biosensors**

Genetically-encoded FRET-based biosensors can be applied in a variety of capacities to visu‐ alize intracellular spatiotemporal changes in real time. The evolution of these applications has progressed from cell culture systems that transiently express FRET biosensors to trans‐ genic mouse models that express them in a heritable manner [57]. Production of transgenic mice with FRET biosensors arose in an effort to enhance our understanding of the differen‐ ces that exist between tissue culture and living systems. Transgenic FRET GFP biosensor systems are very efficient and their fluorescence signals are easily distinguished from auto‐ fluorescence, which is analyte-independent fluorescence. The sensors themselves can be used to probe a variety of pathways for the activity of signaling enzymes as well as a num‐ ber of post translational modifications.

#### *3.2.1. Detection of enzyme activity*

In transgenic animal models, FRET biosensors can be used to study PKA activation by cAMP, ERK activation by TPA and their association with various physiological changes [57]. PKA and ERK areenzymes that transfer the γ-phosphate of ATP to a number of protein sub‐ strates thereby affecting a conformational change. Kinase induced conformational changes are important because they are involved in the control a number of critical cellular processes that include glycogen synthesis, hormonal response, and ion transport [70]. A number of signaling cascades that involve kinases require a means of dynamic control and spatial com‐ partmentalization of the kinase activity; a requirement highlights the need for a mechanism to continuously track kinase activity in different compartments and signaling microdomains *in vivo*.

Traditional methods of assaying kinase activity fail to capture its dynamicity; a void that is filled by genetically encoded FRET-based biosensors. These sensors are constructed so that the substrate protein of the kinase of interest is flanked with a fluorescent protein pair in such a way that the conformational change imparted by phosphorylation translates into a change in the FRET signal (Figure7) [70]. These biosensors can be localized to particular sites of interest with the aid of appropriate targeting signal sequences, allowing the imaging of site-specific kinase activity.G-protein coupled receptors, when used in a biosensor, provide a mechanism for transducingdrug mediated effects on PKA activity into a light signal. Transgenic mice expressing FRET based biosensors provide an ideal system for studying the pharmacodynamics of these drugs.

the detection of pH changes in the cytosol and nucleus of living cells. GFpH combines GFPuv, which has low pH sensitivity, with pH sensitive EGFP and YFpH combines GFPuv and EYFP [58, 69].Not all tandem GFP biosensors are FRET pairs, however. pHusion is a ra‐ tiometric tandem GFP biosensor in whichmRFP (pH insensitive) is tethered to EGFP (pH sensitive) via a linker.pH measurements are determined from the ratio of EGFP to mRFP flu‐ orescence. pHusion biosensor was developed for analysis of intracellular and extracellular

Genetically-encoded FRET-based biosensors can be applied in a variety of capacities to visu‐ alize intracellular spatiotemporal changes in real time. The evolution of these applications has progressed from cell culture systems that transiently express FRET biosensors to trans‐ genic mouse models that express them in a heritable manner [57]. Production of transgenic mice with FRET biosensors arose in an effort to enhance our understanding of the differen‐ ces that exist between tissue culture and living systems. Transgenic FRET GFP biosensor systems are very efficient and their fluorescence signals are easily distinguished from auto‐ fluorescence, which is analyte-independent fluorescence. The sensors themselves can be used to probe a variety of pathways for the activity of signaling enzymes as well as a num‐

In transgenic animal models, FRET biosensors can be used to study PKA activation by cAMP, ERK activation by TPA and their association with various physiological changes [57]. PKA and ERK areenzymes that transfer the γ-phosphate of ATP to a number of protein sub‐ strates thereby affecting a conformational change. Kinase induced conformational changes are important because they are involved in the control a number of critical cellular processes that include glycogen synthesis, hormonal response, and ion transport [70]. A number of signaling cascades that involve kinases require a means of dynamic control and spatial com‐ partmentalization of the kinase activity; a requirement highlights the need for a mechanism to continuously track kinase activity in different compartments and signaling microdomains

Traditional methods of assaying kinase activity fail to capture its dynamicity; a void that is filled by genetically encoded FRET-based biosensors. These sensors are constructed so that the substrate protein of the kinase of interest is flanked with a fluorescent protein pair in such a way that the conformational change imparted by phosphorylation translates into a change in the FRET signal (Figure7) [70]. These biosensors can be localized to particular sites of interest with the aid of appropriate targeting signal sequences, allowing the imaging of site-specific kinase activity.G-protein coupled receptors, when used in a biosensor, provide a mechanism for transducingdrug mediated effects on PKA activity into a light signal. Transgenic mice expressing FRET based biosensors provide an ideal system for studying the

pH in developing plants [60].

16 State of the Art in Biosensors - General Aspects

**3.2.** *In vivo***FRET-based biosensors**

ber of post translational modifications.

*3.2.1. Detection of enzyme activity*

pharmacodynamics of these drugs.

*in vivo*.

**Figure 7.** Representation of the mode of action of an intramolecular FRET biosensor containing a molecular switch. The sensor domain and ligand domain of the construct are connected by a flexible linker with CFP and YFP serving as the donor and acceptor for the FRET pair. This switch can perceive various molecular events, such as protein phosphor‐ ylation, through binding to the ligand domain. This in turn induces an interaction between the ligand and sensor do‐ mains that facilitates a global change in the conformation of the biosensor, which serves to increase the FRET efficiency from the donor to the acceptor (CFP to YFP in this case) [71].

When used to study the signaling events in wound healing, the strength and duration of the fluorescent signals that are generated by these biosensors are dependent on the location within the tissue (tissue depth has a negative impact on the intensity of the fluorescent sig‐ nal), its vicinity in relation to the site of injury, as well as the contributions made by chemi‐ cal mediators (drugs) in sustaining kinase activity [57]. These model systems provide a means of visualizing in real-time the agonist/antagonist pharmacodynamics associated with a plethora of signaling molecules that do not necessarily have to be limited to PKA and ERK activity. They also provide a tool for resolving the maze of upstream signaling pathways that contribute to chemotaxis in the animals.

Genetically encodable FRET GFP biosensors have proven to be useful in characterizing the dynamic phosphorylation dependent regulation of small GTPases [70]. Ras GTPases play es‐ sential roles in regulating cell growth, cell differentiation, cell migration, and lipid vesicle trafficking. Upon binding GFP, the G-protein Ras recruits the serine/threonine kinase Raf. FRET biosensors for GTPase activity such as Raichu-Ras (Ras and Interacting protein CHi‐ meric Unit for RAS) use this Ras-Raf interaction as the basis for the molecular switch. Rai‐ chu-Ras functions by using H-Ras as the sensor domain and the Ras Binding Domain (RBD) of Raf as the ligand domain in constructing a molecular switch that in turn is sandwiched by the FRET pair CFP/YFP (Figure 7). Such a design allows for the monitoring of Ras activation in living cells on the basis of fluctuations in the FRET signals generated.

#### *3.2.2. Detection of antioxidant activity and reactive oxygen species*

FRET-based GFP biosensors can also be employed in *in vitro* applications as an alternative tool for high throughput screening assays. These assays are simple, inexpensive, reproduci‐ ble and highly specific. A good example can be observed in the use of bacterial cell-based assays for screening antioxidant activity of various substances for biological activity [72]. To achieve this objective *E.coli* biosensor strains that carry the plasmid that fuses sodA (manga‐ nese superoxide dismutase) and fumC (fumarase C) promoters with GFP genes, called so‐ dA::gfp and fumC::gfp respectively, were produced and used to evaluate antioxidant activity of a number of phenolic and flavonoid compounds in comparison with two DPPH radical scavenging and SOD activity assays (two more conventional assays). After paraquat treatment of *E. coli* cultures to induce oxidative stress, the putative antioxidant compounds were added and both the GFP fluorescence and cell culture density readings were taken to determine the role played by the respective compounds in reducing the free radical accumu‐ lation and intracellular oxidative stress.Genes sodA and fumC are turned on by SoxR and OxyR, respectively, which are the two main regulatory proteins involved in oxidative stress sensing. GFP fluorescence is therefore diminished by successful antioxidants. These con‐ structs are important because they function as alternative screening tools that can be utilized to assess the activity of compounds with therapeutic potential against oxidative stress. Anti‐ oxidants have been shown to play a role in disease prevention.

#### *3.2.3. Detection of calcium ions*

FRET-based and single domain Ca2+ sensors have been constructed using the allosteric effect of calcium binding to receptors calmodulin or troponin [73]. In one construct, the CFP/YFP pairing is separated by a linker containing a calmodulin domain and a calmodulin ligand peptide called M13.When Ca2+ is present, it binds to the calmodulin domain, inducing a con‐ formational change and binding of the proximal M13 peptide sequence. The M13 binding results in shortening of the linker, bringing CFP within FRET distance of YFP and changing the emission wavelength from cyan to yellow. The Ca2+ binding affinity was found to be highly variable, around 0.3 uM with a Hill coefficient of n=4, depending on conditions. When used *in vivo*, the calmodulin-based biosensors suffered from endogenous interference by host proteins and did not always work [73]. To remedy this, the calmodulin/M13 linker was replaced with troponin C, whose N-to-C distance is shortened by Ca2+ binding, result‐ ing in FRET.Using another strategy, calmodulin and M13 peptide sequences were separated by a circularly-permuted EGFP, which was quenched in the absence of Ca2+ but recovered fluorescence upon Ca2+-induced binding of the calmodulin to M13. Improved genetically en‐ coded Ca2+ indicators have been used *in vivo* to trace action potentials in neurons, with re‐ sponse times in the millisecond range [73, 74], becoming competitive with synthetic indicators and recording electrodes.

#### **4.** *In vitro* **applications**

GFP has great potential to work as an *in vitro* biosensor.Because of its remarkable stability, it can be used and manipulated in multiple ways to impart sensor functionality to the pro‐ tein.Several approaches are described here, including creating a chimeric protein with anti‐ body fragments, linking fluorescent proteins to quantum dots, manipulating the amino acid sequence to create analyte pores, as well as sequence manipulation that provides increased halide ion and/or pH sensitivity.

#### **4.1. GFP-antibody chimeric proteins**

nese superoxide dismutase) and fumC (fumarase C) promoters with GFP genes, called so‐ dA::gfp and fumC::gfp respectively, were produced and used to evaluate antioxidant activity of a number of phenolic and flavonoid compounds in comparison with two DPPH radical scavenging and SOD activity assays (two more conventional assays). After paraquat treatment of *E. coli* cultures to induce oxidative stress, the putative antioxidant compounds were added and both the GFP fluorescence and cell culture density readings were taken to determine the role played by the respective compounds in reducing the free radical accumu‐ lation and intracellular oxidative stress.Genes sodA and fumC are turned on by SoxR and OxyR, respectively, which are the two main regulatory proteins involved in oxidative stress sensing. GFP fluorescence is therefore diminished by successful antioxidants. These con‐ structs are important because they function as alternative screening tools that can be utilized to assess the activity of compounds with therapeutic potential against oxidative stress. Anti‐

FRET-based and single domain Ca2+ sensors have been constructed using the allosteric effect of calcium binding to receptors calmodulin or troponin [73]. In one construct, the CFP/YFP pairing is separated by a linker containing a calmodulin domain and a calmodulin ligand peptide called M13.When Ca2+ is present, it binds to the calmodulin domain, inducing a con‐ formational change and binding of the proximal M13 peptide sequence. The M13 binding results in shortening of the linker, bringing CFP within FRET distance of YFP and changing the emission wavelength from cyan to yellow. The Ca2+ binding affinity was found to be highly variable, around 0.3 uM with a Hill coefficient of n=4, depending on conditions. When used *in vivo*, the calmodulin-based biosensors suffered from endogenous interference by host proteins and did not always work [73]. To remedy this, the calmodulin/M13 linker was replaced with troponin C, whose N-to-C distance is shortened by Ca2+ binding, result‐ ing in FRET.Using another strategy, calmodulin and M13 peptide sequences were separated by a circularly-permuted EGFP, which was quenched in the absence of Ca2+ but recovered fluorescence upon Ca2+-induced binding of the calmodulin to M13. Improved genetically en‐ coded Ca2+ indicators have been used *in vivo* to trace action potentials in neurons, with re‐ sponse times in the millisecond range [73, 74], becoming competitive with synthetic

GFP has great potential to work as an *in vitro* biosensor.Because of its remarkable stability, it can be used and manipulated in multiple ways to impart sensor functionality to the pro‐ tein.Several approaches are described here, including creating a chimeric protein with anti‐ body fragments, linking fluorescent proteins to quantum dots, manipulating the amino acid sequence to create analyte pores, as well as sequence manipulation that provides increased

oxidants have been shown to play a role in disease prevention.

*3.2.3. Detection of calcium ions*

18 State of the Art in Biosensors - General Aspects

indicators and recording electrodes.

**4.** *In vitro* **applications**

halide ion and/or pH sensitivity.

The goal of GFP-antibody chimeric proteins (GFPAbs) is to convert a multi-step experimen‐ tal process for locating molecules via antibodies and enzyme-linked secondary antibodies, into a one-step process using a GFPAbs.This molecule could then work as a detection re‐ agent in flow cytometry, for intracellular targeting, or fluorescence-based ELISAs [38].How‐ ever, in order to replace antibodies in these techniques, it is important to achieve the same nanomolar sensitivity that is found in the natural antibodies.To do this, [38] inserted two an‐ tigen-binding loops into the GFP structure, counting on cooperativity in binding to enhance affinity.

It became clear that adding loops impinges on the integrity of the native GFP structure.The binding loops must be placed such that their presence in the fluorescent protein does not jeopardize its structural fidelity, or that of the chromophore.There are only a few locations in the molecule that are amenable to such insertions:turn regions β4/β5 (residue 102), β7/β8 (residue 172) and β8/β9 (residue 157).The latter two are too far apart in three-dimensional space to provide for cooperative binding (see Figure 5). The β4/β5 and β8/β9 loop regions are in close proximity, but these do not easily accommodate random loop insertions.

[38] used directed evolution with yeast surface display [75] to find sequences that stabilized the folded conformation in the context of loop insertions.The yeast secretory pathway does not allow unfolded protein to reach the surface of the cell, thus only mutants that yield fully folded GFP were displayed by yeast cells. Directed evolution revealed several mutations that conferred additional stability and increased fluorescence in the context of inserted loops: D19N, F64L, A87T, Y39H, V163A, L221V, and N105T. The F64L mutation has been shown to increase fluorescence of GFP and also to shift the excitation maximum to 488 nm.Y39H and N105T have been shown to improve refolding kinetics and refolding stability, respectively.V163A is linked to improved folding as a result of its increased expression in yeast surface display [38]. These mutations accommodated the insertions of antigen-binding loops from antibodies raised against streptavidin-phycoerythrin, biotin-phycoerythrin, TrkB, or GADPH, all while maintaining 40% of the fluorescence and 60% of the expression of wild type GFP.With dual loop insertion, dissociation constants as low as 3.2 nM have been achieved [38]. The success of this construct means that molecules such as GADPH can be located within cells without having to engineer a second round of antibodies, saving both time and resources.

#### **4.2. A chimeric fluorescent biosensor based on allostery**

A general method for developing a biosensor for a specific receptor-ligand interaction has been described [76] in which a receptor protein is inserted into the GFP sequence between strand 8 and strand 9. The insertion puts enough of a strain on GFP that its fluorescence is reduced. Binding of the ligand to the GFP-receptor chimera may then impart enough of a change in its conformation that it causes a change in fluorescence, since the b8/b9 loop is fairly close in space to the chromophore. This change may be found by plate screening for fluorescence. In [76], the receptor Bla1 was cloned into the loop, and random mutations were made to this construct. Mutant constructs that detected the Bla1 ligand BLIP were identified by a visual screen of colonies before and after the induced expression of BLIP. Us‐ ing this method, a double mutant was found that was shown to detect BLIP *in vitro* with mi‐ cromolar affinity. In principle, this method could be used to generate a sensor for any ligand that can be expressed in bacteria or added exogenously, as long as a receptor protein exists that can be inserted into the GFP loop.

#### **4.3. FRET-based biosensors using quantum dots**

FRET-based in vitro biosensors may be constructed by linking fluorescent proteins to quan‐ tum dots (QDs).QDs are inorganic molecular nano-crystals whose absorption and emission spectra are dictated by the size of the QD.For example, a QD may be engineered to absorb ultraviolet light and emit light at 550 nm, which overlaps well with the excitation spectrum of mCherry, a variant of GFP [77], and produces FRET when the two fluorophores are in close proximity.

In order to make the FRET emission analyte-dependent, the QD was linked to the mCherry via an N-terminal linker peptide that contained a protease cleavage site and a 6 histidine tag.The imidazole side chains of the histidines electronically coordinate with the zinc atoms of the CdSe—ZnS core-shell semiconductor of the QD [77]. Multiple mCherry molecules can be coordinated with each QD. Splitting of mCherry from the QD by a protease may be de‐ tected by the loss of FRET.By placing the caspase-3 cleavage sequenceinto the linker be‐ tween GFP and the QD, the FRET complex becomes a biosensor for the presence of caspase-3, glowing red at 610 nm in the absence of the protease, and reverting to the yellow fluorescence of the QD at 550 nm when the protease is present (Figure 8).

**Figure 8.** QD-FRET, showing emission of the chromophore only when in close proximity to the QD. When the two are split by caspase activity, FRET is lost. Figure used with permission from [78].

GFP/QD FRET emission may be also be manipulated by pH-induced changes in the spectral overlap, without having to spatially separate the QD from the fluorescent protein.It has been shown that fluorescent proteins such as GFP and mOrange experience a shift in excitation and emission spectra with changes in pH [78].At a slightly acidic pH, there is very little spectral overlap between the QD emission and the mOrange excitation, which means that the QD emission is seen, in this case around 520 nm.However, as the pH increases, the exci‐ tation spectrum of mOrange shifts such that there is more overlap with the QD emission, which subsequently causes an increase in FRET.The result is an upward shift in the emission wavelength with increasing pH.It is important to note that since there is a fluctuating hydro‐ gen ion concentration, the histidine-QD coordination complex becomes unstable.In order to remedy this problem, a covalently linked quantum dot must be used.

#### **4.4. Fluorescent proteins as intrinsic ion sensors**

identified by a visual screen of colonies before and after the induced expression of BLIP. Us‐ ing this method, a double mutant was found that was shown to detect BLIP *in vitro* with mi‐ cromolar affinity. In principle, this method could be used to generate a sensor for any ligand that can be expressed in bacteria or added exogenously, as long as a receptor protein exists

FRET-based in vitro biosensors may be constructed by linking fluorescent proteins to quan‐ tum dots (QDs).QDs are inorganic molecular nano-crystals whose absorption and emission spectra are dictated by the size of the QD.For example, a QD may be engineered to absorb ultraviolet light and emit light at 550 nm, which overlaps well with the excitation spectrum of mCherry, a variant of GFP [77], and produces FRET when the two fluorophores are in

In order to make the FRET emission analyte-dependent, the QD was linked to the mCherry via an N-terminal linker peptide that contained a protease cleavage site and a 6 histidine tag.The imidazole side chains of the histidines electronically coordinate with the zinc atoms of the CdSe—ZnS core-shell semiconductor of the QD [77]. Multiple mCherry molecules can be coordinated with each QD. Splitting of mCherry from the QD by a protease may be de‐ tected by the loss of FRET.By placing the caspase-3 cleavage sequenceinto the linker be‐ tween GFP and the QD, the FRET complex becomes a biosensor for the presence of caspase-3, glowing red at 610 nm in the absence of the protease, and reverting to the yellow

**Figure 8.** QD-FRET, showing emission of the chromophore only when in close proximity to the QD. When the two are

GFP/QD FRET emission may be also be manipulated by pH-induced changes in the spectral overlap, without having to spatially separate the QD from the fluorescent protein.It has been shown that fluorescent proteins such as GFP and mOrange experience a shift in excitation and emission spectra with changes in pH [78].At a slightly acidic pH, there is very little spectral overlap between the QD emission and the mOrange excitation, which means that the QD emission is seen, in this case around 520 nm.However, as the pH increases, the exci‐

fluorescence of the QD at 550 nm when the protease is present (Figure 8).

split by caspase activity, FRET is lost. Figure used with permission from [78].

that can be inserted into the GFP loop.

20 State of the Art in Biosensors - General Aspects

close proximity.

**4.3. FRET-based biosensors using quantum dots**

Fluorescent proteins, especially E2 GFP, have been shown to be sensitive not only to pH changes but also to the concentration of certain ions, particularly chloride ions.E2 GFP pro‐ vides an avenue for single domain ratiometric analysis of pH because it contains two excita‐ tion and emission peaks. Only the longer wavelength emission peak is pH dependent [68].Therefore analysis of pH based on the ratio of green fluorescence to cyan.By coupling E2 GFP to another fluorescent protein in a fusion construct, it is also possible to measure oth‐ er intracellular chloride ion concentration.For example, DsRed is neither pH nor chloride ion sensitive, so it can be used to measure chloride ion concentration based on the ratio of its fluorescence to the cyan emission of E2 GFP.

**Figure 9.** The analyte channel through which copper ions can pass through to the interior of the barrel structure and quench the fluorescence of the chromophore. Used with permission from [80].

Making a few modifications can make GFP sensitive to the concentration of other ions.For example, superfolder GFP can be made sensitive to copper ions by mutating the arginine at position 146 to a histidine, which, as previously mentioned, coordinates well with metal ions [79]. GFP can also become sensitive to ions by creating channels in the structure through which small molecules can pass through and access the chromophore (Figure 9).By mutat‐ ing position 165 from a phenylalanine to a glycine, a channel is opened that is about 4 Å wide.This allows small molecules such as copper ions to enter the hydrophobic core of the protein and quench fluorescence [80]. GFP, thanks to its stability, has shown a remarkable ability to be modified, and thus shows great promise in visualizing a large variety of intra‐ cellular and extracellular substances.

#### **5. Computationally designed LOO-GFPs**

Recent work in the Bystroff lab has focused on programming GFP to accept any desired pro‐ tein as a binding partner, like an antibody, and to switch on fluorescence only when the tar‐ geted protein is bound. The strategy combines Leave-One-Out split reconstitution with computational design and high throughput screening.

Leave-One-Out (LOO) was described earlier (**"Leave-One-Out" GFP**) as a technique for de‐ veloping split proteins that spontaneously reconstitute function. Fluorescence is recovered in LOO-GFP when the left-out piece is encountered in the analyte. A promising application of LOO-GFP, knowing that it binds to the left-out segment and fluoresces [39, 40], is to engi‐ neer novel LOO-GFP molecules that recognize and sense desired peptides derived from oth‐ er sources such as virus, bacteria and parasites. By modifying the sites of one of the eleven β-strands to complement shapes of given target peptides, the engineered LOO-GFP mole‐ cules will report the presence of specific target proteins, and therefore their host organism, through simple fluorescence readout (Figure 10).LOO-GFP biosensors can be engineered by generating mutations that accommodate the shape and charge of a desired target peptide. The target peptide may be made available for binding by denaturing the target protein.

**Figure 10.** LOO-GFP peptide biosensors. Engineering LOO-GFP molecules to accommodate desired target peptides create specific sensing tools where fluorescence can be reconstituted upon adding back the left-out peptides and sig‐ nals the detection.

Theoretically, this goal could be achieved by random mutation followed by high throughput screening to find mutants that glow in the presence of a peptide. However, random muta‐ tion would be extremely inefficient. Computational protein design methods offer a much better alternative for rationally generating sequence diversity before the labor-intensive ex‐ perimental screen.

#### **5.1. Computer-aided protein design**

which small molecules can pass through and access the chromophore (Figure 9).By mutat‐ ing position 165 from a phenylalanine to a glycine, a channel is opened that is about 4 Å wide.This allows small molecules such as copper ions to enter the hydrophobic core of the protein and quench fluorescence [80]. GFP, thanks to its stability, has shown a remarkable ability to be modified, and thus shows great promise in visualizing a large variety of intra‐

Recent work in the Bystroff lab has focused on programming GFP to accept any desired pro‐ tein as a binding partner, like an antibody, and to switch on fluorescence only when the tar‐ geted protein is bound. The strategy combines Leave-One-Out split reconstitution with

Leave-One-Out (LOO) was described earlier (**"Leave-One-Out" GFP**) as a technique for de‐ veloping split proteins that spontaneously reconstitute function. Fluorescence is recovered in LOO-GFP when the left-out piece is encountered in the analyte. A promising application of LOO-GFP, knowing that it binds to the left-out segment and fluoresces [39, 40], is to engi‐ neer novel LOO-GFP molecules that recognize and sense desired peptides derived from oth‐ er sources such as virus, bacteria and parasites. By modifying the sites of one of the eleven β-strands to complement shapes of given target peptides, the engineered LOO-GFP mole‐ cules will report the presence of specific target proteins, and therefore their host organism, through simple fluorescence readout (Figure 10).LOO-GFP biosensors can be engineered by generating mutations that accommodate the shape and charge of a desired target peptide. The target peptide may be made available for binding by denaturing the target protein.

**Figure 10.** LOO-GFP peptide biosensors. Engineering LOO-GFP molecules to accommodate desired target peptides create specific sensing tools where fluorescence can be reconstituted upon adding back the left-out peptides and sig‐

Theoretically, this goal could be achieved by random mutation followed by high throughput screening to find mutants that glow in the presence of a peptide. However, random muta‐ tion would be extremely inefficient. Computational protein design methods offer a much better alternative for rationally generating sequence diversity before the labor-intensive ex‐

cellular and extracellular substances.

22 State of the Art in Biosensors - General Aspects

nals the detection.

perimental screen.

**5. Computationally designed LOO-GFPs**

computational design and high throughput screening.

Computational protein design predicts protein sequences that fold into predefined protein structures. Proteins are described as a set of atoms with 3D spatial coordinates and physical/ chemical properties [81-84]. Instead of mutating residues experimentally, mutations are ex‐ plored *in silico* and selected using a computed goodness of fit (Figure 11). Mutations predict‐ ed to cause collisions between atoms, leave unsatisfied hydrogen bonding partners, cause charge-charge repulsion, or employ rare amino acid side chain conformations are downweighted by assigning them a higher energy value. To facilitate the search for the best muta‐ tions, amino acid side chains are discretized into rotational isomers (called rotamers) [85-87]. Protein sequences that preserve the desired functionalities, such as the binding of a ligand, are obtained by searching the space of all side chain rotamers for the minimum free energy. There are few reviews of the methods used [88].

**Figure 11.** Computational protein design coupled with design library generation [89]. The entire designed sequence space of selected residues is computationally screened to determine the global minimum energy configuration (GMEC) for the given structure. Starting from the GMEC, sequence space is explored to obtain sub-optimal sequences that are also potentially predicted to be functional. A DNA library is constructed to cover all predicted sequences, and candidates are screened experimentally to select clones with desired functions. Information from analyzing obtained mutants is utilized to validate and improve the computational protein design strategy, and provides a better starting model for iterative optimization.

#### **5.2. Protein biosensors versus other methods for detecting pathogens**

Biosensors for specific proteins and pathogens offer potential advantages over the current state of the art, notably speed and simplicity. Laboratory diagnostics of infections commonly includes pathogen isolation using culture, direct antigen detection, or detection of pathogen specific DNA and/or RNA by polymerase chain reaction (PCR). The isolation method re‐ quires a culture system to inoculate a specimen, followed by the examination of specific characteristics produced by pathogens, such as the cytopathogenic effect of virus and the distinct metabolism of bacteria. Although culture-based methods have higher detection sen‐ sitivity, they generally take three to ten days for diagnosis. Alternatively, immunoassays uti‐ lize pathogen specific antibodies and secondary anti-antibodies to detect and report a pathogen. Most of the rapid diagnostic tests only take 15 to 30 min for diagnosis, but raising specific antibodies against pathogens is time-consuming and expensive. Thirdly, molecular diagnosis using PCR takes the advantage of the gene amplification and provides a highly sensitive detection in diagnosis from minute amounts of pathogen genome within a short time. However, the need for real-time PCR and gel electrophoresis apparati and reagents means it will not be possible in all settings, where a simple biosensor test would be possible. PCR assumes that DNA is present, but some pathogens such as anthrax toxin, snake venom and bovine spongiform encephalopathy contain no genetic material. All these point to a need for developing a diagnostic tool for proteins that is fast and easy to use, and suitable for rural, point-of-care facilities in developing nations.

The following describes how the computer-aided design of LOO-GFP was done, and the en‐ couraging but preliminary results. The process has three steps: (1) the selection of a target peptide sequence from the genome of the pathogen, (2) the computational design of the LOO-GFP• target complex, and (3) the experimental screening of a library of potential bio‐ sensor sequences.

#### **5.3. Target peptide selection**

A target peptide for detection must be unique in order to avoid false positives, and must be conformable to the LOO-GFP binding site, which is the site of one of the eleven β-strands of GFP. From the examination of GFP and homolog fluorescent protein structures and sequen‐ ces, we defined a set of signature patterns for each β-strand. These patterns define the limits of mutation. For example, no position within a target peptide may be a proline, since it must be hydrogen bonded on both sides to the neighboring β-strands. Cysteines are also disal‐ lowed, for experimental reasons.Target peptides are selected by searching the sequences of the target organism for a match to the signature pattern.Other considerations including the location of protease recognition sites, cellular location, and protein expression levels.

In the case study described here, a twelve-residue peptide (SSHEVSLGVSSA) was selected from hemagglutinin (HA) sequence of avian influenza virus H5N1, using the signature pat‐ tern of GFP β-strand 7. The target peptide retains the sequence pattern of the wild type βstrand 7, and it can be released by the chymotrypsin digestion of HA protein. A BLAST search of all known protein sequences confirmed that the HA target sequence occurs only in hemagglutinin from influenza virus type A.

#### **5.4. Computational pre-screening of candidate biosensor sequences**

To engineer customized LOO-GFP biosensors that sense a given peptide we developed a set of software called DEEdesign. DEEdesign uses a combination of physical properties and statistical knowledge to energetically evaluate the fitness of rotamers in protein structures, along with sampling algorithms to search the space of all possible mutations. The parame‐ ters used in the fitness scoring system are trained by a machine learning technique to repro‐ duce the true sidechain conformations in high-resolution crystal structures [90]. Sequence space is searched using one of two methods, either using Monte Carlo [91], with random mutations accepted or rejected based on the calculated energy, or using the dead-end elimi‐ nation theorem (DEE), which holds that if energies can be decomposed into pairwise terms, then a solution to the problem of finding the lowest energy set of mutations can be found by a process of successive elimination [92].

quires a culture system to inoculate a specimen, followed by the examination of specific characteristics produced by pathogens, such as the cytopathogenic effect of virus and the distinct metabolism of bacteria. Although culture-based methods have higher detection sen‐ sitivity, they generally take three to ten days for diagnosis. Alternatively, immunoassays uti‐ lize pathogen specific antibodies and secondary anti-antibodies to detect and report a pathogen. Most of the rapid diagnostic tests only take 15 to 30 min for diagnosis, but raising specific antibodies against pathogens is time-consuming and expensive. Thirdly, molecular diagnosis using PCR takes the advantage of the gene amplification and provides a highly sensitive detection in diagnosis from minute amounts of pathogen genome within a short time. However, the need for real-time PCR and gel electrophoresis apparati and reagents means it will not be possible in all settings, where a simple biosensor test would be possible. PCR assumes that DNA is present, but some pathogens such as anthrax toxin, snake venom and bovine spongiform encephalopathy contain no genetic material. All these point to a need for developing a diagnostic tool for proteins that is fast and easy to use, and suitable

The following describes how the computer-aided design of LOO-GFP was done, and the en‐ couraging but preliminary results. The process has three steps: (1) the selection of a target peptide sequence from the genome of the pathogen, (2) the computational design of the LOO-GFP• target complex, and (3) the experimental screening of a library of potential bio‐

A target peptide for detection must be unique in order to avoid false positives, and must be conformable to the LOO-GFP binding site, which is the site of one of the eleven β-strands of GFP. From the examination of GFP and homolog fluorescent protein structures and sequen‐ ces, we defined a set of signature patterns for each β-strand. These patterns define the limits of mutation. For example, no position within a target peptide may be a proline, since it must be hydrogen bonded on both sides to the neighboring β-strands. Cysteines are also disal‐ lowed, for experimental reasons.Target peptides are selected by searching the sequences of the target organism for a match to the signature pattern.Other considerations including the

location of protease recognition sites, cellular location, and protein expression levels.

**5.4. Computational pre-screening of candidate biosensor sequences**

In the case study described here, a twelve-residue peptide (SSHEVSLGVSSA) was selected from hemagglutinin (HA) sequence of avian influenza virus H5N1, using the signature pat‐ tern of GFP β-strand 7. The target peptide retains the sequence pattern of the wild type βstrand 7, and it can be released by the chymotrypsin digestion of HA protein. A BLAST search of all known protein sequences confirmed that the HA target sequence occurs only in

To engineer customized LOO-GFP biosensors that sense a given peptide we developed a set of software called DEEdesign. DEEdesign uses a combination of physical properties and

for rural, point-of-care facilities in developing nations.

sensor sequences.

**5.3. Target peptide selection**

24 State of the Art in Biosensors - General Aspects

hemagglutinin from influenza virus type A.

However, inaccuracies in design due to the imperfect scoring system, the use of discretized side chains, and the lack of precise modeling of backbone flexibility, affect the reliability of the method. Therefore, instead of relying on the accuracy of the single lowest energy protein sequence, DEEdesign provides an ensemble of plausible mutants, all with reasonably low calculated energy scores. These are assembled into a single amino acid profile, from which a library of nucleotide sequences is derived, employing degenerate codons for those positions in the sequence that have more than one possible amino acid.

In our case study, residues 143-154 NSHNVYITADKQ of β-strand 7 were mutated *in silico* to the target peptide sequence SSHEVSLGVSSA from HA. All residues within 7Å of the target were mutated to all amino acids within the constraints of the evolutionary history of GFP, where the latter was derived from a multiple sequence/structure alignment of 34 fluorescent proteins, augmented by additional homologous sequences. If an amino acid was found at a given position in the evolutionary history of GFP, then that amino acid was allowed in the course of the sequence space search, otherwise it was disallowed. DEE and Monte Carlo were used to search this sequence space, identifying an ensemble of low-energy sequences such that the total complexity of the sequence space of the ensemble was only about ten thousand unique sequences, a number that can be efficiently screened on petri plates.The ensemble of sequences was back-translated to DNA and divided into overlapping degener‐ ate-codon oligonucleotides of 60 bases each by the program DNAWorks [93]. The set of mixed oligos was assembled by PCR into a gene library for screening, using the protocols of gene assembly mutagenesis [94].

#### **5.5. Experimental screening and diversity generation by** *in vitro* **evolution**

The computationally generated library for the H5N1 LOO-GFP biosensor had a complexity of around 10000 sequences and was relatively easy to screen in low to medium-throughput manner by looking for colonies that were fluorescent when co-expressed with its target pep‐ tide sequence. We fused the target peptide to intein [95] so that it would be cleaved immedi‐ ately after expression and would exist as a free peptide.

However, potential mutations that are distant from the binding site of the target peptide (i.e. >10Å away from the binding site) may still have indirect effects on the binding of the target, or influence on LOO-GFP folding, are not easily captured in the computational design proc‐ ess because of time and memory limitations. To expand the screening, candidate mutant genes can be subjected to rounds of *in vitro* evolution, such as error-prone PCR [96] and/or DNA shuffling [24].

We demonstrated the first proof-of-concept for designing LOO-GFP biosensors by combin‐ ing computational protein design and *in vitro* evolution. DEEdesign was used to create a set of degenerate oligonucleotide primers for gene assembly. DNA shuffling was performed di‐ rectly on this set of genes to further increase the diversity of the constructed library, since gene assembly mutagenesis does not ensure complete representation of all possible antici‐ pated sequences [94]. DNA shuffling also introduces random mutagenesis beyond the pre‐ dicted mutations on the gene variant.

Potential candidates for LOO-GFP biosensors were plate-screened in *E. coli* that co-expressed the biosensor gene library and the HA peptide fused to a carrier, intein. Expression of both peptide and biosensor library were induced simultaneously, and the intensity of fluorescence was monitored under excitation of 488 nm wavelength after the induction of 24 hours at room temperature. Two potential LOO-GFP biosensors, DS1 and DS2, that produced elevated fluo‐ rescence intensity in the presence of the HA peptide were found (Figure 12). There were nine and sixteen mutations found in DS1 and DS2 respectively, and seven of those mutations were from DEEdesign prediction and the remainder were from *in vitro* evolution.

**Figure 12.** Potential LOO-GFP biosensors against HA target peptides of influenza virus. (A) Time course study of fluo‐ rescence recovery upon expression of biosensor variants with [+] and without [-] HA peptides. Protein expression was induced with 0.5mM IPTG and under room temperature. Fluorescence was record every hour for 4 hours and after 24 hours. All pictures were taken with the same setting of digital camera. (B) Multiple sequence alignment of LOO7, DS1 and DS2 mutant. Mutations introduced by computational design (green) and *in vitro* evolution (red) in DS1 and DS2 mutants are shown.

When co-expressed with the HA peptide, the DS1 mutant exhibited target-dependent matu‐ ration of chromophore, while in the absense of the peptide it showed barely detectable fluo‐ rescence even after 24 hours, indicating a specific interaction between DS1 mutant and the HA peptide. DS2 mutant showed faster recovery of fluorescence within four hours in the presence of the HA peptide; however, a higher degree of nonspecific auto-fluorescence was also observed after 24 hours. The DS1 mutant chromophore formation showed a greater de‐ pendency on the left-out peptide (i.e. the HA peptide), implying better folding of designed LOO-GFP molecule, than DS2 mutant *in vivo*, showing DS1 mutant as a better HA-specific LOO-GFP biosensor.
