**1. Introduction**

Green fluorescent protein (GFP) is a 27 kD protein consisting of 238 amino acid residues [1]. GFP was first identified in the aquatic jellyfish *Aequorea victoria* by Osamu Shimomura *et al*. in 1961 while studying aequorin, a Ca2+-activated photoprotein.Aequorin and GFP are local‐ ized in the light organs of *A. victoria* and GFP was accidentally discovered when the energy of the blue light emitted by aequorin excited GFP to emit green light.Unlike most fluores‐ cent proteins which contain chromophores distinct from the amino acid sequence of the pro‐ tein, the chromophore of GFP is internally generated by a reaction involving three amino acid residues [2]. This unique property allows GFP to be easily cloned into numerous bio‐ logical systems, both prokaryotic and eukaryotic, which has paved the way for its utilisation in a variety of biological applications, most notably in biosensing.

#### **1.1. The three dimensional structure**

The molecular structure of GFP was first determined in 1996 using X-ray crystallography [1].One of the most obvious features of its tertiary structure is a beta-barrel composed of 11 mostly-antiparallel beta strands. The molecular structure of GFP is illustrated in Figure 1 along with a cartoon representation showing the organization of the secondary structure ele‐ ments that compose the beta barrel.Each beta strand is 9 to 13 residues in length and hydro‐ gen bonds with adjacent beta strands to create an enclosed structure.The bottom of the barrel contains both termini and two distorted helical crossover segments, and the top has one short crossover and one distorted helical crossover segment.The beta-barrel (sometimes referred to as a "beta can" because it contains a central alpha-helical segment) consists of three anti-parallel three-stranded beta-meander units and a two-stranded beta-hairpin

© 2013 Crone et al.; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Crone et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

(shown in blue, green, and yellow, and red in Figure 1 respectively).The very distorted cen‐ tral alpha helix contains three residues which participate in an auto-catalyzed cyclization/ oxidation chromophore maturation reaction which generates the *p*-hydroxybenzylidene-imi‐ dazolidone chromophore.In the unfolded state, the chromophore is non-fluorescent, pre‐ sumably because water molecules and molecular oxygen can interact with and quench the fluorescent signal [3].Therefore, the closed beta barrel structure is essential for fluorescence by shielding the chromophore from bulk solvent.

**Figure 1.** Tertiary structure of GFP as determined by x-ray crystallography (PDB code 2B3P).Shown on the bottom is a cartoon depicting the secondary structure elements, all anti-parallel beta strand pairings except β1 to β6, which is par‐ allel. Numbers indicate the start and end of each secondary structure element.

The interior of the GFP beta barrel is unusually polar.There is an interior cavity filled with four water molecules on one side of the central helix, while the other side contains a cluster of hydrophobic side chains which is more typical of a protein core.Several polar side chains interact with and stabilize the GFP chromophore.Three of these, His148, Thr203, and Ser205, form hydrogen bonds with the phenolic hydroxyl group of the chromophore.Arg96 and Gln94 interact with the carbonyl group of the imidazolidone ring. Figure 2 depicts these sta‐ bilizing hydrogen bonding interactions with the chromophore.Additionally, a number of in‐ ternal residues interact with and stabilize Arg96, a side chain that is known to be required for the maturation of the chromophore.Specifically, Thr62 and Gln183 form hydrogen bonds with the protonated form of Arg96 stabilizing a buried positive charge within the GFP beta barrel, which in turn stabilizes a partial negative charge on the carbonyl oxygen of the imi‐ dazolidone ring.

#### **1.2. Thermodynamic and kinetic properties**

Wild type GFP has a number of interesting characteristics that can potentially complicate its applicability to biosensing.One is its tendency to aggregate in the cell, especially when ex‐ pressed in high concentrations. Aggregation is typically caused by exposed hydrophobicity, which may be due to either the presence of hydrophobic patches on the surface of the protein, or to low thermostability, or to slow folding.Surface hydrophobic-to-hydrophilic mutations decrease the aggregation tendency of GFP [4], but some biosensing applications require sur‐ face mutations that may increase aggregation. Most likely, GFP's low *in vivo* solubility is due to its extremely slow folding and unfolding kinetics.Refolding of GFP consists of at least two observable phases, depending on the variant and the method being used to measure the kinet‐ ics.Multi-phase folding kinetics indicates the existence of multiple parallel folding pathways, some fast and some slow, holding out the hope that engineered GFPs could be made to fold faster by favoring the faster folding pathway. Indeed, GFP has been engineered to eliminate the slowest phase of folding, as discussed later in this chapter. For Cycle3, a mutant whose chromophore matures correctly at 37°C, the kinetic phases range from 10 s-1 to 10-2 s-1 [5] (halflives of folding ranging from 0.1 s to 100 s). Although it folds slowly, GFP unfolds *extremely* slowly, with a rate of 10-6 s-1 (t1/2=8 days) in 3.0M GndHCl [6], such that when extrapolated to 0M GndHCl, the theoretical unfolding half-life in on the order of t1/2= 22 years.GFP is phenom‐ enally kinetically stable once it is folded to its native state.

**Figure 2.** Stereo image of the hydrogen bonding patterns of the internal GFP residues with the chromophore (green), including four crystallographic waters (cyan). Drawn from superfolder GFP, PDB ID 2B3P.

#### **1.3. Maturation of the chromophore**

(shown in blue, green, and yellow, and red in Figure 1 respectively).The very distorted cen‐ tral alpha helix contains three residues which participate in an auto-catalyzed cyclization/ oxidation chromophore maturation reaction which generates the *p*-hydroxybenzylidene-imi‐ dazolidone chromophore.In the unfolded state, the chromophore is non-fluorescent, pre‐ sumably because water molecules and molecular oxygen can interact with and quench the fluorescent signal [3].Therefore, the closed beta barrel structure is essential for fluorescence

**Figure 1.** Tertiary structure of GFP as determined by x-ray crystallography (PDB code 2B3P).Shown on the bottom is a cartoon depicting the secondary structure elements, all anti-parallel beta strand pairings except β1 to β6, which is par‐

The interior of the GFP beta barrel is unusually polar.There is an interior cavity filled with four water molecules on one side of the central helix, while the other side contains a cluster of hydrophobic side chains which is more typical of a protein core.Several polar side chains interact with and stabilize the GFP chromophore.Three of these, His148, Thr203, and Ser205, form hydrogen bonds with the phenolic hydroxyl group of the chromophore.Arg96 and Gln94 interact with the carbonyl group of the imidazolidone ring. Figure 2 depicts these sta‐ bilizing hydrogen bonding interactions with the chromophore.Additionally, a number of in‐ ternal residues interact with and stabilize Arg96, a side chain that is known to be required for the maturation of the chromophore.Specifically, Thr62 and Gln183 form hydrogen bonds with the protonated form of Arg96 stabilizing a buried positive charge within the GFP beta barrel, which in turn stabilizes a partial negative charge on the carbonyl oxygen of the imi‐

Wild type GFP has a number of interesting characteristics that can potentially complicate its applicability to biosensing.One is its tendency to aggregate in the cell, especially when ex‐ pressed in high concentrations. Aggregation is typically caused by exposed hydrophobicity, which may be due to either the presence of hydrophobic patches on the surface of the protein, or to low thermostability, or to slow folding.Surface hydrophobic-to-hydrophilic mutations decrease the aggregation tendency of GFP [4], but some biosensing applications require sur‐ face mutations that may increase aggregation. Most likely, GFP's low *in vivo* solubility is due to its extremely slow folding and unfolding kinetics.Refolding of GFP consists of at least two observable phases, depending on the variant and the method being used to measure the kinet‐

by shielding the chromophore from bulk solvent.

4 State of the Art in Biosensors - General Aspects

allel. Numbers indicate the start and end of each secondary structure element.

dazolidone ring.

**1.2. Thermodynamic and kinetic properties**

The chromophore of the native GFP structure is generated by an internal, autocatalytic reac‐ tion involving three residues on the interior alpha helix.Cyclization and oxidation of inter‐ nal residues of Ser65, Tyr66, and Gly67, generate a *p*-hydroxybenzylidene-imidazolidone chromophore that maximally absorbs light at 395 nm and 475 nm [1].Excitation at either ab‐ sorption peak results in emission of green light at 508 nm.Interestingly, the sidechains of the chromophore triplet65-SerTyrGly-67 can be mutated to other sidechains without loss of function. Tyrosine 66 can be mutated to any aromatic sidechain [7].This allows for the syn‐ thesis of numerous variants of GFP that alter the chromophore structure or its surrounding environment to absorb and emit light at different wavelengths, producing a wide array of fluorescent protein colors [8].

The three-step mechanism for the spontaneous generation of the chromophore consists of cyclization, oxidation, and dehydration [9]. Figure 3 illustrates the mechanism, beginning with the original triplet of amino acids. The slow step in chromophore maturation is the dif‐ fusion of molecular oxygen into the active site of the closed beta barrel (step 3). The posi‐ tioning of side chains surrounding the chromophore is crucial for stabilizing the intermediates in the process of chromophore maturation,especially Arg96, which stabilizes the enolate form of intermediate 1 by forming a salt bridge with the negatively-charged oxy‐ gen atom, and Glu222, which receives protons from the water molecules to cycle between the protonated and deprotonated states.The two coplanar aromatic rings of the chromo‐ phore adopt the *cis* conformation across the Tyr66 alpha-beta carbon double bond.Photo‐ bleaching, the light-induced loss of fluorescence, is caused by short wavelength light that causes the chromophore to isomerize to the *trans* form, accompanied by distortion of its pla‐ nar geometry and surrounding side chain packing [10].This type of photobleaching appears to be a slowly reversible process for GFP and other fluorescent proteins.

**Figure 3.** Mechanism of the maturation of the GFP chromophore. Steps 1-6 include the cyclization and deoxidation steps while step 7 indicates two possible pathways for the dehydration step. Used with permission from [9]

The two spectral absorbances of the GFP chromophore have been found to be highly sensi‐ tive to pH changes [11].At physiological pH, GFP exhibits maximal absorption at 395 nm while absorbing lesser amounts of light at 475 nm.However, increasing the pH to about 12.0 causes the maximal absorption of light to occur around 475 nm while diminishing the ab‐ sorption at 395 nm.The two absorption maxima correspond to different protonated states of the chromophore.The pKa for the side chain hydroxyl group of Tyr66 is about 8.1 [12] and therefore, the maximal absorbance for the neutral chromophore occurs at 395 nm while max‐ imal absorbance occurs at 470 nm for the anionic form of the chromophore.At acidic pHs lower than 6 or alkaline pHs above 12, fluorescence is diminished as GFP is denatured and the chromophore is quenched.

#### **1.4. Wavelength variants and FRET**

Starting with homologous green and red fluorescent proteins, a rainbow of different-colored fluorescent proteins have been developed. Mutating Tyr66 of the GFP chromophore to a tryptophan produces cyan fluorescence, while a histidine mutation produces blue fluores‐ cence. Mutating a threonine on beta strand 10 to a tyrosine introduces a pi-stacking interac‐ tion which produces yellow fluorescence. See [3] for more details. At the other end of the color spectrum, the coral-derived DsRed fluorescent protein, a structural homolog of GFP, was diversified into the mFruits library, producing eight fluorescent proteins with emission maxima ranging from 537 to 610 nm [13]. Far-red fluorescent proteins, which have potential for use in deep tissue imaging due to the penetration of these wavelengths, have been dis‐ covered [14-16], while others have been developed in the lab [17] and even using computa‐ tional approaches [18]. Further enhancement of these wavelength-shifted variants has improved their biophysical properties and made them available to more applications.

tioning of side chains surrounding the chromophore is crucial for stabilizing the intermediates in the process of chromophore maturation,especially Arg96, which stabilizes the enolate form of intermediate 1 by forming a salt bridge with the negatively-charged oxy‐ gen atom, and Glu222, which receives protons from the water molecules to cycle between the protonated and deprotonated states.The two coplanar aromatic rings of the chromo‐ phore adopt the *cis* conformation across the Tyr66 alpha-beta carbon double bond.Photo‐ bleaching, the light-induced loss of fluorescence, is caused by short wavelength light that causes the chromophore to isomerize to the *trans* form, accompanied by distortion of its pla‐ nar geometry and surrounding side chain packing [10].This type of photobleaching appears

**Figure 3.** Mechanism of the maturation of the GFP chromophore. Steps 1-6 include the cyclization and deoxidation

The two spectral absorbances of the GFP chromophore have been found to be highly sensi‐ tive to pH changes [11].At physiological pH, GFP exhibits maximal absorption at 395 nm while absorbing lesser amounts of light at 475 nm.However, increasing the pH to about 12.0 causes the maximal absorption of light to occur around 475 nm while diminishing the ab‐ sorption at 395 nm.The two absorption maxima correspond to different protonated states of the chromophore.The pKa for the side chain hydroxyl group of Tyr66 is about 8.1 [12] and therefore, the maximal absorbance for the neutral chromophore occurs at 395 nm while max‐ imal absorbance occurs at 470 nm for the anionic form of the chromophore.At acidic pHs lower than 6 or alkaline pHs above 12, fluorescence is diminished as GFP is denatured and

Starting with homologous green and red fluorescent proteins, a rainbow of different-colored fluorescent proteins have been developed. Mutating Tyr66 of the GFP chromophore to a tryptophan produces cyan fluorescence, while a histidine mutation produces blue fluores‐ cence. Mutating a threonine on beta strand 10 to a tyrosine introduces a pi-stacking interac‐ tion which produces yellow fluorescence. See [3] for more details. At the other end of the color spectrum, the coral-derived DsRed fluorescent protein, a structural homolog of GFP, was diversified into the mFruits library, producing eight fluorescent proteins with emission maxima ranging from 537 to 610 nm [13]. Far-red fluorescent proteins, which have potential

steps while step 7 indicates two possible pathways for the dehydration step. Used with permission from [9]

the chromophore is quenched.

6 State of the Art in Biosensors - General Aspects

**1.4. Wavelength variants and FRET**

to be a slowly reversible process for GFP and other fluorescent proteins.

GFP and its derivatives have seen significant use as fluorescent pairs for Förster Resonance Energy Transfer (FRET) experiments. FRET emission arises when the emission spectrum of one chromophore overlaps with the excitation spectrum of another chromophore. If the two chromophores are physically close (on the order of a few nanometers) and in the correct ori‐ entation, then excitation of the first chromophore will excite the second chromophore through non-radiative energy transfer and produce fluorescence at the second chromo‐ phore's emission wavelength (Figure4). This phenomenon can be used to detect when two fluorescent proteins (FPs) are within a certain distance, which may be induced by a liganddependent conformational change in a linking domain between the two fluorescent pro‐ teins, or by binding of interacting domains fused to fluorescent proteins. The canonical pairing for FRET using fluorescent proteins is cyan fluorescent protein (CFP) and yellow flu‐ orescent protein (YFP) [19]; but this pairing has issues concerning overlapping emission spectra, stability to photobleaching, and sensitivity to the chemical environment. The study in [20] had the goal of producing a cyan fluorescent protein more suitable for use in FRET experiments. Other pairings, such as GFP and the the DsRed-based variant mCherry red flu‐ orescent protein, have been proposed as consistent, reliable alternatives [21]. A full review of the development and usage of fluorescent proteins as tools for FRET can be found in [22]. The genetic and physical ease of use of GFP-derived fluorescent proteins, in conjunction with their wide range of colors and spectral overlaps, makes them ideal molecules for the design of FRET-based biosensors.

**Figure 4.** Illustration of the FRET phenomenon using the traditional CFP/YFP donor/acceptor pairing. a) If the two flu‐ orescent moieties are too far apart, excitation of the donor molecule only produces observable emission from the do‐ nor. b) When in range, excitation of the donor is propagated to the acceptor molecule through non-radiative photon transfer, and emission from the acceptor is observed.

#### **1.5. Mutants with improved features**

Because of the aforementioned slow folding, low solubility and slow chromophore matura‐ tion, a significant effort has been put forth to improve these properties in GFP. These strat‐ egies range from specific, directed rational mutations based on structural and biophysical information to fully randomized approaches such as error-prone PCR [23] and DNA shuf‐ fling [24]. By mutating the chromophore residue serine 65 to a threonine (S65T) and phenyl‐ alanine 64 to a leucine (F64L), an "enhanced" GFP (EGFP, gi:27372525) was produced with the excitation maximum shifted from ultraviolet to blue and with better folding efficiency in *E. coli*[25]. Blue excitation is favorable because it matches up with the wavelengths of laser light used in modern cell sorting machines. Three rounds of DNA shuffling produced a mu‐ tant of GFP termed "cycle3" or GFPuv (gi:1490533) which contains three point mutations at or near the surface of the protein (F100S, M154T, V164A). This mutant has 16- to 18-fold brighter fluorescence than wild type GFP, attributed to a reduction of surface hydrophobici‐ ty and, subsequently, aggregation *in vivo* which prevents chromophore maturation [6]. Combining these sets of mutations produces a "folding reporter" GFP (gi: 83754214) which is monomeric and highly fluorescent [26], but does not fold and fluoresce strongly when fused to other poorly folded proteins. Four rounds of DNA shuffling starting with this GFP variant produced a mutant with six additional mutations, called "superfolder" GFP (gi: 391871871), which can fold even when fused to a poorly folding protein [27]. Superfolder GFP also showed increased resistance to chemical denaturation and faster refolding kinetics. This GFP variant also has exceptional tolerance to circular permutation compared to the "folding reporter" mutant of GFP (circular permutation will be discussed in Sequential rear‐ rangements and truncations). A common theme emerges from these sets of mutations: a re‐ duction in surface hydrophobicity leads to reduced aggregation tendency, which increases the fraction of chromophore able to mature and, consequently, the brightness of the protein in vivo.The hydrophobicity of the wild type GFP is hypothesized to serve as a binding site to aequorin in jellyfish [4].

Mutating surface polar residues to increase the net charge, called "supercharging", may be one solution to the problem of aggregation. Armed with the knowledge that the net surface charge does not often affect protein folding or activity, [28] demonstrated that mutating the surface residues either to majority positive or to majority negative side chains does not sig‐ nificantly affect fluorescence. Furthermore, these "supercharged" variants of GFP showed increased resistance to both thermally and chemically-induced aggregation with a minimal decrease in thermal stability. The only side effects are the unwanted binding of positively supercharged GFP to DNA, and the formation of a fluorescent precipitate when oppositely supercharged variants are mixed.

Disulfide bonds have been known to confer additional stability to proteins. Two externallyplaced disulfides were engineered into cycle3 GFP,one predicted to have no effect on stabili‐ ty, the other predicted to have a stabilizing effect [29]. The predictions, based on estimations of local disorder, were correct. Adding a disulfide where the chain is more disordered im‐ proved stability the most.

In recent, unpublished work in our lab [30], a faster-folding GFP has been made by eliminat‐ ing a conserved cis-peptide bond. The slowest phase of folding of superfolder GFP has been known to be related to cis/trans isomerization of a peptide bond preceding a proline [5]. We targeted Pro89 for mutation, since the peptide bond is cis at that position in the crystal struc‐ ture, but modeling studies suggested that a simple point mutation would not have worked. Instead, we added two residues creating a longer loop, and then selected new side chains for four residues based on modeling. The new variant, called "all-trans" or AT-GFP, folds fast‐ er, lacking the slow phase. A 2.7Å crystal structure, in progress, shows clearly that the back‐ bone is indeed composed of all trans peptide bonds in the new loop region.

All of the variants discussed so far are derived from *Aequorea* GFP, but homologous fluores‐ cent proteins from other species have also played a role in advancing the science. Rational design of a homologous GFP from the marine arthropod *Pontellina plumata* resulted in "Tur‐ boGFP" which folds and matures much faster than EGFP with reduced *in vitro* aggregation relative to its parent protein [31]. TurboGFP and its parent protein lack *cis*-peptide bonds, known to contribute to the slow phase of GFP folding [5]. The crystal structure of TurboGFP reveals a pore to the chromophore, which mutagenesis shows to be a key component to fast maturation [31]. This makes sense, since the diffusion of molecular oxygen into the core is the rate limiting step in chromophore formation.This result represents the first successful designed improvements to a non-Cnidarian fluorescent protein. Random directed mutagen‐ esis of beta strands 7 and 8 in the cyan fluorescent protein derivative mCerulean produced a mutant with six mutations and a T65S reversion mutation in the chromophore. This con‐ struct, termed mCerulean3, has an increased quantum yield and demonstrates minimal pho‐ tobleaching and photoswitching effects, making it a better FRET donor molecule [20].

A novel fluorescent protein was developed using the consensus engineering approach, syn‐ thesizing a consensus sequence gene from 31 homologs of the monomeric Azami green pro‐ tein, a distant homolog of *Aequorea* GFP. The resulting protein CGP (consensus green protein) has comparable expression to the parent protein with increased brightness and slightly decreased stability [32]. A novel directed evolution process was then carried out on CGP to stabilize it by inserting destabilizing loops into the protein, then evolving it to toler‐ ate the insertions, then removing the destabilizing loops. After three rounds of this process, a mutant called eCGP123 demonstrated exceptional thermal stability compared to CGP and the parent Azami green protein [33]. Distantly-related fluorescent proteins have contributed much to the structural and biophysical understanding and application of the larger family.

#### **1.6. Sequential rearrangements and truncations**

**1.5. Mutants with improved features**

8 State of the Art in Biosensors - General Aspects

to aequorin in jellyfish [4].

supercharged variants are mixed.

proved stability the most.

Because of the aforementioned slow folding, low solubility and slow chromophore matura‐ tion, a significant effort has been put forth to improve these properties in GFP. These strat‐ egies range from specific, directed rational mutations based on structural and biophysical information to fully randomized approaches such as error-prone PCR [23] and DNA shuf‐ fling [24]. By mutating the chromophore residue serine 65 to a threonine (S65T) and phenyl‐ alanine 64 to a leucine (F64L), an "enhanced" GFP (EGFP, gi:27372525) was produced with the excitation maximum shifted from ultraviolet to blue and with better folding efficiency in *E. coli*[25]. Blue excitation is favorable because it matches up with the wavelengths of laser light used in modern cell sorting machines. Three rounds of DNA shuffling produced a mu‐ tant of GFP termed "cycle3" or GFPuv (gi:1490533) which contains three point mutations at or near the surface of the protein (F100S, M154T, V164A). This mutant has 16- to 18-fold brighter fluorescence than wild type GFP, attributed to a reduction of surface hydrophobici‐ ty and, subsequently, aggregation *in vivo* which prevents chromophore maturation [6]. Combining these sets of mutations produces a "folding reporter" GFP (gi: 83754214) which is monomeric and highly fluorescent [26], but does not fold and fluoresce strongly when fused to other poorly folded proteins. Four rounds of DNA shuffling starting with this GFP variant produced a mutant with six additional mutations, called "superfolder" GFP (gi: 391871871), which can fold even when fused to a poorly folding protein [27]. Superfolder GFP also showed increased resistance to chemical denaturation and faster refolding kinetics. This GFP variant also has exceptional tolerance to circular permutation compared to the "folding reporter" mutant of GFP (circular permutation will be discussed in Sequential rear‐ rangements and truncations). A common theme emerges from these sets of mutations: a re‐ duction in surface hydrophobicity leads to reduced aggregation tendency, which increases the fraction of chromophore able to mature and, consequently, the brightness of the protein in vivo.The hydrophobicity of the wild type GFP is hypothesized to serve as a binding site

Mutating surface polar residues to increase the net charge, called "supercharging", may be one solution to the problem of aggregation. Armed with the knowledge that the net surface charge does not often affect protein folding or activity, [28] demonstrated that mutating the surface residues either to majority positive or to majority negative side chains does not sig‐ nificantly affect fluorescence. Furthermore, these "supercharged" variants of GFP showed increased resistance to both thermally and chemically-induced aggregation with a minimal decrease in thermal stability. The only side effects are the unwanted binding of positively supercharged GFP to DNA, and the formation of a fluorescent precipitate when oppositely

Disulfide bonds have been known to confer additional stability to proteins. Two externallyplaced disulfides were engineered into cycle3 GFP,one predicted to have no effect on stabili‐ ty, the other predicted to have a stabilizing effect [29]. The predictions, based on estimations of local disorder, were correct. Adding a disulfide where the chain is more disordered im‐

Circular permutation is the repositioning of the N and C-termini of the protein to different regions of the sequence, connecting the original termini with a flexible peptide linker to pro‐ duce a continuous, shuffled polypeptide. Many proteins retain their structure and function after permutation, provided the permutation site is not disruptive to secondary structural elements. This process demonstrates the tolerance of the protein's overall structure to signif‐ icant rearrangements of primary sequence [34], enabling the design of biosensors based on split GFPs as discussed later.

GFP's rigid structure, extreme stability and unique post-translational chromophore forma‐ tion reaction do not seem to suggest that it would tolerate circular permutation, and for the most part, it does not. All permutations that disrupt beta strands do not form the chromo‐ phore, and about half of the permutations in loop regions cannot form the chromophore. However, one particular permutation, starting the protein at position 145 (just before beta strand 7) expresses and fluoresces well, although it is less stable and less bright than the wild type GFP [34]. This circular permutation can also tolerate protein fusions to its new ter‐ mini (positions 145 and 144 in wild type numbering), and position 145 in the wild type can accept a full protein insertion, such as calmodulin or a zinc finger binding domain [35]. The "superfolder" GFP reported in [27] was able to fluoresce after 13 of the 14 possible circular permutations, whereas the folding-reporter GFP only tolerated 3 of those 14 permuta‐ tions.Figure 5summarizes permutation and loop insertion results.

**Figure 5.** a) The wild type GFP, and (b) rewired GFP topology as drawn using the TOPS conventions [37]. Solid lines are connections at the top, dashed lines at the bottom of the barrel. (c) Green dots mark locations of the termini in viable circular permutants. Orange dots mark places where long insertions have been made [38] Green arrows mark beta strands that can be left out and added back to reconstitute fluorescence. Red lines are connections created in rGFP3, rewired GFP [36]. Topological changes and truncations are the least tolerated in the N-terminal 6 beta strands.

Circular rearrangements preserve the overall "ordering" of the secondary structural ele‐ ments; however, non-circular rearrangement of the secondary structural elements is also possible. Using rational computational modeling and knowledge about GFP's folding path‐ way, [36] designed a "rewired" GFP with identical fluorescence properties and stability as a variant of superfolder GFP, but with the beta strands connected in a different order. These experiments demonstrate the selective robustness of GFP's structure to large-scale rear‐ rangements in sequence, which has implications for deciphering the GFP folding pathway, as well as for design of split-GFP biosensors.

#### **1.7. "Leave-One-Out" GFP**

GFP's rigid structure, extreme stability and unique post-translational chromophore forma‐ tion reaction do not seem to suggest that it would tolerate circular permutation, and for the most part, it does not. All permutations that disrupt beta strands do not form the chromo‐ phore, and about half of the permutations in loop regions cannot form the chromophore. However, one particular permutation, starting the protein at position 145 (just before beta strand 7) expresses and fluoresces well, although it is less stable and less bright than the wild type GFP [34]. This circular permutation can also tolerate protein fusions to its new ter‐ mini (positions 145 and 144 in wild type numbering), and position 145 in the wild type can accept a full protein insertion, such as calmodulin or a zinc finger binding domain [35]. The "superfolder" GFP reported in [27] was able to fluoresce after 13 of the 14 possible circular permutations, whereas the folding-reporter GFP only tolerated 3 of those 14 permuta‐

**Figure 5.** a) The wild type GFP, and (b) rewired GFP topology as drawn using the TOPS conventions [37]. Solid lines are connections at the top, dashed lines at the bottom of the barrel. (c) Green dots mark locations of the termini in viable circular permutants. Orange dots mark places where long insertions have been made [38] Green arrows mark beta strands that can be left out and added back to reconstitute fluorescence. Red lines are connections created in rGFP3, rewired GFP [36]. Topological changes and truncations are the least tolerated in the N-terminal 6 beta strands.

Circular rearrangements preserve the overall "ordering" of the secondary structural ele‐ ments; however, non-circular rearrangement of the secondary structural elements is also possible. Using rational computational modeling and knowledge about GFP's folding path‐ way, [36] designed a "rewired" GFP with identical fluorescence properties and stability as a variant of superfolder GFP, but with the beta strands connected in a different order. These experiments demonstrate the selective robustness of GFP's structure to large-scale rear‐ rangements in sequence, which has implications for deciphering the GFP folding pathway,

as well as for design of split-GFP biosensors.

tions.Figure 5summarizes permutation and loop insertion results.

10 State of the Art in Biosensors - General Aspects

GFP can also be engineered to omit one of its secondary structural elements, either at one end or in the middle of the sequence by truncating a circular permutant. Truncation may be accomplished either at the genetic level or at the protein level, the latter by using proteolysis and gel filtration. Constructs missing one secondary structure element have been named "Leave-One-Out" or LOO, borrowing the term from a method for statistical cross-valida‐ tion. When synthesized directly via the genetic approach, LOO-GFPs are non-fluorescent or weakly fluorescent. However, if co-expressed with the omitted piece, fluorescence some‐ times develops *in vivo*, depending on which of the secondary structure elements was left out [39,40]. Expressing the full-length GFP and removing the beta strand by proteolysis, denatu‐ ration and gel filtration produces similar results [41]. A complete beta barrel is necessary for chromophore maturation. Once the chromophore has matured, LOO-GFP develops fluores‐ cence rapidly upon introduction of the omitted beta strand from an external source.

That Leave-One-Out works is non-intuitive. In general, protein folding is an all-or-none process and leaving out any whole secondary structure element leads to an unfolded pro‐ tein which aggregates in the cell. Yet, [40] has shown that it is possible to reconstitute LOO-GFP after truncation at several positions in the sequence. The key to understanding why LOO is sometimes possible is in the protein folding pathway. Although folding appears to be an all-or-none process by most experimental metrics, it proceeds along a loosely defined sequence of nucleation and condensation events called a folding pathway [42]. If the se‐ quence segment that is removed is in the part of the protein that folds last, then a kinetic intermediate exists whose structure closely resembles the native state with one piece re‐ moved. This intermediate need not be the lowest energy state and may not be visible by equilibrium measurements, but its minute presence diminishes the energetic barrier of fold‐ ing enough that the addition of a peptide can push the protein to the folded state. In short, Leave-One-Out uses the idea that some cyclically permuted, truncated proteins are natural sensors of the part left out.

*In vivo* solubility experiments performed on twelve LOO-GFPs (individually omitting each of 11 beta strands and the alpha helix) showed that there are significant differences in toler‐ ance to the removal of particular secondary structural elements (SSE) as a function of solu‐ bility. The variability is best explained in terms of the order of folding of the SSEs. SSEs that are required for the early steps in folding leave a more completely unfolded polypeptide be‐ hind when they are left out. SSEs that fold late and not required for most of the folding path‐ way, leaving behind a mostly-folded protein which is more soluble. Leave-One-Out solubility analysis provides a unique insight into the folding pathway of GFP [40]. Omitting strand 7 (LOO7-GFP) appears to be the least detrimental to the overall structure of GFP, suggesting that strand 7 folds last. Binding kinetics data for LOO7-GFP to its missing beta strand as a synthetic peptide gives a Kd value of roughly 0.5 M [11]. Surprisingly, when it is omitted by circular permutation and proteolysis, the central alpha helix can be reintro‐ duced as a synthetic peptide to the "hollow" GFP barrel and chromophore maturation pro‐ ceeds and produces fluorescence [41].However, refolding from the denatured state was required.

Some LOO-GFPs also show interesting reactions to ambient light. LOO11-GFP (beta strand 11 omitted) does not bind strand 11 when kept completely in the dark, but does bind it upon irradiation with light [43]. Raman spectroscopy showed that, in the dark, the chromophore assumes a *trans* conformation, and that light induces a switch to the native *cis* conformation. After irradiation, the chromophore relaxes back to the *trans* conformation. Following up on this result, [44] showed that using a circularly permuted LOO10-GFP construct (beta strand 10 omitted) and introducing two synthetic forms of strand 10, the wild-type strand and a strand with a mutation to cause yellow-shifted fluorescence, light irradiation increased the frequency of "peptide exchange" between the two strand 10 forms. The presence of this pep‐ tide exchange suggests that the cis/trans isomerization of the chromophore requires partial unfolding of the protein.
