**3. Applications**

352 Current Trends in X-Ray Crystallography

followed the same protocol as Chothia and Lesk (Chothia & Lesk, 1986) but considering just membrane proteins. Indeed, by the use of the LGA server (http://proteinmodel.org/) we have aligned (the structures and sequences) present in the core of all the membrane proteins with known three-dimensional structure and produced the graph of Figure 1. The structural divergence between two evolutionary correlated proteins is measured as their

Fig. 1. RMSD versus Percent Sequence Identity of membrane proteins. The core of all the membrane proteins found in the Membrane Proteins with Known Structure Database (http://blanco.biomol.uci.edu/mpstruc/listAll/list) was aligned at the structural level and

Figure 1 shows the RMSD of the backbone of the core of pairs of evolutionarily related proteins as a function of the percent of identity between their amino acid sequences. The definition of the "core" of the structure differs in different methods. It can be intuitively seen as the internal, closely packed, evolutionary conserved part of the structure that contains most of the repetitive secondary structure elements (Tramontano, 2006). For practical purposes we considered as the core of the proteins, all the amino acids present in secondary structure elements and those regions not diverging for more than 3 Å, as Chothia and Lesk

As stated before, Comparative Modeling is based on the idea that evolutionary correlated proteins share similar three-dimensional structures. That is, if we want to predict the structure of a protein we can look in database for an evolutionary correlated protein with known structure, and use the latter as template for building up the structural model of our preferred protein (Tramontano, 2006). The important thing is that, based on Figure 1 the procedure is valid also for membrane proteins. In this regard, the astonishing improvements in membrane proteins crystallography together with comparative modeling techniques will allow the characterization of an enormous amount of membrane protein in the near future.

Root Mean Square Deviation (RMSD).

at the sequence level.

did (Chothia & Lesk, 1986).

In the following sections we will present two cases in which homology modelling has been applied on membrane proteins, i.e. ion channels and G-protein-coupled receptors. Both cases are representative icons of the difficulties found in the structure solving and modelling of membrane proteins for many years. Fortunately, in the last few years there was an explosion of newly solved crystal structures that completely revolutionised the field. Indeed, several mechanisms were understood and functional features could be extended to several members of the families.

We will review in both cases the advancements in X-ray crystallography and how we have used the recently solved crystal structures combined with homology modelling and molecular biology experiments to characterize functional mechanisms.

### **3.1 Ion channels**

Ion channels are integral membrane proteins that function as molecular sensors of physical and chemical stimuli and convert these stimuli into biological signals vital for the existence of every living organism. In other words, ion channels represent the doors and windows of the cell, that open and close following precise stimuli and leave the entrance/exit of very accurately selected 'visitors'. As molecular transducers of mechanical, electrical, chemical, thermal or electromagnetic (light) stimuli, ion channels contribute to changes in electrical, chemical or osmotic activity within cells by gating between the two basic conformations in which they exist – open and closed. Through the gating mechanisms, i.e. opening and closing, ion channels regulate the permeation of ions (in some cases also other solutes), allowing ions to cross the hydrophobic core of the cell membrane, affecting its activity. Because of the well-known difficulties in obtaining high resolution 3D structures by X-ray crystallography of ion channels, alternative strategies based on computational biology tools are currently used to investigate their biophysical properties (for a review about ion channel modelling see: Giorgetti & Carloni, 2003).

The last two decades have been exceptionally exciting for research in the field of ion channels. Astonishing progress has resulted from the use of multidisciplinary approaches to gain insight into the structure and function of ion channels and their role in various aspects of cell physiology and signal transduction. Molecular biology and genetics have provided the sequences of a very large number of ion channel proteins and have helped identify their contribution to various cellular functions. The patch clamp technique has provided the means to study the functional properties of single ion channels with unprecedented precision. X-ray and electron crystallography have provided structural snapshots of a number of ion channel molecules at near atomic resolution, whereas magnetic resonance spectroscopy and fluorescence spectroscopy have provided means to access the dynamics of these molecules. In detail, as of today we count with 14 unique ion channel structures (http://blanco.biomol.uci.edu/mpstruc/listAll/list) for a total of about 45 crystal structures of ion channels solved in different activation states and co-crystallized with different ligands and ions. Moreover, more than 2/3 of the solved structures were obtained in the last two years, showing an exponential development of the field.

Using the structural and functional information obtained by these experimental techniques, computer-assisted molecular modeling has brought ion channels to life by allowing the features underlying the molecular events that shape their function. Indeed, the

Knowledge Based Membrane Protein Structure Prediction:

2002).

amount of experimental data.

From X-Ray Crystallography to Bioinformatics and Back to Molecular Biology 355

different functional properties. Indeed, while voltage-gated potassium channels are extremely selective (MacKinnon, 2003) and their gating strongly depends on membrane voltage (Bezanilla, 2005; Bezanilla, 2008; Swartz, 2004; Swartz, 2008) , CNG channels have a low ion selectivity and their gating is only poorly voltage dependent (Kaupp & Seifert,

Here, we review some of the recent years results of an extensive combined experimental/computational structural study on the widely characterized homotetrameric cyclic nucleotide-gated channel (CNG), from *bovine rod*. As stated before, CNG channels are tetrameric and each subunit consists of two domains arranged as shown in Figure 2: (i) a transmembrane domain formed by six transmembrane helices (S1–S6) and a pore helix (Phelix) with the same topology of voltage-gated potassium channels (Beccheti et al., 1999; Sesti et al., 1995); (ii) A cytoplasmic domain formed by the cyclic nucleotide binding domain (CNBD) which is linked to the transmembrane domain through the so called C-linker region. The pore, un-selective for sodium and potassium, is believed to gate via a conformational change of S6 transmembrane helix (TMH) initiated by the binding of cyclic nucleotides to the binding domains. This conformational change is then transmitted to the pore via coupling with the four P-helices (Johnson & Zagotta, 2001; Matulef et al., 1999). During the last 5 years, we and our collaborators have provided the molecular basis for the characterization of the mechanisms underlying the functioning of these channels by constructing homology models of the transmembrane region of the CNGA1 channel. The models include the S6, P-helix-loop (P-helix+pore wall or filter), along with the C-linker Nterminal sections. Indeed, all the modeled regions were extensively characterized by a great

Models of P-helix-loop and S6 are based on the KcsA X-ray structure, whose topology has been suggested to be similar to that of CNG channels (Beccheti et al., 1999). On the other hand, the C-linker domain was modeled using the C-linker of the *mouse* Hyperpolarizationactivated and Cyclic-nucleotide-modulated (mHCN) channel in its ligand bound state, for

The homology models, were then refined by the inclusion of an extensive dataset of spatial constraints inferred by electrophysiological measurements on cysteine mutants. A large set (about 50) of structural constraints among Cα atoms were inferred from measurements of the electrophysiological properties of the channel in the presence of metal ions (Beccheti et al., 1999; Becchetti & Roncaglia, 2000; Flynn & Zagotta, 2001; Johnson & Zagotta, 2001; Liu & Siegelbaum, 2000; Mazzolini et al., 2010; Nair et al., 2009). For example (i) Cadmium, which can block the channel when it binds to, at least, two cysteine residues. (ii) The mild-oxidizing agent copper phenanthroline (CuP) favors disulfide bridge formation between two cysteines separated by a distance going from 6 to 11 Å. The electrophysiological results were then converted into distance constraints by a statistical analysis of the PDB. Indeed, we have looked for all the proteins co-crystallized with cadmium atoms and we have extracted all the mean distances from cysteines bound to the cadmium atoms. Then, taking into account that the channels are homotetramers, the reversible/irreversible condition of the cadmium blockage was converted into distance restraints by just geometrical considerations. These and other agents were included in the solutions to characterize different features of the channel mechanisms (Mazzolini et al., 2010; Nair et al., 2009). The procedure followed by us and by our experimental collaborators, consisted in a series of iterative steps that extended for more than

three years of successive cycles of modeling followed by experiments and vice versa.

which the X-ray structure has been recently solved (Zagotta et al., 2003).

multidisciplinary approach to the study of ion channels has yielded an unprecedented wealth of new data.

In the next sections examples of applications on a specific ion channel will be illustrated.

#### **3.1.1 CNG channels**

Cyclic nucleotide-gated (CNG) channels are ion channels that generally express in several sensory and non-sensory cells (Kaupp & Seifert, 2002; Matulef & Zagotta, 2003). The most characterized members of the family are those involved in sensory transduction in vertebrate photoreceptors and in olfactory sensory neurons (Kaupp & Seifert, 2002; Matulef & Zagotta, 2003). In their native forms, CNG channels are heterotetramers (Kaupp & Seifert, 2002; Matulef & Zagotta, 2003). CNG channels are differentially sensitive to cyclic nucleotides (CNs): CNG channels from vertebrate rod photoreceptors are selectively activated by cGMP binding, whereas CNG from olfactory neurons are activated by either cAMP or cGMP binding.

Fig. 2. Left panel: Topology of CNG channels. The plot shows the topological relevant elements of CNG channels analyzed in the main text. In particular P-helix was depicted in blue while the S6 helix and the C-linker, are shown in yellow and red, respectively. The Clinker domain, connects the transmembrane domain with the cyclic nucleotide binding domain. S4 is the positively-charged voltage sensor of the channel. The conformational changes proposed regards C-linker, S6 transmembrane helix and the filter P-helix. Right panel: the figure shows the three dimensional configuration of the tetrameric filter region of CNG channels, as obtained through the modeling procedure.

Several very important groups worked for years, accumulating electrophysiological data for mutants and wild-type CNG channels in different activation states, in order to characterize the biophysical properties underlying the functioning of these channels. In particular, most of the experimental available information regard electrophysiological investigations with mutant channels, performed on the CNGA1 channels from bovine rods (Craven & Zagotta, 2006; Kaupp & Seifert, 2002; Matulef & Zagotta, 2003), a channel with a primary structure composed of 690 residues (Kaupp et al. 1989). Although the analysis of accessibility, based on Cysteine Scanning Mutagenesis (CSM) (Akabas et al., 1992; Karlin & Akabas, 1998) has shown that CNG and potassium channels share the same gross topology (Bechetti et al. 1999; Flynn & Zagotta, 2001; Giorgetti et al., 2005; Liu & Siegelbaum, 2000; Matulef & Zagotta, 2003; Mazzolini et al., 2009; Nair et al., 2009) , these two families of channels have

multidisciplinary approach to the study of ion channels has yielded an unprecedented

Cyclic nucleotide-gated (CNG) channels are ion channels that generally express in several sensory and non-sensory cells (Kaupp & Seifert, 2002; Matulef & Zagotta, 2003). The most characterized members of the family are those involved in sensory transduction in vertebrate photoreceptors and in olfactory sensory neurons (Kaupp & Seifert, 2002; Matulef & Zagotta, 2003). In their native forms, CNG channels are heterotetramers (Kaupp & Seifert, 2002; Matulef & Zagotta, 2003). CNG channels are differentially sensitive to cyclic nucleotides (CNs): CNG channels from vertebrate rod photoreceptors are selectively activated by cGMP binding, whereas CNG from olfactory neurons are activated by either

In the next sections examples of applications on a specific ion channel will be illustrated.

Fig. 2. Left panel: Topology of CNG channels. The plot shows the topological relevant elements of CNG channels analyzed in the main text. In particular P-helix was depicted in blue while the S6 helix and the C-linker, are shown in yellow and red, respectively. The Clinker domain, connects the transmembrane domain with the cyclic nucleotide binding domain. S4 is the positively-charged voltage sensor of the channel. The conformational changes proposed regards C-linker, S6 transmembrane helix and the filter P-helix. Right panel: the figure shows the three dimensional configuration of the tetrameric filter region of

Several very important groups worked for years, accumulating electrophysiological data for mutants and wild-type CNG channels in different activation states, in order to characterize the biophysical properties underlying the functioning of these channels. In particular, most of the experimental available information regard electrophysiological investigations with mutant channels, performed on the CNGA1 channels from bovine rods (Craven & Zagotta, 2006; Kaupp & Seifert, 2002; Matulef & Zagotta, 2003), a channel with a primary structure composed of 690 residues (Kaupp et al. 1989). Although the analysis of accessibility, based on Cysteine Scanning Mutagenesis (CSM) (Akabas et al., 1992; Karlin & Akabas, 1998) has shown that CNG and potassium channels share the same gross topology (Bechetti et al. 1999; Flynn & Zagotta, 2001; Giorgetti et al., 2005; Liu & Siegelbaum, 2000; Matulef & Zagotta, 2003; Mazzolini et al., 2009; Nair et al., 2009) , these two families of channels have

CNG channels, as obtained through the modeling procedure.

wealth of new data.

**3.1.1 CNG channels** 

cAMP or cGMP binding.

different functional properties. Indeed, while voltage-gated potassium channels are extremely selective (MacKinnon, 2003) and their gating strongly depends on membrane voltage (Bezanilla, 2005; Bezanilla, 2008; Swartz, 2004; Swartz, 2008) , CNG channels have a low ion selectivity and their gating is only poorly voltage dependent (Kaupp & Seifert, 2002).

Here, we review some of the recent years results of an extensive combined experimental/computational structural study on the widely characterized homotetrameric cyclic nucleotide-gated channel (CNG), from *bovine rod*. As stated before, CNG channels are tetrameric and each subunit consists of two domains arranged as shown in Figure 2: (i) a transmembrane domain formed by six transmembrane helices (S1–S6) and a pore helix (Phelix) with the same topology of voltage-gated potassium channels (Beccheti et al., 1999; Sesti et al., 1995); (ii) A cytoplasmic domain formed by the cyclic nucleotide binding domain (CNBD) which is linked to the transmembrane domain through the so called C-linker region. The pore, un-selective for sodium and potassium, is believed to gate via a conformational change of S6 transmembrane helix (TMH) initiated by the binding of cyclic nucleotides to the binding domains. This conformational change is then transmitted to the pore via coupling with the four P-helices (Johnson & Zagotta, 2001; Matulef et al., 1999). During the last 5 years, we and our collaborators have provided the molecular basis for the characterization of the mechanisms underlying the functioning of these channels by constructing homology models of the transmembrane region of the CNGA1 channel. The models include the S6, P-helix-loop (P-helix+pore wall or filter), along with the C-linker Nterminal sections. Indeed, all the modeled regions were extensively characterized by a great amount of experimental data.

Models of P-helix-loop and S6 are based on the KcsA X-ray structure, whose topology has been suggested to be similar to that of CNG channels (Beccheti et al., 1999). On the other hand, the C-linker domain was modeled using the C-linker of the *mouse* Hyperpolarizationactivated and Cyclic-nucleotide-modulated (mHCN) channel in its ligand bound state, for which the X-ray structure has been recently solved (Zagotta et al., 2003).

The homology models, were then refined by the inclusion of an extensive dataset of spatial constraints inferred by electrophysiological measurements on cysteine mutants. A large set (about 50) of structural constraints among Cα atoms were inferred from measurements of the electrophysiological properties of the channel in the presence of metal ions (Beccheti et al., 1999; Becchetti & Roncaglia, 2000; Flynn & Zagotta, 2001; Johnson & Zagotta, 2001; Liu & Siegelbaum, 2000; Mazzolini et al., 2010; Nair et al., 2009). For example (i) Cadmium, which can block the channel when it binds to, at least, two cysteine residues. (ii) The mild-oxidizing agent copper phenanthroline (CuP) favors disulfide bridge formation between two cysteines separated by a distance going from 6 to 11 Å. The electrophysiological results were then converted into distance constraints by a statistical analysis of the PDB. Indeed, we have looked for all the proteins co-crystallized with cadmium atoms and we have extracted all the mean distances from cysteines bound to the cadmium atoms. Then, taking into account that the channels are homotetramers, the reversible/irreversible condition of the cadmium blockage was converted into distance restraints by just geometrical considerations. These and other agents were included in the solutions to characterize different features of the channel mechanisms (Mazzolini et al., 2010; Nair et al., 2009). The procedure followed by us and by our experimental collaborators, consisted in a series of iterative steps that extended for more than three years of successive cycles of modeling followed by experiments and vice versa.

Knowledge Based Membrane Protein Structure Prediction:

From X-Ray Crystallography to Bioinformatics and Back to Molecular Biology 357

Fig. 3. Upper panel. Topology of GPCR receptors. TM, EL and IL are transmembrane helix,

As can be appreciated from Figure 3, structurally GPCRs are characterized by an extracellular N-terminus, followed by seven transmembrane (7-TM) α-helices (TM-1 to TM-7) connected by three intracellular (IL-1 to IL-3) and three extracellular loops (EL-1 to EL-3), and finally an intracellular C-terminus. The GPCR arranges itself into a tertiary structure resembling a barrel, with the seven transmembrane helices forming a cavity within the

Rhodopsin has been for several years an extraordinarily valuable system for understanding the structure and mechanism of activation of G-protein-coupled receptors (GPCRs). Rhodopsin is highly specialized for the detection of light, exhibiting functional and biochemical characteristics that differentiate it from GPCRs expressed in other tissues such as those specialized in detecting diffusible hormones and neurotransmitters. Crystal structures have recently been determined also for the human β2 adrenoreceptor (β2AR) (Bokoch et al., 2010; Cherezov et al., 2007; Hanson et al., 2008; Rasmussen 2007), a receptor for adrenalin and noradrenalin that is involved in the regulation of cardiovascular and pulmonary function by the sympathetic nervous system. β2AR was the first non-rhodopsin GPCR to be cloned and is one of the most extensively studied members of this family (Lefcowitz, 2000). These structures provide the keys for a highly expected way for compare

plasma membrane that serves a ligand-binding domain that is often covered by EL-2.

extracellular loops and intracellular loops, respectively. Lower panel: The three

dimensional distribution of a GPCR: model of the TAS2R38 receptor.

and contrast with rhodopsin.

This experiment-guided computational model allowed us to gain insights into the structural basis of CNG channel gating mechanism. We have suggested several mechanical features underlying channel functioning. For example we have hypothesized the bending and the counterclockwise rotation of the C-linker N-terminal section. Indeed, this motion is suggested to be transmitted upwards to cause the upper part of S6 to rotate counterclockwise producing the conformational changes needed for the opening and closing of the filter region (Giorgetti et al., 2005). The procedure that allowed to unravel the functional gating mechanisms was characterized by an iterative theoretical/experimental work that permitted not only the hypotheses generation but also their experimental validation. Indeed, on the basis of our models, and using cysteine-scanning mutagenesis, our experimental collaborators were also able to show that in the presence of a mild oxidizing agent, copper phenanthroline (CuP), certain cysteine mutations were able to lock the channel in either the closed or open state, depending on whatever state they happened to be in at the time of CuP application (Nair et al., 2006).

This kind of work has very few precedents due to the general difficulties found in the expression of different mutants. Indeed, some of the suggested mutants included double and triple mutations for a single protein.

#### **3.2 GPCRs**

G protein-coupled receptors (GPCRs), or 7 transmembrane helix receptors (Figure 3), are membrane embedded proteins, responsible for the communication between the cell and the environment (Sakmar et al. 2002). Malfunction of these receptors are generally involved in many major diseases, thus making GPCR receptors one of the most exploited targets for the pharmaceutical industry (Schertler, 1998). About 5500 GPCR sequences are publicly available. The total number of GPCRs with and without introns in the human genome has been estimated to be approximately 900, of which 500 are odorant or taste receptors and 450 are receptors for endogenous ligands (Takeda et al., 2002). Binding constants are available for approximately 30000 ligand–receptor combinations (Horn et al., 1998).

This wealth of sequences, ligands, and mutations are in net contrast with the small amount of structural information available. Indeed, till a few years ago, rhodopsin was the only structurally characterized GPCR and is still considered a prototypical member of the superfamily. The first X-ray structure of rhodopsin reflected the dark adapted ground-state of the bovine receptor, captured in four different crystals of 2.8 Å (Palezewski et al., 2000), 2.65 Å (Li et al., 2004) and 2.2 Å (Okada et al., 2004), 3.4 Å (Standfus et al., 2007), 4.15 Å (Salom et al., 2006). Very recently the structure of a GPCR in its empty state, Opsin, has been crystallized (Park et al., 2008), opening in this way an exciting new era in the studies of GPCRs. This new structure of rhodopsin followed another very recent key event, the crystallization of a second GPCR, the beta2-adrenergic receptor (Bokoch et al., 2010; Cherezov et al., 2007; Hanson et al., 2008; Rasmussen 2007).

Nearly all medicines are discovered by trial and error. Nevertheless, most pharmaceutical companies have large research departments that use every imaginable technique to design drugs. Homology modelling, as a tool to obtain structural information, is one of those techniques. In the past, bacteriorhodopsin (Henderson & Schertler, 1990; Luecke et al., 1998; Pebay-Peyroula et al., 1997; Takeda et al., 1998) was often used as a modelling template, but from 2000 the three-dimensional coordinates (Palezewski et al., 2000) of bovine rhodopsin have become available. Along several years, it was shown to be a much better template for GPCR homology modelling than bacteriorhodopsin.

This experiment-guided computational model allowed us to gain insights into the structural basis of CNG channel gating mechanism. We have suggested several mechanical features underlying channel functioning. For example we have hypothesized the bending and the counterclockwise rotation of the C-linker N-terminal section. Indeed, this motion is suggested to be transmitted upwards to cause the upper part of S6 to rotate counterclockwise producing the conformational changes needed for the opening and closing of the filter region (Giorgetti et al., 2005). The procedure that allowed to unravel the functional gating mechanisms was characterized by an iterative theoretical/experimental work that permitted not only the hypotheses generation but also their experimental validation. Indeed, on the basis of our models, and using cysteine-scanning mutagenesis, our experimental collaborators were also able to show that in the presence of a mild oxidizing agent, copper phenanthroline (CuP), certain cysteine mutations were able to lock the channel in either the closed or open state, depending on whatever state they happened

This kind of work has very few precedents due to the general difficulties found in the expression of different mutants. Indeed, some of the suggested mutants included double

G protein-coupled receptors (GPCRs), or 7 transmembrane helix receptors (Figure 3), are membrane embedded proteins, responsible for the communication between the cell and the environment (Sakmar et al. 2002). Malfunction of these receptors are generally involved in many major diseases, thus making GPCR receptors one of the most exploited targets for the pharmaceutical industry (Schertler, 1998). About 5500 GPCR sequences are publicly available. The total number of GPCRs with and without introns in the human genome has been estimated to be approximately 900, of which 500 are odorant or taste receptors and 450 are receptors for endogenous ligands (Takeda et al., 2002). Binding constants are available

This wealth of sequences, ligands, and mutations are in net contrast with the small amount of structural information available. Indeed, till a few years ago, rhodopsin was the only structurally characterized GPCR and is still considered a prototypical member of the superfamily. The first X-ray structure of rhodopsin reflected the dark adapted ground-state of the bovine receptor, captured in four different crystals of 2.8 Å (Palezewski et al., 2000), 2.65 Å (Li et al., 2004) and 2.2 Å (Okada et al., 2004), 3.4 Å (Standfus et al., 2007), 4.15 Å (Salom et al., 2006). Very recently the structure of a GPCR in its empty state, Opsin, has been crystallized (Park et al., 2008), opening in this way an exciting new era in the studies of GPCRs. This new structure of rhodopsin followed another very recent key event, the crystallization of a second GPCR, the beta2-adrenergic receptor (Bokoch et al., 2010;

Nearly all medicines are discovered by trial and error. Nevertheless, most pharmaceutical companies have large research departments that use every imaginable technique to design drugs. Homology modelling, as a tool to obtain structural information, is one of those techniques. In the past, bacteriorhodopsin (Henderson & Schertler, 1990; Luecke et al., 1998; Pebay-Peyroula et al., 1997; Takeda et al., 1998) was often used as a modelling template, but from 2000 the three-dimensional coordinates (Palezewski et al., 2000) of bovine rhodopsin have become available. Along several years, it was shown to be a much better template for

for approximately 30000 ligand–receptor combinations (Horn et al., 1998).

Cherezov et al., 2007; Hanson et al., 2008; Rasmussen 2007).

GPCR homology modelling than bacteriorhodopsin.

to be in at the time of CuP application (Nair et al., 2006).

and triple mutations for a single protein.

**3.2 GPCRs** 

Fig. 3. Upper panel. Topology of GPCR receptors. TM, EL and IL are transmembrane helix, extracellular loops and intracellular loops, respectively. Lower panel: The three dimensional distribution of a GPCR: model of the TAS2R38 receptor.

As can be appreciated from Figure 3, structurally GPCRs are characterized by an extracellular N-terminus, followed by seven transmembrane (7-TM) α-helices (TM-1 to TM-7) connected by three intracellular (IL-1 to IL-3) and three extracellular loops (EL-1 to EL-3), and finally an intracellular C-terminus. The GPCR arranges itself into a tertiary structure resembling a barrel, with the seven transmembrane helices forming a cavity within the plasma membrane that serves a ligand-binding domain that is often covered by EL-2.

Rhodopsin has been for several years an extraordinarily valuable system for understanding the structure and mechanism of activation of G-protein-coupled receptors (GPCRs). Rhodopsin is highly specialized for the detection of light, exhibiting functional and biochemical characteristics that differentiate it from GPCRs expressed in other tissues such as those specialized in detecting diffusible hormones and neurotransmitters. Crystal structures have recently been determined also for the human β2 adrenoreceptor (β2AR) (Bokoch et al., 2010; Cherezov et al., 2007; Hanson et al., 2008; Rasmussen 2007), a receptor for adrenalin and noradrenalin that is involved in the regulation of cardiovascular and pulmonary function by the sympathetic nervous system. β2AR was the first non-rhodopsin GPCR to be cloned and is one of the most extensively studied members of this family (Lefcowitz, 2000). These structures provide the keys for a highly expected way for compare and contrast with rhodopsin.

Knowledge Based Membrane Protein Structure Prediction:

From X-Ray Crystallography to Bioinformatics and Back to Molecular Biology 359

do not know, the bitter taste is becoming more and more important regarding food taste, cuisine and pharmaceutics matters. Thus, a complete characterization of the events giving rise to taste perception is needed. One of the most interesting bitter taste receptors, also from the evolutionary point of view, is the human receptor for phenylthiocarbamide (PTC) and propylthiouracil (PROP) molecules, i.e. TAS2R38 receptor. Indeed, within the polymorphisms present in this receptor, the most pronounced ones affect its perception of the PROP/PTC. In fact, differences in the perception of PROP and PTC has divided the human population into tasters and non-tasters. Albeit these astonishing results were reported in the early nineties, polymorphisms in the hTAS2R38 gene underlying the observed phenotype were identified only recently by Kim and coworkers (Kim et al., *2003*). Indeed, the taster/non-taster quality originate in three hTAS2R38 non-synonymous polymorphisms, i.e. the haplotypes code for either the amino acids PAV (P49, A262, I296) constituting the taster variant of hTAS2R38 or AVI in the corresponding positions for the non-taster variant (Bufe et al., 2005; Kim et al., 2003*).* Up to now, the molecular/structural basis of bitter taste sensing were analyzed by very few studies, i.e. on hTAS2R16 and

hTAS2R38 relied on computations only (Floriano et al., 2006; Miguet et al., 2006).

computational refinement and/or experimental validations are needed.

administration.

for activation of all GPCRs (Altenbach et al., 2008).

In addition, three experimentally guided structure-activity studies are available now, which all addressed hTAS2Rs distantly related to hTAS2R38 (Brockhoff et al., 2010; Pronin et al., 2004; Sakurai et al., 2010). First principle (Floriano et al., 2006) and homology modeling approaches based on bovine rhodopsin (Miguet et al., 2006) have been used to predict the structure of the widely studied bitter taste receptor hTAS2R38 (Bufe et al., 2005; Khafizov et al., 2007; Kim et al., 2003; Kleinau et al., 2007). Both works coincide in the fact that more

Very recently (Biarnés et al., 2010) we have used a combined experimental/computational iterative approach with the aim at identifying hTAS2R38 residues involved in binding to one of its main agonists, i.e. PTC, as well as in receptor activation. We used state-of-the-art bioinformatics approaches based on multiple sequence alignment across the whole family of GPCRs combined with structural bionformatics tools; we also used the homology modeling techniques because sequences at their own were not likely to be sufficient to identify residues in the binding site, as ligands pockets vary largely in position and orientation across this family (Jaakola et al., 2008). Furthermore extensive virtual docking experiments were carried out to predict the putative binding cavities for PTC. In fact, homology modeling and molecular docking has been shown to guide satisfactorily the design of sitedirected mutagenesis experiments, in spite of the little power of the structural predictions (Ballesteros & Weinstein, 1992). Indeed, the proposed receptor positions were then studied and validated/rejected by site-directed mutagenesis experiments and measurements of receptor activation by recording intracellular calcium levels following agonist

We thus have proposed that hTASR38 activation upon PTC binding is reminiscent of the transition of the G-protein/opsin complex to free rhodopsin (Scheerer et al., 2008). Indeed, we were able to identify some of the residues directly involved in the interaction with the ligand, those that define the shape of the binding cavity and, more important, we were able to identify the residues participating in receptor activation. In our model, TMs 5, 6 and 7 change conformation upon ligand binding, in particular TM6 tilts around the helical bundle upon G-protein binding. Similar sequences of events also have been suggested to play a role

The structures of the β2AR provide new exciting insights into the mechanisms of activation in several ways. On the other hand, there exist a huge amount of mutagenesis, biophysical and computational data for the β2AR and closely related receptors. The new structures provided us with a structural scaffold that may allow the interpretation and validation/rejection of these studies and for generating testable hypothesis for future studies.

One of the most important achievements of crystallography in the last few years, was to provide the community with a way of comparing rhodopsin against other members of the superfamily with different functions. Indeed, rhodopsin evolved for the efficient detection of light: it is present in only one organ and serves only one purpose. In the dark it has almost no activity toward its corresponding G protein, but just one photon can photoisomerize its covalently bound ligand, retinal, changing it from an inverse agonist to a full agonist. By contrast, the β2AR, like many other GPCRs, has a broader range of signalling behaviour: coupling to more than one G protein and to G protein independent pathways, and responding to a large spectrum of cognate molecules (Lefcowitz & Shenoy, 2005). Comparing the structures can help in gaining insights into the structural basis for these functional differences. In rhodopsin, the binding pocket, specialized in covalently binding of retinal, is hindered by a β sheet lid formed by the second extracellular loop (ECL2), and a small domain formed by the N terminus, protecting *cis*-retinal from hydrolysis. This structure would limit access for diffusible agonists and is not present in the β2AR. By contrast, in the β2AR ECL2 forms a helix that is constrained by two disulfide bonds such that there is open access to the ligand-binding pocket.

Very recently the structures of two new GPCR were solved, i.e. the human adenosine A2 receptor (Jakola et al., 2008; Lebon et al., 2011; Xu et al., 2011;) and the human beta1 adrenergic receptor, β1AR, (Moukhametzianov et al. 2011; Warne et al., 2008; Warne et al., 2011), giving also valuable information that allowed the generalization of several structural/functional features conserved along the families. Indeed, all the available structural information, along with experiments coming from the molecular biology and functional assays combined with extensive computational calculations were applied by us and collaborators in order to unravel the binding site and the gating mechanisms of a very particular family of GPCRs, i.e. the bitter taste receptors (Biarnés et al., 2010).

#### **3.2.1 Bitter taste receptors**

Mammals, have been prevented, during evolution, from ingesting toxic compounds because of their strong bitter taste (Behrens & Meyerhof, 2009; Meyerhof, 2005; Mueller et al. 2005; Soranzo et al., 2005). This protection mechanism has been carried out for millions of years by a family of about 30 bitter taste receptors (TAS2Rs) expressed in taste receptor cells (Adler et al., 2000; Behrens et al., 2007; Chandrashekar et al., 2000; Matsunami et al. 2000; Shi & Zhang, 2006). TAS2Rs was shown to belong to the super family of GPCR receptors, albeit their low sequence identity with rhodopsin, for example (Adler et al., 2000; Chandrashekar et al., 2000; Matsunami et al., 2000). The binding of a bitter compound to its cognate target TAS2R, is able to fire a downstream cascade of events inside the cell, typical of GPCRs signaling pathways (Chandrashekar et al., 2006), leading to the production of an electrical signal, i.e. bitter taste perception (Behrens & Meyerhof, 2009). Thus, albeit natural selection decreased its constraints in this sense, i.e. we are not walking around tasting plants that we

The structures of the β2AR provide new exciting insights into the mechanisms of activation in several ways. On the other hand, there exist a huge amount of mutagenesis, biophysical and computational data for the β2AR and closely related receptors. The new structures provided us with a structural scaffold that may allow the interpretation and validation/rejection of these studies and for generating testable hypothesis for future

One of the most important achievements of crystallography in the last few years, was to provide the community with a way of comparing rhodopsin against other members of the superfamily with different functions. Indeed, rhodopsin evolved for the efficient detection of light: it is present in only one organ and serves only one purpose. In the dark it has almost no activity toward its corresponding G protein, but just one photon can photoisomerize its covalently bound ligand, retinal, changing it from an inverse agonist to a full agonist. By contrast, the β2AR, like many other GPCRs, has a broader range of signalling behaviour: coupling to more than one G protein and to G protein independent pathways, and responding to a large spectrum of cognate molecules (Lefcowitz & Shenoy, 2005). Comparing the structures can help in gaining insights into the structural basis for these functional differences. In rhodopsin, the binding pocket, specialized in covalently binding of retinal, is hindered by a β sheet lid formed by the second extracellular loop (ECL2), and a small domain formed by the N terminus, protecting *cis*-retinal from hydrolysis. This structure would limit access for diffusible agonists and is not present in the β2AR. By contrast, in the β2AR ECL2 forms a helix that is constrained by two disulfide

Very recently the structures of two new GPCR were solved, i.e. the human adenosine A2 receptor (Jakola et al., 2008; Lebon et al., 2011; Xu et al., 2011;) and the human beta1 adrenergic receptor, β1AR, (Moukhametzianov et al. 2011; Warne et al., 2008; Warne et al., 2011), giving also valuable information that allowed the generalization of several structural/functional features conserved along the families. Indeed, all the available structural information, along with experiments coming from the molecular biology and functional assays combined with extensive computational calculations were applied by us and collaborators in order to unravel the binding site and the gating mechanisms of a very

Mammals, have been prevented, during evolution, from ingesting toxic compounds because of their strong bitter taste (Behrens & Meyerhof, 2009; Meyerhof, 2005; Mueller et al. 2005; Soranzo et al., 2005). This protection mechanism has been carried out for millions of years by a family of about 30 bitter taste receptors (TAS2Rs) expressed in taste receptor cells (Adler et al., 2000; Behrens et al., 2007; Chandrashekar et al., 2000; Matsunami et al. 2000; Shi & Zhang, 2006). TAS2Rs was shown to belong to the super family of GPCR receptors, albeit their low sequence identity with rhodopsin, for example (Adler et al., 2000; Chandrashekar et al., 2000; Matsunami et al., 2000). The binding of a bitter compound to its cognate target TAS2R, is able to fire a downstream cascade of events inside the cell, typical of GPCRs signaling pathways (Chandrashekar et al., 2006), leading to the production of an electrical signal, i.e. bitter taste perception (Behrens & Meyerhof, 2009). Thus, albeit natural selection decreased its constraints in this sense, i.e. we are not walking around tasting plants that we

bonds such that there is open access to the ligand-binding pocket.

**3.2.1 Bitter taste receptors** 

particular family of GPCRs, i.e. the bitter taste receptors (Biarnés et al., 2010).

studies.

do not know, the bitter taste is becoming more and more important regarding food taste, cuisine and pharmaceutics matters. Thus, a complete characterization of the events giving rise to taste perception is needed. One of the most interesting bitter taste receptors, also from the evolutionary point of view, is the human receptor for phenylthiocarbamide (PTC) and propylthiouracil (PROP) molecules, i.e. TAS2R38 receptor. Indeed, within the polymorphisms present in this receptor, the most pronounced ones affect its perception of the PROP/PTC. In fact, differences in the perception of PROP and PTC has divided the human population into tasters and non-tasters. Albeit these astonishing results were reported in the early nineties, polymorphisms in the hTAS2R38 gene underlying the observed phenotype were identified only recently by Kim and coworkers (Kim et al., *2003*). Indeed, the taster/non-taster quality originate in three hTAS2R38 non-synonymous polymorphisms, i.e. the haplotypes code for either the amino acids PAV (P49, A262, I296) constituting the taster variant of hTAS2R38 or AVI in the corresponding positions for the non-taster variant (Bufe et al., 2005; Kim et al., 2003*).* Up to now, the molecular/structural basis of bitter taste sensing were analyzed by very few studies, i.e. on hTAS2R16 and hTAS2R38 relied on computations only (Floriano et al., 2006; Miguet et al., 2006).

In addition, three experimentally guided structure-activity studies are available now, which all addressed hTAS2Rs distantly related to hTAS2R38 (Brockhoff et al., 2010; Pronin et al., 2004; Sakurai et al., 2010). First principle (Floriano et al., 2006) and homology modeling approaches based on bovine rhodopsin (Miguet et al., 2006) have been used to predict the structure of the widely studied bitter taste receptor hTAS2R38 (Bufe et al., 2005; Khafizov et al., 2007; Kim et al., 2003; Kleinau et al., 2007). Both works coincide in the fact that more computational refinement and/or experimental validations are needed.

Very recently (Biarnés et al., 2010) we have used a combined experimental/computational iterative approach with the aim at identifying hTAS2R38 residues involved in binding to one of its main agonists, i.e. PTC, as well as in receptor activation. We used state-of-the-art bioinformatics approaches based on multiple sequence alignment across the whole family of GPCRs combined with structural bionformatics tools; we also used the homology modeling techniques because sequences at their own were not likely to be sufficient to identify residues in the binding site, as ligands pockets vary largely in position and orientation across this family (Jaakola et al., 2008). Furthermore extensive virtual docking experiments were carried out to predict the putative binding cavities for PTC. In fact, homology modeling and molecular docking has been shown to guide satisfactorily the design of sitedirected mutagenesis experiments, in spite of the little power of the structural predictions (Ballesteros & Weinstein, 1992). Indeed, the proposed receptor positions were then studied and validated/rejected by site-directed mutagenesis experiments and measurements of receptor activation by recording intracellular calcium levels following agonist administration.

We thus have proposed that hTASR38 activation upon PTC binding is reminiscent of the transition of the G-protein/opsin complex to free rhodopsin (Scheerer et al., 2008). Indeed, we were able to identify some of the residues directly involved in the interaction with the ligand, those that define the shape of the binding cavity and, more important, we were able to identify the residues participating in receptor activation. In our model, TMs 5, 6 and 7 change conformation upon ligand binding, in particular TM6 tilts around the helical bundle upon G-protein binding. Similar sequences of events also have been suggested to play a role for activation of all GPCRs (Altenbach et al., 2008).

Knowledge Based Membrane Protein Structure Prediction:

cells. *J. Neurosci*, 27, 12630-12640.

*Differ*, 47, 203-220.

294.

From X-Ray Crystallography to Bioinformatics and Back to Molecular Biology 361

Becchetti,A. et al. (1999) Cyclic nucleotide-gated channels. Pore topology studied through

Becchetti,A. and Roncaglia,P. (2000) Cyclic nucleotide-gated channels: Intra- and

Behrens, M. and Meyerhof,W. (2009) Mammalian bitter taste perception. *Results Probl Cell* 

Behrens, M. et al. (2007) Gustatory expression pattern of the human TAS2R bitter receptor

Bezanilla,F. (2008) How membrane proteins sense voltage. *Nat. Rev. Mol. Cell Biol*, 9, 323-332.

Biarnés,X. et al. (2010) Insights into the binding of Phenyltiocarbamide (PTC) agonist to its

Bokoch,M.P. et al. (2010) Ligand-specific regulation of the extracellular surface of a G-

Brockhoff,A. et al. (2010) Structural requirements of bitter taste receptor activation. *Proc.* 

Bufe,B. et al. (2005) The molecular basis of individual differences in phenylthiocarbamide

Cherezov,V. et al. (2007) High-resolution crystal structure of an engineered human beta2-

Chothia,C. and Lesk,A.M. (1986) The relation between the divergence of sequence and

Craven,K.B. and Zagotta,W. N. (2006) CNG and HCN channels: two peas, one pod. *Annu.* 

Floriano,W.B. et al. (2006) Modeling the human PTC bitter-taste receptor interactions with

Flynn,G.E. and Zagotta,W. N. (2001) Conformational changes in S6 coupled to the opening

Giorgetti,A. and Carloni,P. (2003) Molecular modeling of ion channels: structural

Giorgetti,A., Nair,A.V., et al. (2005) Structural basis of gating of CNG channels. *FEBS Lett*,

Giorgetti,A., Raimondo,D., et al. (2005) Evaluating the usefulness of protein structure models for molecular replacement. *Bioinformatics*, 21 Suppl 2, ii72-76. Hanson,M.A. et al. (2008) A specific cholesterol binding site is established by the 2.8 A structure of the human beta2-adrenergic receptor. *Structure*, 16, 897-905. Henderson,R. and Schertler,G.F. (1990) The structure of bacteriorhodopsin and its relevance

to the visual opsins and other seven-helix G-protein coupled receptors. *Philosophical transactions of the Royal Society of London. Series B: Biological sciences*, 326, 379-389. Horn,F. et al. (1998) GPCRDB: An information system for G protein-coupled receptors.

extracellular accessibility to Cd2+ of substituted cysteine residues within the P-

gene family reveals a heterogenous population of bitter responsive taste receptor

the accessibility of reporter cysteines. *J. Gen. Physiol*, 114, 377-392.

loop. *Pflugers Archiv European Journal of Physiology*, 440, 556-565.

Bezanilla,F. (2005) Voltage-gated ion channels. *IEEE Trans Nanobioscience*, 4, 34-48.

and propylthiouracil bitterness perception. *Curr. Biol*, 15, 322-327. Chandrashekar,J et al. (2000) T2Rs function as bitter taste receptors. *Cell*, 100, 703-711. Chandrashekar,J et al. (2006) The receptors and cells for mammalian taste. *Nature*, 444, 288-

adrenergic G protein-coupled receptor. *Science*, 318, 1258-1265.

of cyclic nucleotide-gated channels. *Neuron*, 30, 689-698.

target human TAS2R38 bitter receptor. *PLoS ONE*, 5, e12394.

protein-coupled receptor. *Nature*, 463, 108-112.

*Natl. Acad. Sci. U.S.A*, 107, 11110-11115.

structure in proteins. *EMBO J*, 5, 823-826.

bitter tastants. *J Mol Model*, 12, 931-941.

*Nucleic Acids Research*, 26, 275-279.

predictions. *Curr Opin Chem Biol*, 7, 150-156.

*Rev. Physiol*, 68, 375-401.

579, 1968-1972.
