**3.1. The physical properties of cysteines and thioester bonds**

**2.5. Enzymatic mechanisms of palmitoylation**

**3. Palmitoyl-Cysteine prediction**

reviewed recently [3, 73].

258 Drug Discovery

permeases [74].

The physical and chemical mechanisms that result in enzymatic palmitoylation have yet to be defined clearly, but some progress has been made using purified proteins. It has been es‐ tablished that mutation of the cysteine in the DHHC motif of all PATs studied to date abol‐ ishes autoacylation of the PAT and palmitoylation of the substrate [23, 56, 62, 72]. This literature as well as discussion of potential physical mechanisms for the reaction have been

Prior to the discovery of PATs, attempts were made to define stretches of amino acids that were preferred for palmitoylation. Palmitoylation near the N-terminus, following myristoy‐ lation, is among the predictable places for palmitoylation to occur provided there is one or more nearby cysteines. Navarro-Lérida et al (2002) fused a myristoylation motif (MGCTLS) to GFP with a short intervening sequence containing cysteines at various locations. These authors found a preference for cysteine palmitoylation at positions 3, 9, 15 and (to a much lesser degree) 21 residues away from the N-terminal methionine, but intervening residues were not evaluated. Commonalities in the composition of amino acid residues surrounding palmitoylated cysteines have been noted among members of the family of yeast amino acid

As more palmitoylated proteins and specific palmitoyl-cysteines are discovered, the task of predicting which adjacent amino acids provide a favorable environment for palmitoylation becomes easier. Algorithms trained with data from identified palmitoyl cysteines and adja‐ cent amino acid residues are now able to provide predictions of the statistical likelihood that a cysteine of interest may be palmitoylated [75-78]. CSS-Palm 2.0, which was designed to predict potential palmitoylation sites, has been published [75]. The algorithm was trained to recognize potential palmitoyl-cysteines using a dataset of 263 experimentally determined palmitoylation sites from 109 distinct proteins. Interestingly, CSS-Palm 2.0 also successfully predicted most (~75%) of the same novel palmitoyl-cysteines in yeast proteins previously identified by Roth. et al [74] as well as palmitoyl-cysteines predicted by Roth et al., to be pal‐ mitoylated but not experimentally determined. This rate of success in both cases suggests that CSS-PALM 2.0 is more conservative at calling a site, potentially resulting in a greater rate of false negative results but is reasonably accurate nonetheless. This algorithm should prove useful when prioritizing which cysteine(s), often among multiple potential cysteines

Patterns of amino acid residues surrounding palmitoyl-cysteines have emerged from these analyses. A diagram of favored residues generated by an early version of CSS-Palm 2.0 (NBA-Palm) [76] shows that leucines and additional cysteines are more commonly observed around palmitoyl-cysteines. The subsequent versions of NBA-palm used significantly im‐ proved predictive tests, but the rough sequence of preferred residues remains. An important aspect that cannot yet be considered when attempting to predict cysteine palmitoylation

of a candidate palmitoyl protein, to analyze experimentally.

The unique physical and biochemical nature of the thioester bond that links palmitate to cysteine residues is the basis for the design of many recent assays for palmitoylation. The cysteine residue is among the most nucleophilic entities in a cell [79] and is the most com‐ mon site of palmitoylation. Other residues can be modified by palmitate, but their occur‐ rence is relatively rare and the bond chemistries are different [2, 80-83]. Palmitoylation can also occur in other ways, for example, on an amine of an N-terminal cysteine as is the case with Hedgehog [2, 83, 84], a secreted signaling protein. An example of palmitate modifying the weaker –OH nucleophile of threonine occurs on the carboxyl terminus of a spider toxin [81]. The ε-amino group of lysine can also be modified by palmitate linked by an amide bond. This occurs in several secreted proteins including a bacterial toxin [80].

The reactivity of the thiolate anion of cysteine residues makes it a key component in the structure and function of many proteins by stabilizing higher order structures via disulfide bridges and post-translational modifications like nitrosylation, prenylation, and acylation [85-87]. The high degree of reactivity has also provided a well-characterized, indispensable target for modification by synthetic, thiol-reactive ligands, allowing capture and characteri‐ zation of proteins [88]. An exceptionally useful application of such thiol-specific chemistry is isotope-coded affinity tags (ICAT) for mass spectrometric determination of relative protein or peptide abundance among two or more samples [89-91]. With these probes, changes in abundance of identified proteins or peptides are determined by changes in the ratio of heavy to light-isotope-modified peptides from mixed samples. Combining ICAT technology with functional genomics methods like siRNA-mediated PAT-gene knockdown is one of several mechanisms that will allow us to identify substrates of PATs [37].

In healthy cells the cytoplasm is generally a reducing environment, meaning that solventexposed cysteine side chains are not typically disulfides and thus available to engage in re‐ actions with other molecules [92]. The reactivity of a free cysteine depends on the pKa of the cysteine which is a function of the local environment surrounding the residue within the context of the whole protein. Unlike other residues with nucleophilic side chains (-OH or – NH2), thiol side chains undergo conjugations, redox, and exchange reactions [85]. Conjuga‐ tion reactions (in addition to fatty acylation) include nitric oxide (NO) or S-nitrosylation, re‐ active oxygen species (ROS), and reactive nitrogen species (RNS) forming bonds that are not susceptible to cleavage by hydroxylamine at neutral pH. Hydroxylamine is a reagent used to selectively remove thioester-linked palmitate [93]. Importantly, we know that hydroxyla‐ mine does not perturb disulfides [94], and that it efficiently cleaves thioesters in a quantita‐ tive manner [95].

In addition to the linkage of palmitate to cysteines, another thioester bond that occurs in cells is the transient association between ubiquitin and the E1, E2, and certain E3 ubiquitina‐ tion enzymes [87, 96]. However, these thioester bonds are easily distinguished from the thio‐ ester bond that links palmitate to cysteines by their pKa; the pKa in the case of palmitoylation is near neutral pH (~7.4) whereas, for the thioester in the ubiquitin system it is pH 10.5 or greater. This wide differential allows for a high degree of selectivity when us‐ ing hydroxylamine to cleave palmitate from proteins on the physical characteristics of ubiq‐ uitin-related cysteines. It is highly unlikely that they are ever in a position to be palmitoylated [97, 98].

**1.** identification of the PAT with specificity for a known palmitoyl protein and

presumptive PAT and substrate proteins can be purified and combined with 3

out these same biases as has been demonstrated in yeast and in human cells [37, 74].

**4. Novel methods to discover and identify PAT/substrate specificity**

The chemistry supporting novel assays to study palmitoylation and the reagents that are be‐ ing incorporated into them have, for the most part, been known and available for years [88] . Most of the methods that are now being developed to study palmitoylation capitalize on many years of knowledge and development of cysteine-specific chemistries, developed mainly as methods to purify and/or specifically target proteins and peptides with various reagents. Many of the reagents that specifically label cysteines have been created as both af‐ finity and fluorescent tags, the former for purification and structure determinations [88]and the latter as cellular reporters of protein abundance, subcellular distribution, protein confor‐ mation changes, the formation of the Golgi, and even the concentration of cellular analytes in specific subcellular domains. The following references provide a short list of some of the most clever uses of thiol chemistry [113-119]. Given the wealth of information on the unique chemistry of the palmitoyl thioester bond and the tools for capturing and characterizing cys‐ teines in proteins, it is somewhat surprising that we are only now developing innovative as‐ says to increase our understanding of palmitoylation. This recent increase is most likely tied to the dramatic increase in the utility of mass spectrometry as a proteomic tool. To provide a general frame of reference for the recent shift in the types of assays that are being devel‐ oped, we will briefly discuss other assays that have been used successfully for a longer peri‐

The first of these has been the most common. With this forward approach, each one of the 23 PATs is independently co-overexpressed with a known palmitoyl protein; the cells express‐

one or more of the co-overexpressions at a level significantly above background suggests that a particular PAT is responsible for palmitoylating that known substrate. Similarly, the

The current level of understanding of PAT/substrate recognition makes it unreasonable to assume that the more closely two PATs are related by sequence homology, the more likely they should palmitoylate a particular substrate. For this reason, assigning substrate status of a protein to a single PAT among a select group of tested, more closely-homologous PATs, to the exclusion of others because they are less homologous, may lead to erroneous exclusions. Similarly, we cannot yet assume that homology among residues surrounding palmitoyl cys‐ teines of different proteins is an indication that they are palmitoylated by a particular PAT. The mechanism for molecular recognition is likely to be defined in part by the higher order structure (even quaternary as is the case with ERF2p and AKR1p) of the PATs and sub‐ strates. The reverse approach, defining unknown substrates of a single PAT can occur with‐

H-palmitate and the proteins analyzed by SDS-

Discovery of Selective and Potent Inhibitors of Palmitoylation

H-palmitate onto the substrate protein in

H-palmitate measured as above.

http://dx.doi.org/10.5772/52503

261

H-palmitoyl-

**2.** identification of an unknown substrate of an individual PAT.

ing the pair are metabolically labeled with 3

PAGE and fluorography. The incorporation of 3

CoA in a tube, allowed to react, and the incorporation of 3

Retinoic acid (RA) and RA-CoA have also been shown to be enzymatically attached to cys‐ teines via a thioester bond that can be cleaved by hydroxylamine and reducing reagents such as βME at neutral pH. The reaction can be inhibited, but not fully, by myristate and palmitate suggesting that RA competes for the same cysteines as palmitate [99-107]. There is some debate in the RA field about how it binds to proteins, particularly the nuclear RA re‐ ceptors, to carry out its signaling functions. RA binding to a hydrophobic cleft is the favored mechanism; however, there are many effects of RA (e.g. [108, 109] ) that are independent of RA-receptor binding suggesting that cysteine modification may also have a place in the mo‐ lecular mechanism of RA action.

### **3.2. Mass spectrometric identification of acyl groups that modify cysteines via a thioester bond**

Lipid-modified thiols have been successfully identified using MALDI-TOF (matrix-assisted laser desorption ionization time-of-flight) mass spectrometry [110]. Using this method, direct information on the nature of the endogenous lipids on proteins or peptides (revealing interest‐ ing variability) can be obtained, whereas most other methods rely on surrogate markers for palmitate including thiol-reactive probes or radiolabeled palmitate. Using MALDI-TOF mass spectrometry, Marilyn Resh and colleagues found that the cysteine in the N-terminal Met-Gly-Cys of Src family kinases and two cysteines near the N-terminus of GAP43 are modified not only by palmitate but also (and to a lesser degree) by palmitoleate, stearate, or oleate [7, 8]. While palmitate appears to be the most common acyl group that forms a thioester bond to modify internal, cytoplasmic cysteines, it is clearly not the only one. The 16-carbon palmitate acyl group represents the longest chain synthesized by mammalian fatty acid synthase and is apparently the most abundant chain length present in some tissue types [111]. This relatively greater abundance may underlie the dominance of palmitate as the main acyl group to modi‐ fy free thiols by S-acylation. The functional implications of incorporating lipids with shorter or longer acyl chains and especially those with different degrees of saturation may be that the proteins have different affinities for various lipid microdomains present in membranes. The specificity of PATs for chain-lengths different than 16 carbons has not been rigorously de‐ fined. However, it is known that acyl groups with differing carbon chain lengths and degrees of saturation can also be incorporated [7, 112].

#### **3.3. PAT/Substrate recognition**

Determining the nature of PAT/substrate recognition remains one of the more important tasks to be undertaken. This is especially true for PATs encoded by genes that have been linked to disease. There are two general approaches to defining PAT/substrate relationships:


ester bond that links palmitate to cysteines by their pKa; the pKa in the case of palmitoylation is near neutral pH (~7.4) whereas, for the thioester in the ubiquitin system it is pH 10.5 or greater. This wide differential allows for a high degree of selectivity when us‐ ing hydroxylamine to cleave palmitate from proteins on the physical characteristics of ubiq‐ uitin-related cysteines. It is highly unlikely that they are ever in a position to be

Retinoic acid (RA) and RA-CoA have also been shown to be enzymatically attached to cys‐ teines via a thioester bond that can be cleaved by hydroxylamine and reducing reagents such as βME at neutral pH. The reaction can be inhibited, but not fully, by myristate and palmitate suggesting that RA competes for the same cysteines as palmitate [99-107]. There is some debate in the RA field about how it binds to proteins, particularly the nuclear RA re‐ ceptors, to carry out its signaling functions. RA binding to a hydrophobic cleft is the favored mechanism; however, there are many effects of RA (e.g. [108, 109] ) that are independent of RA-receptor binding suggesting that cysteine modification may also have a place in the mo‐

**3.2. Mass spectrometric identification of acyl groups that modify cysteines via a thioester**

Lipid-modified thiols have been successfully identified using MALDI-TOF (matrix-assisted laser desorption ionization time-of-flight) mass spectrometry [110]. Using this method, direct information on the nature of the endogenous lipids on proteins or peptides (revealing interest‐ ing variability) can be obtained, whereas most other methods rely on surrogate markers for palmitate including thiol-reactive probes or radiolabeled palmitate. Using MALDI-TOF mass spectrometry, Marilyn Resh and colleagues found that the cysteine in the N-terminal Met-Gly-Cys of Src family kinases and two cysteines near the N-terminus of GAP43 are modified not only by palmitate but also (and to a lesser degree) by palmitoleate, stearate, or oleate [7, 8]. While palmitate appears to be the most common acyl group that forms a thioester bond to modify internal, cytoplasmic cysteines, it is clearly not the only one. The 16-carbon palmitate acyl group represents the longest chain synthesized by mammalian fatty acid synthase and is apparently the most abundant chain length present in some tissue types [111]. This relatively greater abundance may underlie the dominance of palmitate as the main acyl group to modi‐ fy free thiols by S-acylation. The functional implications of incorporating lipids with shorter or longer acyl chains and especially those with different degrees of saturation may be that the proteins have different affinities for various lipid microdomains present in membranes. The specificity of PATs for chain-lengths different than 16 carbons has not been rigorously de‐ fined. However, it is known that acyl groups with differing carbon chain lengths and degrees

Determining the nature of PAT/substrate recognition remains one of the more important tasks to be undertaken. This is especially true for PATs encoded by genes that have been linked to disease. There are two general approaches to defining PAT/substrate relationships:

palmitoylated [97, 98].

260 Drug Discovery

lecular mechanism of RA action.

of saturation can also be incorporated [7, 112].

**3.3. PAT/Substrate recognition**

**bond**

The first of these has been the most common. With this forward approach, each one of the 23 PATs is independently co-overexpressed with a known palmitoyl protein; the cells express‐ ing the pair are metabolically labeled with 3 H-palmitate and the proteins analyzed by SDS-PAGE and fluorography. The incorporation of 3 H-palmitate onto the substrate protein in one or more of the co-overexpressions at a level significantly above background suggests that a particular PAT is responsible for palmitoylating that known substrate. Similarly, the presumptive PAT and substrate proteins can be purified and combined with 3 H-palmitoyl-CoA in a tube, allowed to react, and the incorporation of 3 H-palmitate measured as above. The current level of understanding of PAT/substrate recognition makes it unreasonable to assume that the more closely two PATs are related by sequence homology, the more likely they should palmitoylate a particular substrate. For this reason, assigning substrate status of a protein to a single PAT among a select group of tested, more closely-homologous PATs, to the exclusion of others because they are less homologous, may lead to erroneous exclusions. Similarly, we cannot yet assume that homology among residues surrounding palmitoyl cys‐ teines of different proteins is an indication that they are palmitoylated by a particular PAT. The mechanism for molecular recognition is likely to be defined in part by the higher order structure (even quaternary as is the case with ERF2p and AKR1p) of the PATs and sub‐ strates. The reverse approach, defining unknown substrates of a single PAT can occur with‐ out these same biases as has been demonstrated in yeast and in human cells [37, 74].
