**1. Introduction**

### **1.1 Bioluminescence**

Bioluminescence is the production and emission of light by a living organism. Bioluminescence imaging was developed over the last decade as a tool for studying biological processes in living small laboratory animals by molecular imaging. The bioluminescence-based optical imaging is highly sensitive, low-cost, and non-invasive, enabling the real-time analysis of disease processes within the cell at a molecular level in living animals. Recent advances in protein complementation strategies have further expanded its applications by quantitatively monitoring several sub-cellular processes such as protein-protein interactions, protein dimerizations, and protein foldings. In this chapter, we provide a brief introduction to bioluminescence imaging technology and discuss its applications in studying protein-protein interactions, protein dimerizations, and protein foldings, which are some of the most important cellular processes that occur in the heart signal transduction network within the cells, by non-invasively imaging living animals.

Molecular imaging offers many unique opportunities to study biological processes in intact organisms. Bioluminescence imaging (BLI) is one of several molecular imaging strategies currently in use for studying different biological processes. It is based on the sensitive detection of visible light produced during luciferase enzyme mediated oxidation of substrate luciferin in the presence of several co-factors. The luciferase enzyme can be expressed in cells as an indicator of cellular process, and can be used to image living animals by developing tumor xenografts, or developing transgenic animals either to selectively express in a particular type of tissue using a tissue specific promoter, or in the entire animal by a constitutive promoter, to study different cellular diseases. The expressed luciferase enzyme can be imaged with an optical cooled charge coupled device (CCD) camera by injecting the substrate luciferin. Several bioluminescence reporters with a wide range of emission wavelengths are currently identified from insects and crustacean copepods **(Table 1)**. Some of the proteins were even modified by changing from a few to several amino acids by *in vitro* manipulations, and achieved considerably altered proteins with

Bioluminescent Proteins: High Sensitive Optical Reporters

protein-protein interactions.

**1.3 Post genomic proteomic era** 

for Imaging Protein-Protein Interactions and Protein Foldings in Living Animals 51

cells. Protein–protein interactions are important determining factors in the regulation of many cellular processes. Signaling pathways regulating cellular proliferation, differentiation, and apoptosis are commonly mediated by protein-protein interactions as well as reversible chemical modifications of proteins (e.g., phosphorylation, acetylation, methylation, and sumoylation), which normally control sub-cellular trafficking and function of proteins. To understand these modifications in proteins, and protein modificationassisted or independent protein-protein interactions, several techniques have been developed and studied in intact cells and in cell extracts. The yeast two-hybrid system is one of the earliest techniques, which used enzyme beta-galactosidase as a reporter protein at the beginning, and later was improved by adopting bioluminescent reporters for rapid measurement. The latter is used extensively in screening for protein-protein interactions and also for identifying small molecule drugs that alter (inhibit or enhance) protein-protein interactions, which can be used as therapeutic agents for treating several cellular diseases including cancer. The major limitation of this system is that it can only study the proteinprotein interactions occurring in the nucleus; otherwise it requires the study proteins to be trafficked into the nucleus. The readout of yeast two-hybrid system is based on the amount of reporter proteins produced during protein-protein interaction associated transcriptional activation of reporter proteins (see more details in section 3.2). To circumvent this limitation, other techniques have been developed, including the split ubiquitin system, Sos recruitment system, dihydrofolate reductase complementation, -galactosidase complementation, lactamase complementation, the G protein fusion system, and, most recently, split-luciferase (firefly luciferase, click-beetle luciferase, renilla luciferase, and Gaussia luciferase) and splitfluorescent (GFP and RFP) complementation systems. Of these, the split-luciferase complementation system provides significant advantage over other systems, particularly in measuring protein-protein interactions in cell lysates, intact cells, and cell implants in living animals by molecular imaging. The firefly luciferase complementation imaging is robust and a broadly applicable bioluminescence approach with applications in both modificationindependent (phosphorylation, acetylation, methylation, and sumoylation) and dependent

We are in a post-genomic proteomic era. The completion of the human genome project has given us knowledge of the complete nucleotide sequences of human genome, their arrangements in different chromosomes, and the number of functional genes that are present in a human cell. The information collected from the human genome project along with other bio-informatic tools have led to several major new directions in science, including the characterization of RNAs (via transcriptional profiling), microRNAs, and proteins (proteomes). The human genome project estimated the number of functional genes in a human cell to range from 30,000 to 40,000. The concept of one protein, one function can accommodate only a limited number of functions, and does not explain the vastly more proteins needed by cells than those produced from the limited number of functional genes. The management of additional cellular functions, including various house-keeping functions and other specialized functions, mainly depends on the functional organ or tissue types to which these cells are part of. It is logical and even necessary to postulate that multifunctional proteins within the cell, and/or various collaborative interactions between proteins, are needed as molecular machines to carry out the work within a cell. To illustrate, the proteomes are much more dynamic and complex than the genome; it changes during

change in their emission wavelengths, which improved their detection sensitivity especially for *in vivo* imaging applications.

Bioluminescence light from firefly luciferase which emits at the ~575 nm wavelength (with several red shifted mutants) can be imaged at a depth of several centimeters within the tissues, which allows at least organ-level resolution. This technology has been applied in several studies to monitor transgene expression, progression of infection, tumor growth and metastasis, tissue acceptance/rejection in transplantation, toxicology, viral infections, and gene therapy. BLI is simple to execute, and enables monitoring throughout the course of disease, allowing localization and serial quantification of biological processes without sacrificing the experimental animal. This powerful technique can reduce the number of animals required for experimentation because multiple measurements can be made in the same animal over time, which has the added benefit of minimizing the effects of biological variation in handling different groups as control. The strengths of bioluminescence reporters are not just limited to their applications in monitoring disease progress at the cellular level. The recent development of split-reporter technology has further extended their application to monitoring sub-cellular events such as protein-protein interactions and protein-foldings that are the main focus of this chapter.


Table 1. Bioluminescent reporters currently in use for different biological applications, and their sources and properties

#### **1.2 Protein-protein interactions**

**C**ells are the fundamental working units of every living system. Cells determine how a living organism functions. The complex cellular functions rely on several fundamental principles. Each cell has a nucleus that contains chemical DNA (deoxyribonucleic acid) as its genetic material, which carries all the instructions needed to direct their activities in the form of functional units called proteins. Therefore, cellular functioning ultimately depends on the performances of different proteins. Some proteins act as building blocks, such as muscle proteins, while others such as enzymes control the chemical reactions within the

change in their emission wavelengths, which improved their detection sensitivity especially

Bioluminescence light from firefly luciferase which emits at the ~575 nm wavelength (with several red shifted mutants) can be imaged at a depth of several centimeters within the tissues, which allows at least organ-level resolution. This technology has been applied in several studies to monitor transgene expression, progression of infection, tumor growth and metastasis, tissue acceptance/rejection in transplantation, toxicology, viral infections, and gene therapy. BLI is simple to execute, and enables monitoring throughout the course of disease, allowing localization and serial quantification of biological processes without sacrificing the experimental animal. This powerful technique can reduce the number of animals required for experimentation because multiple measurements can be made in the same animal over time, which has the added benefit of minimizing the effects of biological variation in handling different groups as control. The strengths of bioluminescence reporters are not just limited to their applications in monitoring disease progress at the cellular level. The recent development of split-reporter technology has further extended their application to monitoring sub-cellular events such as protein-protein interactions and protein-foldings

Source Emission

Secretary *Renilla reniformis* 482nm Coelenterazine

Secretary *Metridia longa* 480nm Coelenterazine

Non-secretary *Gaussia princeps* 480nm Coelenterazine

Non-secretary *Vibrio fischeri* 482nm Fatty acids

Wavelength

ATP/dATP

Red:610nm/Green: 540nm

Substrate

D-Luciferin

D-Luciferin

Luciferin

478nm Vargula-

for *in vivo* imaging applications.

that are the main focus of this chapter.

Physical Property

Beetle Luciferase Non-secretary *Pyrearinus* 

Firefly Luciferase Non-secretary *Photinus pyralis* 575/610nm:

Secretary *Vargula* 

*termitilluminans* 

*hilgendorfii* 

Table 1. Bioluminescent reporters currently in use for different biological applications, and

**C**ells are the fundamental working units of every living system. Cells determine how a living organism functions. The complex cellular functions rely on several fundamental principles. Each cell has a nucleus that contains chemical DNA (deoxyribonucleic acid) as its genetic material, which carries all the instructions needed to direct their activities in the form of functional units called proteins. Therefore, cellular functioning ultimately depends on the performances of different proteins. Some proteins act as building blocks, such as muscle proteins, while others such as enzymes control the chemical reactions within the

Bioluminescent Reporters

> Renilla Luciferase

> Gaussia Luciferase

> Metridia Luciferase

> Vargula Luciferase

> Bacterial Luciferase

their sources and properties

**1.2 Protein-protein interactions** 

cells. Protein–protein interactions are important determining factors in the regulation of many cellular processes. Signaling pathways regulating cellular proliferation, differentiation, and apoptosis are commonly mediated by protein-protein interactions as well as reversible chemical modifications of proteins (e.g., phosphorylation, acetylation, methylation, and sumoylation), which normally control sub-cellular trafficking and function of proteins. To understand these modifications in proteins, and protein modificationassisted or independent protein-protein interactions, several techniques have been developed and studied in intact cells and in cell extracts. The yeast two-hybrid system is one of the earliest techniques, which used enzyme beta-galactosidase as a reporter protein at the beginning, and later was improved by adopting bioluminescent reporters for rapid measurement. The latter is used extensively in screening for protein-protein interactions and also for identifying small molecule drugs that alter (inhibit or enhance) protein-protein interactions, which can be used as therapeutic agents for treating several cellular diseases including cancer. The major limitation of this system is that it can only study the proteinprotein interactions occurring in the nucleus; otherwise it requires the study proteins to be trafficked into the nucleus. The readout of yeast two-hybrid system is based on the amount of reporter proteins produced during protein-protein interaction associated transcriptional activation of reporter proteins (see more details in section 3.2). To circumvent this limitation, other techniques have been developed, including the split ubiquitin system, Sos recruitment system, dihydrofolate reductase complementation, -galactosidase complementation, lactamase complementation, the G protein fusion system, and, most recently, split-luciferase (firefly luciferase, click-beetle luciferase, renilla luciferase, and Gaussia luciferase) and splitfluorescent (GFP and RFP) complementation systems. Of these, the split-luciferase complementation system provides significant advantage over other systems, particularly in measuring protein-protein interactions in cell lysates, intact cells, and cell implants in living animals by molecular imaging. The firefly luciferase complementation imaging is robust and a broadly applicable bioluminescence approach with applications in both modificationindependent (phosphorylation, acetylation, methylation, and sumoylation) and dependent protein-protein interactions.

#### **1.3 Post genomic proteomic era**

We are in a post-genomic proteomic era. The completion of the human genome project has given us knowledge of the complete nucleotide sequences of human genome, their arrangements in different chromosomes, and the number of functional genes that are present in a human cell. The information collected from the human genome project along with other bio-informatic tools have led to several major new directions in science, including the characterization of RNAs (via transcriptional profiling), microRNAs, and proteins (proteomes). The human genome project estimated the number of functional genes in a human cell to range from 30,000 to 40,000. The concept of one protein, one function can accommodate only a limited number of functions, and does not explain the vastly more proteins needed by cells than those produced from the limited number of functional genes. The management of additional cellular functions, including various house-keeping functions and other specialized functions, mainly depends on the functional organ or tissue types to which these cells are part of. It is logical and even necessary to postulate that multifunctional proteins within the cell, and/or various collaborative interactions between proteins, are needed as molecular machines to carry out the work within a cell. To illustrate, the proteomes are much more dynamic and complex than the genome; it changes during

Bioluminescent Proteins: High Sensitive Optical Reporters

for Imaging Protein-Protein Interactions and Protein Foldings in Living Animals 53

expression. Many proteins play their functional roles only in specific cellular compartments, whereas others move from one compartment to another, acting as "signals". By directly interacting with one another, proteins continually influence other functions (Wills 2001). In addition, proteins are constantly produced and degraded in cells. The rates at which these processes occur depend on how much of each protein is already present, how they interact with each other, and with other macromolecules such as DNA and RNA, and regulate the cellular mechanisms. One protein can speed up or slow down the rate of production of another by interacting with DNA or RNA, which is needed for making that particular protein. The interactions between different proteins that control different cellular functions are therefore interdependent. When a mutation causes the loss of one of these essential protein functions, then this can significantly affect the function of many other proteins, even leading to cell death (Tucker et al., 2001). Clearly the interactions between different proteins in a cell are much more complex than previously thought, and it is vital to understand their fundamental interlinked networks. Protein–protein interactions are important determining factors in the control of many cellular processes such as transcription, translation, cell division, signal transduction, and oncogenic transformation. To modulate many of these cellular events, it is essential to delineate which proteins are involved and how they interact with one another, their precise roles in executing cellular functions, and techniques and mechanisms needed to manipulate these interactions for novel drug development or treatment strategies relevant to particular diseases. Biochemical pathways and networks require many different systems of dynamic assembly and disassembly of proteins with other proteins and nucleic acids (Michnick 2001). Much of modern biological research is concerned with how, when, and where proteins interact with other proteins involved in biological processes in the intact cellular context. The completion of the human genome project has added a major impetus in research that can provide simple approaches to study

protein-protein interactions on a large scale in diseases, including cancer.

The cellular regulatory mechanisms are interlinked. To understand the complex biological processes, and disease states at a molecular level, a systematic approach is necessary to illustrate signaling pathways. Efforts to elucidate the cellular mechanisms for different pathological conditions have significantly increased after the Human Genome Project. Each signaling pathway reacts to specific external stimuli that can be regulated by changes in proteins and chemicals. Recent advances in large-scale and high-throughput techniques, including functional genomics, proteomics, RNAi technology, and genomic-scale yeast twohybrid and protein complementation assays, have provided a tremendous amount of information on signaling pathways. To extract the biological significance from the vast data, it is necessary to develop an integrated environment for a formal and structured organization of the available information, in a format suitable for analysis with bioinformatics tools. To present a signaling pathway, a database must include information on 1) the molecules involved in signaling in response to each external stimulus, 2) which direction the signal is being conveyed, and 3) how the activities and sub-cellular localizations of molecules are changed by protein modifications and/or protein-protein interactions. Analyses of the first database containing such information should made it possible to further expand the database to understand the signaling results in processes such as proliferation, differentiation, and apoptosis, and to explicate how a network can be

**1.5 Cellular signaling pathways** 

development in response to external stimuli, and form large interaction networks through which they support and regulate each other. The genetic blueprint and the genome of human cells are well known. However, the functions that genome encodes and program through which the proteins are produced by the genetic blueprint are not well understood. New research is only beginning to uncover the incredibly rich diversity of protein structure, which is much more complex than that of DNA. One new direction has sought to isolate and structurally characterize all the proteins that exist in the cell (Skolnick et al., 2000; Tucker et al., 2001). Unlike DNA, proteins have a vast repertoire of structures to carry out the diversity of functions. Once the proteins are identified and characterized, a second major challenge to find out how they assemble into the molecular machines that perform the cellular functions. Identifying all of the protein-protein interactions is fundamental for understanding the cellular processes involved in virtually all biological interactions. The collection of protein-protein interactions can be visualized as a map, in which proteins are the nodes and the circuits are the interactions. A protein-protein interaction network or map would then represent a search grid on which biological circuits are constructed (Tucker et al., 2001; Wills 2001).

Fig. 1. Schematic illustration of current molecular imaging strategies, and their potential for providing biological informations such as anatomical details, physiological data, and metabolic status at the molecular level, for clinical applications in human. None of the current strategies is uniquely superior in independently providing different informations needed for making clinical decisions in diagnosis, staging and treatments especially in oncology, diagnosis and treatment in several other diseases; each has its strengths and weaknesses.

#### **1.4 Complexity of protein interaction networks**

There are thousands of different proteins active in a cell at any time. Many of these proteins are working as enzymes that catalyze the chemical reactions of metabolism, while others work as components of cellular machineries, such as ribosomes that read genetic information and synthesize proteins. Still proteins are involved in the regulation of gene expression. Many proteins play their functional roles only in specific cellular compartments, whereas others move from one compartment to another, acting as "signals". By directly interacting with one another, proteins continually influence other functions (Wills 2001). In addition, proteins are constantly produced and degraded in cells. The rates at which these processes occur depend on how much of each protein is already present, how they interact with each other, and with other macromolecules such as DNA and RNA, and regulate the cellular mechanisms. One protein can speed up or slow down the rate of production of another by interacting with DNA or RNA, which is needed for making that particular protein. The interactions between different proteins that control different cellular functions are therefore interdependent. When a mutation causes the loss of one of these essential protein functions, then this can significantly affect the function of many other proteins, even leading to cell death (Tucker et al., 2001). Clearly the interactions between different proteins in a cell are much more complex than previously thought, and it is vital to understand their fundamental interlinked networks. Protein–protein interactions are important determining factors in the control of many cellular processes such as transcription, translation, cell division, signal transduction, and oncogenic transformation. To modulate many of these cellular events, it is essential to delineate which proteins are involved and how they interact with one another, their precise roles in executing cellular functions, and techniques and mechanisms needed to manipulate these interactions for novel drug development or treatment strategies relevant to particular diseases. Biochemical pathways and networks require many different systems of dynamic assembly and disassembly of proteins with other proteins and nucleic acids (Michnick 2001). Much of modern biological research is concerned with how, when, and where proteins interact with other proteins involved in biological processes in the intact cellular context. The completion of the human genome project has added a major impetus in research that can provide simple approaches to study protein-protein interactions on a large scale in diseases, including cancer.

#### **1.5 Cellular signaling pathways**

52 Bioluminescence – Recent Advances in Oceanic Measurements and Laboratory Applications

development in response to external stimuli, and form large interaction networks through which they support and regulate each other. The genetic blueprint and the genome of human cells are well known. However, the functions that genome encodes and program through which the proteins are produced by the genetic blueprint are not well understood. New research is only beginning to uncover the incredibly rich diversity of protein structure, which is much more complex than that of DNA. One new direction has sought to isolate and structurally characterize all the proteins that exist in the cell (Skolnick et al., 2000; Tucker et al., 2001). Unlike DNA, proteins have a vast repertoire of structures to carry out the diversity of functions. Once the proteins are identified and characterized, a second major challenge to find out how they assemble into the molecular machines that perform the cellular functions. Identifying all of the protein-protein interactions is fundamental for understanding the cellular processes involved in virtually all biological interactions. The collection of protein-protein interactions can be visualized as a map, in which proteins are the nodes and the circuits are the interactions. A protein-protein interaction network or map would then represent a search grid on which biological circuits are constructed (Tucker et

Fig. 1. Schematic illustration of current molecular imaging strategies, and their potential for providing biological informations such as anatomical details, physiological data, and metabolic status at the molecular level, for clinical applications in human. None of the current strategies is uniquely superior in independently providing different informations needed for making clinical decisions in diagnosis, staging and treatments especially in oncology, diagnosis and treatment in several other diseases; each has its strengths and

There are thousands of different proteins active in a cell at any time. Many of these proteins are working as enzymes that catalyze the chemical reactions of metabolism, while others work as components of cellular machineries, such as ribosomes that read genetic information and synthesize proteins. Still proteins are involved in the regulation of gene

al., 2001; Wills 2001).

weaknesses.

**1.4 Complexity of protein interaction networks** 

The cellular regulatory mechanisms are interlinked. To understand the complex biological processes, and disease states at a molecular level, a systematic approach is necessary to illustrate signaling pathways. Efforts to elucidate the cellular mechanisms for different pathological conditions have significantly increased after the Human Genome Project. Each signaling pathway reacts to specific external stimuli that can be regulated by changes in proteins and chemicals. Recent advances in large-scale and high-throughput techniques, including functional genomics, proteomics, RNAi technology, and genomic-scale yeast twohybrid and protein complementation assays, have provided a tremendous amount of information on signaling pathways. To extract the biological significance from the vast data, it is necessary to develop an integrated environment for a formal and structured organization of the available information, in a format suitable for analysis with bioinformatics tools. To present a signaling pathway, a database must include information on 1) the molecules involved in signaling in response to each external stimulus, 2) which direction the signal is being conveyed, and 3) how the activities and sub-cellular localizations of molecules are changed by protein modifications and/or protein-protein interactions. Analyses of the first database containing such information should made it possible to further expand the database to understand the signaling results in processes such as proliferation, differentiation, and apoptosis, and to explicate how a network can be

Bioluminescent Proteins: High Sensitive Optical Reporters

(Anfinsen 1973; Levinthal 1969).

**(Table 2)**(Goetz et al., 2003).

for Imaging Protein-Protein Interactions and Protein Foldings in Living Animals 55

molecular foundation of a growing list of diseases in humans and animals. Proteins undergo several levels of structural alterations executed by active chaperon complexes (e.g., Hsp90, Hsp70) and indirectly by the inherent amino acid sequences, before they become a biologically active functional entity of a cell. There is significant supporting evidence that associates the misfolding of proteins with several cellular diseases, including cancers **(Table 2)**. Biologically representative *in vitro* and *in vivo* studies of these abnormal events are best suited to the discovery of molecular mechanisms to prevent or ameliorate such diseases. There is an active search for small molecules which assist refolding of misfolded proteins into their biological functional forms, as equal or at near equal levels of native forms, for the treatment of several biochemical disorders. However, thus far no current technique can be optimally extended to imaging assays in intact living subjects. The development of novel imaging techniques to quantitatively measure the level of protein misfolding in cells and in living animals, and also of small molecule mediated refolding, will be very useful for screening and pre-clinical evaluation of drugs which rectify or cure these diseases. Normally, the conformational changes in protein folding result in the close approximation of amino and carboxy termini in a great majority of native proteins, at their functionally active forms. The 'protein folding problem' has remained one of the more perplexing quandaries in fundamental biological research ever since the classic work of Anfinsen some four decades ago on the hydrophobic-collapse mechanism. How to predict the three-dimensional, biologically active, native structure of a protein from its primary sequence, and how a protein reaches this native structure from its denatured state are still unresolved questions. The intellectual conundrum of the folding pathway of proteins, underscored by the Levinthal paradox, has been addressed to some extent over the last twenty years by various proposed mechanisms for protein folding, including the framework model (diffusion-collision and nucleation mechanisms)

There is accumulating evidence that the conditions used for refolding proteins *in vitro* are only distantly related to those found *in vivo*, where the physiological environment in living cells exerts a profound influence on protein folding owing to the involvement of the intracellular macromolecular background, which also contains folding catalysts and molecular chaperones. Aside from the relevance of the protein-folding problem to deciphering fundamental processes in cell biology, it is becoming clear that dysfunctional protein folding represents the molecular foundation of a growing list of diseases in humans and animals. There is mounting interest in such diseases arising from protein misfolding and aggregation, including Alzheimer's disease, amyloidosis, Creutzfeldt-Jakob disease, cystic fibrosis and cancer, to name a few. Molecular chaperones are involved in the protection of cells against protein damage through their ability to hold, disaggregate, and refold damaged proteins or their ability to facilitate degradation of damaged proteins. Many of the proteins implicated in the pathogenesis of misfolding diseases escape the diverse chaperoning pathways that are in place to assist and assure the fidelity of correct protein folding. More biologically representative *in vivo* structural and functional studies of these abnormal events, carried out in the context of living cell environments, are likely best suited to the discovery of molecular mechanisms to prevent or ameliorate such diseases

composed of various signaling pathways in response to multiple external inputs. Signaling entities ranging from small molecules and proteins-to-protein states and protein complexes should be studied. It must be noted that these entities are not independent of one another. For instance, protein complexes are composed of proteins, and a protein binding to a small molecule can define a protein state. It is not surprising to find many gaps in the current knowledge about any particular signaling pathway. In order to organize such diverse yet incomplete information into a structured and coherent database, the use of a formal model is indispensible. Differing levels of abstraction are inter-related so that essentially the same signaling event can be described in detail at multiple levels. As model systems that implement all the parameters become available, the sharing of models with integrated biological data will be essential to fill in the gaps in our current knowledge base.

#### **1.6 Complexity in studying protein interaction networks**

There are no methods currently available to test protein-protein interaction networks, which occur within a cell without introducing a constructed system that mimics the function of its endogenous protein. It is to be expected that when a new protein of endogenous origin is introduced in a cell in addition to the level of its counterpart expressed inside a cell, it will have some direct physical effect on a number of other proteins. These new interactions may cause some changes in the functional aspects of several other proteins. Such effects can be felt right across the protein interaction network, most often becoming less significant as the distance of the new protein from the other protein increases. It is also possible for genetically modified cells to produce a new protein that will display completely new patterns of protein interactions. This may not be evident until the cells find themselves in some unusual circumstances. They may then respond in a very different way from wildtype cells. Although the genetically engineered cells may appear to behave just like wildtype cells, this cannot be guaranteed under all circumstances (Becker et al., 1990; Beeckmans 1999; Bode and Willmitzer 1975). However the techniques currently available for inserting new DNA into the chromosomes of cells do not have any specific control mechanisms, capable of directing the point of insertion in the organism's existing genome without producing significant impact on the expression level of any of the endogenous proteins. Of the gene delivery systems currently available, the adeno-associated virus is the only viral mediated vector which can normally introduce and integrate a single copy of the transgene specifically into human chromosome loci at 19 (19q13.3-qter). Otherwise, it is customary to produce millions of cells with the new DNA inserted at essentially random positions in the hope of producing at least some "hits." Screening is then conducted to find those cells, which must survive the engineering process and also express the newly inserted gene. These survivors are then subjected to further screenings to find those that seem to behave most like the wild-type, and yet possessing the new, desired, engineered properties. It is generally assumed that any harm to an organism as a result of inserting a new gene will be observed as a change in gross characteristics of the organism (Stopeck et al., 1998).

#### **1.7 Biological importance in studying protein folding**

As discussed in the previous sections, proteins are cellular macromolecules with complex structural and functional properties. Dysfunctional protein folding represents the

composed of various signaling pathways in response to multiple external inputs. Signaling entities ranging from small molecules and proteins-to-protein states and protein complexes should be studied. It must be noted that these entities are not independent of one another. For instance, protein complexes are composed of proteins, and a protein binding to a small molecule can define a protein state. It is not surprising to find many gaps in the current knowledge about any particular signaling pathway. In order to organize such diverse yet incomplete information into a structured and coherent database, the use of a formal model is indispensible. Differing levels of abstraction are inter-related so that essentially the same signaling event can be described in detail at multiple levels. As model systems that implement all the parameters become available, the sharing of models with integrated

There are no methods currently available to test protein-protein interaction networks, which occur within a cell without introducing a constructed system that mimics the function of its endogenous protein. It is to be expected that when a new protein of endogenous origin is introduced in a cell in addition to the level of its counterpart expressed inside a cell, it will have some direct physical effect on a number of other proteins. These new interactions may cause some changes in the functional aspects of several other proteins. Such effects can be felt right across the protein interaction network, most often becoming less significant as the distance of the new protein from the other protein increases. It is also possible for genetically modified cells to produce a new protein that will display completely new patterns of protein interactions. This may not be evident until the cells find themselves in some unusual circumstances. They may then respond in a very different way from wildtype cells. Although the genetically engineered cells may appear to behave just like wildtype cells, this cannot be guaranteed under all circumstances (Becker et al., 1990; Beeckmans 1999; Bode and Willmitzer 1975). However the techniques currently available for inserting new DNA into the chromosomes of cells do not have any specific control mechanisms, capable of directing the point of insertion in the organism's existing genome without producing significant impact on the expression level of any of the endogenous proteins. Of the gene delivery systems currently available, the adeno-associated virus is the only viral mediated vector which can normally introduce and integrate a single copy of the transgene specifically into human chromosome loci at 19 (19q13.3-qter). Otherwise, it is customary to produce millions of cells with the new DNA inserted at essentially random positions in the hope of producing at least some "hits." Screening is then conducted to find those cells, which must survive the engineering process and also express the newly inserted gene. These survivors are then subjected to further screenings to find those that seem to behave most like the wild-type, and yet possessing the new, desired, engineered properties. It is generally assumed that any harm to an organism as a result of inserting a new gene will be observed

biological data will be essential to fill in the gaps in our current knowledge base.

as a change in gross characteristics of the organism (Stopeck et al., 1998).

As discussed in the previous sections, proteins are cellular macromolecules with complex structural and functional properties. Dysfunctional protein folding represents the

**1.7 Biological importance in studying protein folding** 

**1.6 Complexity in studying protein interaction networks** 

molecular foundation of a growing list of diseases in humans and animals. Proteins undergo several levels of structural alterations executed by active chaperon complexes (e.g., Hsp90, Hsp70) and indirectly by the inherent amino acid sequences, before they become a biologically active functional entity of a cell. There is significant supporting evidence that associates the misfolding of proteins with several cellular diseases, including cancers **(Table 2)**. Biologically representative *in vitro* and *in vivo* studies of these abnormal events are best suited to the discovery of molecular mechanisms to prevent or ameliorate such diseases. There is an active search for small molecules which assist refolding of misfolded proteins into their biological functional forms, as equal or at near equal levels of native forms, for the treatment of several biochemical disorders. However, thus far no current technique can be optimally extended to imaging assays in intact living subjects. The development of novel imaging techniques to quantitatively measure the level of protein misfolding in cells and in living animals, and also of small molecule mediated refolding, will be very useful for screening and pre-clinical evaluation of drugs which rectify or cure these diseases. Normally, the conformational changes in protein folding result in the close approximation of amino and carboxy termini in a great majority of native proteins, at their functionally active forms. The 'protein folding problem' has remained one of the more perplexing quandaries in fundamental biological research ever since the classic work of Anfinsen some four decades ago on the hydrophobic-collapse mechanism. How to predict the three-dimensional, biologically active, native structure of a protein from its primary sequence, and how a protein reaches this native structure from its denatured state are still unresolved questions. The intellectual conundrum of the folding pathway of proteins, underscored by the Levinthal paradox, has been addressed to some extent over the last twenty years by various proposed mechanisms for protein folding, including the framework model (diffusion-collision and nucleation mechanisms) (Anfinsen 1973; Levinthal 1969).

There is accumulating evidence that the conditions used for refolding proteins *in vitro* are only distantly related to those found *in vivo*, where the physiological environment in living cells exerts a profound influence on protein folding owing to the involvement of the intracellular macromolecular background, which also contains folding catalysts and molecular chaperones. Aside from the relevance of the protein-folding problem to deciphering fundamental processes in cell biology, it is becoming clear that dysfunctional protein folding represents the molecular foundation of a growing list of diseases in humans and animals. There is mounting interest in such diseases arising from protein misfolding and aggregation, including Alzheimer's disease, amyloidosis, Creutzfeldt-Jakob disease, cystic fibrosis and cancer, to name a few. Molecular chaperones are involved in the protection of cells against protein damage through their ability to hold, disaggregate, and refold damaged proteins or their ability to facilitate degradation of damaged proteins. Many of the proteins implicated in the pathogenesis of misfolding diseases escape the diverse chaperoning pathways that are in place to assist and assure the fidelity of correct protein folding. More biologically representative *in vivo* structural and functional studies of these abnormal events, carried out in the context of living cell environments, are likely best suited to the discovery of molecular mechanisms to prevent or ameliorate such diseases **(Table 2)**(Goetz et al., 2003).

Bioluminescent Proteins: High Sensitive Optical Reporters

al., 2002).

for Imaging Protein-Protein Interactions and Protein Foldings in Living Animals 57

substrate or with induction of light waves, as well as the advancement of sensitive imaging instrumentations. Imaging techniques such as positron-emission tomography (PET), singlephoton emission tomography (SPECT) with the use of radioactive tracers, and magnetic resonance imaging (MRI), are now widely used in the clinical observation of cancer pathology. Recently, the use of combinatorial techniques like PET-CT and PET-MRI are rapidly replacing conventional imaging methods. Positron Emission Tomography (PET) is an imaging technique that produces three-dimensional images of the functional processes of living subjects by capturing a pair of gamma rays emitted indirectly upon the injection of a

PET was introduced by David E. Kuhl and Roy Edwards of the University of Pennsylvania in late 1950s, and has been continually updated and modified to correspond to the clinical and research needs to work as an independent or a combinatorial device (Ter-Pogossian et al., 1975). It has made invaluable contributions in cancer diagnosis, treatment, and research by revealing tumor progression both in clinical and preclinical applications, especially in the diagnosis and detection of tumor metastasis. Advances in PET scanner devices and the introduction of novel radiotracers have fueled the progress of PET imaging. Fluorodeoxyglucose (18F-FDG), an analogue of glucose, is the most common radiotracer used for PET imaging, because it can reveal specific tissue metabolic activity. However, 18F-FDG is phosphorylated by hexokinase and the phosphate cannot be cleared in most tissues, which can result in intense radiolabeling of tissues with high glucose uptake (Burt et al., 2001). FDG-PET is widely used in clinical oncology for diagnosis, staging and follow-up after treatment of tumors. Tissue and also molecule-specific radiotracers have been introduced to monitor the expression level of structural and functional proteins (Torigian et al., 2007). Steroid receptors have been associated with the growth of breast tumors, and thus understanding the receptor status is essential for the treatment of breast cancer. Radiolabeled ligands and their analogues are in preclinical application for receptor imaging. 18F-fluro-17β-estradiol (FES) has been used in PET imaging to examine the estrogen receptor status in different tissues of living subjects (Mintun et al., 1988). Bombesin, a peptide isolated from the frog *Bombinas bombina*, binds with gastrin-releasing peptide (GRP) receptor and has been implicated in breast cancer. This property has led the development of radiolabeled bombesin for peptide receptor imaging in breast cancer diagnosis (Scopinaro et

Single Photon Emission Computed Tomography (SPECT) is another imaging technique that is similar to PET imaging. Unlike PET, however, the tracer used in SPECT emits gamma rays that can be measured directly. In SPECT, the 2-D view of 3-dimensional images is acquired by a gamma camera and eventually 3-D data set is generated with the use of computer based tomographic reconstruction algorithm. Magnetic Resonance Imaging (MRI) is another widely used imaging technique, but unlike PET and SPECT, MRI can be used to view the anatomical nature of living subjects by generating data about the functional status of tissues (MacDonald et al., 2010). Magnetic Resonance Imaging (MRI) is a well-established diagnostic method to detect cancer. It has been widely used to detect breast cancer as it produces the highest sensitivity in spatial resolution of all imaging modalities. Guinea et al. (2010) investigated and analyzed the possible relationship between the magnetic resonance imaging (MRI) features of breast cancer and its clinicopathological and biological factors such as estrogen and progesterone receptor status, and expression of p53, HER2, ki67,

positron-emitting radionuclide into the living body with a biomolecule.


Table 2. Examples of some putative protein misfolding associated diseases and proteins involved in these diseases
