**1. Introduction**

Interactions between proteins and nucleic acids mediate a wide range of processes within a cell from its cycle to the maintenance of cellular metabolic and physiological balance. These specific interactions are crucial for control of DNA replication and DNA damage repair, regulation of transcription, RNA processing and maturation, nuclear transport, and translation.

The characterization of protein-nucleic acid interactions is essential not only for understanding the wide range of cellular processes they are involved in, but also the mechanisms underlying numerous diseases associated with the breakdown of regulatory systems. These include, but are far from being limited to, cell cycle disorders such as cancer and those caused by pathogenic agents that rely on or interfere with host cell machinery. More recently, it has been hypothesized that many neurological disorders such as Alzheimer's, Huntington's, Parkinson's, and polyglutamine tract expansion diseases are a consequence, at least in part, of aberrant protein-DNA interactions that may alter normal patterns of gene expression (Jiménez, 2010).

The electrophoretic mobility shift assay (EMSA), also known as gel retardation assay, is a regularly used system to detect protein-nucleic acid interactions. It was originally developed with the aim of quantifying interactions between DNA and proteins (Fried & Crothers, 1981; Garner & Revzin, 1981) and since then evolved to be suitable for different purposes including the detection and quantification of RNA-protein interactions. EMSA is most commonly used for qualitative assays including identification of nucleic acid-binding proteins and of the respective consensus DNA or RNA sequences. Under proper conditions, however, EMSA can also be used for quantitative purposes including the determination of binding affinities, kinetics, and stoichiometry.

EMSA is a commonly used method in the characterization of transcription factors, the most intensely studied DNA-binding proteins, and the largest group of proteins in humans, second only to metabolic enzymes. Their purification and identification is crucial in understanding gene regulatory mechanisms. Transcription factors are sequence specific DNA binding proteins that are usually assembled in complexes formed prior to transcription initiation. They bind discreet and specific DNA sequences in the promoter

Electrophoretic Mobility Shift Assay: Analyzing Protein – Nucleic Acid Interactions 207

Since its first publication, in 1981, several improvements and variant techniques of EMSA were reported. Originally described as a method to qualitatively detect protein-DNA interactions, gel retardation assays rapidly became one of the most popular methods to map interaction sequences and domains not only in DNA but in RNA-protein interactions as well. EMSA was also adapted in order to allow the determination of quantitative parameters

Several features made EMSA one of the most popular methods among researchers that study protein-nucleic acid interactions. Probably, the main advantages of EMSA when compared to other methods, as we will further discuss in the next sections, may be considered as follows: (1) EMSA is a basic, easy to perform, and robust method able to accommodate a wide range of conditions; (2) EMSA is a sensitive method, using radioisotopes to label nucleic acids and autoradiography, it is possible to use very low concentrations (0.1nM or less) and small sample volumes (20 µL or less; Hellman & Fried, 2007). Even though, less sensitive, non-radioactive labels are often used as well. These labels can further be detected using fluorescence, chemiluminescence or immunohistochemical approaches. Although less sensitive then radioisotopes, the wide variety of labels that can be used makes EMSA a very versatile method; (3) EMSA can also be used with a wide range of nucleic acid sizes and structures as well as a wide range of proteins, from small oligonucleotides to heavy transcription complexes; (4) Under the right conditions a gel retardation assay can separate the distribution of proteins between several nucleic acids within a single sample (Fried & Daugherty, 1998) or distinguish between complexes with different protein stoichiometry and/or binding site distribution (Fried & Crothers, 1981); (5) Finally, but not less important, it is possible to use both crude protein extracts and purified recombinant proteins enabling the identification of new nucleic acid-

**2. Advantages and limitations** 

including complex stoichiometry, binding kinetics and affinity.

interacting proteins or characterization of specific proteins and its targets.

Despite its sensitivity, versatility and usually easy to perform protocols, EMSA is often considered to bear a number of limitations. Dissociation can occur during electrophoresis since samples are not at equilibrium during the run, thus preventing detection. Additionally, complexes that are not stable in solution may be stable in the gel requiring very short runs so that the observed pattern relates to what happens in solution. EMSA does not provide a straightforward measure of the weights or entities of the proteins as mobility in gels is influenced by several other factors. Also, EMSA does not directly provide information on the nucleic acid sequence the proteins are bound to. However, this problem may usually be overcome using footprinting approaches as described further ahead. Kinetic studies using EMSA are limited since the time resolution for a regular EMSA protocol consists of the time required to mix the binding reaction and for the electrophoretic migration to occur before the mix enters the gel. Only processes that have relaxation times

larger than the interval required for solution handling are suitable for kinetic studies.

formation of protein-nucleic acid complexes alters these characteristics.

In this section, we will start with a simple account of the characteristics of the electrophoretic mobility of nucleic acids alone, and afterwards we will discuss how the

**3. How complexes migrate in gels** 

region functioning either as an activator or repressor of expression of the targeted gene through protein-protein interactions (reviewed by Simicevic & Deplancke, 2010). Transcription factors play essential roles during development and differentiation. It is well established that disruption of normal function of tissue-specific transcription factors, as a result of mutations, is often associated with a number of diseases including most forms of cancer, neurological, hematological, and inflammatory diseases. Additionally, transcription factors are often found differentially expressed in different pathologies suggesting an at least indirect involvement on the onset or progression of diseases. One of the most prominent examples of the involvement of transcription factors in development and progression of diseases is perhaps the p53 protein. p53 is a transcription factor involved in the modulation of expression of several genes that regulate essential cellular processes such as cell proliferation, apoptosis, and DNA damage repair (reviewed by Puzio-Kuter, 2011). Mutations in p53 that cause loss of function were reported in about 50% of all cancers. It is believed that this loss of function makes cancer cells more prone to the accumulation of mutations in other genes thus facilitating and accelerating the formation of neoplasias (reviewed by Goh et al., 2011).

In our laboratory, research is mainly directed to the study of host-pathogen interactions during hepatitis delta virus (HDV) replication and infection. HDV is the smallest human pathogen so far identified and infects human hepatocytes already infected with the hepatitis B virus (HBV). Both viruses have the same envelope proteins that are coded by the HBV DNA genome. HDV is, thus, considered a satellite virus of HBV. The HDV genome consists of a single-stranded, circular, RNA molecule of about 1700 nucleotides. This genome contains only one open reading frame from which two forms of the same protein, the so-called delta antigen, are derived by an editing mechanism catalyzed by cellular adenosine deaminase I. Both forms, small and large delta antigen, were shown to play crucial roles during virus replication: the small delta antigen is necessary for virus RNA accumulation and the large delta antigen plays an important role during envelope assembly (reviewed by Rizzetto, 2009). However, neither protein seems to display any known enzymatic activity. Accordingly, HDV is highly dependent on the host cell machinery for virus replication. It has been shown through EMSA that the small delta antigen binds *in vitro* to RNA and DNA without any specificity, which is in agreement with one of the roles attributed to the protein as a chaperone (Alves et al., 2010). Making use of different experimental approaches it was possible to identify a number of cellular proteins that interact with HDV antigens or RNA (reviewed by Greco-Stewart & Pelchat, 2010). However, the precise role played by most host factors during the virus life cycle remains elusive. Furthermore, it is highly consensual among HDV researchers that many other cellular factors that interact with delta antigens or HDV RNA remain to be identified and it is crucial to find those that interact with HDV RNA for a better insight on its replication and as possible targets for new therapies.

In this chapter we will review the principles of EMSA and its advantages and limitations for the quantitative and qualitative analysis of protein-nucleic acid interactions. The key parameters influencing the quality of protein samples, binding to nucleic acids, complex migration in gels, and sensitivity of detection will be discussed. Finally, an overview of the principles, advantages and disadvantages of methods that are an alternative to gel retardation assays will be provided.
