**Abstract**

This chapter aims at presenting small viral proteins that orchestrate replication of the human immunodeficiency virus type-1 (HIV-1) and the human hepatitis virus (HBV), two canonical examples of small human pathogens. HIV-1 nucleocapsid protein (NC) and the C-terminal domain (CTD) of the HBV core protein (HBc) are essential structural components of the virus capsid ensuring protection of the viral genome; they also chaperone replication of the HIV-1 genomic RNA and the HBV DNA by a reverse-transcription mode, and later, these proteins kick-start virus morphogenesis. HIV-1 NC and HBV CTD belong to the family of intrinsically disordered proteins (IDP), a characteristic rendering possible a large number of molecular interactions. Although these viral proteins share little sequence homologies, they have in common to be rich in basic amino acids and endowed with RNA-binding and chaperoning activities. Similar viral RNA-binding proteins (vRBP) are also encoded for by other virus families, notably flaviviruses, hantaviruses, and coronaviruses. We discuss how these vRBPs function based on the abundant RBP family that plays key physiological roles via multiple interactions with non-coding RNA regulating immune defenses and cell stress. Moreover, these RBPs are flexible molecules allowing dynamic interactions with many RNA and protein partners in a semi-solid milieu favoring biochemical reactions.

**Keywords:** RBP, HIV, HBV, IDP, RNA chaperoning, molecular crowding

## **1. Forewords on viruses and RNA chaperones**

Viruses that replicate their genome by the process of reverse transcription (RTion) are common in animals, plants, algae, and fungi [1]. These so-called reverse-transcribing viruses have been classified into five different families, namely, *Caulimoviridae*, *Hepadnaviridae*, *Metaviridae*, *Pseudoviridae*, and *Retroviridae* to which was the recently added *Belpaoviridae* [2]. Among these widespread viruses, two are major human pathogens, the human immunodeficiency virus type 1 (HIV-1) and the human hepatitis B virus (HBV).

Retroviruses exist as infectious exogenous RNA viruses as well as endogenous retroelements (ERV) present at high copy numbers in the genome of vertebrates. Hepadnaviruses can also integrate their genome in the host genome but at a much lower rate [3].

Replication of the genome of these two classes of virus necessitates a reverse transcription step. For HIV-1 the genomic RNA of 9600 nt in length has a structure similar to cellular mRNAs with a 5′ cap and 3′ poly A and contains 9 genes leading to the expression of 15 proteins. Retroviruses replicate their genome by a copy and paste mechanism, whereby the single-stranded positive-sense retroviral genomic RNA is converted into a double-stranded DNA by the virion reverse transcriptase (RT enzyme) [4], subsequently integrated into the host genome [5]. The integrated viral DNA called provirus is expressed by the host transcription machinery to synthesize the full-length viral RNA (FL RNA), which after nuclear export in the cell cytoplasm is translated by the ribosomes to synthesize the major structural proteins and enzymes, the Gag and Gag-Pol precursors. Specific interactions of the genomic RNA with the Gag polyprotein precursor drive Gag polymerization and viral core assembly at the plasma membrane (PM) [6].

For hepadnaviruses the small double-stranded DNA genome in a relaxed circular form (rcDNA) is targeted to the nucleus after virus infection where it is converted into a covalently closed circular form (cccDNA) and expressed by the transcription machinery of the infected cell to synthesize the full-length RNA called pre-genomic RNA (pgRNA) [7]. Upon translation of the pgRNA, the newly made core protein and RT enzyme interact with the pgRNA to synthesize the ds DNA genome. The genome of this virus has unique features such as an extensive overlapping of the genes, namely, 3200 nt with four coding sequences leading to the expression of seven proteins for HBV, and a pseudo-circular structure [8]. In addition several of the HIV and HBV proteins were found to be multifunctional, notably NC, TAT, and VIF protein for HIV and the HBV core protein (HBc) [9, 10].

These two classes of viruses probably emerged during the early Paleozoic Era, some 450–520 million years ago, with a marine origin [11]. The HBVs seem to originate from non-enveloped progenitors called nackednaviruses present in fishes, some 400 million years ago [12].

In addition to an RNA/DNA-dependent DNA polymerase called reverse transcriptase with an associated RNase H activity, these two classes of small viruses encode for a core protein endowed with RNA-binding, unwinding, annealing, and matchmaker activities and the ability to cause the formation of nucleoprotein complexes with a gel-like milieu favoring molecular crowding and biochemical reactions such as reverse transcription.

This chapter will briefly review the multiple roles of the core proteins drawing a parallel between the HIV-1 Gag and the HBV core. In fact these viral core proteins turn out to be much more than a structural component forming a cage enveloping the genome since they provide assistance to the RT-RNase H enzyme at all steps of viral DNA synthesis and then ensure stability of the newly made viral DNA.

Despite common functions in HIV and HBV morphogenesis and replication, the core protein appears much different from Gag on an amino acid sequence basis, but taking a closer look at their activities and functions reveals that these viral proteins are similar.

#### **2. The RNA folding problem and RNA chaperones**

The need for RNA chaperones comes from the RNA folding problem whereby RNA molecules have to find their native functional structure in an extremely wide landscape of structures [13]. In fact, RNA chaperones are as diverse and abundant

**5**

**Figure 1.**

ing in the synthesis of cDNA.

*electrophoresis. Adapted from Darlix et al. [15].*

*Multiple Functions and Disordered Nature of Nucleocapsid Proteins of Retroviruses…*

as RNA molecules, coding and noncoding from prokaryotes to eukaryotes [14]. Recent findings highlight the fact that RNA chaperones are disordered in nature and function in a disordered state and do not require ATP as a source of energy to direct RNA folding [13]. Instead RNA chaperones seem to exploit a mechanism of an energy transfer during a rapid on-off RNA-binding kinetics. A number of standard assays are used to monitor RNA chaperoning activity; notably binding, fraying, and annealing of complementary sequences; activation of hammerhead ribozymedirected cleavage of an RNA substrate; and formation of a dense nucleoprotein complexes. **Figure 1** illustrates assays aimed at describing the influence of NC on DNA strand transfers that occur during the process of reverse transcription result-

*Standard assays for monitoring the nucleic acid chaperone activity. These in vitro chaperoning assays summarize several properties of nucleic acid chaperone proteins, notably their ability to rapidly anneal complementary nucleic acid sequences (top panel) and favor formation of the most stable duplex, in physiological-like conditions. Bottom panel: R+ and R− sequences represent the 5*′ *end repeats of the HIV-1 genome of 96 nt in length. R− (mut) contains three mutated residues at its 3*′ *end in order to generate 3 nt mismatch upon annealing to R+; this was achieved by incubating R+ and R− mut at 66°C for 1 h. Next R− WT is added together with NC protein for 5 min at 30°C. The duplex and ss R− mut were resolved by native gel* 

The major structural proteins of retroviruses are encoded for by Gag that is formed of several modular domains, namely, Map17, Cap24, NCp7, and p6;

**3. The retroviral GAG polyprotein and its multiple roles**

*DOI: http://dx.doi.org/10.5772/intechopen.90724*

*Multiple Functions and Disordered Nature of Nucleocapsid Proteins of Retroviruses… DOI: http://dx.doi.org/10.5772/intechopen.90724*

#### **Figure 1.**

*Viruses and Viral Infections in Developing Countries*

assembly at the plasma membrane (PM) [6].

some 400 million years ago [12].

such as reverse transcription.

VIF protein for HIV and the HBV core protein (HBc) [9, 10].

**2. The RNA folding problem and RNA chaperones**

lower rate [3].

Hepadnaviruses can also integrate their genome in the host genome but at a much

Replication of the genome of these two classes of virus necessitates a reverse transcription step. For HIV-1 the genomic RNA of 9600 nt in length has a structure similar to cellular mRNAs with a 5′ cap and 3′ poly A and contains 9 genes leading to the expression of 15 proteins. Retroviruses replicate their genome by a copy and paste mechanism, whereby the single-stranded positive-sense retroviral genomic RNA is converted into a double-stranded DNA by the virion reverse transcriptase (RT enzyme) [4], subsequently integrated into the host genome [5]. The integrated viral DNA called provirus is expressed by the host transcription machinery to synthesize the full-length viral RNA (FL RNA), which after nuclear export in the cell cytoplasm is translated by the ribosomes to synthesize the major structural proteins and enzymes, the Gag and Gag-Pol precursors. Specific interactions of the genomic RNA with the Gag polyprotein precursor drive Gag polymerization and viral core

For hepadnaviruses the small double-stranded DNA genome in a relaxed circular form (rcDNA) is targeted to the nucleus after virus infection where it is converted into a covalently closed circular form (cccDNA) and expressed by the transcription machinery of the infected cell to synthesize the full-length RNA called pre-genomic RNA (pgRNA) [7]. Upon translation of the pgRNA, the newly made core protein and RT enzyme interact with the pgRNA to synthesize the ds DNA genome. The genome of this virus has unique features such as an extensive overlapping of the genes, namely, 3200 nt with four coding sequences leading to the expression of seven proteins for HBV, and a pseudo-circular structure [8]. In addition several of the HIV and HBV proteins were found to be multifunctional, notably NC, TAT, and

These two classes of viruses probably emerged during the early Paleozoic Era, some 450–520 million years ago, with a marine origin [11]. The HBVs seem to originate from non-enveloped progenitors called nackednaviruses present in fishes,

In addition to an RNA/DNA-dependent DNA polymerase called reverse transcriptase with an associated RNase H activity, these two classes of small viruses encode for a core protein endowed with RNA-binding, unwinding, annealing, and matchmaker activities and the ability to cause the formation of nucleoprotein complexes with a gel-like milieu favoring molecular crowding and biochemical reactions

This chapter will briefly review the multiple roles of the core proteins drawing a parallel between the HIV-1 Gag and the HBV core. In fact these viral core proteins turn out to be much more than a structural component forming a cage enveloping the genome since they provide assistance to the RT-RNase H enzyme at all steps of viral DNA synthesis and then ensure stability of the newly made viral DNA.

Despite common functions in HIV and HBV morphogenesis and replication, the core protein appears much different from Gag on an amino acid sequence basis, but taking a closer look at their activities and functions reveals that these viral proteins

The need for RNA chaperones comes from the RNA folding problem whereby RNA molecules have to find their native functional structure in an extremely wide landscape of structures [13]. In fact, RNA chaperones are as diverse and abundant

**4**

are similar.

*Standard assays for monitoring the nucleic acid chaperone activity. These in vitro chaperoning assays summarize several properties of nucleic acid chaperone proteins, notably their ability to rapidly anneal complementary nucleic acid sequences (top panel) and favor formation of the most stable duplex, in physiological-like conditions. Bottom panel: R+ and R− sequences represent the 5*′ *end repeats of the HIV-1 genome of 96 nt in length. R− (mut) contains three mutated residues at its 3*′ *end in order to generate 3 nt mismatch upon annealing to R+; this was achieved by incubating R+ and R− mut at 66°C for 1 h. Next R− WT is added together with NC protein for 5 min at 30°C. The duplex and ss R− mut were resolved by native gel electrophoresis. Adapted from Darlix et al. [15].*

as RNA molecules, coding and noncoding from prokaryotes to eukaryotes [14]. Recent findings highlight the fact that RNA chaperones are disordered in nature and function in a disordered state and do not require ATP as a source of energy to direct RNA folding [13]. Instead RNA chaperones seem to exploit a mechanism of an energy transfer during a rapid on-off RNA-binding kinetics. A number of standard assays are used to monitor RNA chaperoning activity; notably binding, fraying, and annealing of complementary sequences; activation of hammerhead ribozymedirected cleavage of an RNA substrate; and formation of a dense nucleoprotein complexes. **Figure 1** illustrates assays aimed at describing the influence of NC on DNA strand transfers that occur during the process of reverse transcription resulting in the synthesis of cDNA.

#### **3. The retroviral GAG polyprotein and its multiple roles**

The major structural proteins of retroviruses are encoded for by Gag that is formed of several modular domains, namely, Map17, Cap24, NCp7, and p6;

#### **Figure 2.**

*Structural model of the HIV-1 Gag polyprotein precursor. Left, the different domains of HIV-1 Gag, matrix (Map17), capsid (Cap24), nucleocapsid (NCp7), and p6; two small peptides flanking the NC domain, P2/SP1 and P1/SP2. Right: 2D presentation of the complete Gag Pr55. Adapted from Sundquist and Krausslich [16].*

in addition there are two small peptides p1 and p2 flanking NC in the Pr55 gag (**Figure 2**) [16]. The N-terminus is myristoylated, which, together with a row of basic residues within MA, targets Gag to the plasma membrane where assembly takes place [17].

In infected cells the full-length viral RNA is translated by the ribosome machinery to produce the Gag and Gag-Pol polyprotein precursors. The present model of assembly stipulates that newly made Gag molecules accumulate in the cytoplasm, probably in the vicinity of the translating polysomes [18] where they kick-start virus assembly (**Figure 3**); this is achieved through two types of interactions (i) Gag-NC with the 5′ untranslated region (5′ UTR) of FL RNA [19] and (ii) the myristoylated matrix domain with phospholipids of the T-cell membrane [20]. These interactions target the Gag-RNA nucleoprotein complexes to the plasma membrane, causing Gag-oligomer formation; the nucleocapsid domain binds and selects the genomic RNA causing its dimerization and at the same time, together with the capsid domain, boosts Gag multimerization (**Figure 3**). These interactions between Gag and phospholipids as well as RNA lead to virus assembly that takes place at the plasma membrane. Subsequently, virus maturation occurs during the budding process, together with the recruitment of the envelope glycoproteins by the matrix domain [21] (**Figure 3**).

Maturation is a complex process whereby the core of HIV-1 becomes conical and at the same time the genomic RNA dimer is condensed, thus leading to the formation of infectious particles [23]. However most HIV-1 virions and more generally retroviral particles are noninfectious. As a matter of fact, the ratio of infectious virus to noninfectious particles is from 1:10 to 1:104 [24, 25]. Thus a majority of particles are noninfectious most probably caused by the loss of envelope proteins, degradation of the genomic RNA, or else correspond to defective-interfering particles (DIP) that can lead to an underestimation of virus infectivity [26].

**7**

**Figure 3.**

*Multiple Functions and Disordered Nature of Nucleocapsid Proteins of Retroviruses…*

**4. Characteristics of retroviral nucleocapsid proteins**

DNA synthesis (**Figure 1** on chaperoning assays).

Retroviral nucleocapsid proteins are small basic proteins with either one (MuLV,

*Schematic representation of virus morphogenesis and the roles of Gag-NC. The genomic RNA is exported from the nucleus and translated by the cell ribosome machinery giving rise to the production of the Gag and Gag-Pol precursors. Gag-NC binds the packaging signal at the 5*′ *end of the genomic RNA causing the formation of the viral nucleoprotein complex (vRNP) and dimerization of the genomic RNA. Such vRNP are targeted to the plasma membrane where they accumulate to form immature viral particles. Next, the Gag molecules are processed by the viral protease causing core condensation. Adapted from Muriaux and Darlix [22].*

How is this achieved? According to Uversky, protein-RNA interfaces are most probably very large with the concomitant implications of basic, hydrophobic, and aromatic residues engaged, respectively, in ionic, hydrophobic, and intercalating interactions [34–36]. The interactions between NC and RT are poorly understood, but they appear to necessitate the RNA template as the scaffolding agent [37, 38].

gammaretrovirus) or two CCHC zinc fingers (HIV and FIV, lentiviruses; RSV an *Alpharetrovirus*) (**Table 1**). The zinc fingers are structured upon Zn2+ binding (in red), while the flanking domains are disordered and basic. Therefore, these viral proteins are members of the large family of intrinsically disordered proteins/ intrinsically disordered protein domains (IDPDs) [27–29]. Of note all these NC proteins are endowed with RNA-binding and chaperoning activities as shown using in vitro reconstituted systems [15, 30–32]. Other important characteristics are the ability of these NC proteins to cause the formation of nucleoprotein complexes capable of recruiting enzymes such as reverse transcriptase and integrase (IN) [33]. In this gel-like milieu, molecular crowding can take place, thus facilitating enzymatic reactions, such as cDNA synthesis by RT and integration by IN. Along this line, NC protein interacts with RT improving the fidelity of cDNA synthesis by several different ways: (i) inhibition of self-primed initiation of cDNA synthesis, (ii) chaperoning the obligatory minus- and plus-stranded transfers for the synthesis of the LTR flanking the viral DNA (**Figure 1**), and (iii) improving the processivity of RT as well as its excision repair activity resulting in a much higher fidelity of viral

*DOI: http://dx.doi.org/10.5772/intechopen.90724*

*Multiple Functions and Disordered Nature of Nucleocapsid Proteins of Retroviruses… DOI: http://dx.doi.org/10.5772/intechopen.90724*

#### **Figure 3.**

*Viruses and Viral Infections in Developing Countries*

in addition there are two small peptides p1 and p2 flanking NC in the Pr55 gag (**Figure 2**) [16]. The N-terminus is myristoylated, which, together with a row of basic residues within MA, targets Gag to the plasma membrane where assembly

*Structural model of the HIV-1 Gag polyprotein precursor. Left, the different domains of HIV-1 Gag, matrix (Map17), capsid (Cap24), nucleocapsid (NCp7), and p6; two small peptides flanking the NC domain, P2/SP1 and P1/SP2. Right: 2D presentation of the complete Gag Pr55. Adapted from Sundquist* 

In infected cells the full-length viral RNA is translated by the ribosome machinery to produce the Gag and Gag-Pol polyprotein precursors. The present model of assembly stipulates that newly made Gag molecules accumulate in the cytoplasm, probably in the vicinity of the translating polysomes [18] where they kick-start virus assembly (**Figure 3**); this is achieved through two types of interactions (i) Gag-NC with the 5′ untranslated region (5′ UTR) of FL RNA [19] and (ii) the myristoylated matrix domain with phospholipids of the T-cell membrane [20]. These interactions target the Gag-RNA nucleoprotein complexes to the plasma membrane, causing Gag-oligomer formation; the nucleocapsid domain binds and selects the genomic RNA causing its dimerization and at the same time, together with the capsid domain, boosts Gag multimerization (**Figure 3**). These interactions between Gag and phospholipids as well as RNA lead to virus assembly that takes place at the plasma membrane. Subsequently, virus maturation occurs during the budding process, together with the recruitment of the envelope glycoproteins by

Maturation is a complex process whereby the core of HIV-1 becomes conical and at the same time the genomic RNA dimer is condensed, thus leading to the formation of infectious particles [23]. However most HIV-1 virions and more generally retroviral particles are noninfectious. As a matter of fact, the ratio of infectious virus to noninfectious particles is from 1:10 to 1:104

Thus a majority of particles are noninfectious most probably caused by the loss of envelope proteins, degradation of the genomic RNA, or else correspond to defective-interfering particles (DIP) that can lead to an underestimation of virus

[24, 25].

**6**

infectivity [26].

takes place [17].

*and Krausslich [16].*

**Figure 2.**

the matrix domain [21] (**Figure 3**).

*Schematic representation of virus morphogenesis and the roles of Gag-NC. The genomic RNA is exported from the nucleus and translated by the cell ribosome machinery giving rise to the production of the Gag and Gag-Pol precursors. Gag-NC binds the packaging signal at the 5*′ *end of the genomic RNA causing the formation of the viral nucleoprotein complex (vRNP) and dimerization of the genomic RNA. Such vRNP are targeted to the plasma membrane where they accumulate to form immature viral particles. Next, the Gag molecules are processed by the viral protease causing core condensation. Adapted from Muriaux and Darlix [22].*
