**3. Conventional and unconventional 5' RNA cap synthesis mechanism**

#### **3.1. Canonical cap synthesis by different viruses**

of a ubiquitous RNA 5'-cap. The 5'→ 3' direction of nucleotide triphosphate (NTP) polymeri‐ zation during RNA synthesis creates a nascent mRNA molecule with a 5'-triphosphate moiety resulting from the initial NTP on the 5'-end. Through the processes involved in cap synthesis, the pppRNA structure is transformed into a basic, cap-0 RNA structure (m7GpppN). Further

In this chapter, a number of processes used by viruses to synthesize, acquire or mimic a 5' cap are explored to highlight the similarities and differences in the enzymatic mechanisms that lead to the maturation of a 5'cap on viral RNA and its importance in viral genome replication

To understand the importance of an RNA cap structure for viruses, it is crucial to first understand why this structure is essential to their eukaryotic hosts. Prokaryotic RNA tran‐ scription and protein translation are coupled due to the spatial proximity between DNA and ribosomes. In eukaryotic cells however, newly synthesized RNA transcripts undergo several nuclear post-transcriptional modifications, known as RNA processing, before they are exported and translated in the cytoplasm. These eukaryotic pre-mRNA modifications include the addition of a cap structure at the 5'-end, the splicing out of introns, the editing of nucleo‐ bases and the addition of a poly(A) tail at the 3'-end. RNA capping is a co-transcriptional process that occurs when an RNA molecule is 20-30 nucleotides in length. The cap structure consists of a guanosine residue, harboring a methylation in the N-7 position, which is bound to the terminal 5'-end nucleotide with a peculiar 5'-5' triphosphate bridge (Fig. 1). This inverted link between the two nucleotides prevents RNA degradation by 5'-3' exonucleases. The second important feature of the cap structure is the presence of the methyl group on the guanosine, which confers a positive charge that plays an important role in its specific recognition by specialized proteins. The cap structure fulfills many roles which ultimately lead to mRNA translation. In the nucleus for instance, the cap structure of pre-mRNAs is recognized by the cap binding proteins (CBP20 and CBP80). This cap binding complex (CBC) protects mRNA from degradation and assists RNA transport from the nucleus to the cytoplasm. Once in the cytoplasm, ribosomes and translation factors must be recruited for translation of mRNAs into proteins. The eukaryotic translation initiation factor 4E (eIF4E) specifically binds to the RNA cap structure [1]. This association is mediated through stacking interactions between two aromatic residues of the eIF4E protein; the mRNA binding is further stabilized by specific hydrogen bonds between the positive charge of the 7-methylguanosine and an acidic residue [2]. Upon cap binding, eIF4E assembles with eIF4G (a scaffold protein) and eIF4A (an RNA helicase) into the eIF4F complex [3]. The scaffolding protein eIF4G recruits the small 40S ribosomal subunit through the eIF3 complex [4]. The translation initiation complex then scans the mRNA for the start codon before recruiting the larger subunit of the ribosome, and translation of the open reading frame (ORF) takes place [2]. Taken together, the roles fulfilled by the RNA cap structure are crucial for RNA stability and translation. Because of this, many

2'-O-methylations of the first and second nucleotides of the RNA may occur.

28 Current Issues in Molecular Virology - Viral Genetics and Biotechnological Applications

within a host cell.

**2. Description of the RNA cap structure**

The importance of the cap structure in eukaryote metabolism has resulted in an evolutionary pressure for viruses to adopt a similar cap structure. A series of enzymatic reactions is required to synthesize a cap structure at the 5'-end of RNA. The most pervasive enzymatic pathway, also termed "conventional capping", consists of three sequential enzymatic activities that are required to generate a functional 7-methylguanosine 5'-5'-triphosphate bridged cap structure. As a result of the directional 5' to 3' polymerization of nucleotide triphosphates (NTP) during RNA synthesis, nascent RNA bear at their 5'-end a triphosphate moiety (originating from the initial NTP). This 5'-triphosphate end of the RNA is first converted into a 5'-diphosphate end by hydrolysis of the terminal phosphate, or γ-phosphate, by an RNA triphosphatase (RTPase). This is followed by a two-step reaction catalyzed by an RNA guanylyltransferase (GTase). The enzyme first specifically binds and hydrolyzes a GTP molecule to form a covalent enzyme-GMP intermediate, which then catalyzes the transfer of the GMP moiety onto the 5'-end of a diphosphorylated acceptor RNA (ppRNA) in the second step of GTase reaction. Lastly, an RNA (guanine-N-7)-methyltransferase (N7MTase) uses S-Adenosyl methionine (SAM) as a methyl group donor in order to methylate the guanosine residue of the cap structure at the N7 position. This sequence of enzymatic modifications yields the minimal RNA cap-0 structure (m7GpppN). Subsequent methylation of the 2'-hydroxyl group of the first few nucleotides of the RNA can be catalyzed by a (nucleoside-2'-O)-methyltransferase (2'OMTase) again using a SAM molecule as a methyl-donor (Fig. 2). Further methylations on the caps proximal nucleo‐ tides convert a cap-0 structure into a cap-1 (m7GpppNm) or cap-2 (m7GpppNmNm) structure.

mechanisms of action have evolved to generate the same highly conserved RNA cap structure (Fig. 3). The following paragraphs describe the enzymes supporting the RTPase, GTase,

RNA 5′-end Maturation: A Crucial Step in the Replication of Viral Genomes

http://dx.doi.org/10.5772/56166

31

**Figure 3. Viral RNA 5'-end structure and maturation**. Nearly all mammalian viruses modify their RNA 5'-end through the covalent addition of a cap structure (majority) or a VPg protein (minority). Although the widely acquired RNA cap structure is chemically identical (m7GpppN), viruses have evolved a large variety of mechanisms to synthesize or acquire this crucial structure. The mechanisms of 5'-end maturation vary among order and families of viruses as presented in the above schematic (for clarity, only a few viral families are presented). Within the *Flaviviridae* family, the Flavivirus synthesize a typical cap structure while the Hepacivirus and Pestivirus are the only mammalian viruses repre‐ sentative to harbor an unmodified 5'-triphosphate end. Their RNA 5' UTR instead folds into a highly structured threedimensional conformation termed IRES. Of notice, the Retroviridae RNA harbor both, a cap structure and an IRES, as well as the *Picornaviridae* RNA that harbors both an IRES and a 5'-VPg-linked protein. Adapted from Decroly and al.

The RTPase activity is the first of the three enzymatic reactions required to synthesize a cap structure. The RTPase hydrolyzes the γ-β-phosphoanhydride bond at the 5'-end of an RNA to yield an RNA 5'-diphosphate and inorganic phosphate (Pi). Viruses have evolved a wide variety of enzyme structures and mechanisms of action to fulfill the RTPase activity, a greater diversity than is seen with any other enzymatic capping activity. RTPases are classified as either belonging to the metal-dependent family or the metal independent family based on their cofactor requirements. As indicated by its name, the first family requires a divalent cation cofactor for its activity. This metal requirement is usually satisfied by Mg2+, although Mn2+ is also able to support the RTPase activity [5]. This family of enzymes also shares the ability to hydrolyze free NTPs, again in the presence of a metal cofactor [5, 6]. The lack of substrate

N7MTase and 2'OMTase activity.

(2012).

**3.2. RNA triphosphatases**

**Figure 2. Canonical 5' RNA cap synthesis pathway.** RNA cap-1 structures are conventionally synthesized by the se‐ quential γ-phosphate hydrolysis by an RTPase, GMP transfer by a GTase, N7-methylation by an N7MTase, and 2'Omethylation by a 2'OMTase. The contribution of each substrate to the formation of the final 5' RNA cap structure is highlighted by a color code: pppRNA (black), GTP (blue) and SAM (green and red).

The conventional RNA 5' cap synthesis mechanism is used by a majority of viruses in order to acquire a cap structure. Most DNA viruses together with the RNA viruses from the *Bornaviridae* and *Retroviridae* families use the host RNA polymerase II (RNA Pol II) to transcribe their mRNAs. As a result, the majority of DNA virus transcripts are co-transcriptionally capped using the cellular capping apparatus. Alternatively, many RNA viruses with a cytoplasmic replication cycle, do not have access to the host RNA Pol II and therefore have evolved their own capping machinery*.* Over time, a wide diversity of enzyme structures and mechanisms of action have evolved to generate the same highly conserved RNA cap structure (Fig. 3). The following paragraphs describe the enzymes supporting the RTPase, GTase, N7MTase and 2'OMTase activity.

**Figure 3. Viral RNA 5'-end structure and maturation**. Nearly all mammalian viruses modify their RNA 5'-end through the covalent addition of a cap structure (majority) or a VPg protein (minority). Although the widely acquired RNA cap structure is chemically identical (m7GpppN), viruses have evolved a large variety of mechanisms to synthesize or acquire this crucial structure. The mechanisms of 5'-end maturation vary among order and families of viruses as presented in the above schematic (for clarity, only a few viral families are presented). Within the *Flaviviridae* family, the Flavivirus synthesize a typical cap structure while the Hepacivirus and Pestivirus are the only mammalian viruses repre‐ sentative to harbor an unmodified 5'-triphosphate end. Their RNA 5' UTR instead folds into a highly structured threedimensional conformation termed IRES. Of notice, the Retroviridae RNA harbor both, a cap structure and an IRES, as well as the *Picornaviridae* RNA that harbors both an IRES and a 5'-VPg-linked protein. Adapted from Decroly and al. (2012).

#### **3.2. RNA triphosphatases**

RNA (guanine-N-7)-methyltransferase (N7MTase) uses S-Adenosyl methionine (SAM) as a methyl group donor in order to methylate the guanosine residue of the cap structure at the N7 position. This sequence of enzymatic modifications yields the minimal RNA cap-0 structure (m7GpppN). Subsequent methylation of the 2'-hydroxyl group of the first few nucleotides of the RNA can be catalyzed by a (nucleoside-2'-O)-methyltransferase (2'OMTase) again using a SAM molecule as a methyl-donor (Fig. 2). Further methylations on the caps proximal nucleo‐ tides convert a cap-0 structure into a cap-1 (m7GpppNm) or cap-2 (m7GpppNmNm) structure.

30 Current Issues in Molecular Virology - Viral Genetics and Biotechnological Applications

**Figure 2. Canonical 5' RNA cap synthesis pathway.** RNA cap-1 structures are conventionally synthesized by the se‐ quential γ-phosphate hydrolysis by an RTPase, GMP transfer by a GTase, N7-methylation by an N7MTase, and 2'Omethylation by a 2'OMTase. The contribution of each substrate to the formation of the final 5' RNA cap structure is

The conventional RNA 5' cap synthesis mechanism is used by a majority of viruses in order to acquire a cap structure. Most DNA viruses together with the RNA viruses from the *Bornaviridae* and *Retroviridae* families use the host RNA polymerase II (RNA Pol II) to transcribe their mRNAs. As a result, the majority of DNA virus transcripts are co-transcriptionally capped using the cellular capping apparatus. Alternatively, many RNA viruses with a cytoplasmic replication cycle, do not have access to the host RNA Pol II and therefore have evolved their own capping machinery*.* Over time, a wide diversity of enzyme structures and

highlighted by a color code: pppRNA (black), GTP (blue) and SAM (green and red).

The RTPase activity is the first of the three enzymatic reactions required to synthesize a cap structure. The RTPase hydrolyzes the γ-β-phosphoanhydride bond at the 5'-end of an RNA to yield an RNA 5'-diphosphate and inorganic phosphate (Pi). Viruses have evolved a wide variety of enzyme structures and mechanisms of action to fulfill the RTPase activity, a greater diversity than is seen with any other enzymatic capping activity. RTPases are classified as either belonging to the metal-dependent family or the metal independent family based on their cofactor requirements. As indicated by its name, the first family requires a divalent cation cofactor for its activity. This metal requirement is usually satisfied by Mg2+, although Mn2+ is also able to support the RTPase activity [5]. This family of enzymes also shares the ability to hydrolyze free NTPs, again in the presence of a metal cofactor [5, 6]. The lack of substrate specificity is speculated to be a result of the chemical similarity between an NTP and the RNA 5'-triphosphate end. The metal dependent RTPase family is further subdivided into three distinct structural groups, namely the triphosphate tunnel metalloenzyme (TTM), histidine triad-like (HIT-like) and helicase-like RTPase (Fig. 4).

The HIT-like RTPase is so far only represented by the NSP2 enzyme of rotaviruses (dsRNA virus). The name of this family is based on the structural resemblance between the NSP2 Cterminal domain (CTD) and the ubiquitous cellular histidine triad nucleotidyl hydrolases (HIT). The NSP2 protein associates into an octamer to form a doughnut-shaped quaternary structure (Fig. 4) [9, 10]. RNA binding grooves are found at the surface of the doughnut-shape while the active site is buried deep in an electro-positive cleft on each monomer. Despite structural similarity with HIT, NSP2 appears to be catalytically distinct. The catalytic histidine triad requires a Mg2+ cofactor to hydrolyze the γ-β-phosphoanhydride and form a covalent phosphate-histidine intermediate [11]. The enzyme harbours similar catalytic rates toward both NTP and pppRNA substrates. Increased affinity for RNA, conferred by the RNA binding grooves, is speculated to stimulate RTPase activity over NTPase activity *in vivo* [10]. Despite the structural similarity with HIT, currently no evidence indicates that HIT-like RTPase could have evolved from their cellular counterpart, and rather a convergent evolution is more

RNA 5′-end Maturation: A Crucial Step in the Replication of Viral Genomes

http://dx.doi.org/10.5772/56166

33

The helicase-like RTPases are found in a variety of ss(+) RNA viruses of the *flavivirus*, *coronavirus*, *potexvirus* and *alphavirus* genera and the dsRNA viruses of the *Reoviridae* family. These enzymes are active NTPase-helicases and belong to the large helicase superfamilies SF1 and SF2. The NTPase activity fuels the energy-consuming strand displacement of the helicase activity. The common NTPase-RTPase catalytic site is located in a cleft formed from the junction of two RecA-like subdomains (Fig. 4). As with many nucleotide-binding proteins, the active site of helicase-like RTPases harbour both a Walker A and Walker B motif [12, 13]. The Walker A motif (GxxxxGK(T/S)), or phosphate-binding loop (P-loop), is responsible for contacting the γ-phosphate through its highly conserved arginine. The aspartate of the Walker B motif (DExD) coordinates the crucial Mg2+, which stabilizes the γ and β-phosphates, while the glutamate activates the water molecule for the hydrolysis reaction [14]. The addition of the RTPase activity to an NTPase-helicase ancestor appears to result form only a minor evolu‐ tionary progression as the ancestor enzyme already displayed the key RTPase features, namely, a nucleic acid binding domain, a triphosphate binding active site and a terminal

The second family of RTPases is the metal-independent group. Higher eukaryotic viruses that rely on capping apparatus of the cell use the host metal-independent RTPase. Moreover, baculovirus also expresses such a metal-independent RTPase. Two striking differences between this enzyme family and the metal-dependent family, are its cation-independent mechanism of action and its inability to hydrolyze free NTP [15]. Metal-independent RTPases are members of the cysteine phosphatase superfamily, sharing their signature HCxxxxxR(S/T) P-loop motif located in a deep positively charged pocket. The catalytic cysteine is located at the bottom triphosphate binding cleft formed by the characteristic α/β-fold ternary structure (Fig. 4) [15, 16]. The catalytic cycle fits a two-step phosphoryl-transfer reaction. First, the pppRNA γ-phosphate is attacked by the catalytic cysteine to form a covalent proteincysteinyl-S-phosphate intermediate which results in the release of the ppRNA product. Next, a water molecule attacks the phosphocysteine to expel the inorganic phosphate and regenerate the enzyme [15]. The metal-independent RTPase presumably evolved from the cysteine

probable [9].

phosphate hydrolysis activity.

**Figure 4. Diversity in RNA triphosphatase structure and mechanism of action.** The RTPase activity can be cata‐ lyzed by various mechanisms of action, each associated with a characteristic structure. They are indicated as follows: TTM (blue), HIT-like (multimeric: dark pink, monomeric: red), Helicase-like (cyan) and metal-independent (green). The location of the active site is indicated by and arrow, and examples of hosts and viruses utilizing those enzymes are given.

The TTM enzymes are found in chlorella virus, poxviruses, baculoviruses, mimiviruses and lower eukaryotes. All TTM RTPases fold in a specific, characteristic structure. An assembly of eight antiparallel β-strands to form a tunnel scaffold surrounding the active site (Fig. 4). The interior of the tunnel is dominated by hydrophilic amino acid side chains oriented toward the center of the tunnel creating a network of interactions for the triphosphate moiety of the substrate [7]. Glutamate residues, within this amino acid network are also responsible for the coordination of the crucial cation cofactor [6]. The recognition of the RNA substrate, primarily through its triphosphate moiety, could explain the activity of the TTM RTPase against NTP substrates. Interestingly, this NTP hydrolysis is not supported by Mg2+, but is rather dependent on Mn2+ or Co2+ [6]. The coordinated metal ion, in conjunction with basic lysine and arginine, activates the γ-phosphate and stabilizes the pentacoordinate phosphorane transition state. A glutamate serves as a general base catalyst to activate the nucleophilic water for the attack on the γ-phosphorus according to a one-step in-line mechanism [8]. TTM RTPases have been acquired by large DNA viruses from their hosts [7]. Interestingly, modern *Poxviridae* infect higher eukaryotes that lack TTM RTPase, underlying their evolution from viral ancestors that replicated in unicellular eukarya, from which they likely acquired a TTM RTPase.

The HIT-like RTPase is so far only represented by the NSP2 enzyme of rotaviruses (dsRNA virus). The name of this family is based on the structural resemblance between the NSP2 Cterminal domain (CTD) and the ubiquitous cellular histidine triad nucleotidyl hydrolases (HIT). The NSP2 protein associates into an octamer to form a doughnut-shaped quaternary structure (Fig. 4) [9, 10]. RNA binding grooves are found at the surface of the doughnut-shape while the active site is buried deep in an electro-positive cleft on each monomer. Despite structural similarity with HIT, NSP2 appears to be catalytically distinct. The catalytic histidine triad requires a Mg2+ cofactor to hydrolyze the γ-β-phosphoanhydride and form a covalent phosphate-histidine intermediate [11]. The enzyme harbours similar catalytic rates toward both NTP and pppRNA substrates. Increased affinity for RNA, conferred by the RNA binding grooves, is speculated to stimulate RTPase activity over NTPase activity *in vivo* [10]. Despite the structural similarity with HIT, currently no evidence indicates that HIT-like RTPase could have evolved from their cellular counterpart, and rather a convergent evolution is more probable [9].

specificity is speculated to be a result of the chemical similarity between an NTP and the RNA 5'-triphosphate end. The metal dependent RTPase family is further subdivided into three distinct structural groups, namely the triphosphate tunnel metalloenzyme (TTM), histidine

**Figure 4. Diversity in RNA triphosphatase structure and mechanism of action.** The RTPase activity can be cata‐ lyzed by various mechanisms of action, each associated with a characteristic structure. They are indicated as follows: TTM (blue), HIT-like (multimeric: dark pink, monomeric: red), Helicase-like (cyan) and metal-independent (green). The location of the active site is indicated by and arrow, and examples of hosts and viruses utilizing those enzymes are

The TTM enzymes are found in chlorella virus, poxviruses, baculoviruses, mimiviruses and lower eukaryotes. All TTM RTPases fold in a specific, characteristic structure. An assembly of eight antiparallel β-strands to form a tunnel scaffold surrounding the active site (Fig. 4). The interior of the tunnel is dominated by hydrophilic amino acid side chains oriented toward the center of the tunnel creating a network of interactions for the triphosphate moiety of the substrate [7]. Glutamate residues, within this amino acid network are also responsible for the coordination of the crucial cation cofactor [6]. The recognition of the RNA substrate, primarily through its triphosphate moiety, could explain the activity of the TTM RTPase against NTP substrates. Interestingly, this NTP hydrolysis is not supported by Mg2+, but is rather dependent on Mn2+ or Co2+ [6]. The coordinated metal ion, in conjunction with basic lysine and arginine, activates the γ-phosphate and stabilizes the pentacoordinate phosphorane transition state. A glutamate serves as a general base catalyst to activate the nucleophilic water for the attack on the γ-phosphorus according to a one-step in-line mechanism [8]. TTM RTPases have been acquired by large DNA viruses from their hosts [7]. Interestingly, modern *Poxviridae* infect higher eukaryotes that lack TTM RTPase, underlying their evolution from viral ancestors that

replicated in unicellular eukarya, from which they likely acquired a TTM RTPase.

triad-like (HIT-like) and helicase-like RTPase (Fig. 4).

32 Current Issues in Molecular Virology - Viral Genetics and Biotechnological Applications

given.

The helicase-like RTPases are found in a variety of ss(+) RNA viruses of the *flavivirus*, *coronavirus*, *potexvirus* and *alphavirus* genera and the dsRNA viruses of the *Reoviridae* family. These enzymes are active NTPase-helicases and belong to the large helicase superfamilies SF1 and SF2. The NTPase activity fuels the energy-consuming strand displacement of the helicase activity. The common NTPase-RTPase catalytic site is located in a cleft formed from the junction of two RecA-like subdomains (Fig. 4). As with many nucleotide-binding proteins, the active site of helicase-like RTPases harbour both a Walker A and Walker B motif [12, 13]. The Walker A motif (GxxxxGK(T/S)), or phosphate-binding loop (P-loop), is responsible for contacting the γ-phosphate through its highly conserved arginine. The aspartate of the Walker B motif (DExD) coordinates the crucial Mg2+, which stabilizes the γ and β-phosphates, while the glutamate activates the water molecule for the hydrolysis reaction [14]. The addition of the RTPase activity to an NTPase-helicase ancestor appears to result form only a minor evolu‐ tionary progression as the ancestor enzyme already displayed the key RTPase features, namely, a nucleic acid binding domain, a triphosphate binding active site and a terminal phosphate hydrolysis activity.

The second family of RTPases is the metal-independent group. Higher eukaryotic viruses that rely on capping apparatus of the cell use the host metal-independent RTPase. Moreover, baculovirus also expresses such a metal-independent RTPase. Two striking differences between this enzyme family and the metal-dependent family, are its cation-independent mechanism of action and its inability to hydrolyze free NTP [15]. Metal-independent RTPases are members of the cysteine phosphatase superfamily, sharing their signature HCxxxxxR(S/T) P-loop motif located in a deep positively charged pocket. The catalytic cysteine is located at the bottom triphosphate binding cleft formed by the characteristic α/β-fold ternary structure (Fig. 4) [15, 16]. The catalytic cycle fits a two-step phosphoryl-transfer reaction. First, the pppRNA γ-phosphate is attacked by the catalytic cysteine to form a covalent proteincysteinyl-S-phosphate intermediate which results in the release of the ppRNA product. Next, a water molecule attacks the phosphocysteine to expel the inorganic phosphate and regenerate the enzyme [15]. The metal-independent RTPase presumably evolved from the cysteine phosphatase ubiquitously found in higher eukaryotes and was later acquired by *baculovirus* from their hosts. Interestingly, baculovirus also encodes a second TTM RTPase fulfilling the same role. This unconventional carrying of two distinct enzymes having the same activity is speculated to be an evolutionary snapshot of an RTPase transition from the lower eukaryote TTM RTPase to the higher eukaryote metal-independent RTPase.

#### **3.3. RNA guanylyltransferase**

The second step of the capping sequence is the GTase activity. GTase catalyzes the rate-limiting transfer of a GMP moiety from a GTP substrate to an acceptor ppRNA to yield an unmethylated cap structure (GpppN). GTases are members of the covalent nucleotidyltransferases super‐ family which also includes the ATP- and NAD+ -dependent DNA ligases and the ATPdependent RNA ligases [17]. This superfamily's ternary structure is composed of the Nterminal of the nucleotidyltransferase (NT) domain fused to an oligobinding fold (OB-fold) domain in the C-terminal. These flexible proteins are able to undergo large conformational changes during their catalytic cycle. GTases share highly conserved structures and motifs, of which the hallmark KxDG(I/L) motif is present in nearly all GTases [18]. The catalytic cycle of the GTase is a complex two-step ping-pong reaction involving multiple conformational changes. First, a GTase in a conformation where the OB-fold domain is distant from the NT domain (open conformation) specifically binds a GTP molecule. This is followed by the closure of the OB-fold domain toward the NT domain (closed conformation) which is stabilized by interactions between the bound nucleotide and residues from both NT and OB fold domains. This conformational change also creates a Mg2+ cofactor binding site, thus the closed confor‐ mation represents the catalytically active form of the enzyme [19, 20]. Upon Mg2+ binding, the α-phosphate of the GTP is sandwiched between the catalytic lysine (form the KxDG) and the metal cofactor. Deprotonation of the lysine leads to the attack on the α-phosphate of the GTP to form a enzyme-(lysyl-N)-GMP intermediate (EpG), concomitant with the hydrolysis of a pyrophosphate molecule [20]. Following the catalysis, interactions between the bound guanylate and the OB fold domain are disrupted, leading to the reopening of the enzyme and the release of pyrophosphate. The reopening of the guanylylated enzyme allows for accom‐ modation of the ppRNA, which is likely followed by the closure of the OB-fold domain. Closing of the OB-fold domain returns the enzyme to its catalytically active form, which promotes the transfer of the GMP to the acceptor RNA. A final reopening allows for unmethylated capped RNA to be released and the apo-protein to be regenerated (Fig. 5) [19]. The active sites of the GTase are highly conserved, potentially due to their fairly complex catalytic cycle. Most viruses encode GTases that are, with respect to the active site, nearly identical to their eukaryotic host GTase, favouring the hypothesis of ancestral viral acquisition of the host GTase.

The *flavirivus* GTases are also atypical. Their activities are found on the N-terminal portion of the RDRP-MTase peptide. They are structurally distinct from both the conventional and the *Reoviridae* GTase but they still mediate RNA guanylation through a two-step mechanism involving an EpG intermediate [21, 22]. The precise amino acid involved in the guanylateenzyme complex formation is also speculated to be a lysine, but a histidine or an arginine residue may also play this role. Progress in the field of atypical viral capping enzymes will

**Figure 5. Structural and mechanistic pathway used by GTase**. The apo-enzyme in open conformation (blue) binds a GTP substrate (gray), closes (red) and proceeds to hydrolysis thereby generating an enzyme-GMP (black) intermedi‐ ate. GTase reopening (blue) allows for RNA binding (orange), the enzyme either stays open or closed to allow GMP transfer onto the RNA. Finally, the open enzyme releases the GpppN product and the apo-enzyme is regenerated.

RNA 5′-end Maturation: A Crucial Step in the Replication of Viral Genomes

http://dx.doi.org/10.5772/56166

35

The third step of the RNA 5'-end cap synthesis is the methylation of the cap guanosine by a N7MTase. An N7MTase adds a methyl group to the guanine at the N7 position in order to convert the GpppN into a functional m7GpppN cap-0 structure. The conversion of S-Adenosyl

eventually shed light on those imprecisions.

**3.4. RNA methyltransferase**

While nearly all GTases are highly conserved, a few recently discovered viral GTases are different. Little is currently known about those atypical GTases lacking the catalytic KxDG motif. Some segmented dsRNA viruses of the *Reoviridae* family encode for a large multiprotein capsid harbouring nucleic acid maturation functions, including GTase activity. The *Reoviri‐ dae* GTase is structurally different from the conventional GTase. While they lack the conserved KxDG motif, they still maintain the capacity to form an enzyme-(lysyl-N)-GMP intermediate. RNA 5′-end Maturation: A Crucial Step in the Replication of Viral Genomes http://dx.doi.org/10.5772/56166 35

**Figure 5. Structural and mechanistic pathway used by GTase**. The apo-enzyme in open conformation (blue) binds a GTP substrate (gray), closes (red) and proceeds to hydrolysis thereby generating an enzyme-GMP (black) intermedi‐ ate. GTase reopening (blue) allows for RNA binding (orange), the enzyme either stays open or closed to allow GMP transfer onto the RNA. Finally, the open enzyme releases the GpppN product and the apo-enzyme is regenerated.

The *flavirivus* GTases are also atypical. Their activities are found on the N-terminal portion of the RDRP-MTase peptide. They are structurally distinct from both the conventional and the *Reoviridae* GTase but they still mediate RNA guanylation through a two-step mechanism involving an EpG intermediate [21, 22]. The precise amino acid involved in the guanylateenzyme complex formation is also speculated to be a lysine, but a histidine or an arginine residue may also play this role. Progress in the field of atypical viral capping enzymes will eventually shed light on those imprecisions.

#### **3.4. RNA methyltransferase**

phosphatase ubiquitously found in higher eukaryotes and was later acquired by *baculovirus* from their hosts. Interestingly, baculovirus also encodes a second TTM RTPase fulfilling the same role. This unconventional carrying of two distinct enzymes having the same activity is speculated to be an evolutionary snapshot of an RTPase transition from the lower eukaryote

The second step of the capping sequence is the GTase activity. GTase catalyzes the rate-limiting transfer of a GMP moiety from a GTP substrate to an acceptor ppRNA to yield an unmethylated cap structure (GpppN). GTases are members of the covalent nucleotidyltransferases super‐

dependent RNA ligases [17]. This superfamily's ternary structure is composed of the Nterminal of the nucleotidyltransferase (NT) domain fused to an oligobinding fold (OB-fold) domain in the C-terminal. These flexible proteins are able to undergo large conformational changes during their catalytic cycle. GTases share highly conserved structures and motifs, of which the hallmark KxDG(I/L) motif is present in nearly all GTases [18]. The catalytic cycle of the GTase is a complex two-step ping-pong reaction involving multiple conformational changes. First, a GTase in a conformation where the OB-fold domain is distant from the NT domain (open conformation) specifically binds a GTP molecule. This is followed by the closure of the OB-fold domain toward the NT domain (closed conformation) which is stabilized by interactions between the bound nucleotide and residues from both NT and OB fold domains. This conformational change also creates a Mg2+ cofactor binding site, thus the closed confor‐ mation represents the catalytically active form of the enzyme [19, 20]. Upon Mg2+ binding, the α-phosphate of the GTP is sandwiched between the catalytic lysine (form the KxDG) and the metal cofactor. Deprotonation of the lysine leads to the attack on the α-phosphate of the GTP to form a enzyme-(lysyl-N)-GMP intermediate (EpG), concomitant with the hydrolysis of a pyrophosphate molecule [20]. Following the catalysis, interactions between the bound guanylate and the OB fold domain are disrupted, leading to the reopening of the enzyme and the release of pyrophosphate. The reopening of the guanylylated enzyme allows for accom‐ modation of the ppRNA, which is likely followed by the closure of the OB-fold domain. Closing of the OB-fold domain returns the enzyme to its catalytically active form, which promotes the transfer of the GMP to the acceptor RNA. A final reopening allows for unmethylated capped RNA to be released and the apo-protein to be regenerated (Fig. 5) [19]. The active sites of the GTase are highly conserved, potentially due to their fairly complex catalytic cycle. Most viruses encode GTases that are, with respect to the active site, nearly identical to their eukaryotic host

GTase, favouring the hypothesis of ancestral viral acquisition of the host GTase.

While nearly all GTases are highly conserved, a few recently discovered viral GTases are different. Little is currently known about those atypical GTases lacking the catalytic KxDG motif. Some segmented dsRNA viruses of the *Reoviridae* family encode for a large multiprotein capsid harbouring nucleic acid maturation functions, including GTase activity. The *Reoviri‐ dae* GTase is structurally different from the conventional GTase. While they lack the conserved KxDG motif, they still maintain the capacity to form an enzyme-(lysyl-N)-GMP intermediate.


TTM RTPase to the higher eukaryote metal-independent RTPase.

34 Current Issues in Molecular Virology - Viral Genetics and Biotechnological Applications

family which also includes the ATP- and NAD+

**3.3. RNA guanylyltransferase**

The third step of the RNA 5'-end cap synthesis is the methylation of the cap guanosine by a N7MTase. An N7MTase adds a methyl group to the guanine at the N7 position in order to convert the GpppN into a functional m7GpppN cap-0 structure. The conversion of S-Adenosyl methionine (SAM) into S-Adenosyl homocysteine (SAH) provides the methyl group. N7MTas‐ es are members of the large SAM-dependent MTase family, which shares a low sequence identity but a structurally conserved SAM binding core. This SAM binding pocket, composed of a seven-stranded β-sheet flanked by six α-helices, ensures specific and proper positioning of the SAM molecule, while other structural determinants provide specificity for a range of methyl acceptors [23, 24]. For the N7MTase, those structural determinants are a positively charged RNA-accommodating groove and a GpppN binding pocket that forms extensive electrostatic interactions with the cap guanine, thereby ensuring specificity [25]. Despite a broad network of interactions with both substrates (GpppN and SAM), no direct contact is made between the N7MTase and their substrate reacting group: the guanine N7 nitrogen (methyl acceptor) and the SAM CH3 (methyl donor). The methyl transfer is instead mediated by a direct in-line nucleophilic attack of the SAM methyl moiety by the guanine N7 nitrogen. N7MTases are not directly implicated in the transition state stabilization, but are rather optimizing the proximity and the spatial orientation between both ligands reacting groups. In addition, a favourable electrostatic environment further stimulates the catalysis [25]. The degree of conservation among N7MTases is very high and most viral and eukaryotic N7MTas‐ es only differ in their accessory domain. A rare exception is the poxvirus N7MTase, which appears to bind SAM in a slightly different conformation. Moreover, some poxviruses, such as vaccinia virus, have evolved a heterodimer N7MTase. The vaccinia virus N7MTase D1 for example relies on its association with the accessory protein D12 to be fully active [26]. The degree of conservation among N7MTases points toward a common eukaryotic ancestor acquired by viruses.

share the same SAM binding site and accessory domain but not the same mechanism of methyl transfer. The classical N7MTase and 2'OMTase mechanisms are instead present but inde‐ pendent. It is, for example, possible to abolish the 2'OMTase activity through disruption of the lysine-asparagine-lysine-glutamine tetrad while maintaining the N7MTase activity [22, 32]. It is important to note that the *flavivirus* dual MTase accomplishes a sequential methylation, starting with the N7 guanine methylation and followed by a repositioning of the cap structure and finally, the 2'-hydroxyl methylation [22, 32]. This sequence is virus specific and can be inverted, as exemplified by the vesicular stomatitis virus (VSV), a member of the *Rhabdoviri‐ dae* family. The VSV also encodes a dual MTase, but the 2'OMTase takes place first and is followed by the N7MTase [33]. These dual MTases have likely evolved their second MTase

RNA 5′-end Maturation: A Crucial Step in the Replication of Viral Genomes

http://dx.doi.org/10.5772/56166

37

In order to support viral replication and fitness, both the catalytic activity of viral enzymes involved in RNA capping as well as their localization within the cell, are crucial. Viral capping enzymes required for RNA capping have to be recruited at the site of RNA synthesis. Recruit‐ ment of the capping enzyme can be mediated by protein-protein interactions with either the RNA polymerase or a scaffold protein. While recruitment of the three distinct enzymatic activities is required in order to synthesise a cap-0 structure, the available surface for protein interactions at the RNA synthesis site is limited. Viruses have evolved multiple solutions to overcome this problem including the fusion of multiple enzymatic activities to the same polypeptide as well as protein-protein interactions between two capping enzymes to form a hetero-multimer (Fig. 6). A good example of protein-protien interaction is seen in *Paramecium bursaria Chlorella virus,* which encodes the RTPase, GTase and N7MTase activities on three different peptides [19, 34, 35]. The RTPase enzyme is likely to interact with the GTase, in a manner that is reminiscent of the lower eukaryotic capping machinery in which the Pol II cotranscriptionally recruits the RTPase-GTase heterodimer and the N7MTase separetely [36, 37]. Alternatively, viruses such as *Baculovirus* and Infectious Spleen and Kidney Necrosis virus benefit from the fusion of the RTPase and the GTase activities in a single polypeptide, thus facilitating the recruitment of the capping apparatus to the viral RNA polymerase transcription site [38, 39]. In this instance, the organization of the viral capping enzymes is most analogous to that of higher eukaryotes in which the RTPase and GTase enzymes are fused together. In this case, interaction with the GTase domain is solely responsible for RTPase-GTase recruit‐ ment to the RNA Pol II while the N7MTase is recruited separately [40]. The fusion of sequential enzymatic activities to the same multi-domain protein appears to be more robust than the heterodimer formation. Because of this, selective pressures have driven the fusion of the capping gene in a wide variety of viruses. *Alphavirus*, for example, encodes a single protein that is able to add a N7 methylated guanosine to a ppRNA, while the RTPase activity is located on a different peptide [41]. The *flavivirus* represent an even more striking example of gene organization optimization. The RTPase in this group shares a catalytic site with the NTPase/ helicase (also implicated in RNA synthesis) on one protein while the GTase and the dual (N7 and 2'OMTase are fused to the RNA dependent RNA polymerase (RDRP) on a second protein.

activity out of their initial MTase fold.

**3.5. Gene organization of viral capping enzymes**

Lastly, some viruses infecting higher eukaryotes, such as *flavirirux*, *reovirus* and *poxvirus*, can further modify their RNA 5'-end through 2'-O-methylation in order to more accurately mimic their host mRNA modifications. This last modification is not required for viruses infecting lower eukaryotes as their host harbours cap-0 mRNA. The 2'OMTase methylates the first nucleotide 2'-hydroxyl group(s) of the RNA, allowing for the conversion of a m7GpppN (cap-0) into a m7GpppNm (cap-1). The 2'OMTases are also members of the large SAM-dependent MTase family. When compared to the N7MTase, 2'OMTase harbours an additional highly conserved catalytic lysine-asparagine-lysine-glutamine tetrad [27]. These amino acids are not consecutive in the primary sequence, but they cluster together once the protein adopts its three-dimen‐ sional structure. The exact catalytic pathway is still controversial, but relies on the conserved asparagine and arginine to lower the pKa of the catalytic lysine, which is responsible for the 2'-hydroxyl group activation. Two mechanisms are proposed for this substrate activation. The first involves the lysine deprotonating the 2'-OH to form a nucleophilic 2'-oxanion. The second implicates the lysine in the formation of a non-deprotonating hydrogen bound with a 2' hydroxyl proton, which freezes the 2'-OH rotation in an angle where the 2'-oxygen electron lone pair is steered toward the SAM methyl group. In both cases, the nucleophilic 2'-oxygen attacks the electrophilic SAM methyl group according to an in-line Sn2 mechanism [28-31]. The pentavalent methyl intermediate of the transition state is stabilized by the asparagine. Despite the structural homology with the N7MTase, the 2'OMTase harbours a distinct mechanism of methyl transfer. Interestingly, some viruses, such as the *flavivirus*, have evolved both N7MTase and 2'OMTase activities within the same enzyme [22, 32]. These dual MTases share the same SAM binding site and accessory domain but not the same mechanism of methyl transfer. The classical N7MTase and 2'OMTase mechanisms are instead present but inde‐ pendent. It is, for example, possible to abolish the 2'OMTase activity through disruption of the lysine-asparagine-lysine-glutamine tetrad while maintaining the N7MTase activity [22, 32]. It is important to note that the *flavivirus* dual MTase accomplishes a sequential methylation, starting with the N7 guanine methylation and followed by a repositioning of the cap structure and finally, the 2'-hydroxyl methylation [22, 32]. This sequence is virus specific and can be inverted, as exemplified by the vesicular stomatitis virus (VSV), a member of the *Rhabdoviri‐ dae* family. The VSV also encodes a dual MTase, but the 2'OMTase takes place first and is followed by the N7MTase [33]. These dual MTases have likely evolved their second MTase activity out of their initial MTase fold.

#### **3.5. Gene organization of viral capping enzymes**

methionine (SAM) into S-Adenosyl homocysteine (SAH) provides the methyl group. N7MTas‐ es are members of the large SAM-dependent MTase family, which shares a low sequence identity but a structurally conserved SAM binding core. This SAM binding pocket, composed of a seven-stranded β-sheet flanked by six α-helices, ensures specific and proper positioning of the SAM molecule, while other structural determinants provide specificity for a range of methyl acceptors [23, 24]. For the N7MTase, those structural determinants are a positively charged RNA-accommodating groove and a GpppN binding pocket that forms extensive electrostatic interactions with the cap guanine, thereby ensuring specificity [25]. Despite a broad network of interactions with both substrates (GpppN and SAM), no direct contact is made between the N7MTase and their substrate reacting group: the guanine N7 nitrogen (methyl acceptor) and the SAM CH3 (methyl donor). The methyl transfer is instead mediated by a direct in-line nucleophilic attack of the SAM methyl moiety by the guanine N7 nitrogen. N7MTases are not directly implicated in the transition state stabilization, but are rather optimizing the proximity and the spatial orientation between both ligands reacting groups. In addition, a favourable electrostatic environment further stimulates the catalysis [25]. The degree of conservation among N7MTases is very high and most viral and eukaryotic N7MTas‐ es only differ in their accessory domain. A rare exception is the poxvirus N7MTase, which appears to bind SAM in a slightly different conformation. Moreover, some poxviruses, such as vaccinia virus, have evolved a heterodimer N7MTase. The vaccinia virus N7MTase D1 for example relies on its association with the accessory protein D12 to be fully active [26]. The degree of conservation among N7MTases points toward a common eukaryotic ancestor

36 Current Issues in Molecular Virology - Viral Genetics and Biotechnological Applications

Lastly, some viruses infecting higher eukaryotes, such as *flavirirux*, *reovirus* and *poxvirus*, can further modify their RNA 5'-end through 2'-O-methylation in order to more accurately mimic their host mRNA modifications. This last modification is not required for viruses infecting lower eukaryotes as their host harbours cap-0 mRNA. The 2'OMTase methylates the first nucleotide 2'-hydroxyl group(s) of the RNA, allowing for the conversion of a m7GpppN (cap-0) into a m7GpppNm (cap-1). The 2'OMTases are also members of the large SAM-dependent MTase family. When compared to the N7MTase, 2'OMTase harbours an additional highly conserved catalytic lysine-asparagine-lysine-glutamine tetrad [27]. These amino acids are not consecutive in the primary sequence, but they cluster together once the protein adopts its three-dimen‐ sional structure. The exact catalytic pathway is still controversial, but relies on the conserved asparagine and arginine to lower the pKa of the catalytic lysine, which is responsible for the 2'-hydroxyl group activation. Two mechanisms are proposed for this substrate activation. The first involves the lysine deprotonating the 2'-OH to form a nucleophilic 2'-oxanion. The second implicates the lysine in the formation of a non-deprotonating hydrogen bound with a 2' hydroxyl proton, which freezes the 2'-OH rotation in an angle where the 2'-oxygen electron lone pair is steered toward the SAM methyl group. In both cases, the nucleophilic 2'-oxygen attacks the electrophilic SAM methyl group according to an in-line Sn2 mechanism [28-31]. The pentavalent methyl intermediate of the transition state is stabilized by the asparagine. Despite the structural homology with the N7MTase, the 2'OMTase harbours a distinct mechanism of methyl transfer. Interestingly, some viruses, such as the *flavivirus*, have evolved both N7MTase and 2'OMTase activities within the same enzyme [22, 32]. These dual MTases

acquired by viruses.

In order to support viral replication and fitness, both the catalytic activity of viral enzymes involved in RNA capping as well as their localization within the cell, are crucial. Viral capping enzymes required for RNA capping have to be recruited at the site of RNA synthesis. Recruit‐ ment of the capping enzyme can be mediated by protein-protein interactions with either the RNA polymerase or a scaffold protein. While recruitment of the three distinct enzymatic activities is required in order to synthesise a cap-0 structure, the available surface for protein interactions at the RNA synthesis site is limited. Viruses have evolved multiple solutions to overcome this problem including the fusion of multiple enzymatic activities to the same polypeptide as well as protein-protein interactions between two capping enzymes to form a hetero-multimer (Fig. 6). A good example of protein-protien interaction is seen in *Paramecium bursaria Chlorella virus,* which encodes the RTPase, GTase and N7MTase activities on three different peptides [19, 34, 35]. The RTPase enzyme is likely to interact with the GTase, in a manner that is reminiscent of the lower eukaryotic capping machinery in which the Pol II cotranscriptionally recruits the RTPase-GTase heterodimer and the N7MTase separetely [36, 37]. Alternatively, viruses such as *Baculovirus* and Infectious Spleen and Kidney Necrosis virus benefit from the fusion of the RTPase and the GTase activities in a single polypeptide, thus facilitating the recruitment of the capping apparatus to the viral RNA polymerase transcription site [38, 39]. In this instance, the organization of the viral capping enzymes is most analogous to that of higher eukaryotes in which the RTPase and GTase enzymes are fused together. In this case, interaction with the GTase domain is solely responsible for RTPase-GTase recruit‐ ment to the RNA Pol II while the N7MTase is recruited separately [40]. The fusion of sequential enzymatic activities to the same multi-domain protein appears to be more robust than the heterodimer formation. Because of this, selective pressures have driven the fusion of the capping gene in a wide variety of viruses. *Alphavirus*, for example, encodes a single protein that is able to add a N7 methylated guanosine to a ppRNA, while the RTPase activity is located on a different peptide [41]. The *flavivirus* represent an even more striking example of gene organization optimization. The RTPase in this group shares a catalytic site with the NTPase/ helicase (also implicated in RNA synthesis) on one protein while the GTase and the dual (N7 and 2'OMTase are fused to the RNA dependent RNA polymerase (RDRP) on a second protein. In this example, *flavivirus* managed to pack, within two polypeptides, six different enzymatic activities, all of which are involved in RNA synthesis and maturation [21, 22, 32].

an RNA 5'-end up to a cap-0 structure. It is also interesting to note that the structure of the D12 stimulatory subunit indicates that it used to be a 2'OMTase but that function is now inactive. Instead, the 2'OMTase activity is now taken over by the dedicated VP39 2'OMTase [23]. This raises the possibility of an ancestor *poxvirus* RNA-capping assembly line composed of a D1- D12-like complex that could process a 5'-triphosphate RNA into a cap-1 RNA. Such an enzymatic conveyor can currently be found in mammalian reovirus and bluetongue virus. These two viruses are members of the segmented dsRNA *Reoviridae* family and transcribe their plus-strand messenger RNA within an internal capsid particle containing the RDRP and the capping apparatus. A single protein packs together all four enzymatic activities required to synthesize a cap-1 structure (RTPase, GTase, N7MTase and 2'OMTase), although the putative RTPase activity is yet to be confirmed [43-45]. Once again, these activities are presented into a directional layout that channels the mRNA through successive enzymatic modifications with the goal of converting its 5'-triphosphate end into a cap-1 end. Moreover, this RNA capping assembly line is in direct contact with the polymerase, ensuring optimal recruitment of the nascent mRNA to the capping apparatus [46]. The λ2 and VP4 capping proteins from *reovirus* and bluetongue virus are slightly different in regard to their quaternary structure. *Reovirus* λ2, which is overall linearly shaped, associates into a pentamer to form a hollow cylinder with each active site facing the interior of the cavity, or the turret. This barrel is perpendicular to the spherical internal capsid particle and creates a channel for the nascent mRNA to exit the internal capsid particle while undergoing complete type-1 mRNA capping [44]. It is interesting that a diversity of viruses, ranging from dsDNA virus such as *Mimivirus*, African swine fever virus and *poxvirus*, to segmented dsRNA viruses including members of the *Reoviridae* family, have evolved such a complex but highly effective RNA-capping assembly line. The convergent evolution of these systems highlights the critical importance of proper RNA capping for viral

RNA 5′-end Maturation: A Crucial Step in the Replication of Viral Genomes

http://dx.doi.org/10.5772/56166

39

**3.6. Unconventional 5' RNA cap synthesis mechanism evolved by different viruses**

The capacity to properly cap RNA confers a distinct advantage to many eukaryotic viruses. Consequently, the selective pressure to maintain this structure is high, which is reflected by the degree of conservation among the viral capping proteins. Interestingly, this selective pressure is not directed toward the capping proteins themselves (RTPase, GTase and N7MTase), but rather toward their final product, the cap structure. Because of this, many viruses have evolved diverse biosynthetic strategies, divergent from the canonical RTPase→GTase→N7MTase pathway, allowing them to synthesize or acquire the final cap structure. This cap structure is in every aspect identical to the canonically synthesized one; only the enzymatic pathway varies. Many viruses families include members that use an unconventional 5' RNA cap synthesis pathway. As of today, three unconventional 5' RNA cap

The m7GTP RNA capping pathway, also termed the *alphavirus*-like pathway, is found in a number of (+)ssRNA viruses of the alphavirus (Semliki Forest virus and Sindbis virus),

genome replication and overall viral fitness.

synthesis mechanism have been described.

**3.7. The m7GTP RNA capping pathway**

**Figure 6. Gene organization of the canonical capping enzymes**. Schematic representation of the genetic organiza‐ tion of the canonical enzymatic activity required to synthesize a cap-0 structure. Examples of organisms associated with each gene organization is indicated on the left. The color code is representative of structural and mechanistic enzymatic conservation and is detailed at the bottom of the figure.

Some viruses have even evolved a highly efficient capping enzyme, fusing together all three or four enzymatic functions required for cap synthesis into what can be described as an RNAcapping assembly line. *Mimivirus* and African swine fever virus encode a large, single protein inclusively harbouring the RTPase, GTase and N7MTase activities. This allows these viruses to efficiently modify their RNA to generate a cap-0 structure [7, 42]. The conventional cap synthesis pathway is a directional succession of enzymatic activities such that RTPase→GTase→N7MTase. Interestingly, the order of the catalytic domains within the primary sequence of these triple-activity capping enzymes follows the required capping activity sequence (NH2-RTPase-GTase-N7MTase-COOH). As a result, they not only co-localize all capping activity to the RNA 5'-end, but also optimize the progression of the RNA through the capping activity sequence. *Poxvirus*, typified by the vaccinia virus (VV), also display a nice example of a multi-capping enzyme. The VV multi-capping enzyme, D1, possesses all three RTPase, GTase and N7MTase activities. The first two are constitutive while the N7MTase requires association with the D12 stimulatory subunit. Together this complex is able to modify an RNA 5'-end up to a cap-0 structure. It is also interesting to note that the structure of the D12 stimulatory subunit indicates that it used to be a 2'OMTase but that function is now inactive. Instead, the 2'OMTase activity is now taken over by the dedicated VP39 2'OMTase [23]. This raises the possibility of an ancestor *poxvirus* RNA-capping assembly line composed of a D1- D12-like complex that could process a 5'-triphosphate RNA into a cap-1 RNA. Such an enzymatic conveyor can currently be found in mammalian reovirus and bluetongue virus. These two viruses are members of the segmented dsRNA *Reoviridae* family and transcribe their plus-strand messenger RNA within an internal capsid particle containing the RDRP and the capping apparatus. A single protein packs together all four enzymatic activities required to synthesize a cap-1 structure (RTPase, GTase, N7MTase and 2'OMTase), although the putative RTPase activity is yet to be confirmed [43-45]. Once again, these activities are presented into a directional layout that channels the mRNA through successive enzymatic modifications with the goal of converting its 5'-triphosphate end into a cap-1 end. Moreover, this RNA capping assembly line is in direct contact with the polymerase, ensuring optimal recruitment of the nascent mRNA to the capping apparatus [46]. The λ2 and VP4 capping proteins from *reovirus* and bluetongue virus are slightly different in regard to their quaternary structure. *Reovirus* λ2, which is overall linearly shaped, associates into a pentamer to form a hollow cylinder with each active site facing the interior of the cavity, or the turret. This barrel is perpendicular to the spherical internal capsid particle and creates a channel for the nascent mRNA to exit the internal capsid particle while undergoing complete type-1 mRNA capping [44]. It is interesting that a diversity of viruses, ranging from dsDNA virus such as *Mimivirus*, African swine fever virus and *poxvirus*, to segmented dsRNA viruses including members of the *Reoviridae* family, have evolved such a complex but highly effective RNA-capping assembly line. The convergent evolution of these systems highlights the critical importance of proper RNA capping for viral genome replication and overall viral fitness.

#### **3.6. Unconventional 5' RNA cap synthesis mechanism evolved by different viruses**

The capacity to properly cap RNA confers a distinct advantage to many eukaryotic viruses. Consequently, the selective pressure to maintain this structure is high, which is reflected by the degree of conservation among the viral capping proteins. Interestingly, this selective pressure is not directed toward the capping proteins themselves (RTPase, GTase and N7MTase), but rather toward their final product, the cap structure. Because of this, many viruses have evolved diverse biosynthetic strategies, divergent from the canonical RTPase→GTase→N7MTase pathway, allowing them to synthesize or acquire the final cap structure. This cap structure is in every aspect identical to the canonically synthesized one; only the enzymatic pathway varies. Many viruses families include members that use an unconventional 5' RNA cap synthesis pathway. As of today, three unconventional 5' RNA cap synthesis mechanism have been described.

#### **3.7. The m7GTP RNA capping pathway**

In this example, *flavivirus* managed to pack, within two polypeptides, six different enzymatic

**Figure 6. Gene organization of the canonical capping enzymes**. Schematic representation of the genetic organiza‐ tion of the canonical enzymatic activity required to synthesize a cap-0 structure. Examples of organisms associated with each gene organization is indicated on the left. The color code is representative of structural and mechanistic

Some viruses have even evolved a highly efficient capping enzyme, fusing together all three or four enzymatic functions required for cap synthesis into what can be described as an RNAcapping assembly line. *Mimivirus* and African swine fever virus encode a large, single protein inclusively harbouring the RTPase, GTase and N7MTase activities. This allows these viruses to efficiently modify their RNA to generate a cap-0 structure [7, 42]. The conventional cap synthesis pathway is a directional succession of enzymatic activities such that RTPase→GTase→N7MTase. Interestingly, the order of the catalytic domains within the primary sequence of these triple-activity capping enzymes follows the required capping activity sequence (NH2-RTPase-GTase-N7MTase-COOH). As a result, they not only co-localize all capping activity to the RNA 5'-end, but also optimize the progression of the RNA through the capping activity sequence. *Poxvirus*, typified by the vaccinia virus (VV), also display a nice example of a multi-capping enzyme. The VV multi-capping enzyme, D1, possesses all three RTPase, GTase and N7MTase activities. The first two are constitutive while the N7MTase requires association with the D12 stimulatory subunit. Together this complex is able to modify

enzymatic conservation and is detailed at the bottom of the figure.

activities, all of which are involved in RNA synthesis and maturation [21, 22, 32].

38 Current Issues in Molecular Virology - Viral Genetics and Biotechnological Applications

The m7GTP RNA capping pathway, also termed the *alphavirus*-like pathway, is found in a number of (+)ssRNA viruses of the alphavirus (Semliki Forest virus and Sindbis virus), potexvirus (Bamboo mosaic virus), tobamovirus (Tobacco mosaic virus), *Togaviridae* (Rubella virus and Chikungunya virus) and *Hepeviridae* (Hepatitis E virus) families [5, 47]. These viruses encode unique capping machinery capable of synthesizing a cap-0 structure in three sequential enzymatic reactions. The initial step is quite similar to the conventional capping mechanism in which an RTPase (nsP2 protein of Semliki Forest virus for example) hydrolyzes the γ-βphosphoanhydride bond at the 5'-end of the RNA yielding a ppRNA [48]. Next a GTP molecule in methylated in position N7 by an atypical N7MTase (nsP1 protein of Semliki Forest virus for example). This m7GTP is then recognized as a substrate by an atypical GTase (also nsP1 of protein of Semliki Forest virus for example). The reaction results in the formation of a charac‐ teristic m7GMP-enzyme covalent complex upon the hydrolysis of a pyrophosphate group. This m7GMP group is finally transferred onto the 5'-end of the acceptor ppRNA, to yield a typical m7GpppN cap-0 structure [41, 49-52]. The overall capping reaction is then RTPase→atypical N7MTase→atypical GTase (Fig. 7). It is worth mentioning, however, that not only the order of chemical modifications differs, but also the protein mechanisms of action. The atypical N7MTase has fundamental similarities to the standard N7MTase, including the presence of a SAM binding domain, but its substrate recognition is vastly different. Atypical N7MTase proteins are unable to methylate GpppN as the canonical N7MTase does, and instead they specifically methylate GTP (and GDP to some extent) [51]. The atypical GTases are mechanis‐ tically different from their GTase counterpart in that they lack the KxDG conserved motif and mediate their m7GMP-enzyme intermediate through a conserved histidine instead of a lysine [41]. These proteins have no activity with GTP, but specifically require m7GTP to form a covalently bound enzyme complex. Therefore, the conversion from GTP to m7GTP is necessary prior to the N7-methyl-gunanylyltransferase activity [49].

Of all known eukaryotes and viruses, the m7GTP RNA capping pathway is only used by members of the (+)ssRNA viruses, which points toward a eukaryote-independent emergence of this unconventional cap synthesis mechanism. In addition, the conservation of this capping pathway throughout distantly related viruses harbouring a broad spectrum of hosts, ranging from plants to animals, suggests an evolution from a common (+)ssRNA virus ancestor.

> GDP to form a GpppN block RNA. In this case, only the α-phosphate originates from the RNA whereas boththeβandγ-phosphates are contributedby theGDP. Finally, synthesis ofthe cap-1 structure is completed by two successive methylations; the first being methylation of the first nucleotide ofthe 2'OH andthe secondbeingmethylation ofthe guanine N7 nitrogen [33, 53-57]. When compared to the canonical capping reaction, this unconventional capping pathway reverses the phosphate contribution from the GTP and the RNA. The covalent enzymemonophosphate-nucleotide intermediate is formed with the RNA instead of the GTP in an enzyme-pRNA complex instead of an enzyme-GMP complex. Similarly to the conventional capping pathway, the diphosphate cosubstrate is pre-emptively hydrolysed from its triphos‐ phate precursor, but this time it is GDP instead of ppRNA that is generated. The PRNTase mechanism of action is also distinct from the GTase one in that the KxDG motif is replace by an HR motif and the histidine, not the lysine, is responsible for the enzyme-pRNA phosphoa‐ mide bond [55, 56]. Both the N7 and 2'OMTase activities are also present on the L protein and share the same SAM binding site. The typical lysine-asparagine-lysine-glutamine tetrad is also

code: pppRNA (black), GTP (blue) and SAM (green and red).

**Figure 7. Unconventional 5' RNA cap synthesis mechanisms.** The m7GTP capping pathway involves the hydrolysis of the RNA γ-phosphate by an RTPase, the methylation of a GTP by a N7MTase and the transfer of this m7GTP onto the diphosphorylated RNA. The GDP capping pathway is initiated by the hydrolysis of GTP to GDP by an NTPase. A PRNTase then hydrolyzes the γ-and β-phosphates of the RNA to form a covalent enzyme-pRNA intermediate. The pRNA is then transferred onto the GDP. Further methylation by the N7MTase and 2'OMTase complete the cap-1 struc‐ ture. The contribution of each substrate to the formation of the final 5' RNA cap structure is highlighted by a color

RNA 5′-end Maturation: A Crucial Step in the Replication of Viral Genomes

http://dx.doi.org/10.5772/56166

41

#### **3.8. The GDP RNA capping pathway**

The GDP RNA capping pathway, also termed the *Rhabdoviridae*-like pathway, is found in representatives of many (-)ssRNA viruses of the *Rhabdoviridae*(vesicular stomatitis virus (VSV) and Rabies virus), *paramyxoviridae* (Human respiratory syncytial virus and Measles virus), *Bornaviridae*(bornavirus), and *Filoviridae*(Ebola virus andMarburg virus)families [5, 47].These viruses encode unconventional capping machinery that catalyzes the formation of a cap-1 structure. These viruses, exemplified by VSV, encode a large L protein harbouring the RNA dependent RNA polymerase RDRP activity as well as the RNA capping activity. The latter requires a sequence of four enzymatic activities that differ from the conventional pathway, in order to generate a cap-1 structure. First, the NTPase activity is responsible for the hydrolysis of a GTP molecule into a GDP molecule. Then, an RNA GDP polyribonucleotidyl transferase (PRNTase) catalyzes a two-stepreaction.TheLproteinhydrolyzes the (alpha-beta)phosphoan‐ hydride bond of the pppRNA triphosphate moiety releasing a molecule of pyrophosphate and creating a covalent enzyme-pRNA intermediate. The pRNA moiety is then transferred onto the

potexvirus (Bamboo mosaic virus), tobamovirus (Tobacco mosaic virus), *Togaviridae* (Rubella virus and Chikungunya virus) and *Hepeviridae* (Hepatitis E virus) families [5, 47]. These viruses encode unique capping machinery capable of synthesizing a cap-0 structure in three sequential enzymatic reactions. The initial step is quite similar to the conventional capping mechanism in which an RTPase (nsP2 protein of Semliki Forest virus for example) hydrolyzes the γ-βphosphoanhydride bond at the 5'-end of the RNA yielding a ppRNA [48]. Next a GTP molecule in methylated in position N7 by an atypical N7MTase (nsP1 protein of Semliki Forest virus for example). This m7GTP is then recognized as a substrate by an atypical GTase (also nsP1 of protein of Semliki Forest virus for example). The reaction results in the formation of a charac‐ teristic m7GMP-enzyme covalent complex upon the hydrolysis of a pyrophosphate group. This m7GMP group is finally transferred onto the 5'-end of the acceptor ppRNA, to yield a typical m7GpppN cap-0 structure [41, 49-52]. The overall capping reaction is then RTPase→atypical N7MTase→atypical GTase (Fig. 7). It is worth mentioning, however, that not only the order of chemical modifications differs, but also the protein mechanisms of action. The atypical N7MTase has fundamental similarities to the standard N7MTase, including the presence of a SAM binding domain, but its substrate recognition is vastly different. Atypical N7MTase proteins are unable to methylate GpppN as the canonical N7MTase does, and instead they specifically methylate GTP (and GDP to some extent) [51]. The atypical GTases are mechanis‐ tically different from their GTase counterpart in that they lack the KxDG conserved motif and mediate their m7GMP-enzyme intermediate through a conserved histidine instead of a lysine [41]. These proteins have no activity with GTP, but specifically require m7GTP to form a covalently bound enzyme complex. Therefore, the conversion from GTP to m7GTP is necessary

40 Current Issues in Molecular Virology - Viral Genetics and Biotechnological Applications

Of all known eukaryotes and viruses, the m7GTP RNA capping pathway is only used by members of the (+)ssRNA viruses, which points toward a eukaryote-independent emergence of this unconventional cap synthesis mechanism. In addition, the conservation of this capping pathway throughout distantly related viruses harbouring a broad spectrum of hosts, ranging from plants to animals, suggests an evolution from a common (+)ssRNA virus ancestor.

The GDP RNA capping pathway, also termed the *Rhabdoviridae*-like pathway, is found in representatives of many (-)ssRNA viruses of the *Rhabdoviridae*(vesicular stomatitis virus (VSV) and Rabies virus), *paramyxoviridae* (Human respiratory syncytial virus and Measles virus), *Bornaviridae*(bornavirus), and *Filoviridae*(Ebola virus andMarburg virus)families [5, 47].These viruses encode unconventional capping machinery that catalyzes the formation of a cap-1 structure. These viruses, exemplified by VSV, encode a large L protein harbouring the RNA dependent RNA polymerase RDRP activity as well as the RNA capping activity. The latter requires a sequence of four enzymatic activities that differ from the conventional pathway, in order to generate a cap-1 structure. First, the NTPase activity is responsible for the hydrolysis of a GTP molecule into a GDP molecule. Then, an RNA GDP polyribonucleotidyl transferase (PRNTase) catalyzes a two-stepreaction.TheLproteinhydrolyzes the (alpha-beta)phosphoan‐ hydride bond of the pppRNA triphosphate moiety releasing a molecule of pyrophosphate and creating a covalent enzyme-pRNA intermediate. The pRNA moiety is then transferred onto the

prior to the N7-methyl-gunanylyltransferase activity [49].

**3.8. The GDP RNA capping pathway**

**Figure 7. Unconventional 5' RNA cap synthesis mechanisms.** The m7GTP capping pathway involves the hydrolysis of the RNA γ-phosphate by an RTPase, the methylation of a GTP by a N7MTase and the transfer of this m7GTP onto the diphosphorylated RNA. The GDP capping pathway is initiated by the hydrolysis of GTP to GDP by an NTPase. A PRNTase then hydrolyzes the γ-and β-phosphates of the RNA to form a covalent enzyme-pRNA intermediate. The pRNA is then transferred onto the GDP. Further methylation by the N7MTase and 2'OMTase complete the cap-1 struc‐ ture. The contribution of each substrate to the formation of the final 5' RNA cap structure is highlighted by a color code: pppRNA (black), GTP (blue) and SAM (green and red).

GDP to form a GpppN block RNA. In this case, only the α-phosphate originates from the RNA whereas boththeβandγ-phosphates are contributedby theGDP. Finally, synthesis ofthe cap-1 structure is completed by two successive methylations; the first being methylation of the first nucleotide ofthe 2'OH andthe secondbeingmethylation ofthe guanine N7 nitrogen [33, 53-57]. When compared to the canonical capping reaction, this unconventional capping pathway reverses the phosphate contribution from the GTP and the RNA. The covalent enzymemonophosphate-nucleotide intermediate is formed with the RNA instead of the GTP in an enzyme-pRNA complex instead of an enzyme-GMP complex. Similarly to the conventional capping pathway, the diphosphate cosubstrate is pre-emptively hydrolysed from its triphos‐ phate precursor, but this time it is GDP instead of ppRNA that is generated. The PRNTase mechanism of action is also distinct from the GTase one in that the KxDG motif is replace by an HR motif and the histidine, not the lysine, is responsible for the enzyme-pRNA phosphoa‐ mide bond [55, 56]. Both the N7 and 2'OMTase activities are also present on the L protein and share the same SAM binding site. The typical lysine-asparagine-lysine-glutamine tetrad is also predicted to be at the MTase active site. The 2'O position of the GpppN is methylated prior to the guanine N7 position, which is the opposite order when compared to most canonical cap-1 methylation events [33, 53]. The overall GDP RNA capping sequence can be summarized as NTPase→PRNTase→2'OMTase→N7MTase (Fig. 7). It is very likely that an ancestral (+)ssRNA virus polymerase has evolved a PRNTase activity independently from its eukaryotic host. Both N7 and 2'OMTase, however, have likely been acquired from a eukaryotic host.

waste (induced by the cleavage and downstream degradation) of mRNA when the vRNA are not loaded on the RDRP [66]. Secondly, some nucleocapsid proteins, first demonstrated by Hantavirus, are able to bind and protect capped mRNA from degradation in the processing bodies (P-bodies) [67]. Thus, converting the P-bodies function from mRNA decapping and decay into cellular cap storage foci. The cap snatching is only observed in segmented (-)ssRNA viruses; such a unique molecular mechanism supports the hypothesis of a common (-)ssRNA virus ancestor of today's virus, despite their tropism now ranging from plants to animals. The incredible diversity of RNA capping pathways, protein folding and enzymatic mecha‐ nisms of action that have been evolved by viruses all lead to the synthesis of the same ubiquitous structure is a testimony to the importance of the cap structure for viral genome

RNA 5′-end Maturation: A Crucial Step in the Replication of Viral Genomes

http://dx.doi.org/10.5772/56166

43

**Figure 8. RNA cap snatching.** The viral polymerase complex is activated upon viral RNA binding (dark blue) and spe‐ cifically binds the cellular mRNA cap structure (red) via its cap binding activity. The endonuclease activity cleaves the bound cellular mRNA 10-13 nucleotides downstream of the cap structure. This short capped oligomer is then used to prime the RDRP and initiate genome replication, resulting in a chimeric (red and light blue) RNA copy harbouring the

Most viruses harbour a cap structure at the 5'-end of their RNA. Mutations preventing the proper capping of their RNA result in infection or replication deficient viruses. This is a strong proof of the crucial importance of the cap structure for viral RNA stability and translation. Yet

replication and global viral fitness.

host cap structure (in this example a cap-2 structure).

**4. Viral alternatives to cap structures**

#### **3.9. The RNA cap snatching**

Some viruses, unable to synthesize their own cap structures, have evolved a clever way to acquire this important entity: steeling it from their host. This method of cap acquisition, termed RNA cap snatching, is used by representatives of the *Orthomyxoviridae* (e.g. Influenza virus, Thogoto virus), the *Arenaviridae* (e.g. Lassa virus, Machupo virus) and the *Bunyaviridae* (Hantaan virus, La Crosse virus, Tomato Spotted Wilt virus) families [5, 58]. These (-)ssRNA viruses acquire their cap structure from their hosts capped mRNA. They bind the cap structure, cleave the RNA a few nucleotides downstream and finally use this short capped RNA to prime their RDRP [59]. The *Arenaviridae* and *Bunyaviridae* express a large monomeric polymerase where the *Orthomyxoviridae* expresses an heterotrimeric polymerase (e.g. PB1, PB2 and PA protein of influenza virus) harbouring all the activities required for cap snatching. The PB2 protein of the Influenza virus, the most studied cap snatching virus, specifically binds the host mRNA cap structure. The specificity of the binding is crucial and is mediated by the aromatic stacking of the methylated gunanine coupled to a base-specific interaction with a conserved acidic residue [60]. While the mode of cap binding is similar between PB2 and other capbinding proteins (e.g. eIF4E, nuclear cap binding complex, Vaccinia VP39) its overall fold is completely different [60]. Once the host mRNA is bound by the cap-binding PB2, the viral PA subunit cleaves the mRNA a few nucleotides downstream from the cap structure. The length of the primer RNA generated is virus-dependent, and typically ranges from 10-13 nucleotides for Influenza virus, but can be as short as 1-2 nucleotides as is seen in the Thogoto virus [59, 61, 62]. The PA endonuclease domain shares a high homology with the type II restriction enzyme, including the active site conserved (P)Dxn(D/E)xK signature motif [63]. The PA active site coordinates two Mn2+ cations and is believed to catalyze endonucleolytic cleavage through a common two-metal dependent mechanism [61, 64]. The short capped oligomers are next used by the PB1 RDRP as primer to initiate the transcription of the viral mRNAs [58]. PB1 also specifically binds the viral RNA (vRNA) 3' and 5'-end through a ribonucleoprotein 1-like motif ((R/K)G(F/Y)(G/A)(F/Y)Vx(F/Y)) [65]. The vRNA serves as a template for the 3' elongation of the cellular 10-13 nucleotide-capped primer. The overall cap snatching process results in the transcription of a chimeric full-length vRNA with a 5'-extension of 10-13 cellular nucleotides and a cap-2 structure (Fig. 8). Cap snatching enables viruses to acquire their hosts cap structure, which not only promotes viral replication but also impairs cellular mRNA translation, as translation of decapped cellular mRNA is impeded and the mRNA is targeted for degradation. Another consequence of cap snatching is the dependency on a pool of host mRNA molecules in order to support viral replication. (-)ssRNA viruses that utilize cap snatching have evolved ways to maintain the precious pool of eukaryotic mRNA. First, the cap binding and endonu‐ clease activity of the trimeric polymerase are only activated upon vRNA binding, limiting the waste (induced by the cleavage and downstream degradation) of mRNA when the vRNA are not loaded on the RDRP [66]. Secondly, some nucleocapsid proteins, first demonstrated by Hantavirus, are able to bind and protect capped mRNA from degradation in the processing bodies (P-bodies) [67]. Thus, converting the P-bodies function from mRNA decapping and decay into cellular cap storage foci. The cap snatching is only observed in segmented (-)ssRNA viruses; such a unique molecular mechanism supports the hypothesis of a common (-)ssRNA virus ancestor of today's virus, despite their tropism now ranging from plants to animals.

The incredible diversity of RNA capping pathways, protein folding and enzymatic mecha‐ nisms of action that have been evolved by viruses all lead to the synthesis of the same ubiquitous structure is a testimony to the importance of the cap structure for viral genome replication and global viral fitness.

**Figure 8. RNA cap snatching.** The viral polymerase complex is activated upon viral RNA binding (dark blue) and spe‐ cifically binds the cellular mRNA cap structure (red) via its cap binding activity. The endonuclease activity cleaves the bound cellular mRNA 10-13 nucleotides downstream of the cap structure. This short capped oligomer is then used to prime the RDRP and initiate genome replication, resulting in a chimeric (red and light blue) RNA copy harbouring the host cap structure (in this example a cap-2 structure).
