**3. NGS‐based RNA modification techniques**

housekeeping ncRNA genes include ribosomal RNA (rRNA), transfer RNA (tRNA), and small nuclear RNA (snRNA), while examples of regulatory ncRNAs are microRNA (miRNA) and long non‐coding RNA (lncRNA) [3–5]. The complexity of RNA is further complicated by numerous post‐transcriptional modifications which alter the chemical structure of the nucleotides without changing the nucleotide sequence. Similar to the field of epigenetics which investigates the modifications of DNA and histone proteins, the study of chemical modifications of RNA is called epitranscriptomics [6, 7]. More than 140 chemically diverse and distinct modified nucleotides have been identified in both mRNA and ncRNA, includ-

A), 5‐methyl cytidine (m<sup>5</sup>

mostly in the housekeeping ncRNAs [3, 4, 8]; however, chemical modifications have also been detected in mRNA and the regulatory ncRNAs [9–11]. Unfortunately, the knowledge about the occurrence and function of RNA modifications at transcriptome level remains scarce. Recently, the interest in RNA modifications and their functions have gained momentum owing mainly to the application of novel modifications to next‐generation sequencing (NGS) and mass spectrometry technologies, which have allowed transcriptome‐wide detection of distinct RNA modifications [12, 13]. Accurate regulation of the transcriptome is critical for gene expression and its subsequent control of cellular functions, including metabolism, proliferation, differentiation, and development. Thus, alterations in transcriptome regulation can disrupt cellular functions and lead to disease. Accumulating evidence has identified and functionally characterized several distinct types of chemical modifications of RNA nucleotides in both protein‐coding and ncRNAs, further advancing the burgeoning field of epitranscriptomics. In this chapter, we will first provide an overview of RNA modifications and then synopsize several transcriptome‐wide RNA modification map-

we will highlight novel insights into the potential functions of RNA modifications and their disease relevance as revealed and facilitated by epitranscriptomic profiling. Finally, we will

The process of mRNA maturation involving 5ʹ‐capping, splicing, and polyadenylation has been well studied [14]. However, the more subtle post‐transcriptional modifications of epitranscriptomics, also termed RNA‐epigenetics, are now just fully coming to light. The post‐transcriptional modifications found in RNA are often called marks because they mark a region of RNA that potentially contributes to the regulation of cellular processes, including gene expression, protein translation, or RNA stability. Like mRNA maturation, enzymes are required to catalyze the reactions, which chemically modify RNA nucleotides. The most common post‐transcriptional RNA modification, Ψ, was also the first to be discovered [15]. Originally discovered in rRNA and tRNA, Ψ modifications are also present in mRNA [16, 17]. Site‐specific isomerization of uridine (U) to Ψ (5‐ribosyluracil) is irreversibly catalyzed via Ψ synthases. The family of Ψ synthases (PUS) consists of enzymes which can either function independently or those that require H/ACA ribonucleotide complexes [18]. Compared to

offer our perspective on how the field will progress or evolve in the near future.

**2. An overview of post‐transcriptional modifications of RNA**

‐methyladenosine (m1

280 Applications of RNA-Seq and Omics Strategies - From Microorganisms to Human Health

A‐seq, m<sup>5</sup>

C), pseudouridine (Ѱ), adenosine (A)

A). These modifications have been identified

C‐seq, pseudouridine‐seq, and NAD captureSeq. Next,

ing *N*<sup>6</sup>

‐methyladenosine (m<sup>6</sup>

to inosine (I), and *N*<sup>1</sup>

ping techniques such as m<sup>6</sup>

The first transcriptome‐wide and NGS‐based approach for mapping m<sup>6</sup> A modifications demonstrated the feasibility of identifying RNA modifications across the entire transcriptome and established the field of epitranscriptomics [6]. The most important aspects of NGS‐based techniques are the ability to map modifications on a global scale at the single nucleotide resolution and that the modified nucleotides are analyzed within the context of the surrounding gene sequence. These features insure that the nucleotide modifications are accurately assigned to the appropriate RNA and not falsely attributed to homologous genes or RNA contaminates [6]. Now, several high‐throughput NGS‐based technologies, including RNA‐seq, have been established to profile and quantitate RNA modifications (m<sup>6</sup> A, m6Am, m<sup>5</sup> C, m1 A, A‐to‐I, Ѱ, and NAD cap). These RNA‐seq‐based methodologies can be divided into two classes: immunoprecipitation‐based and chemical‐based methods. **Table 1** lists six representative NGS‐ based detection methods of RNA modifications.


immunoprecipitation and the m1

m5

or control, m<sup>5</sup>

into cytidine but arrest at the ce1

for the detection of site‐specific endogenous m<sup>5</sup>

A residue to cause truncated reverse transcription products,

C [40, 41]. Inosine chemical erasing (ICE)

I). Reverse transcription will transcript inosine

A [39].

Epitranscriptomics for Biomedical Discovery http://dx.doi.org/10.5772/intechopen.69033

C

283

has been applied successfully for the transcriptome‐wide characterizations of m1

Chemical‐based methods rely on the misincorporation of nucleotide or nucleotide conversion to truncate or stop RNA products during reverse transcription. RNA bisulfite conversion followed by high‐throughput sequencing (BS‐seq, **Figure 2A**) is a chemical conversion method based on converting unmodified cytosine residues to uracil and keeping m<sup>5</sup>

residues unchanged by bisulfite treatment. BS‐seq is the only method currently available

uses nucleotide switching to detect A‐to‐I modifications [42]. Inosine ribonucleotides are

**Figure 2.** Chemical‐based strategies to detect RNA modification. (A) BS‐seq: Bisulfite selectively converts cytosine, not

will detect A‐to‐I sites. (C) Ѱ‐seq: The reagent CMC followed by incubation at alkaline pH leads to hydrolysis of U‐ CMC adducts, which are less stable than Ѱ‐CMC. Reverse transcription in Ѱ‐CMC sample will stop at Ѱ site. Following RNA‐seq and reads mapping will detect Ѱ sites with increased transcript termination in the CMC‐treated sample. (D) NAD captureSeq: ADPRC enzyme catalyzes a transglycosylation reaction of NAD with pentynol, which are bound by CuAAC with biotin azide. The RNA with NAD is captured by streptavidin beads before being readied for cDNA library

acrylonitrile can cyanoethylate inosine into N1‐cyanoethylinosine (ce1

preparation and sequenced for identifying NAD‐capped RNAs.

C, into uracil, subsequent to reverse transcription and RNA‐seq processes. After comparison with reference genome

C residues are identified as cytosine, whereas unmethylated cytosine as thymine. (B) ICE‐seq: The

I site after the CE treatment. cDNA library, sequencing, reads mapping, and analysis

**Table 1.** NGS‐based methods to profile transcriptome‐wide RNA modifications.

RNA immunoprecipitation (RIP)‐based methods use an RNA modification‐specific antibody or an enzyme‐specific antibody to capture modified RNA followed by RNA‐seq. m<sup>6</sup> A‐seq [26], methylated RIP‐seq (MeRIP‐seq) [36] and m<sup>6</sup> A‐level, and isoform‐characterization sequencing (m<sup>6</sup> A‐LAIC‐seq) [37] combine RNA‐seq with RIP specific for m<sup>6</sup> A methylation. **Figure 1A** displays a typical m<sup>6</sup> A‐seq workflow. RIP is performed using an anti‐m<sup>6</sup> A antibody to enrich m6 A‐modified RNAs followed by cDNA library preparation and high throughput NGS sequencing and finally analysis to identify the occurrence and consensus motif (RRACU) of global m6 A modifications. A modified RIP approach, called m<sup>6</sup> A individual‐nucleotide‐resolution by cross‐linking and immunoprecipitation (miCLIP), uses ultraviolet light‐induced antibody RNA cross‐linking to induce site‐specific mutations at m<sup>6</sup> A marks. These mutational signatures block reverse transcription and facilitate the detection of m<sup>6</sup> A marks at single‐ nucleotide resolution [38]. As illustrated in **Figure 1B**, m1 A‐ID‐seq, which combines m1 A

**Figure 1.** Immunoprecipitation‐based strategies to detect RNA modifications. (A) m<sup>6</sup> A‐seq workflow: RNA immuno‐ precipitation is done using anti‐m<sup>6</sup> A antibody to enrich m<sup>6</sup> A‐modified RNAs followed by cDNA library preparation and high throughput NGS sequencing before occurrence and consensus motif (RRACU) of global m<sup>6</sup> A modifications are analyzed. (B) m1 A‐ID‐seq workflow: RNA immunoprecipitation is carried out using anti‐m1 A antibody to enrich m<sup>1</sup> A‐modified RNAs, which are then subjected to either the demethylase (−) treatment or the demethylase (+). Reverse transcription is stopped at m1 A site in demethylase (−) group while extended in the demethylase (+) group. After NGS, m<sup>1</sup> A site can be identified by comparing the data of the demethylase (−) group to those of the demethylase (+) group.

immunoprecipitation and the m1 A residue to cause truncated reverse transcription products, has been applied successfully for the transcriptome‐wide characterizations of m1 A [39].

Chemical‐based methods rely on the misincorporation of nucleotide or nucleotide conversion to truncate or stop RNA products during reverse transcription. RNA bisulfite conversion followed by high‐throughput sequencing (BS‐seq, **Figure 2A**) is a chemical conversion method based on converting unmodified cytosine residues to uracil and keeping m<sup>5</sup> C residues unchanged by bisulfite treatment. BS‐seq is the only method currently available for the detection of site‐specific endogenous m<sup>5</sup> C [40, 41]. Inosine chemical erasing (ICE) uses nucleotide switching to detect A‐to‐I modifications [42]. Inosine ribonucleotides are

RNA immunoprecipitation (RIP)‐based methods use an RNA modification‐specific antibody

m<sup>1</sup>

ICE‐seq [42] A‐to‐I editing Cyanoethylation of RNA combined with reverse transcription Pseudo‐seq [16], Ѱ‐seq [17] ѱ Chemical modification to terminate reverse transcription in

A‐seq workflow. RIP is performed using an anti‐m<sup>6</sup>

lution by cross‐linking and immunoprecipitation (miCLIP), uses ultraviolet light‐induced

A‐modified RNAs followed by cDNA library preparation and high throughput NGS sequencing and finally analysis to identify the occurrence and consensus motif (RRACU) of

A‐seq [26],

A methylation. **Figure 1A**

A individual‐nucleotide‐reso-

A‐ID‐seq, which combines m1

A marks. These mutational

A‐seq workflow: RNA immuno‐

A modifications

A antibody to enrich

A‐modified RNAs followed by cDNA library preparation

A marks at single‐

A

A antibody to enrich

A‐level, and isoform‐characterization sequenc-

Am Methyl‐RNA immunoprecipitation and UV cross‐linking

A Methyl‐RNA immunoprecipitation and the inherent ability of

A to stall reverse transcription

C Chemical conversion of modified nucleotides

the pseudouridylated site

or an enzyme‐specific antibody to capture modified RNA followed by RNA‐seq. m<sup>6</sup>

A‐LAIC‐seq) [37] combine RNA‐seq with RIP specific for m<sup>6</sup>

A modifications. A modified RIP approach, called m<sup>6</sup>

signatures block reverse transcription and facilitate the detection of m<sup>6</sup>

antibody RNA cross‐linking to induce site‐specific mutations at m<sup>6</sup>

**Figure 1.** Immunoprecipitation‐based strategies to detect RNA modifications. (A) m<sup>6</sup>

A antibody to enrich m<sup>6</sup>

and high throughput NGS sequencing before occurrence and consensus motif (RRACU) of global m<sup>6</sup>

A‐ID‐seq workflow: RNA immunoprecipitation is carried out using anti‐m1

A site can be identified by comparing the data of the demethylase (−) group to those of the demethylase (+) group.

A‐modified RNAs, which are then subjected to either the demethylase (−) treatment or the demethylase (+). Reverse

A site in demethylase (−) group while extended in the demethylase (+) group. After NGS,

nucleotide resolution [38]. As illustrated in **Figure 1B**, m1

methylated RIP‐seq (MeRIP‐seq) [36] and m<sup>6</sup>

**Method Modification Strategies**

m6 A, m<sup>6</sup>

282 Applications of RNA-Seq and Omics Strategies - From Microorganisms to Human Health

NAD captureSeq [43] NAD Chemoenzymatic capture

**Table 1.** NGS‐based methods to profile transcriptome‐wide RNA modifications.

ing (m<sup>6</sup>

global m6

m6

m6

m6

m<sup>1</sup>

A‐seq [26], MeRIP‐seq [36],

A‐ID‐seq [39] m<sup>1</sup>

Bisulfite sequencing [40] m5

A‐LAICIC‐seq [37]

displays a typical m<sup>6</sup>

precipitation is done using anti‐m<sup>6</sup>

transcription is stopped at m1

are analyzed. (B) m1

m<sup>1</sup>

m<sup>1</sup>

**Figure 2.** Chemical‐based strategies to detect RNA modification. (A) BS‐seq: Bisulfite selectively converts cytosine, not m5 C, into uracil, subsequent to reverse transcription and RNA‐seq processes. After comparison with reference genome or control, m<sup>5</sup> C residues are identified as cytosine, whereas unmethylated cytosine as thymine. (B) ICE‐seq: The acrylonitrile can cyanoethylate inosine into N1‐cyanoethylinosine (ce1 I). Reverse transcription will transcript inosine into cytidine but arrest at the ce1 I site after the CE treatment. cDNA library, sequencing, reads mapping, and analysis will detect A‐to‐I sites. (C) Ѱ‐seq: The reagent CMC followed by incubation at alkaline pH leads to hydrolysis of U‐ CMC adducts, which are less stable than Ѱ‐CMC. Reverse transcription in Ѱ‐CMC sample will stop at Ѱ site. Following RNA‐seq and reads mapping will detect Ѱ sites with increased transcript termination in the CMC‐treated sample. (D) NAD captureSeq: ADPRC enzyme catalyzes a transglycosylation reaction of NAD with pentynol, which are bound by CuAAC with biotin azide. The RNA with NAD is captured by streptavidin beads before being readied for cDNA library preparation and sequenced for identifying NAD‐capped RNAs.

cyanoethylated with acrylonitrile to form *N*<sup>1</sup> ‐cyanoethylinosine (ce1 I). Subsequently, the Watson‐Crick base pairing of I with C is inhibited by the newly formed *N*<sup>1</sup> ‐cyanoethyl group of ce1 I. Thus, cyanoethylation of I blocks cDNA synthesis by preventing extension of the cDNA that bears a cytosine (C) corresponding to the editing site during reverse transcription. However, I will be replaced by guanosine (G) [42] (**Figure 2B**). To detect RNA pseudouridylation, several groups developed Pseudo‐seq (Ѱ‐seq). RNA is treated with N<sup>3</sup> ‐[N‐cyclohexyl‐Nʹ‐β‐(4‐methylmorpholinium) ethylcarbodiimide‐Ѱ (N<sup>3</sup> ‐CMC‐Ѱ)], which binds covalently to U, G, and Ѱ residues and then exposed to alkaline pH to reduce stable U‐CMC and G‐CMC adducts. Reverse transcription will pause at the remaining intact Ѱ‐CMC sites, allowing for the mapping of Ѱ‐modifications [16, 17] (**Figure 2C**). Comparison of mapping reads from CMC‐treated samples versus non‐treated controls, Ѱ will be detected as the sites with an increased proportion of reads supporting reverse transcription termination. NAD captureSeq (**Figure 2D**) requires the chemo‐enzymatic modification of NAD which is capping the 5ʹ end of RNA. The first step, the transglycosylation of NAD, is catalyzed by ADP‐ribosyl cyclase (ADPRC) from *Aplysia californica* in the presence of an alkynyl alcohol. In the second step, the modified NAD is biotinylated by a copper‐catalyzed azide‐alkyne cycloaddition. Thirdly, the biotin‐linked RNA is captured on streptavidin beads and processed further for cDNA library preparation and NGS. The NAD‐biotin‐captured sequences are then identified by comparison with the control samples which were not subjected to the first step of chemo‐enzymatic biotinylation [43].

In vitro and in vivo genetic depletion of the m<sup>6</sup>

mouse embryonic fibroblasts (MEFs) revealed m<sup>6</sup>

ferentiation [45]. These findings suggest that m<sup>6</sup>

cancer stem cell (BCSC) phenotype [46]. The m<sup>6</sup>

enhanced by 5′ UTR methylation [47]. m<sup>6</sup>

regulatory regions contained a total of 14 m<sup>6</sup>

impairs XIST‐mediated gene silencing [49].

The tRNA T‐loop at position 58 commonly contains a m1

regulate translational fidelity [57]. A low level of internal m<sup>5</sup>

sites in protein‐coding and non‐coding RNAs [41]. m<sup>5</sup>

argonaute‐binding sites within the 3ʹ UTR [41].

two highly conserved m<sup>6</sup>

contains fully modified m1

non‐coding RNAs [39].

with at least 78 m<sup>6</sup>

tRNA m1

that m1

m5

whereas m1

tion [53, 54]. m1

the expression of the eraser ALKBH5 which resulted in m<sup>6</sup>

led to the absence of m<sup>6</sup>

m6

m6

factor. The absence of m<sup>6</sup>

A writer, *Mettl3*, in both mouse and human,

Epitranscriptomics for Biomedical Discovery http://dx.doi.org/10.5772/intechopen.69033 285

A marks in naïve pluripotency‐promoting

A modification provides the flexibility of the

A reader YTHDF2 protects the 5′ UTR of

A modification is critical for the regulation of HIV‐1

A methylation peaks. In addition, methylation of

A modification [50], along with posi-

C was found in mRNA cap struc-

C marks in mRNAs were enriched near

A‐ID‐seq demonstrated

A target sites in the HIV‐1 rev response element (RRE) stem loop II

A 58 which stabilizes its tertiary structure. Hypomodification of

A demethylation in the 3ʹ UTR of

A marks and

C marks sta-

A

A modification within *Nanog* mRNA which encodes a pluripotency

and inhibited embryonic stem cell exit from self‐renewal towards lineage differentiation [44].

genes reduced mRNA stability of key pluripotency‐promoting transcripts and facilitated dif-

stem cell transcriptome required to differentiate into different lineages [44]. NANOG is also important in both the maintenance and specification of cancer stem cells which can metastasize and form primary tumors. The exposure of breast cancer cells to hypoxia induced

*NANOG* mRNA and the increased half life of *NANOG* mRNA, thereby promoting the breast

stress‐induced transcripts from demethylation. Cap‐independent translation initiation was

replication and HIV‐1ʹs effect on the host immune system [48]. HIV‐1 viral infection induced

region enriched the binding of the HIV‐1 rev protein to the RRE in vivo and enhanced nuclear export of HIV‐1 RNA [48]. The long non‐coding RNA X‐inactive specific transcript (XIST) regulates transcriptional silencing of genes on the X chromosome. XIST is heavily modified

tion 9 of metazoan mitochondrial tRNAs [51] and eukaryotic rRNAs [52]. Initiator tRNAMet

A‐modified rRNA regulates ribosome biogenesis [52]. m1

C sites have been detected in several eukaryotic tRNA, Rrna, and mRNA. m<sup>5</sup>

A sites. Knockdown of METTL3 leads to decreased XIST m<sup>6</sup>

A 58 affects the association with polysomes and the subsequent efficiency of transla-

A methylation regulated the dynamic response to stimuli and identified 901 m1

peaks enriched within the 5ʹ UTR near the start codons of 600 distinct protein‐coding and

bilize the secondary structure of tRNA, alter aminoacylation and codon recognition [56], and

tures in mammalian‐ and virus‐infected mammalian cells [58, 59]. BS‐seq identified 10,275

A‐to‐I editing sites are distributed through human mRNA, including exons, introns. and 5ʹ and 3ʹ UTRs [60]. Alu repeat elements contain the highest frequency of A‐to‐I editing sites

A modifications in tRNA function in response to environmental stress [55],

A modification in both host and viral mRNAs. HIV‐1 coding, non‐coding, and splicing

A‐seq in mouse naïve embryonic stem cells (ESCs), 11‐day‐old embryoid bodies (EBs), and

A marks extended Nanog expression throughout differentiation
