**2. What are non-coding RNAs?**

Large expanses of the genome are transcribed into RNAs, but only a small portion of these RNAs encode proteins [18,19]. Many fundamental cellular processes rely on conserved ncRNAs, particularly on ribosomal RNAs (rRNAs), the ribosome RNA components that allow mRNA translation into proteins (Figure 1). Other roles are the transport of amino acids via transfer RNAs (tRNAs) and mRNA splicing through the implication of small nucleolar RNAs (snoRNAs). miRNAs and their crucial role as key modulators of post-transcriptional gene regulation were discovered more than 20 years ago [20]. In the last few years, lncRNAs have been identified as new modulators of key biological processes [21-23, 18]. Currently, ncRNAs are divided in two classes, based on their length; long ncRNAs (lncRNAs, > 200 nt) and short ncRNAs (<200 nt), such as miRNAs, small nucleolar RNAs (snoRNAs) and PIWI-interacting RNAs (piRNAs) [24].

Ribosomal RNA (rRNA) and transfer RNA (tRNA) are the most represented ncRNAs in humans. Long non-coding RNAs (lnc or long ncRNA) are longer than 200 nt and are subdi‐ vided in five categories based on their genomic localization: pRNA (promoter-associated RNA), eRNA (enhancer-associated RNA), gsRNA (gene body-associated RNA), lincRNA (intergenic RNA) and NAT (Natural Antisense Transcript). Short non-coding RNAs (short

**Figure 1.** Non-coding RNAs.

is the p.Phe508del, whereas other mutations, located both in coding and non-coding regions, are rare or private. The p.Phe508del mutation induces aberrant protein folding, leading to endoplasmic reticulum (ER)-associated degradation, atypical intracellular trafficking and reduced stability of the CFTR protein at the apical membrane. Dysfunction or lack of the CFTR protein causes an obstructive lung disease characterized by impaired ion transport in the airway epithelium, accumulation of sticky mucus in the air space and chronic airway inflam‐ mation. Physiological *CFTR* expression is tightly controlled by transcriptional, post-transcrip‐ tional, translational and post-translational regulatory mechanisms, resulting in complex spatial and temporal expression patterns. Notwithstanding the importance of *CFTR* transcrip‐ tional regulation [1-3], *CFTR* expression can be modulated through other mechanisms. Indeed, epigenetic changes, such as DNA methylation or histone acetylation, also influence *CFTR* gene expression in different tissues [4-7]. Post-transcriptional controls also regulate its expression, for instance via the usage of upstream open reading frames (uORFs) encoded within the *CFTR* 5'UTR [8] and the 3'UTR that controls *CFTR* mRNA stability through ARE sequences (AU-rich elements) [9]. An emerging area of research is focusing on the role played by non-coding RNAs (ncRNAs), such as microRNAs (miRNAs), in *CFTR* gene expression. Starting from 2011, a few studies have shown the involvement of miRNAs in the physiological control of the complex spatio-temporal expression pattern of *CFTR* mRNA [10,11], including a recent work by our group [3]. Moreover, the implication of long non-coding RNAs (lncRNAs) and miRNAs in human diseases is well documented [12], including in inherited disorders [13] and lung diseases [14-17]. However, only few studies, described below, have hitherto been carried out on the role of ncRNAs in CF. This chapter is an overview of the findings about the role of

Large expanses of the genome are transcribed into RNAs, but only a small portion of these RNAs encode proteins [18,19]. Many fundamental cellular processes rely on conserved ncRNAs, particularly on ribosomal RNAs (rRNAs), the ribosome RNA components that allow mRNA translation into proteins (Figure 1). Other roles are the transport of amino acids via transfer RNAs (tRNAs) and mRNA splicing through the implication of small nucleolar RNAs (snoRNAs). miRNAs and their crucial role as key modulators of post-transcriptional gene regulation were discovered more than 20 years ago [20]. In the last few years, lncRNAs have been identified as new modulators of key biological processes [21-23, 18]. Currently, ncRNAs are divided in two classes, based on their length; long ncRNAs (lncRNAs, > 200 nt) and short ncRNAs (<200 nt), such as miRNAs, small nucleolar RNAs (snoRNAs) and PIWI-interacting

Ribosomal RNA (rRNA) and transfer RNA (tRNA) are the most represented ncRNAs in humans. Long non-coding RNAs (lnc or long ncRNA) are longer than 200 nt and are subdi‐ vided in five categories based on their genomic localization: pRNA (promoter-associated RNA), eRNA (enhancer-associated RNA), gsRNA (gene body-associated RNA), lincRNA (intergenic RNA) and NAT (Natural Antisense Transcript). Short non-coding RNAs (short

ncRNAs in physiological *CFTR* gene expression and in CF.

**2. What are non-coding RNAs?**

280 Cystic Fibrosis in the Light of New Research

RNAs (piRNAs) [24].

ncRNAs) are smaller than 200 nt and are subdivided in four classes based on their size and function: siRNA (small interfering RNA), miRNA (microRNA), piRNA (PIWI-interacting RNA), snoRNA (small nucleolar RNA) and derived snoRNA (sdRNA).

### **2.1. Long non-coding RNAs**

lncRNAs include all ncRNAs longer than 200 nt (except rRNA and tRNA). They constitute the bulk of the non-coding transcriptome [25].

### *2.1.1. lncRNA biogenesis*

It is thought that most lncRNAs originate within a 2-kb region surrounding the Transcription Start Site (TSS) of protein-coding genes (65% of lncRNAs overlap with a promoter and are called pRNAs), or map to enhancer regions (19%; named eRNAs), or derive from antisense transcripts that overlap with annotated gene bodies (5%, called NATs), or are associated with the bodies of protein-coding genes (gsRNA, gene body-associated lncRNAs) [26, 27]. The remaining lncRNAs originate from more distal (>2kb) unannotated regions (11%) and are commonly referred to as long intervening or intergenic ncRNAs (lincRNAs) [28, 29] (Figure 2).

a-Promoter-associated RNAs (pRNA), b-Enhancer-associated RNAs (eRNA), c-Intronic and gene body-associated (sense) RNAs (gsRNA), d-Natural Antisense Transcripts (NAT), e-Long Intergenic RNAs (lincRNA). In the lower part of the figure are described the main lncRNA functions.

**Figure 2.** lncRNAs, from biogenesis to functions.

The finding that a large number of lncRNAs arise from loci close to protein-coding genes is consistent with previous genome-wide analyses of lncRNAs [30]. Although all studies agree that the 5′end of lncRNAs, like for mRNAs, is capped by methylguanosine, their splicing status and their 3′-end processing have not been fully defined [26]. It is likely that splice site recog‐ nition occurs at low frequency at most lncRNA loci and that lncRNAs may be predominantly mono-exonic and non-polyadenylated [26]. Most lncRNAs are not translated [28] and their localization is predominantly nuclear [25].

### *2.1.2. lncRNA functions*

lncRNAs have regulatory functions in different biological processes (Figure 2). Many of their functions are related to their capacity to bind to RNA, DNA and proteins. The founding member of the lncRNA family is Xist (~17 kb). Xist originates from the silent X chromosome in female cells and coats this chromosome during the early stages of development to establish epigenetic X inactivation [31]. lncRNAs can be used as indicators of the transcriptional activity of a locus or a gene [19]. Their roles as scaffolds for nuclear processes, guides for ribonucleo‐ protein complexes or decoys have been described in the literature. Similarly to miRNAs, they can act as activators or repressors of protein expression.

lncRNAs are considered to be more species, tissue and developmental stage-specific than mRNAs [32]. A growing number of studies show that lncRNA deregulation has a role in various diseases [33,34; for reviews: 35,36], including pulmonary disorders. Several reviews have discussed the role of lncRNAs and miRNAs in respiratory diseases [16,17,37] and a recent work reported lncRNA involvement in CF (detailed in section 4.2.1).

### **2.2. MicroRNAs**

The finding that a large number of lncRNAs arise from loci close to protein-coding genes is consistent with previous genome-wide analyses of lncRNAs [30]. Although all studies agree that the 5′end of lncRNAs, like for mRNAs, is capped by methylguanosine, their splicing status and their 3′-end processing have not been fully defined [26]. It is likely that splice site recog‐ nition occurs at low frequency at most lncRNA loci and that lncRNAs may be predominantly mono-exonic and non-polyadenylated [26]. Most lncRNAs are not translated [28] and their

lncRNAs have regulatory functions in different biological processes (Figure 2). Many of their functions are related to their capacity to bind to RNA, DNA and proteins. The founding member of the lncRNA family is Xist (~17 kb). Xist originates from the silent X chromosome in female cells and coats this chromosome during the early stages of development to establish epigenetic X inactivation [31]. lncRNAs can be used as indicators of the transcriptional activity of a locus or a gene [19]. Their roles as scaffolds for nuclear processes, guides for ribonucleo‐ protein complexes or decoys have been described in the literature. Similarly to miRNAs, they

localization is predominantly nuclear [25].

**Figure 2.** lncRNAs, from biogenesis to functions.

282 Cystic Fibrosis in the Light of New Research

can act as activators or repressors of protein expression.

*2.1.2. lncRNA functions*

### *2.2.1. miRNA localization and biogenesis*

Animal miRNAs derive from the nuclear genome. In humans, the majority of canonical miRNAs are encoded by introns of non-coding or coding transcripts, but some miRNAs are encoded by exonic regions (Figure 3). Often, several miRNA loci are in close proximity, thus constituting a polycistronic transcription unit [38]. Most miRNAs use their host gene tran‐ scripts as carriers, but separate transcription from internal promoters remains possible. Generally, miRNAs in the same cluster are co-transcribed. Most miRNA genes located in introns of protein-coding genes share the promoter of the host gene [39]. miRNA genes often have multiple transcription start sites [40]. miRNA loci in intergenic regions apparently have their own transcriptional regulatory elements, thus constituting independent transcription units.

Figure 3: Genomic locations of miRNAs. miRNA genes, isolated or in clusters, are located in intergenic *(ex: miR-494)* or intragenic genome regions, including exons of non-coding *(e.g. miR-155)* or coding (e.g. miR-985) genes and introns of **Figure 3.** Genomic locations of miRNAs. miRNA genes, isolated or in clusters, are located in intergenic *(ex: miR-494)* or intragenic genome regions, including exons of non-coding *(e.g. miR-155)* or coding (e.g. miR-985) genes and introns of non-coding (e.g. the miR-15a ~16-1 cluster) or coding *(e.g. miR-126)* genes.

miRNA biogenesis includes several steps. First, the gene coding for a given miRNA is transcribed by RNA polymerase II into a long primary transcript (pri-miRNA, ranging from 100 nt to several kilobases). Some miRNA genes, especially those located in Alu elements, are transcribed by RNA polymerase lII [41]. When several miRNAs are in a cluster, primiRNAs can contain multiple miRNAs. Transcription factors, such as p53, MYC, C/EBP, FOXA positively or negatively regulate miRNA transcription [3,42,43]. Epigenetic control, such as DNA methylation and histone modifications also

Then, several maturation steps are necessary for miRNA processing. Indeed, the long pri-miRNAs (typically over 1kb) contain stem–loop structures in which mature miRNA sequences are embedded. The nuclear RNase III Drosha acts by cropping the stem–loop to release small hairpin-shaped RNAs of 65 nt in length (pre-miRNA) from the pri-miRNAs. To do this, Drosha, together with its cofactor DiGeorge Syndrome Critical Region 8 (DGCR8), forms the Microprocessor complex. As Drosha cleavage defines one end of the mature miRNA and thereby determines its specificity, it is important that the Microprocessor complex precisely recognizes and cleaves each pri-miRNA. Importantly, Drosha-mediated processing of intronic miRNAs does not affect splicing of the host pre-miRNA [45]. Multiple auxiliary factors could contribute to pri-miRNA maturation [46]. For example, three primary sequence determinants (the basal UG, CNNC and the apical GUG motifs) contribute to efficient processing of human pri-miRNAs. At least one of these three motifs is present in almost 80% of human miRNAs [46]. The splicing factor SRp20 (also called SRSF3) and the RNA helicase DDX17 bind to the CNNC motif and increase processing of human pri-miRNAs by Drosha. Moreover, the terminal loops of miRNA precursors are enriched in cis-elements that recruit regulatory proteins. For example, the splicing factors HNRPA1 and KSRP bind to the conserved terminal loops of some pri-miRNAs and facilitate Drosha-mediated processing

Following Drosha processing, pre-miRNAs are exported in the cytoplasm where they are cleaved by Dicer near the terminal loop, liberating a small RNA duplex. Dicer, like Drosha, belongs to a family of RNase III-type endonucleases that

RNA duplexes include two mature miRNAs: one derived from the 5ʹ strand and the other one from the 3ʹ strand of the precursor (e.g. miR-27a-5p and miR-27a-3p). One is also called the 'guide' (miRNA) and is usually more biologically active than the other one (the 'passenger', often referred to as miRNA\*). The passenger is normally degraded, but, in some cases, it can be functional [50]. The mature miRNA strand is subsequently incorporated in the RNA-induced silencing complex (RISC, or miRISC for miRNA-containing RISC, or miRNP for microribonucleoprotein), where it directly binds to a member of the Argonaute (AGO) protein family (four AGO members, AGO 1 to AGO4, and AGO1/2 are the

To date, about 1,900 miRNAs (1,881 precursors and 2,588 mature miRNAs; GRCh38 human genome assembly) have been reported in the miRbase database (http://www.mirbase.org/). A substantial number of these miRNAs have dubious annotations and for nearly one-third of miRNA loci there is no convincing evidence concerning the production of

non-coding (e.g. the miR-15a ~16-1 cluster) or coding *(e.g. miR-126) genes*.

contribute to miRNA gene regulation [44].

act specifically on double-stranded RNA.

[47-49].

most frequently used).

authentic miRNAs (miRbase).

miRNA biogenesis includes several steps. First, the gene coding for a given miRNA is transcribed by RNA polymerase II into a long primary transcript (pri-miRNA, ranging from 100 nt to several kilobases). Some miRNA genes, especially those located in Alu elements, are transcribed by RNA polymerase III [41]. When several miRNAs are in a cluster, pri-miRNAs can contain multiple miRNAs. Transcription factors, such as p53, MYC, C/EBP, FOXA positively or negatively regulate miRNA transcription [3,42,43]. Epigenetic control, such as DNA methylation and histone modifications also contribute to miRNA gene regulation [44].

Then, several maturation steps are necessary for miRNA processing. Indeed, the long primiRNAs (typically over 1kb) contain stem–loop structures in which mature miRNA sequences are embedded. The nuclear RNase III Drosha acts by cropping the stem–loop to release small hairpin-shaped RNAs of 65 nt in length (pre-miRNA) from the pri-miRNAs. To do this, Drosha, together with its cofactor DiGeorge Syndrome Critical Region 8 (DGCR8), forms the Micro‐ processor complex. As Drosha cleavage defines one end of the mature miRNA and thereby determines its specificity, it is important that the Microprocessor complex precisely recognizes and cleaves each pri-miRNA. Importantly, Drosha-mediated processing of intronic miRNAs does not affect splicing of the host pre-miRNA [45]. Multiple auxiliary factors could contribute to pri-miRNA maturation [46]. For example, three primary sequence determinants (the basal UG, CNNC and the apical GUG motifs) contribute to efficient processing of human primiRNAs. At least one of these three motifs is present in almost 80% of human miRNAs [46]. The splicing factor SRp20 (also called SRSF3) and the RNA helicase DDX17 bind to the CNNC motif and increase processing of human pri-miRNAs by Drosha. Moreover, the terminal loops of miRNA precursors are enriched in cis-elements that recruit regulatory proteins. For example, the splicing factors HNRPA1 and KSRP bind to the conserved terminal loops of some pri-miRNAs and facilitate Drosha-mediated processing [47-49].

Following Drosha processing, pre-miRNAs are exported in the cytoplasm where they are cleaved by Dicer near the terminal loop, liberating a small RNA duplex. Dicer, like Drosha, belongs to a family of RNase III-type endonucleases that act specifically on double-stranded RNA.

RNA duplexes include two mature miRNAs: one derived from the 5' strand and the other one from the 3' strand of the precursor (e.g. miR-27a-5p and miR-27a-3p). One is also called the 'guide' (miRNA) and is usually more biologically active than the other one (the 'passenger', often referred to as miRNA\*). The passenger is normally degraded, but, in some cases, it can be functional [50]. The mature miRNA strand is subsequently incorporated in the RNAinduced silencing complex (RISC, or miRISC for miRNA-containing RISC, or miRNP for microribonucleoprotein), where it directly binds to a member of the Argonaute (AGO) protein family (four AGO members, AGO1 are the most frequently used).

To date, about 1,900 miRNAs (1,881 precursors and 2,588 mature miRNAs; GRCh38 human genome assembly) have been reported in the miRbase database (http://www.mirbase.org/). A substantial number of these miRNAs have dubious annotations and for nearly one-third of miRNA loci, there is no convincing evidence concerning the production of authentic miRNAs (miRbase).

### *2.2.2. miRNA roles*

miRNA biogenesis includes several steps. First, the gene coding for a given miRNA is transcribed by RNA polymerase II into a long primary transcript (pri-miRNA, ranging from 100 nt to several kilobases). Some miRNA genes, especially those located in Alu elements, are transcribed by RNA polymerase III [41]. When several miRNAs are in a cluster, pri-miRNAs can contain multiple miRNAs. Transcription factors, such as p53, MYC, C/EBP, FOXA positively or negatively regulate miRNA transcription [3,42,43]. Epigenetic control, such as DNA methylation and histone modifications also contribute to miRNA gene regulation [44].

Then, several maturation steps are necessary for miRNA processing. Indeed, the long primiRNAs (typically over 1kb) contain stem–loop structures in which mature miRNA sequences are embedded. The nuclear RNase III Drosha acts by cropping the stem–loop to release small hairpin-shaped RNAs of 65 nt in length (pre-miRNA) from the pri-miRNAs. To do this, Drosha, together with its cofactor DiGeorge Syndrome Critical Region 8 (DGCR8), forms the Micro‐ processor complex. As Drosha cleavage defines one end of the mature miRNA and thereby determines its specificity, it is important that the Microprocessor complex precisely recognizes and cleaves each pri-miRNA. Importantly, Drosha-mediated processing of intronic miRNAs does not affect splicing of the host pre-miRNA [45]. Multiple auxiliary factors could contribute to pri-miRNA maturation [46]. For example, three primary sequence determinants (the basal UG, CNNC and the apical GUG motifs) contribute to efficient processing of human primiRNAs. At least one of these three motifs is present in almost 80% of human miRNAs [46]. The splicing factor SRp20 (also called SRSF3) and the RNA helicase DDX17 bind to the CNNC motif and increase processing of human pri-miRNAs by Drosha. Moreover, the terminal loops of miRNA precursors are enriched in cis-elements that recruit regulatory proteins. For example, the splicing factors HNRPA1 and KSRP bind to the conserved terminal loops of some

Following Drosha processing, pre-miRNAs are exported in the cytoplasm where they are cleaved by Dicer near the terminal loop, liberating a small RNA duplex. Dicer, like Drosha, belongs to a family of RNase III-type endonucleases that act specifically on double-stranded

RNA duplexes include two mature miRNAs: one derived from the 5' strand and the other one from the 3' strand of the precursor (e.g. miR-27a-5p and miR-27a-3p). One is also called the 'guide' (miRNA) and is usually more biologically active than the other one (the 'passenger', often referred to as miRNA\*). The passenger is normally degraded, but, in some cases, it can be functional [50]. The mature miRNA strand is subsequently incorporated in the RNAinduced silencing complex (RISC, or miRISC for miRNA-containing RISC, or miRNP for microribonucleoprotein), where it directly binds to a member of the Argonaute (AGO) protein

To date, about 1,900 miRNAs (1,881 precursors and 2,588 mature miRNAs; GRCh38 human genome assembly) have been reported in the miRbase database (http://www.mirbase.org/). A substantial number of these miRNAs have dubious annotations and for nearly one-third of miRNA loci, there is no convincing evidence concerning the production of authentic miRNAs

pri-miRNAs and facilitate Drosha-mediated processing [47-49].

family (four AGO members, AGO1 are the most frequently used).

RNA.

284 Cystic Fibrosis in the Light of New Research

(miRbase).

miRNAs are small ncRNAs that can act in the nucleus and in the cytoplasm [51] through binding to RNA, DNA and proteins. They play an important role in the negative regulation of gene expression by base-pairing to partially complementary sites on the target mRNAs, usually in the 3' UTR part. Binding of an miRNA to its target mRNA, within the RISC complex, typically leads to translational repression and exonucleolytic mRNA decay. However, highly complementary targets can be cleaved endonucleolytically.

**Figure 4.** Main miRNA roles. a. Translation block by inhibiting cap and poly(A)-binding protein recognition. b. Elon‐ gation inhibition by slowing down elongation or ribosome 'drop-off'. c. Degradation by deadenylation and decapping. d. Proteolysis. Degradation of a nascent peptide. e. mRNA storage in P-bodies that contain exonucleases, RNA helicas‐ es, decapping enzymes, DCP1/2, exosomes, deadenylases.

Several miRNAs have a role in lung diseases, such as asthma, chronic obstructive pulmonary disease (COPD) and idiopathic pulmonary fibrosis (see Table 1). The studies reporting the involvement of miRNAs in CF are detailed in section 3.2.2.


**Table 1.** Examples of pulmonary diseases in which miRNAs have a role
