Endogenous Retroelements in Cancer: Molecular Roles and Clinical Approach

*Kang-Hoon Lee and Je-Yoel Cho*

## **Abstract**

Retroelements have been considered as "Junk" DNA although the encyclopedia of DNA elements (ENCODE) project has demonstrated that most of the genome is functional. Since the contribution of LINE1 (L1) and human endogenous retrovirus (HERV) has been suspected to cause human cancers, their regulations and putative molecular functions have been investigated in diverse types of cancer. Their diagnostic, prognostic, and therapeutic potentials have been incessantly proposed using cancer associated or specific properties, such as hypomethylation, increased transcripts, and reverse transcriptase, as well as cancer-associated antigens. This chapter presents the current knowledge on retroelements in various aspects during tumorigenesis and their clinical usage in many cancer studies.

**Keywords:** retrotransposons, repetitive elements, tumorigenesis, cancer, LINE, HERV, retroelement

#### **1. Introduction**

In recent decades, the development of genomic analysis technology has played an important role in the study and treatment of various diseases [1, 2]. However, these studies have been focused on genes that form proteins that account for about 1–2% of the entire genome, and the understanding of other parts remains relatively insufficient. A retroelement (RE), also called a retrotransposon, is a type I transposable element that replicates itself via RNA and reverse transcription and can be largely classified into two types based on the genome structure, including long terminal repeat sequences (LTRs). The intact endogenous retrovirus (ERVs) retains two LTRs at both ends of the genome, instead of long and short interspersed nuclear elements (LINE and SINE), which are non-LTR groups. LTRs compose ~8% of the human genome and most are known to be inactive due to accumulated mutations. Yet, interestingly, many are transcriptionally active [3]. The non-LTR groups can be divided again into autonomous LINEs and nonautonomous SINEs that need LINE's proteins [4]. The LINE1s (L1s), known as the only active REs, makes up ~17% of the human genome. Intact L1s retain ~6 kb of the genome, which encodes two proteins, ORF1 and ORF2, which are essential for replication and reverse transcription [5]. There are about 145 full-length, functional L1 elements in the human genome. On the other hand, SINEs, which are nonautonomous retroelements, have ~300 bp genomes without coding potential. Most SINEs are of the Alu type of which there are over one million copies in the human genome [6].

The association between REs and cancer has been suggested since 1950. As the presence of a viral-oncogene was unveiled and mouse mammary tumor virus (MMTV) became the accepted etiological agent of mammary tumors in mice, the possible carcinogenesis mechanism of ERV was also revealed, raising hope for overcoming cancer [7, 8]. Many studies have reported the association of RE expression with various cancer types, including breast cancer, melanoma, and kidney cancer [9]. However, the function of RE expression in cancer as a driver or passenger remains controversial [10, 11]. It is a chicken and egg situation, since the cancer-associated RE expression can cause malignant cell transformation and malignant cell transformation leads to global DNA hypomethylation, which in turn contributes to oncogenic RE expression [12–15]. In addition, the fact that most REs have lost their transposition activity due to accumulated mutations makes it difficult to evaluate the role of REs [16]. The RE sequences that occupies about half of the mammalian genome is known as "junk DNA," and, as the name suggests, little research has been done it [17]. However, in certain areas such as in the early embryogenesis process, degenerative disease, and cancer, the expression of REs have been studied relatively well [18, 19]. In particular, several studies have been conducted to reveal the relationship among the environmental stress, RE responses, and associated diseases [20, 21]. Although no direct relationship has been revealed yet, genome instability by activated RE is known to be the main mechanism linking RE with disease [22]. However, the transposition ratio of all the REs is about 0.02 germline events per generation [23], so it is too rare to explain their various roles.

In this chapter, we focus on the functional mechanisms of REs in various cancers from development to metastasis and from diagnosis to cancer therapy.

### **2. RE regulation in normal cells and abnormal reactivation and expansion in cancer**

Fortunately, except for during the reprogramming process in early stage germ cells, most REs are strongly silenced by diverse epigenomic controls and their reactivation is molecularly inhibited [24, 25].

DNA methylation is a major epigenetic mechanism that contributes to retrotransposon silencing in both normal and cancer cells [26]. In early embryogenesis, a genome-wide DNA methylation is established by the DNA methyltransferase 3 (Dnmt3) and maintained by the methyltransferase1 (Dnmt1) [27]. Parental methylation pattern is genome-wide demethylated and methylated again at imprinted loci and REs by the Dnmt3, and these patterns are maintained by Dnmt1 in somatic cells [28–30]. Association between demethylation and RE expression was demonstrated in that the inactivation of DNMT3L, which is a non-catalytic homolog of DNMT3A/3B, causes the reactivation of L1 and IAP and leads to meiotic arrest as well as male sterility in male germ cells [31–33].

In cancer cells, a genome-wide DNA hypomethylation and the reactivation of REs that may result in the loss of chromosomal stability and imprinting patterns are well known [34]. Alteration of L1 methylation has been investigated in many types of cancers, including breast, colon, lung, ovarian, and prostate cancers [35–37]. Mostly, hypomethylation of the L1 promoter is associated with genome instability, aggressive histology, poor prognosis, and some metastasis [38]. Interestingly, some abnormal features, such as chromosome 8 abnormalities, are also associated with L1 hypomethylation [39]. In addition, due to their prevalent unmethylation in cancer samples, a moderate increase of Alu was also observed in cancer samples with a hypomethylated L1 promoter [40]. Similarly, hypomethylation of HERV has also been reported in various cancer cells [9, 12, 41–44]. Hypomethylation of its long terminal repeat (LTR)

**83**

ERK, and FGF signaling pathways [68].

*Endogenous Retroelements in Cancer: Molecular Roles and Clinical Approach*

where the promoter is located is associated with its overexpression in cancer [45]. Numerous HERV family members were expressed in cancer cell lines and primary tumor tissues. In a head and neck cancer study, tumor-specific methylation changes were found in HERV-H, HERV-W, and HERV-K families [24, 46]. Similarly, the hypomethylated CpGs resulting in high expression of HERV-K, -W, and L1 was reported in ovarian cancer [47]. Moreover, the hypomethylation of REs has been observed in specific stages or subtypes of cancer, such as during ovarian cancer progression and in the basal subtype of invasive ductal carcinoma breast cancer [48, 49]. Remarkably, individual RE expressions associated with cancer such as HERV-K at 22q11.23 (H22q), HERV-H5, HERV-H48–1, and HERV-E4 are highlighted in various cancers [46, 50, 51]. Their transcripts or viral proteins have been detected in sera from bladder, breast,

The last cellular epigenomic regulation mechanism for silencing RE expression is histone modification [52]. In normal spermatogonia, one of the repressive histone modification marks, histone 3 lysine 9 dimethylation (H3K9me2), causes transcriptional repression and is sufficient to maintain L1 silencing in the absence of DNA methylation. Thus, the loss of H3K9me2 combined with the absence of DNA methylation may be the cause of LINE1 activation [53]. On the other hand, in the study of the association of histone modification with RE expression in cancer, two repressive histone modifications, H3K9me3 and H3K27me3, were more enriched at H22q, HERVK17, and L1 sequences in PC3 than in LNCaP prostate cell lines, of which RE expression levels are high and low, respectively. By contrast, the active modification H3K4me3 was the most enriched in LNCaP at the H22q LTR [54]. The expressed RE transcripts can eventually be knocked down by the PIWI system [55]. Piwi-interacting RNA (piRNA) is a well-studied mechanism that contributes to the silencing of REs in many animal germline cells [56, 57]. The piRNA system is a ribonucleoprotein complex consisting of a piRNA, and a P-elementinduced wimpy testis (PIWI) subfamily of Argonaut nucleases protein [58]. The piRNA recognizes RE sequences and the PIWI protein destroys the RE transcripts [58, 59]. The piRNA system silences RE expression both at the transcriptional and posttranscriptional levels by modifying repressive chromatin modifications and by cleaving RE transcripts, respectively [57, 60]. However, the role of piRNA in posttranscriptional regulation is not similar to that of miRNA via providing sequence specificity because most piRNA sequences are found not to be complementary to target gene transcripts, suggesting that piRNAs may be involved in epigenetic regulation rather than posttranscriptional regulation of mRNA [61]. The deficient of the piRNA pathway causes overexpression of REs, significantly compromised genome structure and, invariably, germ cell death and sterility [58]. The aberrant expression of piRNAs has been reported in the development of cancer including the proliferation, apoptosis, metastasis, and invasion of cancer cells [62]. Moreover, the high expression of PIWI proteins has been documented in many cancer types, including gastric cancer, liver cancer, intestinal cancer, breast cancer, nonsmall cell lung cancer, bladder cancer, ovarian cancer, and melanoma and is furthermore associated with the aggressiveness of sarcomas, gliomas, and leukemia [61, 63]. The roles of PIWI proteins have been investigated separately in cancer invasion, migration, proliferation, division, and survival [64]. PIWIL1 has been known to induce epithelial-mesenchymal transition and confer migration and invasion of endometrial cancer cells [65]. The association of PIWIL2 via increasing the expression of CDK2 and cyclin A in cancer cells is reported in glioma and nonsmall lung cancer (NSCLC) cells [66]. PIWIL3 promotes the cancer proliferation, migration, and invasion through the JAK2/STAT3 signal pathway [67]. PIWIL4 can promote cancer cell division, migration, and survival of breast cancer by activating TGF-β, MAPK/

*DOI: http://dx.doi.org/10.5772/intechopen.93370*

liver, lung, ovarian, and prostate cancer patients [11].

#### *Endogenous Retroelements in Cancer: Molecular Roles and Clinical Approach DOI: http://dx.doi.org/10.5772/intechopen.93370*

where the promoter is located is associated with its overexpression in cancer [45]. Numerous HERV family members were expressed in cancer cell lines and primary tumor tissues. In a head and neck cancer study, tumor-specific methylation changes were found in HERV-H, HERV-W, and HERV-K families [24, 46]. Similarly, the hypomethylated CpGs resulting in high expression of HERV-K, -W, and L1 was reported in ovarian cancer [47]. Moreover, the hypomethylation of REs has been observed in specific stages or subtypes of cancer, such as during ovarian cancer progression and in the basal subtype of invasive ductal carcinoma breast cancer [48, 49]. Remarkably, individual RE expressions associated with cancer such as HERV-K at 22q11.23 (H22q), HERV-H5, HERV-H48–1, and HERV-E4 are highlighted in various cancers [46, 50, 51]. Their transcripts or viral proteins have been detected in sera from bladder, breast, liver, lung, ovarian, and prostate cancer patients [11].

The last cellular epigenomic regulation mechanism for silencing RE expression is histone modification [52]. In normal spermatogonia, one of the repressive histone modification marks, histone 3 lysine 9 dimethylation (H3K9me2), causes transcriptional repression and is sufficient to maintain L1 silencing in the absence of DNA methylation. Thus, the loss of H3K9me2 combined with the absence of DNA methylation may be the cause of LINE1 activation [53]. On the other hand, in the study of the association of histone modification with RE expression in cancer, two repressive histone modifications, H3K9me3 and H3K27me3, were more enriched at H22q, HERVK17, and L1 sequences in PC3 than in LNCaP prostate cell lines, of which RE expression levels are high and low, respectively. By contrast, the active modification H3K4me3 was the most enriched in LNCaP at the H22q LTR [54].

The expressed RE transcripts can eventually be knocked down by the PIWI system [55]. Piwi-interacting RNA (piRNA) is a well-studied mechanism that contributes to the silencing of REs in many animal germline cells [56, 57]. The piRNA system is a ribonucleoprotein complex consisting of a piRNA, and a P-elementinduced wimpy testis (PIWI) subfamily of Argonaut nucleases protein [58]. The piRNA recognizes RE sequences and the PIWI protein destroys the RE transcripts [58, 59]. The piRNA system silences RE expression both at the transcriptional and posttranscriptional levels by modifying repressive chromatin modifications and by cleaving RE transcripts, respectively [57, 60]. However, the role of piRNA in posttranscriptional regulation is not similar to that of miRNA via providing sequence specificity because most piRNA sequences are found not to be complementary to target gene transcripts, suggesting that piRNAs may be involved in epigenetic regulation rather than posttranscriptional regulation of mRNA [61]. The deficient of the piRNA pathway causes overexpression of REs, significantly compromised genome structure and, invariably, germ cell death and sterility [58]. The aberrant expression of piRNAs has been reported in the development of cancer including the proliferation, apoptosis, metastasis, and invasion of cancer cells [62]. Moreover, the high expression of PIWI proteins has been documented in many cancer types, including gastric cancer, liver cancer, intestinal cancer, breast cancer, nonsmall cell lung cancer, bladder cancer, ovarian cancer, and melanoma and is furthermore associated with the aggressiveness of sarcomas, gliomas, and leukemia [61, 63]. The roles of PIWI proteins have been investigated separately in cancer invasion, migration, proliferation, division, and survival [64]. PIWIL1 has been known to induce epithelial-mesenchymal transition and confer migration and invasion of endometrial cancer cells [65]. The association of PIWIL2 via increasing the expression of CDK2 and cyclin A in cancer cells is reported in glioma and nonsmall lung cancer (NSCLC) cells [66]. PIWIL3 promotes the cancer proliferation, migration, and invasion through the JAK2/STAT3 signal pathway [67]. PIWIL4 can promote cancer cell division, migration, and survival of breast cancer by activating TGF-β, MAPK/ ERK, and FGF signaling pathways [68].

*Methods in Molecular Medicine*

**expansion in cancer**

reactivation is molecularly inhibited [24, 25].

arrest as well as male sterility in male germ cells [31–33].

The association between REs and cancer has been suggested since 1950. As the presence of a viral-oncogene was unveiled and mouse mammary tumor virus (MMTV) became the accepted etiological agent of mammary tumors in mice, the possible carcinogenesis mechanism of ERV was also revealed, raising hope for overcoming cancer [7, 8]. Many studies have reported the association of RE expression with various cancer types, including breast cancer, melanoma, and kidney cancer [9]. However, the function of RE expression in cancer as a driver or passenger remains controversial [10, 11]. It is a chicken and egg situation, since the cancer-associated RE expression can cause malignant cell transformation and malignant cell transformation leads to global DNA hypomethylation, which in turn contributes to oncogenic RE expression [12–15]. In addition, the fact that most REs have lost their transposition activity due to accumulated mutations makes it difficult to evaluate the role of REs [16]. The RE sequences that occupies about half of the mammalian genome is known as "junk DNA," and, as the name suggests, little research has been done it [17]. However, in certain areas such as in the early embryogenesis process, degenerative disease, and cancer, the expression of REs have been studied relatively well [18, 19]. In particular, several studies have been conducted to reveal the relationship among the environmental stress, RE responses, and associated diseases [20, 21]. Although no direct relationship has been revealed yet, genome instability by activated RE is known to be the main mechanism linking RE with disease [22]. However, the transposition ratio of all the REs is about 0.02 germline events per generation [23], so it is too rare to explain their various roles. In this chapter, we focus on the functional mechanisms of REs in various cancers

from development to metastasis and from diagnosis to cancer therapy.

**2. RE regulation in normal cells and abnormal reactivation and** 

Fortunately, except for during the reprogramming process in early stage germ cells, most REs are strongly silenced by diverse epigenomic controls and their

DNA methylation is a major epigenetic mechanism that contributes to retrotransposon silencing in both normal and cancer cells [26]. In early embryogenesis, a genome-wide DNA methylation is established by the DNA methyltransferase 3 (Dnmt3) and maintained by the methyltransferase1 (Dnmt1) [27]. Parental methylation pattern is genome-wide demethylated and methylated again at

imprinted loci and REs by the Dnmt3, and these patterns are maintained by Dnmt1 in somatic cells [28–30]. Association between demethylation and RE expression was demonstrated in that the inactivation of DNMT3L, which is a non-catalytic homolog of DNMT3A/3B, causes the reactivation of L1 and IAP and leads to meiotic

In cancer cells, a genome-wide DNA hypomethylation and the reactivation of REs that may result in the loss of chromosomal stability and imprinting patterns are well known [34]. Alteration of L1 methylation has been investigated in many types of cancers, including breast, colon, lung, ovarian, and prostate cancers [35–37]. Mostly, hypomethylation of the L1 promoter is associated with genome instability, aggressive histology, poor prognosis, and some metastasis [38]. Interestingly, some abnormal features, such as chromosome 8 abnormalities, are also associated with L1 hypomethylation [39]. In addition, due to their prevalent unmethylation in cancer samples, a moderate increase of Alu was also observed in cancer samples with a hypomethylated L1 promoter [40]. Similarly, hypomethylation of HERV has also been reported in various cancer cells [9, 12, 41–44]. Hypomethylation of its long terminal repeat (LTR)

**82**

The apolipoprotein B mRNA editing catalytic polypeptide 3 (APOBEC3) proteins are cytidine deaminases of which family consists of seven family members (APOBEC3-A through -H) with diverse activities against a variety of retroviruses and endogenous REs, even though the activity of L1 suppression does not correlate either with antiviral activity against Vif-deficient HIV-1 and murine leukemia virus, or with patterns of subcellular localization [69, 70]. Thus, the inhibitory effect of APOBEC3 family members, specifically APOBEC3G on L1 transposition might not be due to deaminase activity, but due to novel mechanism(s) [70].

Besides APOBEC3G, MOV10, SAMHD1, and ZAP have all been identified to be able to inhibit L1 activity through diverse mechanisms [71]. MOV10 inhibits L1 mobility through interacting with L1 RNP resulting in L1 transcript degradation [72]. SAMHD1 inhibits the L1 RT activity [73]. ZAP also restricts L1 activity through the loss of L1 transcripts and ribonucleoprotein integrity [74].

Together, it will be a universal explanation for the various epigenomic modifications that are directly associated with both genome-wide RE silencing and reactivation that is much more commonly found in diverse human cancers as frequent as 4–100 de novo insertions per tumor.

#### **3. Roles of RE expressed in cancers**

The genomic instability caused by de novo insertions of REs that frequently occur in cancer is the major pathophysiological role accepted by the public [75, 76]. However, this is a very limited explanation of the universal functions of REs, because most REs lose their ability to mobilize [16]. Although some retain their coding potentials, these are silenced tightly by various mechanisms and at various levels, such as epigenomic mechanisms, transcription, and posttranscription [77]. Thus, a more in-depth understanding of RE function is mandatory.

#### **3.1 The source of genome instability**

De novo insertions of REs, despite their defective form, can both directly and indirectly affect surrounding human genome sequences [78]. Some of these events occur at high enough frequency to result in vast amounts of rearrangement of the host genome sequence [16]. This does not happen only via the mechanism of transposition activity followed by reintegration but also via the homologous recombination between dispersed REs, resulting in large structural variations (SVs) including duplications, inversions, and deletions [79]. REs are also the source of small SVs such as single-nucleotide variants (SNVs) and short indels, which are caused by template switching during repair of replication errors [16]. The SVs derived from reactivation and expansion of REs via either mobilization activity or homologous recombination have been frequently found in many cancers (~50%) [80, 81]. A high enrichment was reported especially in certain types of cancers, such as esophageal cancers, colon cancers, and squamous cell lung cancers (>90%) [82]. Although this result indicated that somatic L1 insertions are very frequently found in certain cancers, it is known that a majority of RE somatic integrations are passenger mutations with little or no effect on cancer development [83].

Nevertheless, specific SV loci derived from somatic L1 insertions have also been identified as drivers in most cancer types, including colorectal, breast, lung, and liver cancers [84–88]. For example, disruption of the APC gene by the insertion of L1 in colon cancer has been well studied [89]. Additionally, a recent study identified driver SV by L1 insertion in liver cancer [90]. L1 integration in the intron of the ST18 gene disrupted a cis-regulatory repressor element, resulting in increased expression of the ST18 gene [84].

**85**

*Endogenous Retroelements in Cancer: Molecular Roles and Clinical Approach*

**3.2 Epigenomic regulation and reactivation of REs in cancer**

**3.3 REs, the origin of cancer associated non-coding transcripts**

RNA sequencing using next-generation sequencing technology has provided a large amount of gene expression data in both normal and disease conditions, such as cancer [103]. Growing evidence suggests that REs in the intergenic regions of the human genome are sources of noncoding RNAs, including micro RNAs (miRNAs) and long noncoding RNAs (lncRNAs) [104]. Notably, about 30% of human lncRNAs originate from REs, specifically HERVs. In addition, about 80% of lncRNAs contain RE-originated sequences within or nearby their transcription start sites [105]. Importantly, a recent study has reported that many lncRNAs have a crucial role in a variety of fundamental cellular processes and diseases [106]. A recent study reported that a single-nucleotide polymorphism (SNP) in an L1-containing lncRNA sequence located in an intron of SLC7A2 leads to a decrease in its expression and results in a lethal encephalopathy phenotype [107]. Alu elements, which encode no functional proteins, are also frequently found at multiple locations in lncRNA sequences [108]. Recently, many studies have suggested that Alu sequence in lncRNAs can contribute to the function of lncRNAs. For example, Alu-mediated CDKN1A/p21 transcriptional regulator (APTR) negatively regulates p21 expression by recruiting polycomb

Several algorithms have also been developed for the sensitive and precise detection of SVs from the whole genome sequence (WGS) and whole exome sequence (WES) data published in large international consortia such as The International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), and driver SV events with remarkable functional consequences have been identified [82, 91]. The most SVs were generated by L1 (99%), followed by SINE VNTR Alu (SVA) and ERV [92]. Yet, few retrotranspositions of HERVs have been reported in human

Since 1993 when the methylation status of L1 in cancer cells was first measured by Thayer et al., L1 hypomethylation has been reported in many types of human cancers, including prostate, ovarian, head and neck, lung, thyroid, and breast cancer [94, 95]. However, some controversial results showed no changes in L1 methylation levels of cancers including thyroid cancer, renal cancer, lymphoma, and leukemia [96]. This discrepancy may be due to differences in the tumor histological type, because association between L1 hypomethylation and clinical outcome has been demonstrated in melanoma patients. However, the mechanism of L1 hypomethylation effects on aggressive tumor behavior has not been fully investigated [49]. The most likely mechanism is the causing of DNA instability, which has been suspected as the main role of REs [92]. A DNA methyltransferase 1 (Dnmt1) mutation showed substantial genome-wide hypomethylation in all types of tissue and also known to be associated with aggressive T cell lymphomas [97, 98]. Notably, the mutation also showed a high frequency of chromosome 15 trisomy, which suggested that the DNA hypomethylation has a causal role in cancers by promoting genome instability [98]. Another possible mechanism is a dysregulation in transcription level, which activates proto-oncogenes and REs that affect tumor aggressiveness [99]. MicroRNAs, which are closely related to the development of human cancer, can be increased by global DNA hypomethylation, contributing to the acquisition of tumor aggressiveness [100]. In addition, it is possible that the L1 methylation state itself exerts a biological effect. It is known that L1 regulates the function of multiple genes by providing an alternative promoter and contributing to noncoding RNA expression [101, 102]. Therefore, further studies are needed to explain the mechanisms in which L1 hypomethylation affects tumor behavior.

*DOI: http://dx.doi.org/10.5772/intechopen.93370*

cancers [84, 93].

*Endogenous Retroelements in Cancer: Molecular Roles and Clinical Approach DOI: http://dx.doi.org/10.5772/intechopen.93370*

Several algorithms have also been developed for the sensitive and precise detection of SVs from the whole genome sequence (WGS) and whole exome sequence (WES) data published in large international consortia such as The International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), and driver SV events with remarkable functional consequences have been identified [82, 91]. The most SVs were generated by L1 (99%), followed by SINE VNTR Alu (SVA) and ERV [92]. Yet, few retrotranspositions of HERVs have been reported in human cancers [84, 93].

#### **3.2 Epigenomic regulation and reactivation of REs in cancer**

Since 1993 when the methylation status of L1 in cancer cells was first measured by Thayer et al., L1 hypomethylation has been reported in many types of human cancers, including prostate, ovarian, head and neck, lung, thyroid, and breast cancer [94, 95]. However, some controversial results showed no changes in L1 methylation levels of cancers including thyroid cancer, renal cancer, lymphoma, and leukemia [96]. This discrepancy may be due to differences in the tumor histological type, because association between L1 hypomethylation and clinical outcome has been demonstrated in melanoma patients. However, the mechanism of L1 hypomethylation effects on aggressive tumor behavior has not been fully investigated [49]. The most likely mechanism is the causing of DNA instability, which has been suspected as the main role of REs [92]. A DNA methyltransferase 1 (Dnmt1) mutation showed substantial genome-wide hypomethylation in all types of tissue and also known to be associated with aggressive T cell lymphomas [97, 98]. Notably, the mutation also showed a high frequency of chromosome 15 trisomy, which suggested that the DNA hypomethylation has a causal role in cancers by promoting genome instability [98]. Another possible mechanism is a dysregulation in transcription level, which activates proto-oncogenes and REs that affect tumor aggressiveness [99]. MicroRNAs, which are closely related to the development of human cancer, can be increased by global DNA hypomethylation, contributing to the acquisition of tumor aggressiveness [100]. In addition, it is possible that the L1 methylation state itself exerts a biological effect. It is known that L1 regulates the function of multiple genes by providing an alternative promoter and contributing to noncoding RNA expression [101, 102]. Therefore, further studies are needed to explain the mechanisms in which L1 hypomethylation affects tumor behavior.

#### **3.3 REs, the origin of cancer associated non-coding transcripts**

RNA sequencing using next-generation sequencing technology has provided a large amount of gene expression data in both normal and disease conditions, such as cancer [103]. Growing evidence suggests that REs in the intergenic regions of the human genome are sources of noncoding RNAs, including micro RNAs (miRNAs) and long noncoding RNAs (lncRNAs) [104]. Notably, about 30% of human lncRNAs originate from REs, specifically HERVs. In addition, about 80% of lncRNAs contain RE-originated sequences within or nearby their transcription start sites [105]. Importantly, a recent study has reported that many lncRNAs have a crucial role in a variety of fundamental cellular processes and diseases [106]. A recent study reported that a single-nucleotide polymorphism (SNP) in an L1-containing lncRNA sequence located in an intron of SLC7A2 leads to a decrease in its expression and results in a lethal encephalopathy phenotype [107]. Alu elements, which encode no functional proteins, are also frequently found at multiple locations in lncRNA sequences [108]. Recently, many studies have suggested that Alu sequence in lncRNAs can contribute to the function of lncRNAs. For example, Alu-mediated CDKN1A/p21 transcriptional regulator (APTR) negatively regulates p21 expression by recruiting polycomb

*Methods in Molecular Medicine*

4–100 de novo insertions per tumor.

**3. Roles of RE expressed in cancers**

**3.1 The source of genome instability**

expression of the ST18 gene [84].

The apolipoprotein B mRNA editing catalytic polypeptide 3 (APOBEC3) proteins are cytidine deaminases of which family consists of seven family members (APOBEC3-A through -H) with diverse activities against a variety of retroviruses and endogenous REs, even though the activity of L1 suppression does not correlate either with antiviral activity against Vif-deficient HIV-1 and murine leukemia virus, or with patterns of subcellular localization [69, 70]. Thus, the inhibitory effect of APOBEC3 family members, specifically APOBEC3G on L1 transposition might not

Besides APOBEC3G, MOV10, SAMHD1, and ZAP have all been identified to be able to inhibit L1 activity through diverse mechanisms [71]. MOV10 inhibits L1 mobility through interacting with L1 RNP resulting in L1 transcript degradation [72]. SAMHD1 inhibits the L1 RT activity [73]. ZAP also restricts L1 activity

Together, it will be a universal explanation for the various epigenomic modifications that are directly associated with both genome-wide RE silencing and reactivation that is much more commonly found in diverse human cancers as frequent as

The genomic instability caused by de novo insertions of REs that frequently occur in cancer is the major pathophysiological role accepted by the public [75, 76]. However, this is a very limited explanation of the universal functions of REs, because most REs lose their ability to mobilize [16]. Although some retain their coding potentials, these are silenced tightly by various mechanisms and at various levels, such as epigenomic mechanisms, transcription, and posttranscription [77].

De novo insertions of REs, despite their defective form, can both directly and indirectly affect surrounding human genome sequences [78]. Some of these events occur at high enough frequency to result in vast amounts of rearrangement of the host genome sequence [16]. This does not happen only via the mechanism of transposition activity followed by reintegration but also via the homologous recombination between dispersed REs, resulting in large structural variations (SVs) including duplications, inversions, and deletions [79]. REs are also the source of small SVs such as single-nucleotide variants (SNVs) and short indels, which are caused by template switching during repair of replication errors [16]. The SVs derived from reactivation and expansion of REs via either mobilization activity or homologous recombination have been frequently found in many cancers (~50%) [80, 81]. A high enrichment was reported especially in certain types of cancers, such as esophageal cancers, colon cancers, and squamous cell lung cancers (>90%) [82]. Although this result indicated that somatic L1 insertions are very frequently found in certain cancers, it is known that a majority of RE somatic integrations are pas-

be due to deaminase activity, but due to novel mechanism(s) [70].

through the loss of L1 transcripts and ribonucleoprotein integrity [74].

Thus, a more in-depth understanding of RE function is mandatory.

senger mutations with little or no effect on cancer development [83].

Nevertheless, specific SV loci derived from somatic L1 insertions have also been identified as drivers in most cancer types, including colorectal, breast, lung, and liver cancers [84–88]. For example, disruption of the APC gene by the insertion of L1 in colon cancer has been well studied [89]. Additionally, a recent study identified driver SV by L1 insertion in liver cancer [90]. L1 integration in the intron of the ST18 gene disrupted a cis-regulatory repressor element, resulting in increased

**84**

repressive proteins to the p21 promoter. The Alu sequence is crucial to the localization of APTR on the p21 promoter that regulates cell growth and proliferation [109].

Despite the limited contribution of L1 and Alu to lncRNAs, a close association between HERVs and ncRNAs was reported by Kelley and Rinn [110]. Hundreds of ncRNAs originated from HERV-H. For example, the lncRNA ROR known to promote the progression of human cancers is one of the ncRNAs promoted by a HERV-H element [111]. Moreover, the lncRNA produced by HERV-K11 directly binds to polypyrimidine tract-binding protein-associated splicing factor (PSF), of which the function is to repress proto-oncogene transcription, reversing the PSF-mediated repression of proto-oncogene transcription and subsequently driving tumorigenesis [46, 112]. Other HERV-related lncRNAs with tumor-suppressive potential have also been identified in the intronic RNAs arising from ERV-9 [45]. It has been reported that its antisense RNA at 3′-untranslated regions was found to physically bind to key transcription factors for cell proliferation such as NF-Y, p53, and sp1. This means that the HERV-related lncRNAs may have a function as decoy targets or traps for the transcription factors resulting in the growth retardation of cancer cells [113].

Another role of RE transcripts related to human disease is to form a complex with the cytoplasmic cDNA of the reactivated RE transcripts to trigger the signal of the inflammatory pathway [23]; for example, RE-derived cytosolic DNA accumulated in Aicardi-Goutières syndrome (AGS) [114]. IFNB1 expression also has an anticorrelation with L1 retrotransposition in cancer cells [115]. Moreover, the study by Ishak et al. showed that mutation of the RB1 gene causes both genome-wide upregulation of L1 expression in somatic cells as well as increased susceptibility to leukemia [116]. Gasche et al. reported that the IL-6 treatment of a cancer cell line induced genome-wide L1 promoter hypomethylation [117]. Altogether, the evidence indicates that REs modify an important aspect of human tumorigenesis.

#### **3.4 RE proteins associated with tumorigenesis**

ORF1 and ORF2 in L1 and GAG, POL, and ENV in HERV are proteins encoded by REs that are essential to complete the replication cycle, whereas Alu's are RNA polymerase III-transcribed sequences without coding potential [118]. Most REs lose their coding potential due to accumulated mutations; however, it is well known that hundreds of L1 are still active to produce two essential proteins, ORF1 (p40, RNA binding protein) and ORF2 (p109, endonuclease and reverse transcriptase activities) [119, 120]. Additionally, although no infectious virus formed by HERVs is reported, multiple protein expressions and their functions have been studied in various HERV families [46]. Most comprehensive studies have reported on envelop proteins (ENV) and their pathogenic properties. The transcripts encoding capsid and protease (GAG) and reverse transcriptase with RNase H domain and integrase (POL) ORFs have been detected in many cells and tissues from both diseased and healthy individuals [121]. Remarkably, HERV-W encodes an ENV protein known as ERVWE1 (Syncytin1), which has been adopted by the human to functionally contribute in placenta biogenesis [122]. Similarly, Syncytin2 encoded by ERVFRD1 is known to have a key role in the implantation of human embryos [123]. Aberrant expression of HERV-W has been known to be associated with various human diseases including cancer [122, 124, 125].

In cancer, an increase in retroviral protein expression was generally detected. Overexpression of L1 ORF1 protein was detected from more than 90% of breast, ovarian, and pancreatic cancers followed by tubular gastrointestinal tract, lung, and prostate cancers (about 50%) [126, 127]. However, the high expression of L1 ORF1p expression is dependent on tumor origin, and it differs case by case even within a similar histological type of cancer. For example, L1 ORF1p is detected in lung adenocarcinoma at greatly varying levels (about 20% are very high, about 30% are

**87**

*Endogenous Retroelements in Cancer: Molecular Roles and Clinical Approach*

moderate, and the rest are undetectable) [128]. Several antibodies targeting ORF2p have recently been produced, and thus, the overexpression of ORF2p was detected in many cancers. Although the functional effects of L1 proteins in human cancers remain unclear in most cancer contexts, this data suggests that L1 proteins are potential cancer biomarkers for the diagnosis of cancer development or the prognosis of clinical outcomes [126, 129]. On the other hand, the HERV-K ENV protein has been identified in various cancer tissues and several different mechanisms by which it associates with tumorigenesis have been proposed [130]. The melanocyte antigen HERV-K-MEL is expressed in about 85% of malignant melanocytes, whereas breast cancer, ovarian cancer, teratocarcinoma, sarcoma, and bladder cancer also express HERV-K ENV [131]. Other HERV families, HERV-E, and ERV3 have also been detected in more than 30% of ovarian cancer patients and are higher in patients with lymph-node-positive breast cancer [11, 132]. Moreover, some antibodies against HERV-K have been detected in serum samples with melanoma [133].

Despite HERVs being known to be incompetent in transposition, studies have shown that the protein-coding potentials can still promote neoplastic properties during tumorigenesis through diverse mechanisms [134]. The oncogenic role of HERV proteins is well investigated with NP9 and REC, which are accessory splice proteins of HERV-K [135]. The transcripts encoding these proteins are overexpressed in many tumors including breast cancers and both are known to interact with the promyelocytic leukemia zinc finger (PLZF) tumor suppressor, which is a transcriptional repressor and epigenetic modulator implicated in cancer. C-myc proto-oncogene is one of the major targets of PLZF. Interaction of NP9 and REC with PLZF abrogates the transcriptional repression of the c-Myc gene promoter, which results in c-Myc overproduction [136]. In addition, the abnormal cell-to-cell fusion activity of HERV-W ENV proteins has been shown to possibly contribute to tumor development and metastasis [130]. Further studies to characterize the expression and

molecular functions of these HERV proteins in cancers are demanded.

**4. Implementation of REs for cancer diagnosis and prognosis**

they identified 788 HERVs harboring significantly increased the numbers of

somatic single-nucleotide variations (SNVs) [141].

Identification of somatic mutation hotspots associated with cancer is very important for functional analysis and diagnosis [137]. Several methods have been developed for the identification of somatic RE insertions in cancers (L1-seq, TIPseq, and ERVcaller), and many bioinformatics tools to discover somatic L1 insertions in silico using WGS or WES data have been developed [138, 139]. SVs via L1 insertion associated with cancer have been well investigated in a couple of genes, such as the APC gene that is considered to be a tumor suppressor of colorectal polyposis in colorectal cancer [89]. A potential suppressor of L1, TP53 mutation by L1 insertions, has been observed frequently in tumors. In addition, L1 insertional mutation of MOV10, which is a key L1 suppressor, decreased the expression of the MOV10 in tumors with high L1 insertions [140]. On the other hand, instead of cancer-associated SVs caused by RE insertion, genome variations that might be associated with HERVs or around gene expression in cancer have been identified. Chang et al. identified that four HERVs with mutation hotspots overlapped with exons of four human protein coding genes, which are TNN (HERV-9/LTR12), OR4K15 (HERV-IP10F/LTR10F), ZNF99 (HERV-W/ HERV17/LTR17), and KIR2DL1 (MST/MaLR). They also evaluated the effect of each non-synonymous SNV on the survival of kidney cancer patients. Furthermore,

**4.1 Structural variations (SVs) associated with REs in cancer**

*DOI: http://dx.doi.org/10.5772/intechopen.93370*

#### *Endogenous Retroelements in Cancer: Molecular Roles and Clinical Approach DOI: http://dx.doi.org/10.5772/intechopen.93370*

moderate, and the rest are undetectable) [128]. Several antibodies targeting ORF2p have recently been produced, and thus, the overexpression of ORF2p was detected in many cancers. Although the functional effects of L1 proteins in human cancers remain unclear in most cancer contexts, this data suggests that L1 proteins are potential cancer biomarkers for the diagnosis of cancer development or the prognosis of clinical outcomes [126, 129]. On the other hand, the HERV-K ENV protein has been identified in various cancer tissues and several different mechanisms by which it associates with tumorigenesis have been proposed [130]. The melanocyte antigen HERV-K-MEL is expressed in about 85% of malignant melanocytes, whereas breast cancer, ovarian cancer, teratocarcinoma, sarcoma, and bladder cancer also express HERV-K ENV [131]. Other HERV families, HERV-E, and ERV3 have also been detected in more than 30% of ovarian cancer patients and are higher in patients with lymph-node-positive breast cancer [11, 132]. Moreover, some antibodies against HERV-K have been detected in serum samples with melanoma [133].

Despite HERVs being known to be incompetent in transposition, studies have shown that the protein-coding potentials can still promote neoplastic properties during tumorigenesis through diverse mechanisms [134]. The oncogenic role of HERV proteins is well investigated with NP9 and REC, which are accessory splice proteins of HERV-K [135]. The transcripts encoding these proteins are overexpressed in many tumors including breast cancers and both are known to interact with the promyelocytic leukemia zinc finger (PLZF) tumor suppressor, which is a transcriptional repressor and epigenetic modulator implicated in cancer. C-myc proto-oncogene is one of the major targets of PLZF. Interaction of NP9 and REC with PLZF abrogates the transcriptional repression of the c-Myc gene promoter, which results in c-Myc overproduction [136]. In addition, the abnormal cell-to-cell fusion activity of HERV-W ENV proteins has been shown to possibly contribute to tumor development and metastasis [130]. Further studies to characterize the expression and molecular functions of these HERV proteins in cancers are demanded.
