We are IntechOpen, the world's leading publisher of Open Access books Built by scientists, for scientists

3,700+ Open access books available

115,000+

International authors and editors

119M+

Downloads

Our authors are among the

Top 1%

most cited scientists

12.2%

Contributors from top 500 universities

Selection of our books indexed in the Book Citation Index in Web of Science™ Core Collection (BKCI)

## Interested in publishing with us? Contact book.department@intechopen.com

Numbers displayed above are based on latest data collected. For more information visit www.intechopen.com

## **Meet the editor**

Michael S.D. Kormann is currently an Assistant Professor for Translational Genomics and Gene Therapy at the University of Tübingen, Germany, and works at the University Children's Hospital - Section I - Pediatric Infectiology & Immunology. He obtained his PhD from Ludwig-Maximilians-University, Munich, Germany. Dr. Kormann serves as an Editor, Reviewer and Editorial

Board member for many scientific journals and books. He is a member of several scientific societies, including German CF Association, and American Society of Gene and Cell Therapy. Dr. Kormann has a total of 22 publications with more than 800 citations. He has three patents and has won several awards and prizes for his work.

## Contents

#### **Preface XI**

**Section 1 Gene Delivery Technologies 1** Chapter 1 **Next-Generation Therapeutics: mRNA as a Novel Therapeutic Option for Single-Gene Disorders 3** Tatjana Michel, Hans-Peter Wendel and Stefanie Krajewski

#### Chapter 2 **Gene Delivery Technologies for Efficient Genome Editing: Applications in Gene Therapy 21** Francisco Martin, Sabina Sánchez-Hernández, Alejandra Gutierrez-Guerrero and Karim Benabdellah

#### **Section 2 Gene Editing and Gene Correction Technologies 33**


#### Chapter 7 **Application of Genome Editing Technology to MicroRNA Research in Mammalians 163** Lei Yu, Jennifer Batara and Biao Lu

#### Chapter 8 **Emerging Gene Correction Strategies for Muscular Dystrophies: Scientific Progress and Regulatory Impact 187** Houria Bachtarzi and Tim Farries

### Preface

Treatment of diseases caused by dysfunctional or insufficiently expressed proteins can prin‐ cipally be addressed by either administration of the functionally active protein itself or by its genetically encoded precursors, the corresponding gene (DNA) or its transcript (mRNA). Administration of the functional protein has been utilized for the successful treatment of a number of diseases, including hemophilia and alpha-1 antitrypsin deficiency. However, mode of delivery remains a major obstacle to protein therapy, especially in non-systemic diseases, where administration via standard routes, such as oral, intravenous (i.v.) or intra‐ muscular (i.m.) injection, is often ineffective, as the therapeutic protein is metabolized or cleared before it can enter the target tissue or target cells.

A promising alternative to gene addition and transcript replacement approaches is gene cor‐ rection through editing of the endogenous gene. In this approach, a mutation is corrected in its' endogenous chromosomal locus using specific endonucleases such as Zinc Finger Nucle‐ ases (ZFNs), TAL Effector Nucleases (TALENs) or CRISPR/Cas9 systems, together with a donor template. The endonuclease cuts surgically at the predetermined locus in the genome, creating a double-strand break, which initiates homologous recombination. During the re‐ combination a manipulation or correction template will be favored over the homologous chromosome if present in excess. Correcting a gene defect on the genomic level, issues such as inappropriate tissue specificity, timing, level and duration of expression can be complete‐ ly avoided, because the targeted gene remains under natural, endogenous controls.

Gene therapy technology today is more effective and precise as it has ever been. The prog‐ ress of tools and its application is highlighted in this book.

Enjoy the fascinating insight in modern genetic engineering.

#### **Michael S.D. Kormann, PhD**

University Children's Hospital - Section I, Pediatric Infectiology & Immunology, Translational Genomics and Gene Therapy, Lothar-Meyer-Bau, Tübingen, Germany

**Gene Delivery Technologies**

## **Next-Generation Therapeutics: mRNA as a Novel Therapeutic Option for Single-Gene Disorders**

Tatjana Michel, Hans-Peter Wendel and Stefanie Krajewski

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/62243

#### **Abstract**

In single-gene disorders, such as α1-antitrypsindeficiency (AATD), hemophilia B (clotting factor IX deficiency), and lecithin-cholesterol acyltransferasedeficiency (LCATD), a gene mutation causes missing or dysfunctional protein synthesis, which in turn can lead to serious complications for the patient affected. Furthermore, single-gene disorders are associated with severe, early-onset conditions and necessitate expensive lifelong care. Today, therapeutic treatment options remain limited, cost-intensive, or ineffective. Therefore, the novel mRNA-based therapeutic strategy for the treatment of single-gene disorders, which is based on the induction of *de novo* synthesis of the functional proteins, has extraordinary potential. After the delivery of the specific mRNA to the target cells, the desired protein is expressed by the cells' own translational machinery, and hence, a fullyfunctionalproteinreplacesthedefective or missing protein.mRNAtherapyprovides an innovative, highly promising, and inexpensive therapeutic approach and will thus lead to new advances in the treatment of single-gene disorders.

**Keywords:** α1-antitrypsin deficiency, hemophilia B, lecithin-cholesterol acyltransfer‐ ase deficiency, mRNA therapy, single-gene disorders, messenger RNA, next-genera‐ tion therapeutics, gene mutation

#### **1. Introduction**

The human body is made up of millions of cells, which have special, well-defined functions, such as the transportation of oxygen in the blood. Proteins carry out all these functions that are necessary for life. In most of the cells, the genetic information is encoded on the 23 pairs of

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

chromosomes found in the nucleus. On each chromosome, the information for the production of a great range of different proteins is contained in genes made up of DNA. There are approx‐ imately 25,000 protein-encoding genes in the human genome [1].

Protein synthesis takes place in two major processes. First, the DNA is transcribed into the mRNA in the nucleus followed by cytoplasmic mRNA translation into the protein.

Alteration in a single gene (i.e., a single-gene disorder) caused by a mutation in the gene's DNA sequences leads to dysfunction of the gene. As a result, the protein the gene codes for is either altered or missing, which can result in serious complications in the human body.

Over the last few years, it has become apparent that single-gene disorders are far more numerous than previously assumed, and more than 1800 single-gene disorders have been identified [2]. Moreover, single-gene disorders are associated with severe, early-onset condi‐ tions necessitating lifelong care [3].

#### **2. Examples of single-gene disorders**

#### **2.1. α1-Antitrypsin (AAT) deficiency (AATD)**

AAT is a serine protease inhibitor belonging to the serpin superfamily, and it is predominantly synthesized in the liver and released into the bloodstream. By inhibiting neutrophil elastase, which is released from activated neutrophil granulocytes and macrophages, AAT plays a pivotal role in the prevention of proteolytic damage to the host tissue in the presence of inflammatory processes. Physiological AAT plasma concentrations are within the range of 20 to 53 μmol/L (150–300 mg/dL).

Mutations in the SERPIN1A gene are associated with AATD, one of the most common metabolic diseases in Europe. AATD, first described in 1963 by Laurell and Erikson, affects approximately 1 in 2000 to 1 in 5000 individuals [4, 5]. People suffering from AATD have AAT levels below 11 μmol/L (80 mg/dL) and are predisposed to severe lung diseases, such as chronic obstructive pulmonary disease (COPD), or liver diseases.

#### **2.2. Hemophilia B**

Hemophilia B, also called Christmas disease, is the second most common form of hemophilia, affecting approximately 20% of those diagnosed with hemophilia. It is a bleeding disorder that is typically inherited and characterized by a lack of clotting activity of factor IX (FIX; Christmas factor), a blood clotting factor that is synthesized in hepatocytes and plays a crucial role in blood coagulation. Depending on the bleeding phenotype of hemophilia B, which is classified as mild, moderate, or severe, and the sufferer's overall health, symptoms vary from prolonged, partially excessive bleeding to serious bruising and joint pain [6].

#### **2.3. Lecithin-cholesterol acyltransferase (LCAT) deficiency (LCATD)**

The gene for LCAT is localized on chromosome 16 and primarily expressed in the liver. After secretion to the plasma, it is primarily found attached to circulating high-density lipoprotein (HDL) particles [7]. LCAT converts free cholesterol into cholesterol esters on the surface of HDL, thus removing cholesterol from the blood and tissues [8].

Mutations in the LCAT gene result in deficient or absent catalytic LCAT activity, leading to a reduction in the enzyme's ability to attach cholesterol to lipoproteins. Hence, deficiency leads to the accumulation of unesterified cholesterol in different tissues (e.g., in the cornea, eryth‐ rocytes, or kidneys) and may lead to corneal opacities, renal failure, or hemolytic anemia [9]. Due to the accumulation of cholesterol in the lining of the arteries, LCATD sufferers have an increased risk for premature atherosclerosis.

#### **2.4. Familial hypercholesterolemia (FH)**

FH is a genetic disease characterized by high low-density lipoprotein (LDL) cholesterol levels in the blood. This is caused by a defect in the gene for the LDL receptor (LDLR) that prevents it from absorbing the LDL from the blood into the cell for metabolization. The symptoms in patients with FH range from harmless fatty skin deposits called xanthoma to life-threatening atherosclerotic vascular disease, which can culminate in myocardial infarction or stroke [10]. In patients suffering from the homozygous form of FH, such life-threatening complications occur in infancy. Statins are currently used for standard therapy, but their efficacy is contro‐ versial [11].

#### **2.5. Available treatment options**

Different treatment options exist depending on the genetic disorder. Treatment of AATD is currently achieved by aerosol or intravenous augmentation therapy of purified and pooled human plasma AAT protein [12]. Furthermore, the risks of proinflammatory stimuli to the lung need to be minimized by ensuring that the patient abstains from smoking, using bron‐ chodilators, etc. Augmentation therapy is, however, very expensive, because the AAT protein derived from a healthy donor needs to undergo a rigorous screening process before it is ready to be infused [13].

Patients suffering from hemophilia B are routinely treated with recombinant FIX concentrates, which have greatly reduced the mortality associated with hemophilia B. However, there are still significant drawbacks of this existing therapy, including the necessity of multiple weekly infusions for patients with severe hemophilia B as well as repeated bleeding despite prophy‐ lactic therapy, which can cause long-term damage in joints and other tissues.

The therapeutic up-regulation of LCAT function has gained interest in recent years, not only as an enzyme replacement therapy for LCATD syndromes but also as a potential therapeutic strategy for reducing atherosclerosis [14]. In 2013, the first case report was published high‐ lighting the success of LCAT replacement therapy using a recombinant enzyme form in a 53 year-old patient.

On the contrary, many gene therapy-based systems for the treatment of various genetic disorders have been developed and investigated during the last few years. Gene therapy promises the permanent expression of the functional protein after the incorporation of the corresponding gene into the host genome. However, as yet, gene therapy has not found wide clinical application, because, depending on the vectors used, it can be associated with risks for the patient, such as insertional mutagenesis, carcinogenic effects, immune responses, low genetransfer efficiency, or protein misfolding [15].

#### **3. mRNA as a novel therapeutic option**

#### **3.1. mRNA as a therapeutic agent**

Next to gene therapy, wherein genetic defects are corrected by the introduction of specific DNA sequences into the genome, mRNA-based therapy promises new advances in the treatment of single-gene disorders. Via the delivery of a specific *in vitro*-generated mRNA to the target cells, the expression of a desired protein can be induced. The idea of using specific mRNAs to produce a protein of interest instead of protein replacement via DNA gene therapy was described 25 years ago by Wolff and colleagues [16]. At that time, however, the stability of the mRNA was poor and the immunogenicity was too high for the therapy to be practical. In recent years, the administration of the mRNA as a therapeutic agent has gained enormous potential in the fields of disease treatment, regenerative medicine, and vaccination [17–21].

Especially for monogenic diseases, mRNA therapy can be a highly beneficial alternative to classical gene therapy. Because monogenic diseases result in defective or missing protein synthesis, protein replacement therapy is primarily used for these kinds of diseases. The mRNA can be easily produced in large amounts for the protein of interest in comparison to pDNA, and if the mRNA is used, there is no need to integrate promoter and terminator regions in the sequence [22].

Moreover, the mRNA-based therapeutic strategy has significant advantages:


Overall, this therapeutic strategy could be safely used in patients, and it is more cost-effective and easier to manipulate than gene therapy.

The standard *in vitro* procedure for mRNA generation begins with the plasmid, which contains the coding sequence of the protein. First, this sequence is amplified using polymerase chain reaction (PCR). The generated DNA template contains all the important elements of the mRNA. Second, the amplified PCR product is used to generate the mRNA. Therefore, *in vitro* transcription (IVT) is performed using the T7 or SP6 RNA polymerase, which synthesizes the mRNA. After the purification and quality control steps, the mRNA is ready to use (**Figure 1**).

**Figure 1.** Generation process of the mRNA.

The mRNA is a single-stranded molecule containing a poly(A)-tail at the end and a 5′-cap at the beginning. The coding sequence for the protein is marked by a start codon and a stop codon. The untranslated regions (UTRs) are in between the cap/tail and the coding sequence (**Figure 2**) [16].

**Figure 2.** Schematic overview of the general structure of an IVT mRNA.

However, IVT mRNA is very sensitive to degradation by nucleases, which limits its suitability for transfections and therapeutic applications [21]. If the mRNA will be applied as a therapeutic agent, there is a need for special modifications. The modifications should be nonmutagenic and should not interfere with the translation machinery of the cell [23]. To improve the translation of the mRNA, the cap and the poly(A)-tail are important. The mRNA cap is responsible for the recognition of the ribosomes and binding to the ribosomes and for the initiation of the translation machinery [24]. It was reported that part of the unmodified cap structure could be bound in the wrong direction to the mRNA during IVT. Thus, the mRNA cannot be located by the cap-binding complex (CBC) of the ribosomes and the translation cannot follow [25]. By using an anti-reverse cap analog (ARCA) during IVT, this can be prevented. An ARCA contains a 5′-5′ triphosphate bridge, and the 3′OH is replaced by OCH3, making it impossible for the cap to bind in the wrong direction [26]. The 5′-5′ bridge and the chemical modifications at the 3′-position lead to translatable mRNA of high quality and great translation efficiency [27, 28]. Furthermore, ARCA-capped mRNA is more resistant to hydrolases [29]. The poly(A)-tail, which binds to the polyadenosyl-binding-protein (PABP), is also very important for the stability and translation of the mRNA [30, 31]. The PABP interacts with the CBC and makes a circular structure of the mRNA molecule [15, 32]. This circular structure minimizes the contact surface for nucleases [33]. If the poly(A)-tail is shorter than 12 adenine nucleosides or removed completely, the cap structure is cleaved and the mRNA is degraded. Thus, the poly(A)-tail is very important to obtaining the cap structure and delaying the degradation of the mRNA [15, 34]. A poly(A)-tail of more than 60 adenine monophosphates increases the translation efficiency of the mRNA [35]. Therefore, the poly(A)-tail and the cap structure contribute to the stability of the mRNA [27, 36]. The 5′- and 3′-UTRs include specific regulatory sequence regions that are necessary to modulate the stability and translation of the mRNA [16]. The 3′-UTR region contains α- and β-globin sequences that enhance the stability and translation of the mRNA [16, 37, 38]. Furthermore, the 3′- and 5′-UTR regions inhibit the decapping and degradation of the mRNA [16, 39, 40]. On the contrary, if limited protein production is desired, faster mRNA degradation is possible by integrating AU-rich areas in the 3′-UTR region [41]. The protein-coding region of the mRNA can be optimized in two ways. First, the use of optimized codons leads to improved translation of the sequence to the desired protein. Second, via the optimization of the bases, the endonucleolytic degradation can be reduced [16].

#### **3.2. Potential immune response against IVT mRNA**

In comparison to recombinant protein in the substitution therapy approach, mRNA-translated proteins are autologously produced by the cell's own machinery. Usually, they undergo the correct modification and folding [16], but protein aggregation and immune reactions to the translated protein cannot be excluded. Therefore, it is necessary to investigate the activation of the immune system and distribution of the antibodies against IVT mRNA-translated proteins in clinical studies. Furthermore, the combination of the immune reaction to the translated mRNA protein and the foreign mRNA within the cells can lead to immunopathol‐ ogy in the cells or organs [16, 42].

Regarding protein replacement therapies, the activation of the immune system caused by mRNA administration might be disadvantageous. It is well known that foreign mRNAs as well as pDNAs activate the immune system via recognition through toll-like receptors (TLRs). TLRs recognize different pathogen patterns resulting in the expression of different cytokines. TLR3 and TLR7/8 are responsible for the recognition of different RNA types [43]. The mRNA and other RNA types, such as small interfering RNA (siRNA) or double-stranded RNA (dsRNA), are recognized by TLR3 [44–46], whereas TLR7/8 is activated by single-stranded RNA (ssRNA) [47]. In nonimmune cells, the recognition of the mRNA occurs through the retinoic acid-inducible gene I (RIG-I), which is activated by short RNA and dsRNA [48] and leads to interleukin-1β (IL-1β) production [49]. It is known that the IVT mRNA leads to a strong distribution of tumor necrosis factor-α (TNF-α) if the mRNA has no modifications [50]. Additionally, a strong type I interferon (type I IFN) response of the cells is induced upon contact with exogenous mRNAs. This is also induced by mRNAs that form secondary structures, such as loops or hairpins, or by mRNAs that bind to incompletely synthesized mRNA fragments or incompletely degraded DNA fragments [43, 51]. IFNs then activate the antiviral genes in the genome and lead to a translation stop and the degradation of the mRNA [43]. However, the immune reaction can be avoided if the mRNA is purified and if modified nucleotides are inserted during IVT [50, 52–54].

The incorporation of different modified nucleotides, such as pseudouridine (pseudo-U), 2' thiouridine (2'-tU), 5′-methyluridin (5′-mU), 5-methylcytidine (5′-mC), or *N*<sup>6</sup> -methyladeno‐ sine (N6-mA), can prevent the cellular immune response [50, 53]. These modified nucleotides incorporated in the mRNA help to avoid the activation of TLRs [50]. In particular, pseudo-U and 2'-tU make the recognition of IVT mRNA by RIG-I impossible [16, 54, 55]. To minimize the immune activation and optimize the translation efficiency of the mRNA to the protein, high-performance liquid chromatography (HPLC) purification should be applied to eliminate the dsRNA contamination that can still be present after the IVT process [56].

However, mRNA modification may also represent another risk. The naturally existing mRNA is degradable by the RNases in the extracellular space, but the modification of the IVT mRNA makes degradation more difficult [57]. Some of the modified nucleotides are associated with mitochondrial toxicities and hepatic failure and play a role in viral and tumor cell replication [58, 59]. Here too, further investigations and clinical trials are necessary to prove the risks and benefits of IVT mRNA modifications.

#### **3.3. mRNA delivery systems and specific targeting**

For clinical mRNA applications, a specific delivery system is needed, because otherwise the delivery of the mRNA to the target cells is unguaranteed and inefficient [27]. Therefore, the development and engineering of safe and effective delivery vectors for mRNA therapy is inevitable [60]. Viral and nonviral vectors can be used to bring the mRNA into the cells. For the direct translation of the mRNA to the protein, only positive-stranded viruses can be applied [61]. Negative-stranded viruses are not infectious and need an RNA-dependent RNA poly‐ merase for mRNA translation [62]. However, viral vectors have some limitations; for example, they can be carcinogenic [63], they can activate immune responses [64], they can be difficult to produce [65], and they have a limited packaging capacity [66].

Nonviral vectors have lower immunogenicity, they can deliver large genetic molecules, and they are easier to produce [60, 67, 68]. Nonviral delivery systems can be subdivided into direct and indirect delivery systems. Direct delivery is possible via electroporation or gene guns. Electroporation is an early and efficient method of transporting mRNA into cells [69], whereby electrical pulses make the cell membrane permeable for the entry of the mRNA into the cytosol. This method does not induce immune activation, which may occur when mRNA carriers are used [27, 69]. Transfection using a gene gun requires heavy metal particles to get the nucleic acid into the cell [70]. This method allows the delivery of the mRNA to mammalian organs with minimal damage and leads to transient protein expression in the target tissues [70]. Selfassembled complexation of negatively charged mRNAs and positively charged liposomes and polysomes to lipoplexes or polyplexes is the most widely used method of bringing mRNA into cells [60]. The lipoplexes can be taken up by the cells in two different ways. The first way is by the endocytosis of the lipoplexes, whereby 98% of the lipoplexes enter the cell. The second way is through the fusion of the cell membrane and the lipoplexes, which results in the uptake of the remaining mRNA [71]. After the release of the mRNA into the cytosol, the protein translation can begin. The encapsulation of the mRNA into liposomes is a rapid, transient, and cell cycle-independent delivery method [27]. The translation of the mRNA to the protein can be measured 1 h after the transfection in nondividing cells [72].

For *in vivo* applications, the perfect mRNA delivery vector has to overcome various barriers:


For systemic delivery, the lipid and polymer complexes show protective properties against nucleases, whereby the mRNA is protected and the stability is increased [73]. However, liposome complexes sometimes interact with serum proteins. Together, they form aggregates or clots and are cleared rapidly [74]. The conjugation of the complexes with polyethylene glycol (PEG) helps to inhibit the nonspecific uptake and attachment to serum proteins [75].

Nanoscale delivery systems (10–200 nm) enhance the uptake efficiency and reduce the systemic toxicity [75]. The use of PEG coating of the liposomes increases the blood circulation time and avoids the detection of liposomes by immune cells [75, 76]. The liposomes have many advantages, such as low batch-to-batch variability, easy synthesis, biocompatibility, and scalability, over many other delivery systems [75]. The liposome surface can be functionalized by conjugation to chemically reactive lipid head groups [77]. This property makes it possible to functionalize the surface with ligands and thus enhance the target delivery [75, 78].

Different methods of application are tested for *in vivo* delivery. Polyplex nanomicelles applied with hydrodynamic intravenous injection have been shown to effectively deliver the mRNA to the liver in mouse models. This method shows a strong protein expression in nearly 100% of liver cells [79]. Intramuscular or intraperitoneal injection of erythropoietin (EPO)-coding mRNA complexed with cationic lipids leads to significantly high levels of EPO *in vivo* [20, 80]. Intratracheal and intranasal applications of Foxp3 mRNA show protective properties against asthma in mice [81]. Furthermore, it has been shown that intradermal [82] and intranodal [83] applications of the mRNA in animal models resulted in immunization against tumors. Many other methods of applying the mRNA have been described in publications in recent years, and this field is developing rapidly [19, 84]. Furthermore, patient-centered applications have improved, especially for mRNA therapy. Nebulization with the Pari-Boy® is the standard method used to apply the drugs to the lung. A study on the influence on mRNA transfection efficiency shows that the nebulization of complexed mRNA has no effects *in vitro* [85].

#### **3.4. mRNA applications**

Gene therapy allows the replacement of a defective gene through substitution and integration of the correct genetic code in the genome. The genome integration of this genetic code via viruses guarantees highly efficient gene replacement methods. However, undesirable effects, such as mutagenesis and innate immune response, may jeopardize the life or safety of the patient [27, 86]. Nonviral gene delivery is safer, but it is also associated with lower transfection efficiency due to insufficient nuclear transport. Furthermore, modifications of the pDNA, including adding a strong constructive promoter to improve transcription efficiency, may lead to unexpected alterations in the genome [27, 86, 87].

Protein-substitution therapy is associated with adverse reactions, such as headache, dizziness, and nausea [88], and high costs [89]. Recombinant proteins have been expressed in different microorganisms [90–92], plant cells [93], and human cells [94–96], but they are linked to disadvantages such as nonglycosylation or incorrect glycosylation of the product as well as product contaminations with endotoxins. Plasma-derived proteins, which are purified from human donor blood, are limited and prone to contamination.

The mRNA is an alternative to overcome the disadvantages of pDNA and direct protein substitution [15]. The mRNA does not need to enter the nucleus for transcription, because the mRNA is directly translated in the cytosol of the cells; thus, the insertion of exogenous DNA into the genome poses no risk. Furthermore, the mRNA uses the cell's own translation machinery and requires no strong promoter. Effective mRNA transfer can also take place in nondividing cells, and immunogenicity can be overcome by different modifications of the mRNA molecule [27]. Moreover, compared to protein-substitution therapy, the expression of receptors and intracellular molecules can be induced with specific mRNAs (**Figure 3**).

**Figure 3.** The way from encapsulated mRNA to protein expression in the target cell.

Several studies describe that the use of different mRNAs leads to an increase in the respective protein *in vitro* and *in vivo*. The first application of the mRNA was performed in 1992 [97]. However, the broad application of IVT mRNA only became possible as modified nucleotides were used; thus, a reduction of immune reactions and an increase of mRNA translation efficiency were achieved [50, 56]. For example, the successful expression of surfactant protein B (SP-B) of therapeutically relevant levels was shown in a mouse model after the application of SP-B-encoding mRNA via direct administration to the lung. The results showed that the inflammation of the lung (and the inevitable respiratory failure and death) was prevented [20, 98]. Likewise, the expression of the regulatory T-cell transcription factor (FOXP3) led to the prevention of asthma in a mouse model. After the administration of FOXP3-coding mRNA to the lung, the expressed protein protected the lung from allergen-induced inflammation. This mRNA approach can be used as a preventive and therapeutic drug [81]. In a different study, the expression of the AAT protein was shown after the transfection of cells with AAT-encoding mRNA. The level of AAT was measurable in the cellular supernatant 48 h after transfection, and the functionality of the protein was proven. All these studies show that mRNA therapy also promises a novel therapeutic strategy in the treatment of single-gene diseases such as AATD [99].

In the case of FH, it is also possible to induce the expression of functional LDLR after the transfection and thus regulate LDL metabolism. In this case, functional LDLR expression can help to prevent secondary diseases, such as stroke and atherosclerotic plaques. Although the induction of the expression of proteins, such as LCAT and FIX, can be performed similarly to the approach of AATD-mRNA therapy, different organs are targeted. Regarding hemophilia B and LCATD, hepatocytes should be transfected, which can be achieved by intravenous injections of the complexed protein-encoding mRNA.

Depending on the application method and the genetic defect, it is important that the cell type of interest is targeted and transfected with IVT mRNA. Additionally, the dose-effect relation‐ ship could be a challenge *in vivo*, whereby the bioavailability and individual variations also play a pivotal role and potentially make individual dose adaptation necessary [16]. Overall, mRNA application promises an effective and low-cost therapeutic strategy with the potential to efficiently correct serious monogenic diseases.

#### **3.5. Summary**

Overall, the mRNA as a novel therapy for monogenic diseases has many advantages over the currently existing treatment options such as substitution therapy or gene therapy. Using the mRNA, the expression of nearly all proteins can be induced. In this way, not only extracellular proteins, but also proteins, which are important for the function of the cells, such as growth factors or receptors can be generated. Furthermore, the mRNA does not need to penetrate into the nucleus. The effect of the mRNA is transient, thereby enabling the precise control of protein expression. Depending on the introduced modifications, the half-life of the respective mRNA and thus a reduction or increase in mRNA stability can be determined. Developments in recent years, such as modification and purification methods, have made it possible to use the mRNA as a therapeutic agent, because it became possible to control the immune reaction, which is typically triggered by exogenous mRNA. Further developments increased the extracellular stability and made mRNA transfection of the cells with nonviral vectors possible and efficient. The mRNA can be produced in large batches and under good manufacturing practice (GMP) conditions without batch-to-batch variations. Thereby, the production of specific mRNA is significantly cheaper compared to the production of the corresponding protein for substitution therapy.

The mRNA as a therapeutic agent could be a great help for patients suffering from monogenic diseases. The flexibility and variability of proteins that can be replaced by the cell's own translational machinery through the use of the mRNA is nearly unlimited. This makes mRNA to a unique therapeutic molecule, which will revolutionize therapeutic options for affected patients in the coming years.

#### **Author details**

Tatjana Michel, Hans-Peter Wendel and Stefanie Krajewski\*

\*Address all correspondence to: stefanie.krajewski@uni-tuebingen.de

Department of Thoracic, Cardiac and Vascular Surgery, University of Tuebingen, Tuebingen, Germany

#### **References**


fialuridine (FIAU), an investigational nucleoside analogue for chronic hepatitis B. *N Engl J Med* 1995, 333(17):1099–1105.


## **Gene Delivery Technologies for Efficient Genome Editing: Applications in Gene Therapy**

Francisco Martin, Sabina Sánchez-Hernández, Alejandra Gutierrez-Guerrero and Karim Benabdellah

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/62776

#### **Abstract**

Specific nucleases (SNs), including ZFNs, TALENs, and CRISPR (clustered regularly interspaced palindromic repeats), are powerful tools for genome editing (GE). These tools have achieved efficient gene repair and gene disruption of human primary cells. However, their efficiency and safety must be improved before translation into clinic. In particular, one of the main hurdles of GE technology is the delivery of the different components into the nucleus of target cells. Successful gene editing must be able to deliver the SNs and/or the donor DNA into a large number of target cells in order to have a therapeutic benefit. In addition, the delivery must be nontoxic and the SNs must be innocuous to the target cells. In this chapter, we will summarize the different ways to deliver SNs and donor DNA.

**Keywords:** gene edition, gene therapy, delivery, viral vectors, specific nucleases

#### **1. Introduction**

Although genome editing (GE) technologies have been used for more than 30 years, the efficiency and specificity was too low to be used in gene therapy (GT). However, the development of specific nucleases (SNs) that can enhance homologous directed recombination (HDR) up to 10,000 times and allows the generation of specific mutations have open new possibilities for the use of GE for GT applications. SNs can create specific DSBs at target locations in the genome. These DSBs must be repaired by the cell's endogenous mechanisms by either HDR, a high-

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

fidelity reparation process, or by nonhomologous end joining (NHEJ), an error prone process that results in insertions or deletions (indels) at the cleavage site. This repair mechanism is used as a platform for GE. Depending on the mechanism used by the cells to repair the DSBs, we can repair, insert, or delete DNA fragments in the genome of the target cells. There are four families of SNs being used: meganucleases (MNs) [1], zinc finger nucleases (ZFNs) [2], transcription activator-like effectors nucleases (TALENs) [3], and the clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR Associated 9 (Cas9) (CRISPR/Cas9) system [4] also named RNA-guided nulceases (RGNs). MNs, ZFNs, and TALENs used the principles of DNAprotein recognition to target specific locus. However, the difficulties of protein design, synthesis, and validation blocked the widespread adoption of these engineered nucleases for routine use in many laboratories. The field is experiencing a new phase thanks to the development of the RGNs [4]. This two-component system can achieve specific cleavage in any target DNA location guided by a small RNA molecule named gRNA [5].

However, in spite of the great advances in SNs design, the translation of GE technologies into clinic still requires several refinements both in terms of specificity and efficiency [6]. The efficiency of a particular GE strategy largely relies on the efficiency of delivery of the SNs and/ or the donor DNA. In this chapter, we will discuss the different tools available for the delivery of the different components required for GE.

#### **2. Viral-based vectors**

GE using SNs and/or a donor DNA requires vector systems that efficiently deliver both components into the target cells. This is a relative easy task for cells that are growing in the laboratories but is a much difficult task when we target primary human cells. In general, viralbased vectors are more efficient than non-viral for most primary human cells. Since transient expression of GE components is preferred over stable expression, we will focus on episomal (non-integrative) viral-based vectors.

#### **2.1. Adenoassociated virus (AAV)-based vectors**

AAVs have been used for the delivery of ZFNs; however, the limited capacity of the AAV vectors makes difficult the use of these vectors for the delivery of Cas9 and TALEN. The first report showing the efficacy of AAV vectors to deliver ZFNs was published in 2012 [7, 8]. Soon later, the group of Katherine High showed that systemic AAV-ZFNs and AAV-corrective donor template enables production of high levels of human factor IX in a murine model of hemophilia B [9]. In another study, Weber et al. [10] reported that the administration of AAV-ZFNs-targeting HBV polymerase achieved an inhibition of HBV replication. In spite of their big size, several groups have attempted to deliver *S. pyogenes* Cas9 and its gRNA using AAV vectors [11]. However, oversized AAV vectors render inconsistent results [12]. As an alterna‐ tive, different groups deliver the gRNA and the Cas9 in separate AAVs with very promising results [13]. Indeed, these two-AAV vectors systems have achieved correction of dystrophin expression in a mouse model of Duchenne muscular dystrophy (DMD) [14] and correction a metabolic liver disease [15].

Recently, the development of a new RGN based in the smaller Cas9 from *S. aureus* [16] opened the possibility of generating AAV vectors harboring both Cas9 and gRNA expression cassettes. Taking the advantage of this smaller Cas9, several groups have developed "All-In-One" AAV-RGN systems obtaining very promising results in animal models of DMD [17, 18]. In these experiments, systemic delivery of AAV-Cas9-gRNA to DMM mice resulted in the expression of the dystrophin gene with improvements of muscle biochemistry and enhancement of muscle force.

Beside the potential of AAV as a gene delivery tools, it has been reported that these vectors are able to enhance up to 1000-fold the HDR rate in mammalian cells [19]. In fact, the delivery of corrective donor DNA (without SNs) with AAV has achieved correction of mucopolysac‐ charidosis [20] and hereditary tyrosinemia [21] in mouse models. Of course, scientists have taken advantage of this property and have combined SNs with delivery of Donor DNA by AAV vectors. Wang and colleagues [22], combined electroporation of ZFN mRNA (see below) with donor delivery by AAV serotype 6, achieving efficient GE in HSPCs (up to 50% of CCR5 specific insertion) and T cells. Anguela et al. [9] and Sharma et al. [23] showed that systemic delivery of AAV-ZFNs and AAV-Donor to adult mice can achieve high levels of human factor VIII and IX in murine models of hemophilia A and B. Recently Yang et al. [15] corrected a metabolic liver disease using a similar strategy. Using AAV to deliver ZFNs and donor DNA, Sharma et al. [23] reported a general strategy for liver-directed protein replacement therapies that allows site-specific integration of therapeutic transgenes within the albumin gene. The authors achieved long-term expression of human factors VIII and IX as well as lysosomal enzymes in different animal models of hemophilia, Fabry and Gaucher disease, and Hurler and Hunter syndrome. An additional property of AAV that make these viruses an ideal tool for delivery of Donor DNA is it high specificity for the target locus [24]

#### **2.2. Adevovirus (AdV)-based vectors**

AdVs are highly attractive for viral delivery of SNs and in particular for TALENs due to their high cargo capacity, their ability to transduce dividing and nondividing cells and their transient expression. In addition, similarly to AAV, AdVs started to be used as a tool to deliver large donor DNA for homology-directed gene-targeting experiments (without the use of SNs). This approach has been used in order to correct the HPRT in mouse ES cells [25] or LMNA gene in pluripotent stem cells [26]. Of course, the appearance of SNs prompts scientists to use AdV not only for delivery of the donor DNA but also for the delivery of SNs. In this context, one of the major successes was achieved by the expression of ZFNs targeting the CCR5 locus into CD4+ T cells [27–29]. In fact, AdV delivery of ZFNs was the first GE strategy been approved for their use in clinical trials. The strategy aimed to knock down CCR5 expression (a co-receptor for HIV-1 entry into cells) from T cells derived from HIV-1 patients [29]. On the other hand and taking into account the large cargo capacity of AdVs in comparison with AAV, the AdV have been used not only for the delivery of TALENs [30] and Cas9/gRNA [31, 32], but also as a source of donor DNA templates for homology-directed gene editing after site-specific chromosomal DSB formation by ZFNs, TALENs, and RGNs. Interestingly, it was found that protein-capped adenovirus genomes favored a more specific GE by HDR templates compared to un-capped linear templates. However, the strong immune response elicited by these viruses may limit their potential in clinical settings [33]

#### **2.3. Lentiviral vectors (LVs) and integration-deficient lentiviral vectors (IDLVs)**

LVs have been successfully used for efficient transduction of the most cell types, including hard to transfect primary differentiated cells (such as neurons, T cells, or macrophages) as well as multipotent (MSCs, HSCs) and pluripotent stem cells (hESCs and iPSCs). Due to the high efficiency of LVs, scientists have used this platform also for the transient delivery of transgenes by mutating the integrase protein and develop IDLVs [34]. The development of IDLVs opened the possibility of using these systems for GE in therapeutic settings. IDLVs have been used to deliver cDNAs expressing ZFNs, TALENs, and RGNs. However, only ZFNs genes have been delivered with high efficiency using these systems [24, 35–37]. In one of the first demonstration of IDLVs efficacy for GE, Lombardo et al. [35] showed that IDLV delivery of ZFNs can achieve high levels of gene addition (over 50%) in several primary human cell lines, including hematopoietic stem cells. IDLVs have also been very successful for GE of T cells [38, 39].

Delivery of RGNs and TALENs by IDLVs has been more challenging due to the larger size of Cas9 and the high recombination rates of TALENs [40]. Some authors have developed LVs with mutated reverse transcriptase to deliver mRNA avoiding recombination. Using this system, the authors showed efficient CCR5 and TCR gene suppression in different cell lines [41].

IDLVs have also been adapted for the delivery of SNs proteins instead of delivery of cDNA, providing efficient targeted gene disruption in several primary cells [42]. By co-packaging ZFNs or TALENs proteins and donor RNA in lentiviral particles, the authors achieved homology-directed DNA insertion and gene correction.

The ability to deliver circular DNAs into the nuclei of target cells, including quiescent cells, make IDLVs a very interesting tool to deliver donor DNAs [35–37]. Compared with AAV and AdV, IDLVs have the advantage of enhanced efficiency in some target cells, such as hemato‐ poietic stem cells (HSCs), a very interesting target population for GT strategies. Using IDLVs to deliver Donor DNA, Genovese et al. [43] managed to restore up to 6% of CD34+ cells from a SCID-X1 patient. The main advantage of using IDLVs for delivery of the Donor DNA is the high efficacy; however, quite often (5–20%) the Donor DNA integrates outside that target locus. These off-target integrations can have undesired side effect and is something that need to be monitored in detailed.

#### **3. Non-viral-based vectors**

As we have discussed in the previous section, the different viral vectors have different applications in GE, each one with its own limitation. In this section, we will discuss the best non-viral gene transfer technologies that have been developed for *ex vivo* and *in vivo* delivery of GE tools, in particular for the delivery of SNs. These systems have often been combined with viral-based methods for the delivery of donor DNAs.

#### **3.1. Nucleofection**

NucleofectorTM Technology-Lonza, also named nucleofection, is a electroporation-based system that allows high transfection efficiencies with high cell viability in most cell types including hHSCs, dentritic cells, and iPS. Nucleofection of SNs in the form of DNA, RNA, and proteins has been a successful approach for GE of primary human cells. This technique has been used to achieve therapeutic benefits by NHEJ and by HDR (often combined with viralbased methods for delivery of the donor DNA).

Examples of mRNA nucleofection for NHEJ-based GT can be found in the clinical trial for Duchenne muscular dystrophy patients. The protocol resulted in a deletion of a defective sequence, partially restoring the expression of dystrophin [44]. In a similar strategy, Poirot et al. [45] nucleofected TALEN mRNA into T cells allowing highly efficient gene disruption of alphabeta T-cell receptor (TCR) and CD52 (a protein targeted by alemtuzumab). These cells did not mediate graft versus host reactions and were resistant to alemtuzumab, increasing the safety and efficacy of CAR T cells immunotherapies [45]. The CCR5 T cell receptor (see above) was also targeted efficiently by nucleofection of TALEN mRNAs [46]. Recently, a clinical scale protocol for gene disruption of the PD-1 gene in tumor-infiltrating lymphocytes (TILs) has been developed [47]. In this protocol, nucleofection of T cells with ZFNs mRNA resulted in 80% reduction in PD-1 surface expression in TILs.

The first experiments showing HDR in primary human cells used nucleofection of plasmids to deliver both SNs and Donor DNA [48]. The authors showed specific modification (inclusion of new restriction enzymes) of the IL2Rgamma gene in over 10% of primary human T cells. However, efficiencies in other target cells remained very low for therapeutic applications. Different groups have shown that combining mRNA nucleofection with delivery of Donor DNA by IDLVs [43] or AAV [22] rendered better results due to the ability of the viral vectors to improved efficiency of HDR.

#### **3.2. Liposomes and cationic polymers**

Liposomes and cationic polymers (i.e., polyethylenimine-PEI) allow delivery of large DNA fragments due to the interactions between the cationic charge of the particles and the anionic charge of the cell membranes. Cationic liposomes have shown good efficacies for transfection of DNA expressing ZFNs [27, 48, 49], TALEN [50] and Cas9/gRNAs [50, 51].

In a GT application, HPV-targeted TALEN plasmids were used for *in vivo* delivery using TurboFect1 [50], a proprietary cationic polymer (ThermoFisher Scientific). Direct applications of the TurboFect1-Cas9-gRNA complex into the cervix of transgenic mice displaying HPV infection and cervical cancer reduced viral loads and tumor size [50]. Other reports have showed efficient *ex vivo* and *in vivo* delivery of Cas9 protein complexes with gRNA using liposomes [53, 54].

Some groups have combined the delivery of the CRISPR/Cas as a protein but fusion to a cationic liposome reagent. Zuri et al. [54] hypothesized that proteins that are highly anionic could be delivered by the same electrostatics-driven complexation used by the cationic liposomal reagents. They showed that the Cas9 nuclease protein with the polyanionic single guide RNA could be delivered efficiently and functional into mammalian cells using this cationic lipid formulations and at the same time is able to create indels in a efficient way; approximately 10 folds compared with the plasmid transfection [54]

#### **3.3. SNs proteins and cell-penetrating peptides (CPPs)**

It was soon observed that ZFNs have an intrinsic cell-penetrating activity due to the positive charge of the zinc finger domains [55]. In fact, direct delivery of ZFNs proteins achieved up to 24% gene disruption of CCR5 in HEK and HDF cells and up to 8% in human T cells [55]. However, unlike ZFNs, TALEN are incapable of penetrating cellular membranes [56] and therefore, cell-penetrating peptides (CPPs) are required to promote cellular uptake. CPPs, also known as protein transduction domains or membrane translocation sequences, are short cationic or amphypatic peptides of 5–40 amino acids that can traverse mammalian plasma membranes [57]. Ru et al. fused TALENs with HIV-1 TAT protein (a CPPs) and showed 3 and 5% of CCR5 gene disruption in HeLa and hiPSC cells, respectively. TALENs have also been conjugated with poly-Arg9 peptides (R9CPP) [58]. The conjugated R9-TALEN proteins were able to knockout CCR5 and BMPR1A genes in HeLa and HEK293 cells.

The delivery of the CRIPR/Cas9 system by its fusion with CPPs has also been used in different cells types, such as fibroblast and pluripotent stem cells. Delivery of the Cas9 and gRNA conjugated with m9R and 9R CPPs, respectively, resulted in mutation frequencies ranging from 2.3 to 16% in several human cells including embryonic stem cells [59].

#### **3.4. Nanoparticles**

Nanoparticles are particles between 1 and 100 nanometers in size generated by different strategies such as attrition, pyrolysis, or hydrothermal synthesis. The group of Marie E. Egan developed a new method for surface-modifying PLGA nanoparticles with cell-penetrating peptides [60] and then combined to develop PLGA/PBAE/MPG nanoparticles, achieving modification of 5% of the cells in the nasal epithelium and more than 1% in the lung. Using this system, the authors deliver PNAs and donor DNA molecules to correct the F508del *CFTR* mutation achieving *in vitro* and *in vivo* gene correction an order of magnitude higher than previously achieved [61].

Other strategies combined triplex-forming peptide nucleic acids (PNAs), synthetic oligonu‐ cleotide analogs that are resistant to degradation with nanoparticles [62]. PNAs have also the advantage that can induce DNA repair upon sequence-specific triplex formation at targeted genomic sites. The direct delivery of PNAs and Donor DNA by nanoparticles can mediate GE of human cells at frequencies of 0.05% in HSCs [63]. Other authors used biodegradable poly (lactic-co-glycolic acid) (PLGA) nanoparticles (NPs) encapsulating PNAs and donor DNAs to disrupt CCR5, achieving up to 1% gene disruption in HSCs and conferring HIV-1 resistance to an humanized mice model [64].

#### **Acknowledgements**

This work was supported by Fondo de Investigaciones Sanitarias ISCIII (Spain) and Fondo Europeo de Desarrollo Regional (FEDER) from the European Union through the research grants PI12/01097, PI15/02015 and ISCIII Red de Terapia Celular TerCel RD12/0019/0006 to FM, by the Consejería de Economía, Innovación, Ciencia y Empleo, Junta de Andalucía‐ FEDER/Fondo de Cohesion Europeo (FSE) de Andalucía through the research grants P09‐CTS‐ 04532, PI‐57069, PI‐0001/2009 and PAIDI‐Bio‐326 to FM and PI‐0160/2012 to K.B.

#### **Author details**

Francisco Martin\* , Sabina Sánchez-Hernández, Alejandra Gutierrez-Guerrero and Karim Benabdellah

\*Address all correspondence to: francisco.martin@genyo.es

Genomic Medicine Department, Gene & Cell Therapy Group, GENYO, Pfizer — University of Granada, Junta de Andalucía Centre for Genomics and Oncological Research, Granada, Spain

#### **References**


**Gene Editing and Gene Correction Technologies**

## **DNA Elements Tetris: A Strategy for Gene Correction**

Colette Bastie and Florence Rouleux-Bonnin

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/62382

#### **Abstract**

Transposable elements (TEs) are mobile genetic sequences that are able to move in the genome from one location to another. TEs were first regarded as junk or selfish DNA, as they comprise the largest molecular class within most metazoan genomes having no genomic function. It was necessary to wait until whole genome sequencing to provide new insights about the origin, diversity, and impact of TEs on the genome function. Thus, due to advances in molecular technology, TEs have been shown to create new regulato‐ ry sequence networks. Although nowadays most TEs present in the human genome are silenced, particularly DNA transposons, it does not mean that these sequences are dead. In this review, we detail how DNA transposons could be emphasized to create a new tool for gene correction. DNA-based transposon vectors are derived from three models: Sleeping Beauty, piggyBac, and Tol2, which all work via a "cut-and-paste" mechanism where transposase enzyme is alone able to catalyze the transposition process, which means integrating the genes of interest in chromosomal DNA. Limitations and improvements of the systems are discussed, particularly the latest way to target a specific integration site, showing that the DNA transposon-derived system and its engineering, are powerful tools for gene correction.

**Keywords:** transposon, piggyBac, Sleeping Beauty, gene transfer, Molecular engineer‐ ing

#### **1. Introduction**

#### **1.1. Transposable elements (TEs) in the genome: a brief history from their discovery to their biotechnological use in gene transfer**

TEs, also described as "jumping genes," were first discovered in maize by Barbara McClin‐ tock in the 1940s. TEs are discrete pieces of DNA that are able to move from one site to another

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

within one genome. This new concept, which suggested that the genome was not a final design but was rather able to evolve, to rearrange, was first met with criticism. However, a large body of evidence has accumulated over the last 60 years not only on the categorization and classification of TEs [1] but also on the understanding of their mechanisms. The ability to accurately identify and classify these sequences is critical to understand their impact on host genomes. Pioneers such as Finnegan [2] classified TEs into two classes based on their mechanism of transposition (**Figure 1**). Class I elements transpose by reverse transcription using an RNA intermediate: they are named retrotransposons. Three kinds of enzyme, RNA polymerase, reverse transcriptase, and integrase, are used for transposition. Class II ele‐ ments directly transpose from DNA to DNA: they are named DNA transposons and just one enzyme, the transposase, is needed.

**Figure 1.** Classes I and II transposable elements (TEs, in green). Class I transposon or RNA transposon: three enzymes are necessary to transpose (1: RNA polymerase, 2: reverse transcriptase and 3: integrase). This mechanism is called "copy-and-paste" and gives rise to two identical copies ; one in the donor site and one in the target site. Class II trans‐ poson or DNA transposon: only one enzyme, the transposase, catalyzes the excision and the integration processes. The mechanism is named "cut-and-paste" and translocates the TE element in the target site leaving a free TE donor site. Inverted terminal repeats (ITRs) are drawned in red.

Piégu et al. [1] clearly detailed the necessity to update this classification. TEs are widely distributed in prokaryotic and eukaryotic genomes and represent a variable fraction account‐ ing for 8% in chicken to 85% in maize. After an initial phase of sudden episodic bursts, the invasion step, TEs proliferate and accumulate mutations. Finally, transposition is tolerated by the genome at a reduced rate. Some TE insertions contribute with new genes, exons, or

regulator regions. This has been called the exaptation [3] and domestication [4] processes. However, for a significant amount of time, TEs were primarily considered as "junk or selfish DNA" that played no significant role in genome evolution [5]. The modern-day view of TEs is that they can generate genomic instability and reconfigure gene expression networks in both germline and somatic cells. This comprehensive view came with significant advances in sequencing technologies and the development of bioinformatics tools. One of the most unexpected insights is that almost half of our DNA is derived from TEs and 75% of our genome is transcribed (ENCODE project [6]). Therefore, as an integral part of the genome, the dynamic presence of TEs will be a major force to naturally reshape genomes. Several researchers have found examples of concordant timing between bursts of transposition or massive extinction and speciation events. For example, Lynch et al. [7] noticed how transposons transformed the uterine regulatory landscape during the evolution of mammalian pregnancy and Britten [8] reviewed the importance of Alu inserts on brain growth. Thus, TEs are "spam" coming from the dark ages and nowadays a small proportion of retroelements (<0.05%) remains able to transpose in humans [9]. However, no evidence of DNA transposon families was found active in the human genome during the later phase of the primate Radiation, 37 million years ago [10]. The last active DNA transposons were from the hAT superfamily, the Tc1/*mariner*, and the piggyBac families. This suggests that three sources of transposase were silenced at the same evolutionary period. As previously discussed, although transposons have been silenced, it does not mean that they are dead sequences for the genome and they constitute new regulatory networks.

Thus, DNA TEs present distinguishing features, making them attractive as gene transfer tools. Indeed, they are not infectious, as they are able to mobilize DNA in a single genome and are ubiquitous. From the natural architecture of DNA transposon**s**, a secure and easy system has been designed (**Figure 2**).

**Figure 2.** From natural transposon to engineered pseudo-transposon. a) In the natural transposon, the transposase ORF (green rectangle) is delineated by the two ITRs (red arrows). b) In engineered pseudo-transposon, the transposase ORF is replaced by the cassette of the gene of interest. Transposase should therefore be delivered in parallel either in DNA, mRNA or protein form.

Briefly, the transposon is naturally delineated by two inverted terminal repeats (ITRs) framing the unique transposase open reading frame (ORF). The transposase recognizes the ITRs and catalyzes the excision and integration processes (**Figure 1**). After engineering, the transposase ORF is replaced by the gene of interest cassette and the enzyme is brought independently (**Figure 2**). The transposase is then able to integrate any gene of interest, without crossmobilization between transposon families, as the ITR sequences are highly specific for each transposase. From this global conception of the transposon tool, numerous technological aspects have been explored, finally resulting in an attractive gene integrative system to modify the human genome.

#### **2. Transposon-based strategies**

Various transposon-based strategies are available to obtain efficient transgene integration while maintaining safety and cell integrity. First, it depends on the transposase used to govern the efficacy of the integration process. Second, it depends on the way the transposase and the transgene would be delivered. Some use only one plasmid carrying the transposase expression cassette and the transgene construct. Other strategies rely on using one helper molecule carrying the transposase under gene, mRNA, or protein form and one donor plasmid that brings the gene of interest delineated by two ITRs.

#### **2.1. Different types of transposase**

For genome engineering, two strategies have been developed: find a transposase in any other species that works in humans or create a new one considering that nowadays no DNA transposons are found active in mammalian genomes. After the identification of efficient transposases for gene correction, their activities have been dissected and optimized.

#### *2.1.1. The three musketeers*

For decades, three main transposases have been developed with the aim of gene correction: Sleeping Beauty (SB), piggyBac (PB) and Tol2. In 1997, the SB transposase was artificially reconstructed from partial ancestral copies of a transposase gene identified in salmonid *Salmo* sp. [11]. The Tol2 and piggyBac transposases have been found to be active in their natural host. The piggyBac transposase was isolated from the cabbage looper moth *Trichoplusia ni*, and the developed tool is active in human and mice cells [12]. Tol2 was isolated from the Japanese medaka fish *Oryzias latipes* [13]. It is active in vertebrate cells including zebrafish, chicken, mouse and human.

Following their discovery, various optimizations were carried out to increase their transposi‐ tion efficiency. The development of the SB100x transposase [14], characterized by a 100-fold greater efficacy than the natural SB, stands as an important step of transposase optimization. Comparatively, in 2011, a hyperactive piggyBac transposase was found with 17- and 9-fold increases in excision and integration, respectively [15], and a codon-optimized PB (mPB) was also developed [16]. Following this, the efficacy of this hyperactive PB (hyPb or 7PB) was compared to SB100x by luciferase *in vivo* expression. Mice injected with m7pB had 10 times greater luciferase expression than those injected with SB100x [17]. Currently, no optimization studies have been carried out On the Tol2 enzyme since it is higly sensitive to molecular engineering [1].

#### *2.1.2. Transposases confer specific properties to the system*

Naturally, each transposase governs the integration of the pseudo-transposon using their own target site. The integration site for the SB transposon is TA, whereas it is TTAA for the PB transposon and 8-bp target duplication for the Tol2 transposon. After integration, these target sites are duplicates on either side of the newly integrated pseudo-transposon. Besides this specific transposition signature, the SB, PB, and Tol2 transposases confer specific properties to the system, such as cargo size capacity, overproduction inhibition (OPI), and reversibility with or without footprint.

#### *2.1.2.1. Cargo size capacity*

The distance between ITRs delineates the cassette transgene and defines the cargo size capacity. The more this distance is important, the less the transposase is efficient for excision and integration. However, the constant optimization of the enzymes improved considerably the efficacy of the system.

For now, the SB transposase initially allowed the transposition of only 10-kb transposon [18]. Beyond this size, the transposition rate is abolished. In 2014, Turchiano et al. [19] suggested to change its configuration, permitting the use of SB transposon until 18 kb but with a reduced efficiency. To date, the PB transposon offers the higher cargo size capacity with a natural high activity with 14.3-kb transgenes [12]. The hyPB transposase allows transposition of transgenes up to 100 kb in mouse ES cells [20]. In contrast, Tol2 does not show decrease of transposition efficacy until 10-kb transposon [21], and its activity has been proven until 66 kb [22]. However, few studies have directly compared the transposition efficacy of the transposases in an identical system [23].

Raising cargo size capacity opens new perspectives in gene correction. For example, in muscular dystrophy, disease is induced by the dystrophin mutation. Adding the full-length cDNA of the dystrophin, 11-kb length, has been proven complicated using viral gene transfer. Recently, the full-length dystrophin cDNA has been successfully integrated in mesangioblasts from a dystrophic dog model using the PB transposon tool [24].

#### *2.1.2.2. Overproduction inhibition*

As previously discussed, the transposase is brought independently to the pseudo-transposon, and the ratio between the enzyme and the pseudo-transposon turns out to be important to establish. On the one hand, transposases act by creating double-stranded breaks so the amount of transposase used must be the lowest possible to avoid genotoxicity. On the other hand, it is necessary to have enough transposase for having high transposition rate. Unexpectedly, increasing the amount of transposase does not result in more transposition activity. Indeed, even if at low level the transposition rate increases with the amount of transposase until a maximum value, it is abolished above. This phenomenon is called OPI and depends on the studied model and the type of transposase [25]. In other cases, the transposition rate is saturated, without decrease, and a plateau is observed. The OPI has been well documented for a long time concerning the SB transposase [26]. However, concerning the PB and Tol2 transposases, the OPI is not as clear. For example, the PB transposase showed an OPI phe‐ nomenon in HeLa cells [16], but a stabilization of the activity was demonstrated in HEK293 [27] or mouse ES cells [28]. Similarly, for its Tol2 transposase, OPI or stabilization has been observed [16,21]. The molecular mechanism of this phenomenon is not still clearly established. Numerous hypotheses have been subjected and reviewed in Ref. [25].

#### *2.1.2.3. Integration is reversible*

In some conditions, the desired integration needs to be reversed. The transposase could then been readded with the aim of excising the pseudo-transposon from its chromosomal location. The excision of SB pseudo-transposons drives a footprint signature creating a 5-bp insertion [29]. Tol2 transposase excisions have been less investigated, but they could leave a short insertion or deletion [30]. In contrast, PB transposases have the particularity to carry out this excision without leaving a footprint in the genomic sequence. This property has been exten‐ sively exploited in induced pluripotent stem cells (iPSC) generation [31–33]. For more security, it is possible to use an engineered PB transposase in which the integration efficacy is abolished while conserving its excision property [34].

#### **2.2. Design of the coupled pseudo-transposon/transposase architecture**

Besides the intrinsic particularities of the transposases, the cellular delivery system is crucial. In a first system, called "cis" configuration, only one plasmid carries both the transposase and the gene of interest. The second way, termed "trans" configuration, is based on the principle of separately bringing the gene of interest on one plasmid, "donor" plasmid, and the trans‐ posase under a "helper" plasmid or mRNA or protein form.

#### *2.2.1. "Cis" versus "trans" configurations*

In the cis configuration, only one plasmid needs to be prepared. This confers easier manipu‐ lation and high efficacy, but three drawbacks need to be overcome. First, the pseudo-transpo‐ son/transposase ratio is fixed, conferring less flexibility to the system. Second, the plasmid backbone could be integrated as well as, third, the transposase gene. Even if the pseudotransposon/transposase ratio is fixed, working on promoters has brought flexibility. Indeed, Mikkelsen et al. [35] compared the efficiency of their helper-independent SB vector depending on 11 different promoters used for driving the transposase gene and they observed the OPI phenomenon with the strongest promoter.

In the "trans" configuration, two molecules are used, one carrying the gene of interest and one bringing the transposase either in DNA, RNA, or protein forms. The trans configuration offers naturally more flexibility than the cis one. On the one hand, this approach gives the advantage to modulate the molecular ratio between the transposase and the pseudo-transposon. On the

other hand, this approach gives the possibility to introduce several independent pseudotransposons [36] in their inducible systems. Only one constraint has been detailed: transpo‐ sases are able to catalyze integration more efficiently with a circular donor plasmid than with a linear one [37].

**Table 1.** Different configurations to deliver transposase and pseudo-transposon and their consequences. Transposase molecules are in green whatever is the molecule type. Pseudo-transposon molecule is drawned in blue. GOI, gene of interest; p(A), polyadenylation signal; Prom, promoter; Tnpase, transposase; ITR, Inverted terminal repeats.

#### *2.2.2. Risks and solutions associated to each strategy*

#### *2.2.2.1. Risk of linearized backbone integration*

After excision of the gene of interest, the backbone thereby linearized is more prone to be integrated by a nontransposition process [38], whatever the cis or trans configuration used. This undesired integration exposes the problem of the presence of bacterial sequence such as resistance gene or bacterial replication origin. This has been correlated with the amount of transfected transposase [38] and with the size of the transgene [39]. To avoid this, Wilson's team suggested to use a suicide gene in the plasmid backbone, [40] or to select cells expressing green fluorescent protein (GFP) present in the backbone donor plasmid [38]. Other authors suggested using DNA minicircles [41]. Interestingly, they also observed an increased efficacy with DNA minicircles compared to standard plasmid for the same transgene size in several cell lines. However, keeping only the pseudo-transposon as linearized donor plasmid showed no efficacy with SB transposase [42] and a low one with the PB transposase [37].

#### *2.2.2.2. Risk of transposase gene integration*

The presence of the transposase gene within the plasmid generates risk of its own integration and *per se* a risk of sustained transposase expression. The consequence could be saltatory remobilization of the integrated transgene [43]. To limit the effect of sustained transposase expression, a self-inactivated transposase gene has been obtained by including either the promoter [44,45] or the polyadenylation signal [46] between the ITRs (**Table 1**). Indeed, in primary human T cells, authors identified an active SB transposase ORF only in one clone out of 94, but a bulk analysis showed up to 0.047 transposase copy integrated per cell [50]. This still has not been evaluated for the PB and Tol2 transposases. Nevertheless, it is possible to completely abolish its integration by introducing transposase under mRNA or protein form (**Table 1**) [51]. mRNA or protein forms allow a one-shot transposition process, thanks to the time-restricted transposase expression.

For example, mRNA transposase expression peaked at 18 h after transfection [58]. Galla et al. [52] demonstrated less cell mortality with the mRNA transposase than an integrative form. Bire et al. [51] showed that the mRNA transposase gave less double-stranded break formation and less copy transgene integration. Moreover, no integrations of the transposase mRNA have been highlighted [51]. These considerations have been confirmed *in vivo* [53], as detailed in the end of this chapter.

Using the protein transposase offers also a short window of expression. Cai et al. [55] recently used the transposase protein associated with viral polyprotein. They observed a high number of transgene expressing cells, with a few number of integrated transgene copies per genome. Aiming to limit viral particle uses, recombinant transposase protein was fused with the cell penetrating peptide (CPP) [56] or transposase was delivered with a free CPP [57]. For now, no *in vivo* evaluations have been found in the bibliographic database.

#### **3. Editing the genome: the final step after a long journey through the cell**

Genome editing includes all methods aimed to modify the genome by introducing new DNA sequences or by correcting existing genomic sequences. The journey begins with the ability to enter into the cell, evade the immune response, and, after crossing the nuclear barrier, integrate the gene of interest into the DNA genome.

#### **3.1. Cross the cellular membrane and escape immune response**

As free DNA delivery did not show efficient results, both transposase and pseudo-transposon need to be driven into the cell using different gene delivery strategies, either using a carrier (viral particles or chemical agent) or using a physical method. According to the method selected, it is important to consider all parameters of cellular defense against the entry of the foreign DNA.

#### *3.1.1. Viral hybrid systems*

The viral-transposon hybrid systems take advantage of the natural properties of viral proteins to enter into the cell. For example, as early as 2006, a hybrid HSV amplicon-SB transposase vector was used in a central nervous system development study [59]. Since that time, several studies have been developed on hybrid transposase systems (reviewed in Refs. [60,61]) that use adenovirus [62–64], adeno-associated virus [65], baculovirus [66], or nonintegrative lentivirus [67,68] particles.

#### *3.1.2. Chemical agents*

Chemical agents have been developed with the aim of condensating DNA and thereby avoiding any viral derived systems. However, it turns out to be more controversial than expected with respect to the immune escape [69]. Indeed, these nanovehicles enter into the cell essentially *via* the endosomal pathway [70,71] and therefore expose foreign DNA to the endosomal Toll-like receptors. Among all available chemical carriers, the polyethylenimine (PEI) polymers appear to be the most used in transposon systems. Indeed, the PEI improve endosomal escape through the "proton sponge" mechanism. For example, in 2009, Kang et al. [72] used the PB transposase-based system with the PEI as a transfection reagent for ovarian cancer treatment in a mouse model. Further examples have been realized both *in vitro* [73] and *in vivo* [36,74].

#### *3.1.3. Physical gene transfer*

Finally, plasmid DNA could be driven by physical methods. In this case, the plasmid traffic does not go through the endosome and thereby escapes Toll receptors. One such method, electroporation, turned out to be highly efficient to transfect otherwise hard to transfect cells such as dendritic cells and human hematopoietic or embryonic cells [75–77]. Depending on the cell type used, the results may be controversial. Ley et al. [73] compared transposition efficiency in PEI-transfected versus electroporated mesoangioblasts and were not able to obtain efficient long-term expression in muscle after *in vivo* electroporation.

Other physicals methods have therefore been developed. For example, ultrasound targeted microbubble destruction (UTMD) results in pore formation on the cell membrane after ultrasonic waves application. Recently, two *in vivo* studies have been carried out with clinical perspectives [78,79]. In parallel to UTMD, the hydrodynamic (HD) injection has been applied to transfer the clotting factor VIII [80]. However, they are proinflammatory consequences inducing a lack of transgene expression. To circumvent this drawback, Doherty et al. [81] suggested to induce transient transgene repression, thereby preventing the priming of transgene-specific T cells.

#### **3.2. Cross the nuclear barrier and transgene integration**

#### *3.2.1. The transposase is driven to the nucleus*

For an efficient transposition, the transposase needs to be localized into the nucleus at the same time as the pseudo-transposon DNA.

The transposases contain a nuclear localization signal, driving them to the nucleus [82]. An engineered PB transposase have been developed for increasing its localization within the nucleoli by adding a nucleolus-predominant (NP) signal peptide from HIV-1 TAT protein [83]. With this NP-mPB, a three- to fourfold increase in PB transposition rate, in both murine and human cells, was observed.

From the pseudo-transposon point of view, its nuclear targeting is also essential. Thus, DNA nuclear targeting sequences (DTS) might be added to the plasmid backbone. These DTS consist, for example, to a 72-bp sequence from the SV40 enhancer and act as a sequence driver [84].

#### *3.2.2. Integration profile of the gene of interest*

All transposon systems have less integration bias than viruses, as previously described [85– 88]. However, it is important to note that there are some differences within transposon systems [89]. The SB transposase is known to allow the more random integration [90], with approxi‐ mately 35% into RefSeq sequences. It has been notified that the SB transposition has an affinity for the heterochromatin topology [91]. In contrast, the Tol2 and PB transposases are not considered to allow random integration. Indeed, the PB transposase shows a bias towards integration of the transgene into CpG islands and transcriptional start site, with approximately 49% into RefSeq sequences [16,27], and the Tol2 transposase presents a strong bias for the intergenic regions [92].

Interestingly, this global integration profile could be affected by various parameters, such as the transposase variant [93] or the cell type [94].

In addition, it is important to note that, for now, studies have been essentially established in *in vitro* models and no predictions could be drawn regarding the *in vivo* integration profile. Indeed, after *in vivo* UTMD transfection, the pseudo-transposon showed a significant bias of transgene integration into chromosome 14 [49], but no bias was observed in their *in vitro* control.

#### **4. Side effect of the transgene integration system**

The newly integrated foreign DNA is considered as an invader by the cell. This leads to postintegrative transgene silencing. Conversly, the transgene copy might also influence surrounding sequences according to the integration site. To conter these mutual side-effects numerous strategies have been developed.

#### **4.1. Communication mechanisms between the transgene and the genome**

During their evolution, transposons have been made extinct by at least chromatin condensa‐ tion and by RNA interference (RNAi) induction.

The transcriptional regulation includes DNA CpG methylation and histone modifications. It has been confirmed that the transgene expression could be restored by a demethylating agent such as 5-aza-2'-deoxycytidine or by a histone deacetylase inhibitor such as trichostatin A [95]. However, it is easier to avoid the induction of upstream gene silencing. To this end, working with a methylated pseudo-transposon plasmid unlike an unmethylated one showed more transposition rate with the SB transposase [96]. Curiously, when the SB, PB, and Tol2 trans‐ posase systems are directly compared, the integrated transgene is less silenced if integrated by the PB transposase [97].

The role of RNAi in posttranscriptional silencing of exogenous DNA transposons remains unclear. One study demonstrated that, in the absence of an efficient cellular RNAi system, by establishing p19 protein knockdown cells, the number of colonies is increased [98]. Nonethe‐ less, the mechanism is still not elucidated.

Besides the host-to-transgene effect, a transgene-to-host effect, driving perturbations in sequences surrounding the transgene by DNA methylation modulation, has been highlighted [99]. A further study investigated the expression levels of host genes neighboring the SB transposon and underlined variations depending on the chromosomal location of the trans‐ gene [100]. Therefore, solutions allowing a complete isolation of the transgene should be developed.

#### **4.2. Overcoming the host regulation for a sustained expression**

In gene correction, maintaining the expression level of the transgene and limiting host genome perturbations are crucial for having an efficient therapeutic effect.

#### *4.2.1. Matrix attachment region (MAR)*

The human MAR elements are natural elements of the eukaryotic genome, which mediate the structural organization of the chromatin domains. When included in a transposon plasmid, they do not affect the number of transposed transgene copies but rather increase the transgene expression per integrated copy [101]. Moreover, when the MAR element is included in the transposase vector, an increased transposition efficacy has been observed [102].

#### *4.2.2. Insulators*

Insulators are short DNA sequences naturally present in the genome and act as genetic boundary elements. In a recent study, four different insulators (cHS4, D4Z4, CTCF, and CTF/ NF1) were compared and showed that D4Z4 and CTF/NF1 had insulator functions when combined with transposition [51]. The protective effect of the cHS4 insulator has been demonstrated by a strong diminution of the activation of a nearby promoter [103] and by a prolonged fluorescent marker expression [104,105]. Some equivalent studies corroborated this role in clinically relevant cells as well as primary hematopoietic CD34+ cells [106]. Moreover, cHS4 insulators abolished the RNAi pathway effects regulating transposon-derived transgene expression by epigenetic silencing [98]. Nevertheless, for an optimal boundarie effect of insulators, it is necessary to consider the model used. Indeed, the size of the pseudo-transposon increased by the insulator or steric hindrance of transposase action [103] could also influence the transgene expression..

#### **5. Going further**

For many years, researchers have provided elements for a better understanding of their mechanism and have given solutions for the optimal use of these systems. Here, we recall promising leads for further work in this area: targeting a specific site within the genome and targeting a specific tissue at the body scale.

#### **5.1. Targeting a specific site within the genome**

Replacing a defective gene or introducing a gene of interest into a completely safe, predeter‐ mined, specific genomic site is the ideal approach for gene correction. This potential locus could be defined by numerous criteria determined by its position from gene, miRNA, tran‐ scription unit, or ultraconserved region. All of these aspects have been recently reviewed [107].

#### *5.1.1. Transposon targeting strategies*

The SB, PB, and Tol2 transposases have short integration target sites: TA, TTAA, and 8-bp sequences, respectively. Thus, transposon-derived systems should be optimized by combining the transposase to a system able to target a specific DNA sequence, such as a DNA-binding domain (DBD). The first strategy uses a fusion protein containing both the transposase and a DBD. In the second method, a fusion protein is constructed between a DBD and a protein, which is able to specifically recruit the transposase. To date, only one protein is known to be able to interact with the SB transposase, which is named N-57 [108]. Finally, another solution is based on a fusion protein between two DBD, one recognizing a genomic sequence and one specific to a sequence inserted within the pseudo-transposon plasmid. Few parameters of this third approach have been explored in a mammalian model [108]. Considerations of these three strategies have been recently reviewed [109], and we herein detail only chimeric transposases.

The proof-of-concept has been demonstrated by studying intraplasmic integration using the PB transposase fused to the Gal4 domain [110]. However, the system revealed to be more restrictive than expected both in the conservation of the transposition activity and the ability to restrict integration in the targeted locus. Therefore, the transposition activity might be affected by the DBD fusion. Indeed, the DBD Gal4 (a zinc finger domain, ZF) has been tested in fusion to the Tol2, SB11, and PB transposases. The number of chromosomal integrations of the transposon is abolished with Gal4-Tol2 and Gal4-SB11, but no loss of efficiency was observed for the Gal4-PB transposase [111]. Some studies have been carry out to analyze the parameters of this loss of activity, such as the sequence surrounding the targeted site [108], the orientation of the fusion [112], or the choice of the linker [113]. The DBD type has also been evaluated in their ability to avoid off-target integration. With the Gal4-PB transposase, transposition occurred at 23% within 0.8 kb of Gal4 site compared to 5% for the native transposase [114]. However, for improvement of the targeting, artificial ZFs have been created by assembling six ZF domains to create a polydactyl protein capable of targeting a unique sequence of 18 bp [115]. For example, the sequence targeting with these artificial ZF allowed 44.3% of integration events near the CHK2-ZF site [116]. Comparatively, when the Sp1 ZF is fused with the PB transposase, which preferentially binds the CG-rich motif, the integration increased near the CpG islands (25.7% versus 10.5% with the native PB transposase) but without modification regarding the integration into the RefSeq genes [117].

#### *5.1.2. Other systems allowing a targeting integration*

In 2011, the discovery of the CRISPR/Cas9 system revolutionized the gene transfer because of its ability to drive the transgene in its physiological site, but no studies directly compared the efficiency of both transposon and CRISPR/Cas9-based systems. It has been supposed that this system arises from casposon in the evolutionary tree. Casposons are mobile cryptic sequences present in Achaea and bacteria, and two independent studies described this superfamily of mobile elements by linking transposon and CRISPR/Cas systems [118,119].

Recently, a combinatory approach was developed, in which the correction is realized gene by gene (CRISPR/Cas9 role) and temporarily needed sequences are removed from the genome (transposase role). This method has been applied for gene correction of β-thalassemia [120] and to create iPSC with deletion into the CCR5 gene [121].

#### **5.2. Targeting a specific tissue at the organism scale**

For *in vivo* application of gene correction, it is important to express the transgene of interest only in the organ, tissue, or cell types in which the transgene expression is required. The design of the transgene vector is essential and might contain specific elements such as tissue-specific promoter or regulatory sequences. The second option is to deliver the system only in the specific cells.

#### *5.2.1. Design of the transgene vector for in vivo applications*

In the ideal gene transfer, the transgene is expressed in the same conditions, as it is in phys‐ iological conditions. Indeed, overexpression of the transgene or expression in a nontarget cell could improve cytotoxicity, induce its clearance by the immune system, and increase its gene silencing (reviewed in Ref. [122]). With this aim, vectors have been designed in such a way as promoters or regulatory sequences are chosen for restricting the expression of the gene of interest only in the cells of interest. Tissue-specific promoters control gene expression in a tissue-dependent manner or according to the development stage of the cells. In plasmid design, several approaches are available such as using a promoter regulating an endogenous gene expressed in one type of cell (minimal promoter) or combining numerous enhancers to a minimal promoter.

In the first case, the transposon is under a native promoter. For example, endothelin-1 [123] allows a decreased GFP expression in a nonendothelial cell line while maintaining the expression level in endothelial cell lines. When the targeted cell type is the final point of a differentiation lineage, it seems essential to have the expression of the therapeutic protein only in the differentiated state, such as promoters capable of restricting β-globin expression in differentiated erythroid cells from transfected proerythroid cells [124]. In cancer therapy, a study based on the SB transposition showed that the HSV-TK transgene driven by a telomerase reverse transcriptase promoter increased death rate in cancer cell lines compared to fibroblast cell lines [125].

The second approach is based on constructions containing a minimal promoter with specific enhancers. For example, the SB transposon system has been used for the introduction of the telomerase gene driven by a combination of the transthyretin (TTR) gene promoter/enhancer, the human alcohol dehydrogenase gene promoter, and the SV40 enhancer [126]. The authors observed an induced transcriptional activity only in hepatocytes. In an *in vivo* study, the authors developed a TTR minimal promoter coupled to a hepatocyte-specific cis-regulatory module, driving the clotting factor IX for correction of hemophilia B [127]. This promoter has also been combined with a PB transposon-mediated gene transfer and confirmed the high efficiency of the transgene construct [128].

#### *5.2.2. Limiting the ectopic integrations by tissue targeting*

For improvement of tissue targeting, two major routes have been developed, either adminis‐ tration of *ex vivo* premodified cells of interest or direct delivery of the integrative system, containing the transgene, to the whole organism.

#### *5.2.2.1. Administration route for ex vivo modified cells*

The delivery of premodified cells to a patient was extensively carried out in adoptive cell transfer of immune cells expressing an artificial T-cell receptor (TCR) designed to target an antigen. Briefly, T cells are removed from a patient and transformed to express the artificial TCR (also named chimeric antigen receptor or CAR). After amplification, modified T cells are intravenously readministrated to the organism. In the field of transposon technology, this approach has been used in several applications. For example, a human epidermal growth factor receptor 2-specific CAR was introduced into cytotoxic T cells, thanks to the PB trans‐ posase [129]. More recently, T lymphocytes were modified to express the CD19-CAR trans‐ gene, and after 7 days of coculture, CAR T cells eradicated all CD19+ tumor cells *in vitro* [130]. In lower proportions, the Tol2 transposase has also been used for the integration of a CD19- CAR into T cells [131]. However, production of CD19-CAR T cells usually uses SB transposase and clinical trials are currently under investigation [132]. The authors detailed their protocol for manufacturing clinical-grade CD19-specific T cells [76].

It is also possible to reimplant modified cells *in situ* after their encapsulation. In this aim, Fjord-Larsen et al. [133] developed a model in which a new clinical-grade cell line expresses a high level of neural growth factor after striatum implantation.

The administration of already modified cells increases the security of the transfer system. However, applications are, for now, restricted to cells easy to collect and reimplant to a patient. For less accessible tissue or organs, targeting methods are more often driven by a direct administration of the transgene.

#### *5.2.2.2. Administration route for transposon DNA system*

The administration of the therapeutic gene, associated with the transposase, needs a delivery method able to drive them into the organ or tissue of interest. To this end, two strategies have been developed. The first one takes advantages of specific administration route properties, whereas the second one uses vehicles expressing receptors capable of specific recognition of the targeting tissue.

It has been demonstrated that all gene delivery methods do not present an equal distribution in the different organs. For example, the HD injection is known to target the liver at 95%, as detailed by Bell et al. [134]. In agreement, Herweijer and Wolff [135] showed that transgene expression was also found in others organs such as the heart, spleen, and kidneys at levels approximately 100-fold lower than in the liver. This liver targeting way has been applied in gene correction, and in 2007, Aronovich et al. showed a model of correction of mucopolysac‐ charidosis mice by SB-mediated transgene α-L-iduridase (IDUA) transposition [136]. They mentioned a persistent expression of IDUA in plasma for almost 10 weeks after injection. In cancer therapy, liver metastasis of colorectal cancer was reduced after antiangiogenic genes were integrated by the SB transposase [137].

As a complement, the DNA transposon could also been administrated after complexation to a targeting vehicle. After an intravenous administration, Kren et al. [47] highlighted a hepa‐ tocyte-specific integration of the transgene when condensated with coated nanocapsules. Comparatively, the transgene complexed to the PEI showed an expression in the lung, not observed after HD injection [138]. More specifically, within the lung, the polyplexes are addressed into pneumocytes and no transgene expression was detected within the conducting airways [139].

Coupling specific administration route and nanocapsules is the future way. In this aim, the UTMD gene delivery method allows mediating the site-specific delivery of transposons. Briefly, the transgene is intravenously injected and cell penetration occurs at the targeted organ by acoustic cavitation [49]. This approach has been used for the transposition of the Nkx2.2 transcriptional factor to the pancreas by the PB system [78] or for the transposition of the thymosine β4 gene, or the glucagon-like peptide-1 one, to the heart [79,140].

In gene correction, targeting the tissue of interest is essential for reflecting physiological conditions. Compared to viral transduction, the transposon systems are more customizable and numerous possibilities are available for users. Depending on the tissue to target, it is possible to play at the same time on the promoter, the administration route, and the presence of targeting molecules.

#### **6. Therapy applications of transposase tools**

Some technological aspects previously discussed offer a suitable transposon toolbox to gene correction. Transposon-based systems allow first the transgene integration in a large range of clinically relevant target cells, including hematopoietic stem cells [141], mesenchymal stromal cells [142], iPSC [143], and lymphoid T cells [131]. Transposon-mediated correction could therefore be used in a large-scale application, such as treatment of inherited disorders, cancer, and tissue degeneration (**Table 2**).



Tnpase, transposase ; ADSC, adipose-derived mesenchymal stem cells; eGFP, enhanced GFP; eNOS, endothelial nitric oxide synthase; FA-C, Fanconi anemia complementation group C; Fah, fumaryl-acetoacetate hydrolase; FVIII, clotting factor, factor VIII; hAAT, human α1-antitrypsin; HDDPC, primary human deciduous tooth dental pulp cells; HER2, human epidermal growth factor receptor 2; hGUSB, β-glucoronidase; hIDO, human indoleamine-2,3-dioxygenase; hIDUA, human α-L-iduronidase; HO-1, heme oxygenase-1; HSV-tk, herpes simplex virus thymidine kinase; hTERT, human telomerase reverse transcriptase; htt, huntingtin; hUGT1A1, human uridine diphosphoglucuronate glucuronosyltransferase 1A1; IFNγ, interferon γ; IGF-1R, insulin-like growth factor-1 receptor; IHK, antisickling globin; IL-11, interleukin-11; IPE, iris epithelial cells; KLF4, Krüppel-like factor 4; LAMB3, laminin B3 subunit of laminin 5; Nkx2.2, NK-type homeodomain transcription factor; OCT4, octamer-binding transcription factor 4; PEDF, pigment epithelium-derived factor; RPE, pigment epithelial cells; sFlt-1, soluble fms-like tyrosine kinase-1; SOX2, SRY (sex-determining region Y) box 2; statin-AE, angiostatin-endostatin fusion gene; STZ, streptozotocin; TB4, thymosin β4; Tnpase, transposase; TRAIL, TNF-related apoptosis-inducing ligand.

**Table 2.** Application fields of transposon-based gene correction.

#### **7. Conclusion**

Transposons have naturally drawn genomes since the first forms of life. Scientists have taken advantage of their properties with the aim of constantly updating the safety of this nonviral tool for gene transfer. With the other integrative systems derived from casposons, such as CRISPR/Cas9, we dispose of complementary tools for reshaping the genome. Latest discov‐ eries have open new horizons, but a long road is still ahead.

#### **Acknowledgements**

This work has been supported by La Ligue Contre le Cancer.

#### **Author details**

Colette Bastie and Florence Rouleux-Bonnin\*

\*Address all correspondence to: florence.bonnin@univ-tours.fr

LNOX GICC UMR CNRS 7292, Department of Medicine, Bâtiment Dutrochet, 10 boulevard Tonnellé, 37032 TOURS, France

#### **References**


receptor-expressing T cells using piggyBac gene transfer and patient-derived materials. Cytotherapy. 2015;17:1251–67.


## **Gene Correction Technology and Its Impact on Viral Research and Therapy**

Guan-Huei Lee and Myo Myint Aung

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/62231

#### **Abstract**

Aims

1. To explain why gene correction technology is a useful tool for studying chronic/ latent viral infections.

2. To explain how gene editing technology may facilitate or restrict virus replication and impact on future therapy.

3. To cite specific examples of how gene correction technology is being applied to target specific viruses, including HIV, hepatitis B virus (HBV), herpes simplex virus (HSV), and other viruses.

#### Methodology

We attempted to identify all scientific publications including basic science, translation research, and any clinical trials involving DNA correction technology [zinc finger endonuclease (ZFN), transcription activator-like endonucleases (TALENs), and CRISPR/Cas-based systems] and persistent viral infections [including but not limited to HIV, hepatitis B, C, and D viruses, herpes viruses, cytomegalovirus, Epstein–Barr virus (EBV), human papillomavirus (HPV), measles virus, and varicella-zoster virus] published on or before December 31, 2015. We conducted searches of MEDLINE, Cochrane Central Register, and EMBASE. The identified papers have been summarized and organized into relevant sections within the chapter.

#### Conclusion

Sequence-specific DNA endonucleases target and destroy DNA viruses, with early work describing the use of ZFNs, TALENs, or a third type of endonuclease, called a homing endonuclease (HE), to target HBV, HPV, and HSV-1 with varying degrees of success. The new CRISPR/Cas9 systems do not allow virologists to screen for host genes that affect the replication of pathogenic human viruses but to derive human cell lines that are genetically engineered to either facilitate or suppress viral replication. Scientists now

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

widely use adeno-associated virus (AAV)-based vectors to directly target chronic viruses that infect discrete organs/tissues in the human body, such as HBV and HSV, although the safety of such delivery system is still a concern. The eventual goal is to eradicate or to disable the entire population of latent viral DNA genomes within the infected cells, resulting in permanent cure for these viral infections, which remains elusive until now.

**Keywords:** antiviral gene therapy, CRISPR/Cas, gene editing, sequence-specific DNA cleavage, TALENs, zinc finger

#### **1. Introduction**

Sequence-specific DNA endonucleases were first identified in the 1960s as enzymes that restrict the ability of DNA bacteriophage to grow in particular bacterial isolates or species. However, it was many years later that the DNA editing technology for selective modification of large viral or cellular DNA genomes was developed with the discovery of zinc finger endonuclease (ZFN) [1]. It was originally developed as an artificial restriction endonuclease to replace or complement restriction enzymes, and the engineered nucleases have become a versatile and indispensable tool in research as well as in biotechnology. However, each zinc finger is not necessarily highly specific for the 3-bp target that it is designed to bind and zinc fingers specific for all possible 3-bp target sequences have not yet been derived [2].

To address some of the deficiencies of ZFN, transcription activator-like endonucleases (TALENs) were developed based on a distinct modular DNA-binding motif. TALENs are made up of four ~34 amino acid domains derived from a DNA-binding protein found in the pathogenic plant bacterium *Xanthomonas*, which each recognizes a single DNA base pair with higher specificity [3, 4]. Although being more specific than ZFNs, TALENs require a rather laborious exercise in genetic engineering and are quite large, thus limiting the ability to express TALENs using viral vectors.

The newly developed CRISPR/Cas-based systems, or RNA-guided engineered nucleases (RGENs), unlike ZFNs and TALENs that use protein motifs for DNA sequence recognition, depend on RNA-DNA recognition. These are both highly specific and allow facile retargeting to new genomic loci [5, 6] and may be superseding the two older technologies. A key step forward in making the CRISPR/Cas9 system more user-friendly is by demonstrating that the crRNA and tracrRNA could be linked by an artificial loop sequence to generate a fully functional small guided RNA (sgRNA) [5, 6].

To date, the ever-increasing list of organisms with genomes that have been modified success‐ fully using engineered nucleases includes mosquitoes, crickets, silkworms, pigs, cows, rabbits, and nonhuman primates, among others [7, 8]. The ability to genetically manipulate human pluripotent stem cells and somatic cells using engineered nucleases opens new opportunities to develop novel therapies for patients with various genetic and acquired diseases. Genetic defect correction is now possible in cultured cells from patients with a number of genetic diseases, including sickle cell disease [9], cystic fibrosis [10], Down syndrome [11], Duchenne muscular dystrophy [12], α1-antitrypsin deficiency [13], dystrophic epidermolysis bullosa [14], chronic granulomatous disease [15], and infectious diseases, particularly persistent viral infections that were among the most attractive targets in both ex vivo and in vivo studies, which will be discussed in the rest of the chapter. Despite all these promising advances, we should also be aware that these DNA editing technologies have been shown to cut at off-target sites with mutagenic consequences. Therefore, issues such as efficacy, specificity, and delivery are likely to drive selection of reagents for particular purposes. Human therapeutic applica‐ tions of these technologies will ultimately depend on risk-versus-benefit analysis [8].

#### **2. Main text**

#### **2.1. Why is DNA editing technology important in chronic viral disease research?**

Chronic viral infection underlies a wide variety of medically important diseases that either follow directly from primary infection or may require months, years, or even decades to develop. Diseases caused by persistent virus infections include AIDS, AIDS-related complexes, chronic hepatitis, subacute sclerosing panencephalitis (chronic measles encephalitis), chronic papovavirus encephalitis (progressive multifocal leukoencephalopathy), several herpes virus– induced diseases, and some neoplasias (see **Table 1**). Pathogens with worldwide impact, such as HIV, hepatitis B virus (HBV), and a number of herpes viruses remain uncontrolled. The pathogenic mechanisms by which these viruses cause diseases include disorders of biochem‐ ical, cellular, immune, and physiologic processes. The chronic nature of these infections limits the number of antiviral options and increases the risk of developing drug-resistant strains in the host. Some recent studies suggest that chronic viral infection also contributes to certain cancers as well as to diabetes and atherosclerosis. Ongoing studies are rapidly advancing our understanding of many persistent infections. Viruses have evolved a wide variety of strategies by which they maintain long-term infection of populations, individuals, and tissue cultures.

Most of our current understanding of the host immunity and viral virulence are from studies on the progression of acute infection. When a virus enters the host, there is an initial nonequi‐ librium phase of acute infection, in which viral and immune strategies compete for dominance. A transition point will be reached in a survivor, at which the infection either is cleared or becomes chronic. This transition point may be reached very early in infection for viruses that can establish a latent infection, in which case the infection is permanent regardless of the course of acute infection. If recovery occurs, the immune system must reset by clearing the antigen and reestablishing immune homeostasis. If the balance shifts toward chronic infection, a new set of viral and host strategies interact to define a metastable equilibrium in which viral replication is held in check, but the virus is not cleared [16].

Whereas acute viral infection represents a nonequilibrium process, chronic viral infection is a process in dynamic and metastable equilibrium. During acute infection, both the host and the virus change continuously until infection is resolved, kills the host, or becomes chronic. Certain genes in a virus or in the immune system function during acute but not chronic infection. The failure of these immune system genes to function effectively or the overly effective evasion of immunity by the virus may eventually lead to multiorgan failure and death. In contrast, during chronic infection, viral and host genes balance each other [16]. The mechanisms of viral persistence despite the impressive immune armamentarium of the host, without causing overt disease, remain unclear, although the events leading to the establishment of chronicity were described.

Two events are fundamental to the establishment of chronic viral infection. First, the virus must evade sterilizing immunity (the complete elimination of a virus). Second, the immune system must adjust to the continuous presence of viral antigen-driven inflammatory responses to limit viral replication to an acceptable level without untoward damage to permanently infected tissues. If the immune system cannot eliminate the virus, unrestrained immune attack on virus antigen-bearing cells causes tissue injury. Thus, down-regulation of inflammation during chronic viral infections can result in decreased tissue damage, at least for noncytopathic viruses. Immunopathology can be severe in human chronic infections caused by HBV and hepatitis C virus (HCV) [46], which may remain noncytopathic for many decades initially. It is important to realize that viruses which rely on a living but chronically infected host for their own survival must carefully avoid mechanisms that overwhelm immunity and kill their hosts.

Viruses have evolved highly effective strategies for establishing chronic infection despite the presence of an active host antiviral immune response. There are three general strategies for chronic viral infection: continuous replication, latency and reactivation, and invasion of the genome followed by vertical spread from generation to generation. Individual viruses usually rely mostly on one strategy, but viruses can use more than one mechanism. For example, HIV effectively uses both continuous replication and establishment of latency [47–49], a dangerous combination. The differences between these strategies have profound implications for designing new ways to prevent or control harmful chronic viral infections.

The mechanisms of the viral persistence have not been completely understood, and some common factors are known [50]. Immune modulation is one of them. Many of these viruses managed to avoid the specific and nonspecific immune defenses in several different ways. These include (1) limitation of recognition molecules on infected cells; (2) altered lymphocyte and macrophage functions, including the modified production of cytokines and general immunosuppression [e.g., HIV-1 and -2, Epstein–Barr virus (EBV), and HBV]; (3) infection in immunologically privileged anatomic sites [e.g., herpes simplex virus (HSV) and VZV in central nervous system]; (4) compromised nonspecific defenses (e.g., suppress interferon production); and (5) immune tolerance (e.g., HBV).

Another common mechanism is the modulation of viral gene expression. Such examples include (1) down-regulation of some viral genes by viral or cellular regulatory gene products [e.g., HIV and human papillomaviruses (HPVs)], (2) specific latency-associated proteins (e.g., EBNA-1), and (3) synthesis of latency-associated transcripts (LATs; e.g., HSV-1 and -2) as well as viral variants (e.g., HIV and measles) [50].

Developing a successful cure for persistent or chronic viral infections remains a major challenge due to their ability to evade/suppress the immune system and their ability to incorporate viral sequences into the host genome and long inactive latent phases, which makes targeting of active biological activities nearly impossible. However, recent concerted effort and improvement of drug design, leading to multiple new drugs that successful cure HCV, demonstrate that the complete clearance of chronic viral infections is an attainable goal. A particularly tantalizing application of programmable nucleases is the potential to directly correct genetic mutations in affected tissues and cells to treat diseases that are refractory to traditional therapies. A number of approaches targeting specific disease-causing viral infec‐ tions will be discussed in the following sections.




**Table 1.** List of known human pathogenic chronic viral infections and their respective prevalence and clinical disease [16].

#### **2.2. Defining the cellular factors that facilitate or restrict virus replication**

Using sequence-specific DNA endonucleases to target and destroy DNA viruses had been attempted almost as soon as the technology was invented [51], with earlier work describing the use of ZFNs, TALENs, or an earlier HE, to target HBV, HPV, and HSV-1.

The newer DNA editing systems not only impact virology by allowing researchers to screen for human genes that affect the replication of pathogenic human viruses and are extremely useful in generating human cell lines that lack gene products that may facilitate or restrict specific virus replication. Adeno-associated virus (AAV)-based vectors offer the possibility of directly targeting DNA viruses that infect specific organs or tissues in the human body (e.g., HSV and HBV) to target and destroy the entire population of viral DNA genomes. Safety concerns, such as off-target genomic mutations, of such delivery system remains [52].

Viruses have very compact genomes and therefore rely on the host cell for many activities required to support their replication cycle. Ironically, viruses as pathogens put selective pressure on their host organisms, thus selecting for cellular restriction factors that can limit its own level of viral replication. One of the best studied viruses in this context is HIV, which requires a wide range of human cofactors for replication in CD4+ T cells, several of which are lacking in other mammalian species, such as in mice. HIV is targeted by a wide range of human restriction factors and dedicates a substantial portion of its coding capacity to the neutraliza‐ tion of these restriction factors. For example, the Vif protein blocks the activity of the host APOBEC3 family of restriction factors, which otherwise interfere with the production of the HIV-1 provirus, whereas the HIV-1 Vpu protein neutralizes the cellular restriction factor tetherin, which blocks the release of progeny virions [53].

The identity of cellular factors that either facilitate or restrict viral replication is central to the search for novel targets for antiviral drug development. Researchers have been using RNA interference (RNAi) libraries to knock down the expression of human genes systematically to identify factors required for virus replication. Although this approach has led to some interesting insights, it is also clear that RNAi screens for viral cofactors in different laboratories have often led to very different lists of cellular proteins with this potential activity. This may be due to the use of different cell systems, different assays for viral replication, different RNAi reagents, and sometimes incomplete knockdown of target gene expression. As a result, most of the viral cofactors and restriction factors identified so far have required confirmation using biochemical approaches (e.g., by identification of cellular factors that specifically bind to a viral protein) or genetic approaches (e.g., by complementation of a human and/or animal cell line that lacks a particular cofactor or restriction factor) [52, 53].

RGEN systems appear highly suitable for use in screens for viral cofactors or restriction factors; indeed, several screens for cellular factors involved in cell transformation have been published [54–56]. This could now be extended to analysis of the replicative potential of viruses in a 96 or 384-well plate format, possibly in the form of comprehensive CRISPR/Cas9-generated libraries of cellular clones available for specific human cell lines. Each of these lines will lack a functioning gene that is dispensable for cell viability in a tissue culture setting. These clones should allow the reproducible and almost complete identification of cellular factors that either help or hinder virus replication in that human cell line in vitro. Factors required for human cell viability would be missed, but it seems likely that almost all host innate immune factors involved in restricting virus replication would not be essential. Essential factors required for host cell viability must also be essential for virus replication and clearly would not provide potential targets for antiviral drug development. A summary of studies using DNA editing technology to define the host factors that facilitate or restrict virus replication is listed in **Table 2**.



**Table 2.** Summary of studies using DNA editing technology to define the host factors that facilitate or restrict virus replication.

#### **2.3. Using engineered nucleases for gene therapy applications in specific viral pathogens**

#### *2.3.1. Human immunodeficiency virus (HIV)*

In the last two decades, the availability of highly active antiretroviral therapy (HAART) capable of reducing HIV replication to undetectable level and prolonged survival has greatly improved the prognosis of infected patients. However, HIV-1 persists as a latent infection in a small number of resting CD4+ memory T cells [47]. In these long-lived cells, intact integrated HIV-1 proviruses persist in a transcriptionally silent state that is refractory to both drugs and host immune responses. However, these memory T cells can be reactivated by an appropriate recall antigen, resulting in the induction of a productive viral replication cycle [47, 49]. If this occurs after drug treatment has been stopped, HIV-1 will rapidly spread through the available CD4+ T cells and rekindle the same level of virus replication that was seen before antiviral drug treatment. Other disadvantages of such long-term therapy include limitations including high cost, patient compliance and side effects of long-term therapy, as well as emergence of drug resistance [84]. Therefore, there is a need to develop a more effective "cure" for HIV infection.

The approaches to purge the pool of latently infected cells have focused on two strategies. Some have attempted to activate latent HIV-1 proviruses using drugs, including histone deacetylase inhibitors and protein kinase C agonists [85]. However, so far, this strategy has not been proven able to activate HIV-1 in a high percentage of latently infected cells.

An alternative strategy would be to directly target and destroy latent proviruses using HIV-1 specific CRISPR/Cas combinations. Early attempts have shown that this is feasible and that latent proviruses can be excised from the host cell genome, and then destroyed, by cleavage in the HIV-1 long terminal repeat regions [86, 87]. In principle, the HIV-1 provirus is a perfect target for CRISPR/Cas, as there is only a single proviral copy in the infected cell, and in the presence of antiviral drugs, no spread of the virus is possible. The problem, however, is that latently HIV-1-infected T cells are scattered throughout the body and infecting all of these seems currently to be an insurmountable problem, especially as T cells are poor targets for AAV infection. This contrasts with HBV, HSV, and HPV, all of which are tightly localized in known tissues in the body that can be readily targeted by AAV [52]. Therefore, in the absence of novel vector delivery systems that can target latently HIV-1-infected cells throughout the body, HIV-1 is likely to remain a technically challenging target for elimination by CRISPR/Cas in vivo.

CCR5, which encodes a coreceptor for HIV entry [88, 89], has been a popular target for developing a new generation of HIV therapy for several reasons. First, its disruption seemed likely to increase the survival of CD4T cells; persons homozygous for a naturally occurring 32 bp deletion (delta32/delta32) in CCR5 are known to be resistant to HIV infection [90]. CD4 T cells from such persons are highly resistant to infection in vitro [91]. Persons who are hetero‐ zygous for CCR5 delta32 and HIV infection have slower progression to full-blown AIDS [92, 93]. The effectiveness of blocking or inhibiting CCR5 with the use of small interfering (siRNA) and other small-molecule inhibitors has been shown in humans [94]. Finally, there is the remarkable success story of HIV eradication in the so-called "Berlin Patient". This HIV-positive patient with lymphoma, who had been transplanted with bone marrow from a CCR5-Δ32 homozygous donor, became cured with no measurable virus (undetectable HIV RNA and proviral DNA in the blood, bone marrow, and rectal mucosa) even 5 years after transplanta‐ tion, showing the potential benefits of CCR5 disruption [95, 96]. Although the mechanism responsible for the apparent cure associated with this procedure remains to be established, acquired CCR5 deficiency is one possibility [97]. Due to the low frequency of CCR5-Δ32 homozygotes in the general population and the difficulties of identifying suitable donors, alternative methods to artificially disrupt CCR5 are being sought [65]; in particular, gene editing methods have attracted a lot of attention as a potential therapy for HIV, as they allow permanent disruption of the selected gene(s).

This approach had recently led to a phase I clinical study published in the *New England Journal of Medicine* [64]. The researchers enrolled 12 patients in an open-label, nonrandomized, uncontrolled study of a single dose of ZFN-modified autologous CD4 T cells. Six of these 12 patients underwent an interruption in antiretroviral treatment 4 weeks after the infusion of 10 billion autologous CD4 T cells. Between 11% and 28% of these CD4 T cells were genetically modified by ZFN. The primary outcome was safety in terms of treatment-related adverse events. Secondary outcomes included measures of immune reconstitution and HIV resistance.

The only serious adverse event associated with infusion of the ZFN-modified CD4 T cells was attributed to a transfusion reaction. The median concentration of CCR5-modified CD4 T cells at 1 week was 250 cells/mm3 . This represented approximately 8.8% of circulating peripheral blood mononuclear cells and 13.9% of circulating CD4 T cells. These modified T cells had an estimated mean half-life of 48 weeks. During treatment interruption and the resultant viremia, the decline in circulating CCR5-modified cells (-1.81 cells/day) was significantly less than the decline in unmodified cells (-7.25 cells/day; *p* = 0.02; see **Figure 1**) [64]. HIV RNA became undetectable in one patient. The blood level of HIV RNA decreased in the majority of the study subjects. This provides one of the first evidence that CCR5-modified autologous CD4 T-cell infusions are safe and opens the door for applying DNA editing technologies for treating human chronic viral diseases, although many obstacles remain.

**Figure 1.** CCR5-modified CD4 T cells during treatment interruption [64].

#### *2.3.2. Hepatitis B virus (HBV)*

HBV remains a major public health problem, with more than 300 million people chronically infected worldwide [98]. These individuals have an approximately 25% risk of dying from the consequences of HBV infection, including hepatocellular carcinoma (HCC) and cirrhosis, and approximately 800,000 individuals are thought to die each year due to HBV. An effective vaccine for HBV is available, but this is not helpful in individuals with preexisting infection and is not fully effective at preventing vertical transmission, compounded by the problem that it is not given to all children in resource-limited regions timely. HBV polymerase can be effectively inhibited by nucleoside-based antiviral agents (lamivudine, adefovir, telbivudine, entecavir, tenofovir), but this does not cure this infection due to the extraordinary stability of the viral episomal cccDNA intermediate [99], which continue to persist and produce new viral particles, as soon as the antiviral agents were stopped.

Current research in HBV therapy focuses on finding new targets on the viral life cycle (see **Figure 2**) and trying to overcome immune tolerance through immunotherapy, with limited success. It is clear that a complete "cure" cannot be achieved unless the new therapy is able to remove or destabilize the HBV cccDNA, and genome editing is one of the most promising approaches. Some of the challenges that need to be overcome include toxicity, development of viral resistance, specificity, ensuring therapeutic effect of sufficient duration, and hepatocyte-targeted delivery. Significant progress has been made to overcome these obstacles.

**Figure 2.** Replicative cycle of HBV and the respective targets of the current and experimental therapeutic agents (modi‐ fied from Phyo et al. [100]).

Zimmerman et al. [101] were the first to advance a gene editing approach to countering HBV replication. The researchers used duck HBV (DHBV) as a model and designed six different zinc finger proteins (ZFPs) to target the DHBV enhancer sequences, which control the transcription of the core and surface sequences. Marked reduction in viral pgRNA and total viral RNA was observed. ZFPs significantly reduced viral core and surface protein production with no obvious cytotoxicity. As the ZFPs did not cause target DNA mutation or durable epigenetic changes, this inhibition was not lasting. Later, Cradick et al. [71] also demonstrated the effectiveness of using ZFNs to specifically cleave HBV episomal DNA. The team engi‐ neered nine pairs of HBV-specific ZFNs and cotransfected each of these ZFN pair plus an HBV genome target plasmid into a hepatoma cell line. Targeted cleavage of viral DNA was demonstrated. The cleaved fragments were misrepaired in a manner that could potentially inactivate HBV. Further, cotransfection with the ZFN pair 6 decreased HBV pregenomic viral RNA levels by almost 30%. However, it should be noted that the study did not clearly demonstrate that the HBV cccDNA was modified by the ZFNs.

The X gene is thought to play a major role in the development of HCC [102]. Zhao et al. [103] designed ZFPs to inhibit the expression of integrated sequences of the X gene. An artificial transcription factor (ATF) was synthesized to target a sequence in the enhancer I region, which is upstream of the X promoter. The ATF comprised a DNA-binding domain of a ZFP that was linked to a KRAB repressor domain. X repression was demonstrated on a luciferase reporter assay. Another study by Weber et al. [72] aimed to prevent viral reactivation by targeting three HBV protein-coding sequences with ZFNs in HepAD38 cells, a tet-regulated cell line. AAV vectors containing sequences encoding the ZFNs were used for gene delivery. Site-specific mutagenesis with low cytotoxicity was confirmed for two of three engineered ZFNs. Inhibition of HBV replication and virion production over a period of 14 days could be achieved after a single treatment with ZFN targeting the viral polymerase gene.

Bloom et al. [73] first applied TALENs to disrupt hepatitis B replication in a cell line and a mouse model. HBV-specific TALENs were generated, targeting conserved sequences in the surface, core, and pol open reading frames (ORFs). The TALEN targeting surface and core ORFs were the most effective. The viral cccDNA were isolated using Hirt's extraction and plasmid-safe DNase. T7 endonuclease I assay was used to verify targeted mutation in cultured cells. The TALEN targeting surface ORF reduced HBsAg by more than 90% and circulating viral particle equivalents were diminished by approximately 70% by the S and C TALENs. T7E1 assays and deep sequencing confirmed the targeted disruption. Chen et al. [74] confirmed the successful targeting and inactivation of HBV genomic sequences by TALENs. The re‐ searchers also showed significant knockdown in markers of viral replication. Interestingly, when used in combination with interferon-α, synergistic antiviral effects were observed. Although promising, a limitation of using mice to simulate HBV replication in vivo is that these animals do not produce HBV cccDNA.

Some key studies employing CRISPR/Cas9 systems recently demonstrated the utility of RGEN cleavage of HBV DNA [75–78]. Lin et al. [75] designed eight HBV-targeting sgRNAs to target different conserved regions of the HBV genome. A significant decrease in the production of viral proteins was observed. Cotransfection with more than one sgRNA-encoding sequence augmented antiviral efficacy. This effect was corroborated by an increase in indels at the targeted sites. However, the efficacy against HBV cccDNA was not evaluated. Seeger and Sohn [76] investigated the targeted disruption of HBV cccDNA and confirmed the efficient cleavage of viral sequences with all five of their sgRNA constructs. Approximately eightfold reduction of HBcAg expression was achieved in HBV-infected HepG2-NTCP cells. The cells were transduced using recombinant lentiviral vectors. Targeted deletions from single nucleotide up to 2.3 kb was produced. This demonstrated the potential of CRISPR/Cas to target and excise host-integrated HBV genomes. Another similar study by Kennedy et al. [77] also showed suppression of HBV replication by lentiviral vector-delivered Cas9 and sgRNA sequences. Dong et al. confirmed the efficacy of sgRNA-Cas9 against HBV. In addition, they demonstrated the disruption of artificial cccDNA in a murine hydrodynamic model [78] that was based on the use of engineered recombinant cccDNA precursor plasmid (rcccDNA) [104].

These studies suggest that HBV-specific Cas9/sgRNA combinations can block HBV replication and eliminate the cccDNA pool if they can be effectively delivered to hepatocytes. At the moment, AAV may be the best carrier for this task, as several AAV serotypes are naturally hepatotropic and even more highly hepatotropic AAV vectors have recently been isolated by "shuffling" AAV sequences in vivo [105]. The next step is to examine whether AAV-delivered Cas9/sgRNA combinations can effectively cure HBV in the humanized, immunodeficient mouse liver model system.

#### *2.3.3. Herpes simplex virus (HSV)*

HSV-1 infects approximately 70% of the U.S. population and about one third of affected individuals suffer from recurrent, primarily oral, cold sores. HSV-1 most commonly initially infects the oral mucosal epithelium, leading to a local productive infection, and then undergoes retrograde transport to the trigeminal ganglia, where it establishes a latent infection in a small number of sensory neurons, which persists after the initial, productive infection is cleared by the host immune response [106]. During latency, the HSV-1 DNA genome is maintained as a nuclear episome, with 1 to ~50 copies per latently infected neuron. At this point, the only region of the genome that is actively transcribed encodes the LAT, which is processed to give rise to a single long noncoding RNA of 2.1 kb, as well as eight virally encoded miRNAs, which together are thought to regulate exit from latency [107]. Because no viral proteins are made, there is no immune recognition of latently infected cells. Occasionally, one or more latently infected neuron is activated to produce infectious virions that migrate down the axons of the reactivating neuron to the original site of infection, where they reestablish a transient produc‐ tive infection that can lead to the formation of cold sores. Although often no more than an irritation, HSV-1 infections can also lead to serious morbidity and HSV-1 keratitis represents the most common form of infectious blindness in the West [108]. Infection of the central nervous system may also cause fatal encephalitis. A closely related virus, HSV-2, which is found in approximately one fifth of the U.S. population, has a similar replication cycle but generally is sexually transmitted and infects the genital mucosa. Latency is established in sensory neurons of the sacral ganglia and reactivation can lead to genital ulcers. Again, serious morbidity is rare but does occur in some individuals and neonatal HSV-2 infections acquired during vaginal delivery can be fatal [106].

Although there are several drugs that can treat productive HSV-1 or -2 infections, generally by targeting viral DNA synthesis, latent HSV genomes are entirely refractory to current treatment regimens and it remains impossible to cure these infections. What is clearly needed is an approach that directly targets HSV-1 or HSV-2 episomal DNA for cleavage and elimination from latently infected neurons. AAV-delivered HSV-specific engineered endonucleases appear ideal for this purpose. Aubert et al. [79] introduced DNA doublestranded breaks in an HSV latency model using the engineered HE HSV1m5, which targets a sequence in the HSV-1 gene UL19, encoding the viral protein VP5. Coexpression of the 3′ exonuclease Trex2 with HEs increased the mutagenesis frequencies by up to sixfold. Following HSV1m5/Trex2 delivery with AAV vectors, the target site within latent HSV genomes was mutated. There was no detectable cell toxicity. The viral production by latently infected cells after reactivation was significantly decreased. Prior HSV1m5/Trex2 treatment followed by exposure to histone deacetylase inhibitors increased mutagenesis frequencies of latent HSV genomes by another twofold to fivefold. This indicates that chromatin modification may be a useful adjunct to gene-targeting methods. Using CRISPR/Cas9-mediated genome engineering to create single- and double-knockout (KO) cell lines, Turner et al. [62] discovered that the Torsin Activator LULL1 is required for efficient growth of HSV-1, whereas Johnson et al. [61] reported that the interferon-γ-inducible factor 16 (IFI16) restricts HSV-1 replication by accumulating the HSV-1 genome and repressing HSV-1 gene expression, and modulates histone modifications. Given the tight localization of HSV-1 and -2 to the trigeminal and sacral ganglia, respectively, low level of viral DNA genomes present in these cells, and the ability to efficiently transduce sensory neurons with AAV8-based vectors, this seems like an ideal viral candidate for cure using RGENs.

#### *2.3.4. Human papillomavirus (HPV)*

HPV infection, although normally innocuous, can also give rise to warts on the skin or genitalia [31]. Most HPV variants replicate as episomes in the basal epithelial layer of the skin, where the virus expresses exclusively nonstructural proteins. When the infected precursor epithelial cell migrates toward the surface of the epidermis and undergoes differentiation into a keratinocyte, the productive HPV replication cycle is activated leading to the release of infectious HPV virions (**Figure 3**) [31].

**Figure 3.** HPV penetrating the basal layer and released at the epithelial surface (Medscape [109]).

Although most HPVs are nonpathogenic, there are a small number of high-risk HPV serotypes, especially HPV-16 and -18, which together cause approximately 70% of all cervical cancers. In most HPV-induced cancers, the HPV episome is found clonally integrated into the cell genome in a manner that destroys or deletes the viral E2 gene (E for early) [31]. The role of the E2 protein is to bind to the HPV origin of replication, where it functions to ensure the distribution of HPV episomes to both daughter cells after cell division, and E2 also acts to regulate HPV early gene transcription. One key activity of E2 is to limit the expression of the HPV oncogenes E6 and E7, and disruption of E2 during integration into the host cell genome can lead to high constit‐ utive levels of E6 and E7 expression [110]. E6 functions to bind and destabilize the p53 tumor suppressor [111], whereas E7 similarly binds and destabilizes the Rb tumor suppressor [112], and these two functions play a critical role in the maintenance of HPV-transformed cells. Cancers associated with HPV infection include cervical carcinoma, which is almost always HPV positive, as well as a fraction of head and neck (H&N) carcinoma and anal cancer, all of which are related to sexual transmission of HPV. Novel treatment for chemoresistant HPVpositive tumors will be important for treating recurrent disease. Of note, almost all HPVpositive H&N and anal cancers are HPV 16 positive, thus restricting the required sequence range for engineered endonuclease-based therapy.

Both HPV E6 and E7 proteins play a crucial role in HPV tumorigenesis by blocking the action of p53 and Rb, respectively [112]. Consistent with this idea, the inactivation of the E6 gene in the HPV-18-positive cervical carcinoma cell line HeLa or the HPV-16-positive cell line SiHa using Spy CRISPR/Cas has been found to result in the induction of p53 expression followed by the expression of downstream targets of this cellular transcription factor, including the CDK inhibitor p21 and several activators of apoptosis, leading to cell cycle arrest and cell death [68, 82]. Similarly, disruption of the E7 gene using CRISPR/Cas results in the increased expression of Rb, formation of Rb/E2F heterodimers, and then the induction of cellular genes that induce senescence and cell death [68, 113]. Mino et al. [81] improved the design of a ZFN-based hybrid nuclease to the single-chain FokI dimer (scFokI) and demonstrated that it inhibited HPV-18 DNA replication in transient replication assays using mammalian cells more efficiently. By linker-mediated PCR analysis, the researchers confirmed that AZP-scFokI cleaved an HPV-18 ori plasmid around its binding site in mammalian cells. These studies suggest that targeted endonucleases specific for HPV E6 and/or E7 has the potential to serve as a novel, highly specific, and effective therapy for chemoresistant HPV-16 induced anal and H&N tumors.

#### *2.3.5. Other viruses*

A number of other chronic virus infections are associated with serious human diseases including EBV, cytomegalovirus, HCV, Kaposi's sarcoma-associated herpes virus (KSHV), human T-cell leukemia virus type 1 (HTLV-1), and Merkel cell polyomavirus (MCPyV) (see **Tables 2** and **3**). Of these, perhaps the most relevant in relation to DNA editing technology is EBV. EBV is the etiologic agent of several cancers, including an epithelial cell tumor called nasopharyngeal carcinoma (NPC), which is highly prevalent in southern China and Southeast Asia [40, 52]. In NPC cells, EBV is found in a form of viral latency that nevertheless involves the expression of several viral nonstructural proteins and microRNAs [114]. EBV-positive NPCs share a number of characteristics with HPV-16-positive H&N cancers, and as in the latter case, the continued presence and transcription of the viral (in this case, EBV) genome is thought to be essential for tumor survival. Wang and Quake [70] used the CRISPR/Cas9 system for antiviral therapy in human cells, specifically targeting the EBV genomes of latent viral infections. Patient-derived Burkitt's lymphoma cells with latent EBV infection showed significant proliferation arrest and decrease in viral load after treatment with specific CRISPR/ Cas9 vector targeting the viral genome. It seems likely that NPC cells would be excellent targets for transduction in vivo using Sau Cas9/sgRNA-based AAV vectors specific for the EBV genome.

Progressive multifocal leukoencephalopathy (PML) is a fatal demyelinating disease of the central nervous system caused by human polyomavirus JC (JCV) reactivation. JCV replicates in oligodendrocytes, the myelin-producing cells in the brain. Previously a rare disease seen in patients with lymphoproliferative and myeloproliferative disorders, PML is now seen more frequently in HIV-1-positive/AIDS patients and patients undergoing immunomodulatory therapy due for rheumatological/autoimmune disorders [83]. At this time, there is no cure for PML, and in most cases, disease progression leads to death within 2 years. The JCV genome is a small circular double-stranded DNA that includes coding sequences for the viral early protein, T-antigen, which is critical for directing viral reactivation and lytic infection. Wollebo et al. [83] applied CRISPR/Cas9 system to introduce mutations in the viral genome to inactivate the gene encoding T-antigen and inhibit viral replication. Transient or conditional expression of Cas9 and gRNAs specifically targets the N-terminal region of T-antigen on integrated genetic and functional studies. The mutation introduced interferes with the expression and function of the viral protein, suppressing viral replication in vitro [83]. There was no off-target effect of the JCV-specific CRISPR/Cas9 editing apparatus observed. These studies provide the first evidence for the employment of a gene editing strategy as a promising tool for the elimination of the JCV genome and a potential cure for PML.

HTLV-1, which causes adult T-cell leukemia (ATL) in humans, establishes a lifelong latent infection. Current therapies are not very effective against HTLV-1-associated disorders. In a proof-of-concept study, Tanaka et al. [80] developed a targeted endonuclease based on zinc finger nuclease (ZFN) that specifically recognized a conserved region of HTLV-1 long terminal repeat (LTR) and introduced it into various HTLV-1-positive human T-cell lines, including HTLV-1-transformed and ATL-derived cell lines [80]. ZFN disrupted the promoter function of HTLV-1 LTR and specifically killed HTLV-1-infected cells. The researchers showed the first evidence of the removal of the proviral genome from HTLV-1-infected cells. The therapeutic effect of ZFN was confirmed in an in vivo model of ATL. This strategy may form the basis of a therapy that can eradicate HTLV-1 infection, and similar approaches can be used to target other malignancy-associated viruses.



**Table 3.** Summary of studies using DNA editing technology to inactivate or mutate DNA virus genomes.

#### **3. Conclusion**

The development of effective gene editing technologies has the potential to lead to the global identification of almost all cellular factors that regulate virus replication in culture, leading to a wealth of new insights into viral molecular biology and producing numerous potential targets for antiviral drug development. If such screens can indeed identify cellular factors that are required for virus replication but entirely dispensable for the health of the adult target organism, then it might be possible to also treat viral infections, including RNA virus infec‐ tions, via the localized ablation of a cellular gene as described above for direct targeting of DNA virus genomes. A similar approach has previously been reported using ZFNs to inactivate the HIV-1 coreceptor CCR5, to prevent the infection of CD4+ T cells, by transduction of hematopoietic stem cells followed by an analysis of the production of HIV-1-resistant CD4+ human T cells in engrafted immunodeficient mice [115, 116]. The challenges remain in applying engineered nucleases in the clinic setting, including immunogenicity of transduced proteins, with repeated application, delivery of the genes into the correct tissue/cell types, side effects of off-target mutagenesis, limitations in the design of the nuclease target sites, ethical issues, including tumorigenicity, undesired integration of nucleases or donor templates, and the germline transmission of the modified genome. One interesting and novel approach is to deliver engineered nuclease using modified mRNA, which is nonintegrating and provides a transient pulse of protein expression, as an alternative to traditional viral vectors. A team of researchers had recently applied this nuclease-encoding, chemically modified (nec) mRNA to deliver site-specific nucleases in a transgenic mouse model of SP-B deficiency, resulting in successful site-specific genome editing in vivo [117, 118]. Although several technical challenges and uncertainties remain, the promise of using gene correction technology to study and treat chronic viral infections is tremendous. Further advances in understanding and improvements in technology will open the next era of therapy against currently difficult-to-treat viral diseases.

#### **Author details**

Guan-Huei Lee1,2\* and Myo Myint Aung1


#### **References**


## **Gene Editing in Adult Hematopoietic Stem Cells**

Sergio López-Manzaneda, Sara Fañanas-Baquero, Virginia Nieto-Romero, Francisco-Jose Roman-Rodríguez, Maria Fernandez-Garcia, Maria J. Pino-Barrio, Fatima Rodriguez-Fornes, Begoña Diez-Cabezas, Maria Garcia-Bravo, Susana Navarro, Oscar Quintana-Bustamante and Jose C. Segovia

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/62383

#### **Abstract**

Over the last years, an important development has allowed the scientific community to address a precise and accurate modification of the genome. The first probe of concept appeared with the design and use of engineered zinc-finger nucleases (ZFNs), which was expanded later on with the discovery and engineering of meganucleases and transcrip‐ tion activator-like effector nucleases (TALENs) and finally democratized and made easily available to the whole scientific community with the discovery of the clustered regular‐ ly interspaced short palindromic repeats (CRISPR)/Cas9 nuclease combination technol‐ ogy. The availability of these tools has allowed a precise gene editing, such as knockout of a specific gene or the correction of a defective gene by means of homologous recombi‐ nation (HR), taking advantage of the endogenous cell repair machinery. This process was already known and used but was inefficient—efficiency that has been increased more than 100-fold with the addition of the mentioned specific nucleases to the process. Apart from the proper design of the nucleases to recognize and cut the selected site in the cell genome, two main goals need to be adequately addressed to optimize its function: the delivery of the tools into the desired cells and the selection of those where the gene editing process has occurred correctly. Both steps can be easily solved when the source of cells is extensive or can be expanded and manipulated *in vitro* extensively, such as immortal‐ ized cell lines or pluripotent stem cells (embryonic stem cells and induced pluripotent stem cells). However, both steps are critical in the case of primary cells, such as the hematopoietic stem cells (HSCs). HSCs are a rare cell population present in the bone marrow (BM) of higher mammals, and it is the responsible for the maintenance and

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

replenishment of all hematopoietic cells for the lifespan of the animals by means of two fundamentalproperties: self-renewalandmultipotency.HSCpopulationis thentheideal target for the correction of hematopoietic genetic diseases and also for the knockout of the responsible genes to *in vitro* and *in vivo* model those hematopoietic diseases. This rare populationcannotbeexpandedandits*invitro*manipulationandculturenegativelyaffects their fundamental properties of self-renewal and multipotency. These factors challenge the application of gene editing to HSCs. Important efforts are now ongoing trying to optimize the protocols of gene delivery and selection for HSCs. This chapter will review and discuss how researchers are trying to solve them, all attempts that are ongoing and the potential application of the technology to the patients affected with hematopoietic genetic diseases.

**Keywords:** hematopoietic stem cells, hematopoiesis, genetic disease, gene therapy, gene editing, nucleases, zinc finger, TALEN, meganucleases, CRISPR/Cas9

#### **1. Introduction**

Adult hematopoietic stem cells (HSCs) are a rare population of cells that are present in the bone marrow (BM) and are the responsible for the generation of all mature blood cells, including erythrocytes, platelets, and immune cells (1). This population represents less than 0.1% of the total BM and is the only one capable of self-renewal, being responsible for the maintenance of the lifelong hematopoiesis. Since their definition at the beginning of the last century (2), an extensive knowledge of this population has been accumulated, being probably the best known adult stem cell. HSCs can be phenotypically and functionally characterized and can be purified and even manipulated *in vitro*. Once transplanted back into the animal or human body, HSCs are capable of regenerating the whole hematopoietic system of the organism (1,3). These capabilities established the possibility of BM transplantation as a therapeutic strategy for the treatment of hematopoietic diseases, either inherited or acquired. BM transplantation can be autologous, when the HSCs come from the same person, or allogeneic, when they come from a healthy donor. Allogeneic BM transplant is the only curative treatment for oncogenic diseases and for inherited (genetic) diseases, providing healthy HSCs. However, the immune reaction of the transplanted cells against the recipient [graft-versus-host disease (GVHD)] and the availability of compatible BM donors reduce dramatically its applicability and increase the severe adverse effects of the treatment (4). To definitively solve this problem, the genetic modification of the autologous HSCs to restore the genetic defect or to provide them with new capabilities, the so-called gene therapy, appeared as an additional therapeutic option avoid‐ ing GVHD and providing repaired/cured cells.

The idea of gene therapy appeared up in the middle of the 20th century, when the use of modified viruses as vectors to introduce the desired genetic material into the desired cells was already mentioned and demonstrated as a proof-of-concept (5). Since then, gene therapy has been playing in a roller coaster with tremendous improvements and successes but also with serious problems that have been dramatic in some instances (6). Finally, gene therapy is nowadays established as a real therapeutic option for a number of hematopoietic inherited diseases, such as immunodeficiencies or hemoglobinopathies (7). The use of self-inactivating lentiviral vectors, the improvement of the *in vitro* culture conditions of the human HSCs, and the better knowledge of HSC engraftment in the recipient has contributed to this reality (8). Nevertheless, HSC gene therapy still has some drawbacks that needed to be addressed to make this strategy safer. The most important one is the potential insertional mutagenesis that could derive in an oncologic problem due to the inability to control the insertion site of the genetic material using the present available tools (8).

Precise gene editing using the gene repair cellular machinery appeared as the next step to follow. Widely used in the generation of animal models, mainly fish and mouse, its application as a therapeutic option in humans was far to be considered due to the low efficacy and the difficulty to select those cells where the desired gene edition had taken place. The discovery of enzymes and biological systems able to cut in one precise and specific sequence of the genome (nucleases and nuclease systems) and activate the endogenous DNA repair pathways has clearly approached gene editing to be considered as a potential therapeutic approach in the near future. First tested for their therapeutic potential for hematopoietic genetic diseases in induced pluripotent stem cells (hiPSCs) (9), the principal bottleneck in these experiments was the generation of functional HSCs from the edited hiPSCs. Thus, an increasing number of laboratories, including ours, are now adapting the gene editing protocols to adult HSCs. Important steps, such as the delivery of the gene editing tools, improving protocols to maintain HSCs *in vitro* to allow their proper edition, and improving systems to select properly edited cells, are important barriers that need to be overcome to definitely apply gene edition-based gene therapy to the clinics. Some of the steps that are now being taken are explained below.

#### **2. Hematopoiesis and hematopoietic stem cells**

The hematopoietic system is responsible for the production and maturation of the different components of the blood in a process called hematopoiesis. Blood is one of the most active regenerative tissues, with approximately 1 trillion mature cells arising daily in adult human BM (3). The hematopoiesis is a highly hierarchically organized structure in which all types of mature blood cells are generated from a small subpopulation of undifferentiated cells allocated in the BM. The hematopoietic system is divided in two main different lineages: lymphoid and myeloid. The lymphoid lineage gives rise to the T cells, B cells, and natural killer (NK) cells, whereas the myeloid lineage formed granulocytes, monocytes, erythrocytes, and megakaryo‐ cytes/platelets (1). Different organs and tissues are implicated in the production and homeo‐ stasis of the hematopoietic systems, such as blood, BM, liver, spleen, thymus, and lymph nodes.

HSCs are the most primitive cells in the hematopoietic system and support all the hemato‐ poiesis. They are characterized by their capability to self-renew and to produce all types of mature blood functional cells. These cells persist throughout adult life, maintaining their hematopoiesis (1). Hematopoietic committed progenitor cells, the next step in the hemato‐ poietic differentiation process, arise from the HSCs. These cells have a higher proliferative activity than HSCs, a limited self-renewal capacity and the ability to differentiate a limited number of mature cell types. Finally, mature cells complete the pyramid structure of the hematopoiesis. These cells are characterized by a recognizable morphology, full functional capabilities, and a low or null capacity of proliferation and self-renewal (10).

The study of human HSCs began with the identification of colony-forming progenitors using *in vitro* CFU-C assays (11–14). Nowadays, human HSC capacity is usually tested in xenogeneic transplants, in which human HSCs are transplanted in immunodeficient mouse strains that have a permissive microenvironment for the human HSC engraftment (15). The first human‐ ized mouse model was the severe combined immune-deficient (SCID) mouse lacking B and T cells (16,17). Then, different mouse models were developed based on their ability to efficiently support high levels of all lineages of human engraftment (18–20). The CD34 marker was identified as a relevant indicator of human hematopoietic progenitors, which includes HSCs (21,22); nevertheless, due to the variability in their repopulation ability, several subpopulations of HSCs were identified in xenograft models, some subpopulation with a transient and limited ability to support the hematopoietic system, such as short-term HSCs (ST-HSCs), multipotent progenitors (MPP), committed progenitors, or long-term HSCs (LT-HSCs), which are able to maintain a durable hematopoiesis (23,24).

Only 1 in 106 in human BM is a transplantable HSC (25); thus, the scarcity of HSCs has led to the necessity to purify the HSCs from the heterogeneous bulk population to allow their handling and manipulation. The purification of HSCs requires the detection of several cell surface markers. There are several markers whose expression is gained or lost during the differentiation process. In mice, HSCs do not express lineage markers (myeloid or lymphoid), so they are called lineage negative cells. In addition, HSCs express high levels of the stem cell antigen 1 (Sca-1) and c-Kit (CD117) marker (**Figure 1**). These cells are called LSK cells (lineage negative, Sca-1 positive, c-Kit positive) (1,26,27). As in mice, the absence of all lineage markers (Lin- ) determines human HSCs (28). As previously mentioned, CD34, expressed on 0.5% to 5% of all blood cells, was the first marker found that identifies human LT-HSCs and more differentiated progenitors (29). Later on, other markers were described in LT-HSCs such as the expression of the CD90 (Thy1) (30) or the lack of expression of CD45RA and CD38 (found to be expressed in more differentiated progenitors) (31–33). Moreover, integrin CD49f (involved in cell attachment to the extracellular matrix) was shown to be expressed in LT-HSCs (34). With respect to ST-HSCs or MPP, the loss of Thy1 and CD49f expression was proposed to be a key signature through a more differentiated state (3). Consecutive studies have classified the different populations depending on the expression of alternative markers, such as the expres‐ sion of aldehyde dehydrogenase (ALDH), and other surface markers, such as CD117 or CD133 (35).

#### **3. Gene correction of hematopoietic stem cells by addition strategies and their drawbacks**

Gene therapy can be achieved by the delivery of genetic material into cells affected by a disease, and it can be accomplished by the addition, substitution, or alteration of the related genes (8). These modifications can be carried out by *in vivo* direct infusion of a vector having the ectopic

**Figure 1.** Human and mouse hematopoieisis. Similarities in terms of antigen differentiation markers.

therapeutic gene or by *ex vivo* manipulation of patient's cells and their reinfusion into the patient. Gene therapy has become a feasible and attractive alternative therapeutic strategy for several diseases, from inherited hematopoietic, immune, and nervous diseases, including primary immunodeficiencies, leukodystrophies, thalassemias, hemophilias, and retinal dystrophies to cancer and other malignancies (36). In the case of blood-related diseases, HSCs have long been the preferred target for *ex vivo* gene therapy due to the feasibility to isolate them *ex vivo*, genetically correct them and reinfuse them through intravenous infusion (37,38). The correction of HSCs is a promising therapeutic approach that can provide a steady and stable expression of the transgene, restoring the function of the malfunctioning gene in the patient. Stable expression of the transgene can be obtained by addition gene therapy using vector-mediated transgene insertion (**Table 1**).




**Table 1.** Addition gene therapy attempts and their outcome.

Classically, viral vectors, based on γ-retroviruses (γ-RV) or lentiviruses (LV), have been widely used for this purpose because they ensure high transduction efficiency and long-term expres‐ sion of the transgene. γ-RVs were widely used in the first gene therapy trials that started at the end of the 1990s (39). In diseases affecting the hematopoietic system, these vectors were first used for the treatment of two inmunodeficiencies: adenosine deaminase deficiency (ADA) and X-linked severe immunodeficiency or γ-chain immunodeficiency (X-SCID1) (40–42). In the case of ADA, γ-RV-transduced autologous stem cells have been used to treat 38 patients in three independent studies (39). Overall survival resulted in 100% at 3.5 years follow-up, and from these, 20 patients no longer required enzyme replacement therapy. *Ex vivo* gene transfer in ADA patients, by means of γ-RV, demonstrated to be efficient, as the engraftment of corrected cells was maintained over time and resulted in the improvement of the cellular and humoral immune response despite the mild conditioning applied (39). Similarly, the treatment of X-SCID1 was initiated in two different clinical trials. Similar to ADA gene therapy trials, the gene complementation of γ-chain with γ-RV was clearly efficient (42). Seventeen of the 20 treated patients showed a full or nearly full correction of the T-cell defect (39). However, whereas, in ADA patients treated by gene therapy with γ-RV, no adverse effects related with the genotoxicity of the procedure were observed, in X-SCID1 patients, 5 of the 20 treated patients suffered T-cell leukemia after 2.5 to 5 years after the treatment due to insertional mutagenesis (39). Viral insertion produces insertional mutagenesis and can induce the activation of neighboring genes, as proto-oncogenes, through enhancer/promoter sequences present in the retroviral long terminal repeats (LTRs). Both γ-RV and LV contain LTRs that are present at both the 5′ and 3′ ends of retroviral RNA genome and are required for the integration of the provirus into the host genome. Similar issues arose sometimes after the treatment of other immunodeficiencies as the Wiscott-Aldrich syndrome, in which restored expression of the WASP gene was achieved, but 4 of 10 treated patients were reported to suffer from leukemia or chronic granulomatosis disease (CGD) in which the appearance of myelodys‐ plastic syndrome in 3 patients also proved to be the result of the up-regulation of genes such as *MECOM* and *PRDM16* in CGD patients (43–47). These severe adverse effects limited the application of gene therapy to patients. A strong effort of the scientific community was required to increase the safety characteristics of the vectors. The inactivation of the U3 region in the 3′ end of the viral genome, leading to self-inactivating (SIN-LTRs) vectors (48), and the use of more physiological and/or specific promoters (49) were characteristics implemented in the new improved integrative vectors. Moreover, the analysis of insertion patterns demon‐ strated that γ-RV preferentially integrate near the transcriptional start sites and promoter regions (50), whereas this in LV occurs along transcriptionally active genes (51–53). Even so, some studies have reported that LV produce aberrant spliced transcript and deregulated expression in the integrated genes, although this phenomenon has not been reported in patients (54–56). Therefore, the SIN LTRs and the integration-site preferences of LV have been shown to substantially mitigate the insertional genotoxicity, making LV a reliable therapeutic option. In fact, a significant number of phase I/II clinical trials are currently being conducted with LV, which have reported remarkable efficacy and safety in immunodeficiencies, βthalassemia, adrenoleukodystrophy, or metachromatic leukodystrophy (36). All these HSCbased gene therapy clinical trials have reported stable and high level of reconstitution of the hematopoiesis of the treated patients. Besides, there has been no report of adverse events related with LV, although the overall follow-up of some of these trials is still limited.

Nonetheless, the potential risks already observed with γ-RV could be still present. Future trials should overcome the efficient delivery of the new genetic information in the target cells, without altering or disrupting the host cell genome and preserving the characteristics of the transduced cells. The uses of LV that do not require integration or the precise control of the site where the genetic material is going to be inserted in the host genome (gene editing) are strategies that are nowadays being explored in deep. Important efforts are invested in this second approach that will overcome the majority of the potential adverse effects described above. However, additional challenges need to be overcome yet.

#### **4. Gene editing strategies in HSCs**

Gene editing describes the new technology able to accurately modify genes, either by knocking out these genes or by inserting or substituting specific sequences in a precise way. This new technology is based on promoting the action of endogenous DNA repair pathways at the target sequence without altering any other place in the genome. The main approach used for gene targeting is based on the natural HR DNA repair mechanism of the cell. Homologous Recom‐ bination (HR) is an accurate DNA repair mechanism that uses the sister chromatic as a template to repair double-stranded breaks (DSBs). HR-based gene therapy strategies started in 1980s using DNA donors flanked by long arms with homology to the target locus (57). However, the probability of inducing the specific insertion in the target site was low, around 10-6 (58), making this procedure only feasible for basic research. Trying to improve this low efficiency, several strategies have been addressed. The most important one is the development of engineered nucleases able to induce DSBs in the specific target site, with the consequent activation of the HR, and the gene edition in a precise place of the genome.

Although transfer of designed nucleases together with a donor template can increase the efficiency, HR is not as frequent as other DSB repair mechanisms, such as non-homologous end-joining (NHEJ). This is an error-prone mechanism that binds the two DNA ends originated in the DSB, without the use of any template. Insertions and/or deletions (INDELS) can be introduced in the target site. HR process is more frequent during the S and G2 phases of the cell cycle, so it takes place more frequently when cells are proliferating, and its permissiveness varies between cell types (59).

There are different applications based on these two DNA repair mechanisms. If NHEJ takes place, the generated INDELs could have a particular application for the stable disruption of specific target genes (59). When HR takes place, three strategies of gene editing can be envisaged:


### **5. Gene editing tools applicable to the gene editing of hematopoietic stem cells**

Different systems have been developed to lead the genetic modification to a specific locus. These gene editing tools have been inspired by natural systems, which evolved over millions years to modify DNA, among which the capability of generating DSBs has been selected for different purposes including DNA repair or cell response against foreign DNA from patho‐ gens. The proteins involved in DSBs act by the specific hydrolysis of the DNA and are known as nucleases. These nucleases have been described in all branches and life species. As long as these proteins have been described, they have been rapidly applied as tools for gene editing. Four different types of nucleases have been more widely applied to generate DSBs: homing endonucleases (also called meganucleases), zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palin‐ dromic repeats (CRISPR)/Cas9.

#### **5.1. Meganucleases**

Meganucleases tend to be small proteins (<40 kDa) encoded by mobile inteins or introns. These families of proteins have evolved to target sequences (14–40 bp) in a specific way (60,61). In nature, meganucleases propagate their own introns cleaving cognate intronless alleles and therefore producing HR. For that capability of moving its genetic sequence in a determined place, meganucleases are also called homing endonucleases. There have been described five families of meganucleases: GIY-YIG His-Cys box, HNH, LAGLIDADG, and PD-(D/E) XK. The largest family is known as LAGLIDADG, named after the consensus sequence of its catalytic motif LAGLIDADG. It is a diverse group of enzymes present in fungal and protozoan mitochondria, plants and algae chloroplasts, bacteria, and *Archaea*. This family is defined for having either one or two copies of LAGLIDADG motif acting therefore as separated monomers or homodimers, respectively. However, meganucleases are limited as gene editing tools due to the necessity of finding large palindromic DNA targets (62). Moreover, its design requires important bioinformatics, difficult protein engineering, and *in vitro* testing.

#### **5.2. ZFNs**

In 1996, Chandrasegaran et al. started to use hybrid nucleases which combined the cleavage domain of *Fok*I nuclease and the specificity of Cys2His2 zinc-finger protein domains, making possible the specific DNA cut in a sequence of more than 18 bp (63,64). *Fok*I is capable of generating DSB when dimerized, so to use it as a gene editing tool, two monomers must be designed and delivered separately. Designing the two monomers in inverted orientation improved the nuclease efficiency (65). The specific domain that binds to the DNA is formed by tandem repetitions of Cys2His2, which form the ββα structure folded for the presence of a Zn2+ ion. The α-helix recognizes a specific DNA triplet. Between 3 and 6 of these motifs, which are separated by some spacer nucleotides, are modularly assembled in each monomer. Between the recognition sites of both strains remains a small nontargeted region called "spacer" surrounded by the *Fok*I dimer (66–68). To avoid some off-target events already described (69,70), monomers of two different *Fok*I domains, only active when forming heterodimers, have been designed (71,72). Although different ZFN open-source libraries have been developed, the design still results complicated for nonspecialists (73,74). However, ZFNs have been the first described nucleases and are the only ones already in clinical trials (https:// goo.gl/M0ZWoB).

#### **5.3. TALEN**

TALEN technology follows the same rationale as ZFN, the use of *Fok*I domain because of its nonspecific cleavage capability fused to a domain that confers DNA-binding specificity. As in the case of ZFN, the *Fok*I region requires dimerization to be active, so this tool also requires the use of two monomers (65,75). In TALEN, the specificity is provided for protein domains from TALE. TALE are bacterial proteins, first described in *Xanthomonas,* capable to modify the gene transcription of the plants that these bacteria parasite (76). This DNA-binding domain is formed by 10 to 30 repetitions of 30 residues. The 12th and 13th positions, the so-called repeatvariable di-residues (RVDs), are the polymorphism responsible of the specificity (77). Each RVD pair is able to recognize specifically a base pair (77–80). Knowing the code TALE-to-DNA (79), the designing has become a relatively easy task (81). Moreover, due to the long DNA sequences they recognize, they show the lowest off-target cleavage. On the contrary, their highly repetitive structures make these sequences hard to package into viral vectors because of their tendency to recombine (82).

#### **5.4. CRISPR/Cas9**

The CRISPR/Cas9 system is the latest nuclease that has appeared in the gene editing tool arena. CRISPR/Cas9 is part of the adaptive immune system of bacteria and *Archaea*. In nature, this system cleaves foreign DNA to avoid cellular threats such as viral attacks (83,84). Up to 10 CRISPR/Cas9 systems have been described, but the type II CRISPR/Cas9 is the most used in gene engineering to generate DSBs for research.

This strategy is based in three elements: the Cas9 nuclease, the trans-activating CRISPR RNA (tracrRNA), and the CRISPR RNA (crRNA). Cas9 is a nuclease that produces DSBs, thanks to its two catalytic regions, Ruv and HNH. Cas9 recognizes the tertiary RNA structures that the duplex tracrRNA and crRNA form when they are bound to the DNA. The tracrRNA is a fix structure that binds Cas9, the catalytic element, to the crRNA, the element that gives the specificity to the target site. The crRNA is a 20-bp-long RNA strand complementary to the target region. The only restriction of the system is that the target sequences must be followed downstream for the 3-bp-long protospacer adjacent motif (PAMs), 5′-NGG-3′ (72). PAM motif is frequent along the DNA, so CRISPR/Cas9 system crRNA can be design against almost every DNA region. To facilitate its use, crRNA and tracrRNA have been combined in one single molecule called single guide RNA (gRNA) (85). Thus, only two elements must be delivered into the cells, the Cas9 and the gRNA, instead of three. Although controversial, some studies have reported a higher off-target activity than other nucleases (86). CRISPR/Cas9 system is developing fast and new Cas proteins recognizing different PAM sequences, having smaller protein sizes or even been able to cut in an overhang way have been described over the last year (87–89). Moreover, Cas9 protein modifications, as ZFN and TALE, have allowed not only the generation of DSB but also as function as DNA activators, DNA repressors, or DNA tracers (90).

#### **6. Delivery of gene editing tools to HSCs: the main challenge**

One of the most important steps to consider when performing a gene editing strategy is the delivery of the gene editing tools. In fact, nowadays this is probably the main bottleneck in gene therapy in general and definitely in the gene editing of HSCs (91).

The main objective of delivering the nucleases effectively in cells is to promote a hit-and-run activity of the enzyme that allows a short-term boost of expression that leads to a specific generation of a DSB in a short period of time to avoid toxicity and off-target activity due to the long-term expression of the nuclease. In case of performing a gene addition/gene correction strategy, we should also consider the delivery of the DNA molecule carrying the desired sequences to be integrated in the cell genome, the so-called donor DNA molecule. Taking all these facts into account, the delivery of gene editing tools can be classified in two different groups: viral and nonviral transfection methods (**Figure 2**).

Concerning the nonviral transfection methods, the gene editing tools can be delivered as plasmid DNA, *in vitro* transcribed mRNA, or purified proteins. The delivery of plasmid DNA is a widely used standard technique (92,93), as this molecule is easily engineered and easy to handle. Both gene editing tools and donor sequences can be transfected in this way. The donor sequence can be a linear donor sequence with less than 50 bp of homology (94), as well as single-stranded DNA oligonucleotides (95), especially when inducing mutations or corrections at the target site. Due to the negative charge of both plasmid DNA and the cell surface, the uptake of DNA into the cells is restricted, and the delivery needs to be supported by com‐ plexing DNA with chemicals such as calcium phosphate, coat lipids, or cationic polymers (92) or by subjecting the cells to an electric field that open pores in the nuclear and/or plasmatic membrane of the cells (electroporation). Another important factor to take into account is the target cell. In our case, the HSC is a hard-to-transfect cell. The only technique available nowadays to introduce these tools inside this cell type is nucleofection. Nucleofection is an electroporation-based transfection method that enables transfer of nucleic acids into difficultto-transfect cells by applying a combination of electrical parameters with cell type-specific reagents. However, this process produces high cellular toxicity and even cell death depending on the plasmid size and the total amount of transfected DNA. Moreover, the delivery of plasmid DNA could mediate its random insertion into the genome of treated cells (96). The risk of random plasmid integration mainly depends on the plasmid quality, but it is also influenced by specific properties of the targeted cells, including the prevalence of DSBs and the status of their DNA repair pathways. Plasmid integration, although is a nonusual event, represents an obstacle for gene targeting approaches, as mentioned above for the more conventional retroviral vector delivery systems.

Alternative platforms to deliver nucleases have been explored, such as the transfection of *in vitro*-transcribed mRNA (97,98). The main advantage is that mRNA establishes a short-term boost of enzyme activity with low toxicity and no risk of integration (99). Thus *in vitro* transcribed mRNAs that encode ZFNs, TALENs, or Cas9 and gRNAs are the preferential forms in which these nucleases are nucleofected into HSCs. Another possibility is to deliver nucleases as proteins; this approach presents the same advantages as mRNA delivery, as they also act in a hit-and-run fashion. Genetic fusion of recombinant proteins to positively supercharged moieties favors their uptake by cellular internalization mechanisms (100,101). However, generation of high yields of soluble and active protein transduction domains (PTDs)-contain‐ ing nucleases is difficult (102). An alternative strategy consists on the chemical conjugation of nucleases to PTDs for receptor-mediated endocytosis (103). Interestingly, due to the positive charge of their Cys2-His2 zinc-finger motifs, ZFNs display an intrinsic cell-penetrating capacity, which can lead to targeted mutagenesis (102). Other delivery options under investigation include protein transfection procedures such as electroporation (104). It has been successfully applied in the case of CRISPR/Cas9 nucleases, where HSCs have been efficiently edited by the direct nucleofection of the Cas9 protein/gRNA ribonucleoprotein complex (105). Chemical transfection agents are also being investigated for nucleases as proteins (106).

Apart from nucleofection, nonintegrating viral vectors, such as integrase-deficient LV vectors (IDLVs), adenovirus vectors (AdVs), and adeno-associated virus vectors (AAVs), also serve as a robust source to facilitate the delivery of the genes encoding programmable nucleases both *in vitro* and *in vivo* in cell types that are sensitive to transfection such as HSCs (107). The decision whether to use one type or another relies on the size of the transgene to deliver and on the capability of the virus to transduce quiescent or diving cells.

AdVs are double-stranded DNA viruses capable of packaging up to 37 kb of DNA. However, the transduction capacity of these vectors is highly dependent on the cellular type. Owing to the large cargo size that can be accommodated in adenoviruses, they are an attractive platform for the delivery of programmable nucleases and donor DNA. They have been used in some gene targeting approaches to deliver nucleases in both fibroblasts and CD4+ T cells (108,109), but unfortunately they are not able to efficiently transduce HSCs.

AAVs are nonenveloped, single-stranded DNA viruses that replicate only in the presence of a helper virus, such as adenoviruses or herpes simplex viruses. Nowadays, they are the most widely used delivery system for *in vivo* gene therapy due their nonpathogenic behavior and their capability to overcome the host's preexisting immunity. The major disadvantage of AAVs is their low packaging capacity that is incompatible with some therapeutic gene sizes. ZFNs are the most compact nucleases, and sequences encoding ZFNs can be packaged into these vectors (110). In the case of the CRISPR/Cas9 tool, only one of the required elements can be packaged in an AAV. The *Streptococcus pyogenes* Cas9 can be packaged into an AAV, but the gRNA sequence should go in a different vector. An alternative to the limited packaging size of the AAV is the use of the *Staphylococcus aureus* miniCas9, which perfectly fits the AAV size with the gRNA sequence. However, the proper definition of the target sites for this miniCas9 is still being improved (87).

IDLVs are currently the preferred tool for transferring genes into HSCs, as they possess certain advantageous attributes compared to other vectors [reviewed in (38)]. For example, the preintegration complex of LV is actively translocated into the nucleus and thereby facilitates efficient transduction of a variety of nondividing cells. In contrast, other viral vectors depend on the dissolution of the nuclear membrane during mitosis for delivering their cargo into the target cell nucleus. Consequently, efficient transduction of HSCs can be achieved with IDLVs after a shorter incubation time *in vitro*, preserving to some extent the physiological nature of HSCs and their engraftment potential. IDLVs have successfully delivered ZFNs and homolo‐ gous donor DNA (111). Unfortunately, IDLVs are incompatible with TALENs because highly homologous TALE repeats often lead to unwanted recombination in cells (112). Moreover, the production of IDLVs is still the major drawback. The high viral titers needed to efficiently transduce HSCs are difficult to get, limiting their potential use in many instances.

#### **7. Evidences of gene editing in HSCs and future attempts**

Since gene editing technology has started to be used, its applicability to HSCs is being explored. Either blood disease modeling or new gene therapy approaches have risen from the basics of gene edition. The development of genome editing technologies based on programmable nucleases has substantially improved the ability to make precise changes in the genomes of the cells compared to small-molecule therapies. However, the gene editing advancement in hematopoiesis could not be understood without its combination with cell reprogramming technology, especially iPSCs. The correction by gene editing of patient-specific iPSC has facilitated the spreading of this technology, allowing the optimization of tools and techniques on this difficult but easily expandable cell type, which have been then applied to the more difficult to culture and transfect/transduce HSCs. On the contrary, the current clinical trials addressed to correct monogenic blood disease by viral vectors, retro and LV, have contributed to the useful knowledge about these diseases, different delivery systems, and their therapeutic potentials (Table 2).

The first evidence of gene editing in human hematopoietic progenitors was described in 2002 by Dr. Davis's group in sickle cell disease (SCD) HSCs (113). SCD is characterized by a singlepoint mutation in the seventh codon of the β-globin gene, which means that the site-specific correction of the mutation would allow for the production of normal hematopoietic cells. By DNA microinjection, chimeric oligonucleotides designed to direct a site-specific nucleotide exchange in the human β-globin gene were introduced into CD34+ and Lin- CD38- cells (purified from umbilical cord blood from healthy donors). Conversion rates of 10% to 15% were confirmed in Lin- CD38 cells. These levels of correction are enough to correct the phenotype of the patients in this case, as 10% normal hematopoietic cells are sufficient to render patients free of SCD. However, human engraftment of edited HSCs in immunodeficient animals was not tested. The efficacy of gene editing in CD34+ was later improved by the use of sequence-specific endonucleases. Dr. Naldini's group was able to identify gene edition in up to 80% hematopoietic progenitors after transducing cord blood CD34+ with an IDLV carrying a puromycin-resistant gene together with ZFN targeting CCR5 locus and selecting them with puromycin (114). The first therapeutic application of gene editing was described 7 years later by the same group. Human HSCs were gene edited at the *IL2RG* locus, which is mutated in SCID-X1 patients, by an IDLV delivering a donor template to perform a knock-in strategy in combination with the electroporation of ZFN mRNAs against the same locus (115). Since the donor templates carried the GFP reporter cassette alone or together with the therapeutic *IL2RG* cDNA, the targeted human hematopoietic could be followed by GFP expression. GFP could be detected in up to 95% of immunodeficient mice transplanted with the human progenitors after their gene editing at short term after transplantation. However, the most significant achievement was the detection of the targeted human hematopoietic cells in 46% of the transplanted mice at long-term, which meant the targeting of the most primitive HSCs, the LT-HSCs, and opens up a real potential clinical use of gene editing in HSCs. More importantly, the IL2RG functionality was restored in T and NK cells derived from the hema‐ topoietic progenitors edited at *IL2RG* locus. This is the first demonstration of the potential clinical use of gene edition to treat monogenic blood diseases.

Although the use of large donors to correct patient's hematopoietic progenitors through a knock-in or safe harbor approaches have got good preclinical data, its overall efficacy is still low to be applied in clinical trials. Alternatively, the use of single-stranded oligonucleotides (ssODNs) for HR has been also explored to perform gene correction due to their high efficacy and low toxicity. As Proof of concept, gene correction experiments using β-globin ssODNs as donors to correct β-thalassemia were performed. Intracellular delivery was done using biodegradable nanoparticles loaded with triplex-forming peptide nucleic acids (PNAs). Although low levels of efficacy were reached (<1%), *in vivo* analysis showed engraftment of edited cells in NOD-SCID IL2rγ-null mice (116).

Recently, another approach was tested for the introduction of a mutation able to correct SCD (117). ZFNs were designed to specifically cleave at the β-globin locus with nonrelevant offtarget effects and introduced in CD34+ cells (isolated from umbilical cord or peripheral blood) by electroporation. Simultaneously, the delivery of an IDLV or DNA oligonucleotide homol‐ ogous donor template allows high levels of gene modification (10–20%) and engraftment in immunocompromised NSG mice.

The delivery of nucleases using AAV vectors has also demonstrated their feasibility for human HSCs gene editing. Their low toxicity and relatively easy manufacture, together with the ability of achieving high recombination frequencies with small homology regions (200 bp), make them feasible genetic instruments (118). Recent studies showed that AAV serotype 6 was efficient for the delivery of donor DNAs in human CD34+ Hematopoietic Progenitor Stem Cells (HPSCs) and can induce Homologous direct Repair (HDR) when used in combination with different nucleases (119). When AAV6 was used to deliver a GFP cassette flanked by arms homologous to CCR5 or AAVS1 locus, followed by electroporation with ZFN mRNA, HDR-mediated insertion at the CCR5 or AAVS1 locus was found in 17% to 26% of cells. Studies in immuno‐ deficient mice showed that modified cells persisted in secondary transplant recipient animals, meaning that edited cells contained long-term HSCs.

Using a different gene editing approach, β-thalassemia correction has been achieved in a preclinical study without the need of a donor template (120). This clever approach was based in the knockout of the *BCL11A* gene using ZFN to generate INDELs in the mentioned gene. This gene regulates the transition from fetal globin (γ-globin) to an adult one (β-globin). Therefore, the disruption of *BCL11A* gene was able to ameliorate β-thalassemia due to increasing γ-globin and reducing mutated β-globin level.

The transfer of programmable nucleases into HSCs has been also developed as antiviral therapies by deleting genes encoding receptors essential to viral entry in the host cells, thus preventing the infection. The definitive cure of HIV or the achievement of efficient protection against this virus constitutes one of the biggest goals of modern medicine. Gene therapy can bring us closer to this purpose by the modification of *CCR5,* an HIV-1 coreceptor involved in the entrance of the viral particles in hematopoietic cells. The introduction by electroporation of specifically designed ZFNs against the *CCR5* gene attained 17% of allele disruption. This ZFN-treated HPSCs were capable of engraft in NOD-SCID IL2rγ-null mice and were able to produce a polyclonal multilineage progeny where *CCR5* disruption was maintained (121). Moreover, when mice transplanted with ZFN-modified cells were challenged with the HIV-1 virus, the animals were able to maintain the levels of CD4+ T cells throughout their tissues and showed lower levels of HIV-1 viral load compared to the control animals engrafted with nonedited HPSCs. In the same direction, other strategies such as the introduction of CCR5- Specific ZFNs or CCR5-MegaTALEN mRNAs into HPSCs using adenoviral vectors and electroporation, respectively, have demonstrated the potential of this (118,122).

Genome editing in HSCs has also been used to model cancer generation and development. For example, CD34+ cells purified from human cord blood were capable of initiating leukemia in response to endogenous activated *MLL-AF9* or *MLL-ENL* oncogenes, generated by TALENmediated knock-in (123,124). These cells displayed altered *in vitro* growth potentials and induce acute leukemias following transplantation in immunocompromised mice at a mean latency of 16 weeks. Additionally, the phenotype, morphology, and molecular features of the induced leukemias were similar to patient leukemic blasts. Consequently, this strategy provides an experimental platform for prospective studies of mixed lineage leukemia (MLL). Similarly, the CRISPR/Cas9 system is also applicable to generate some cancer models. Translocations resembling those described in acute myeloid leukemia and Ewing's sarcoma (*RUNX1/ETO* and *EWS/FLI*, respectively) were generated in human cell lines and primary cells. Cas9 nuclease and the specific gRNA were introduced as plasmidic DNA by electropo‐ ration, reaching 4% efficiency in the generation of cancer-like translocations (125).

The application of gene editing to correct or study blood diseases is a growing field, where a rising number of publications exploring all the potential of this precise gene modification technology are being generated rapidly. We should be ready for the burst of gene editing in hematopoiesis.

#### **8. Gene-edited HSCs goes to the clinic**

A notable number of phase I/II gene therapy clinical trials have reported evidence of efficacy and safety for the treatment of various genetic diseases of the blood, including primary immunodeficiencies, leukodystrophies, thalassemia, and hemophilia. However, some diffi‐ culties are acting as a bottleneck in the clinical approach of gene editing tools. First, the delivery of genetic tools into hematopoietic progenitors must be improved. Second, an advance in the culture conditions is needed, given that gene-corrected cells must be present in large enough quantities to revert the condition and transmit the modification to their progeny. In this way, some compounds capable of preserving the "stemness" of CD34+ HPSCs are nowadays being explored, such as SR-1 or UM171, with promising results (126).

Up to now, only one clinical trial worldwide is open to apply gene editing in human HSCs. The trial is addressing the safety of the transplantation of hematopoietic progenitors, gene edited for the *CCR5* disruption using specific ZFNs, into HIV-1-infected patients. These patients have undetectable virus but suboptimal CD4+ cell levels. Phase 1 trial is ongoing and will provide first in human data about the use of gene editing tools (https://goo.gl/M0ZWoB).

#### **9. Other alternative nonhematopoietic sources of HSCs and their edition**

Gene edited autologous HSC transplantation (HSCT) approaches remain elusive because of the difficulty in the culture and expansion of long-term repopulating HSCs *ex vivo*, as these protocols still extend for long periods of time to achieve the delivery of the correcting tools and the gene edition itself. An alternative therapeutic strategy is the use of patient-derived induced pluripotent stem cells (iPSCs) in which disease-causing mutations are corrected by gene targeting. The development of iPSC-based approach provides an unlimited source of subject-derived cells that bypasses the main limitation of HSCs-based protocols, as iPSCs possess properties of self-renewal and pluripotency that are similar to those of embryonic stem cells (127–129). Moreover, the availability of HSCs is compromised in some situations, and in this context, easily accessible cell sources, such as skin fibroblasts (130), keratinocytes (131), or peripheral blood mononuclear cells (132), constitute an advantage for gene editing protocols. These corrected iPSCs could then be differentiated into the desirable hematopoietic cell for transplantation into patients to treat the disease (133). Here are some of the most relevant attempts following this strategy.

Regarding erythroid lineage defects, efforts have focused on β-globin alterations causing βthalassemia and SCD. Most of the works have reprogrammed patient-derived cells by transduction of reprogramming factors, in the case of skin fibroblasts (134–137), or by nucle‐ ofection of amniotic fluid-derived cells (138,139). The gene editing strategy selected was a knock-in approach, and in the case of β-thalassemia patient-derived iPSCs, the delivery of TALENs (138,139) or CRISPR/Cas9 (135,139,140) by electroporation achieved specific gene targeting in human hemoglobin subunit β (*HBB*) locus, with variable HR rates. More precisely, in a direct comparison of CRISPR/Cas9 and TALEN, the first ones induced DSBs with greater efficiency than TALEN, but the second ones mediated a higher homologous gene targeting efficiency (139). In all the cases, the expression of *HBB* in colonies formed after clonogenic progenitor cell assays was comparable to endogenous β-globin allele, but the main challenge was the complete erythroid differentiation of the edited iPSCs, as embryonic and fetal but not adult hemoglobin was expressed (135,139). Singularly, Wang et al. reported that gene correc‐ tion without assistant nucleases of iPSC reprogrammed from patient-derived fibroblasts and then differentiated to hematopoietic progenitors could differentiate *in vivo* and produce human β-globin in irradiated SCID mice, although the efficacy of correction was lower than 1% (134).

In the case of SCD, patient-derived fibroblasts were reprogrammed to iPSCs and subsequently edited by nucleofection of ZFN in a knock-in strategy. Recombination efficacy ranged from 1% to 10%, and as it happened in thalassemia studies, the vast majority of hemoglobin expression was of the fetal type (136,137).

The gene editing of iPSCs derived from peripheral blood mononuclear cells from pyruvate kinase deficiency (PKD) patients assisted by TALENs in a knock-in strategy rendered approx‐ imately 10% of efficacy in gene editing, obtaining high numbers of erythroid cells with a normalized ATP level and a normal metabolic function, providing an approach in which cells can undergo up to 2×104 -fold expansion to correct metabolic erythroid diseases (9).

In the case of primary immunodeficiencies, the access to BM and thymocyte samples from untreated patients with SCID is challenging, as these conditions are rare and infants typically presenting with life-threatening infections require urgent HSCT to survive (141). Different SCIDs have been addressed by gene editing of patient-derived iPSCs. Chang et al. demon‐ strated that locus-specific correction of the human JAK3 mutation by CRISPR/Cas9-enhanced gene targeting of iPSCs derived from skin keratinocytes of SCID patients restored the differ‐ entiation potential of early T-cell progenitors. The strategy used was knock-in and efficacies ranged from 6% to 73 % of the analyzed clones when using one gRNA and from 0 to 100% of G418-resistant clones when using two guides. Corrected progenitors were capable of produc‐ ing NK cells and mature T-cell populations expressing a broad TCR repertoire as well as all hematopoietic lineages. However, after transplantation in NSG mice, the differentiation to all hematopoietic lineages was not detectable (141).

Human mesenchymal cells from SCID-X1 patients were reprogrammed to iPSCs and after‐ wards corrected by HR assisted by TALENs in the *IL-2Rg* locus, following also a knock-in strategy, with an efficiency of 2.6% (142). They used a selection-free approach, as cells bearing a functional γ-chain show a positive selective advantage (143). The gene editing resulted in the rescue of T-cell precursors and mature NK cells after long-term differentiation *in vitro* (142). In ADA-SCID patients, the protocol was improved to perform reprogramming and gene targeting together in a one-step procedure that required only a single electroporation in an attempt to avoid cells to be in culture for several months, which is not compatible for patients for whom urgent medical intervention is imperative. They obtained 5% corrected iPSCs from skin biopsies of ADA-SCID patients after gene editing assisted by CRISPR/Cas9 and using single-stranded corrective oligonucleotide (ssODN) as therapeutic matrix, where the expres‐ sion of corrected ADA mRNA was confirmed (144).

Some works have been performed in X-linked chronic granulomatous disease (X-CGD), being the selected strategy for gene editing the directed insertion of therapeutic cDNA in the safe harbor locus *AAVS1*. The selection of the promoter that drives the expression of the therapeutic gene seems to be quite important in CGD, as an inappropriate expression of the transgene in other cellular types, as the stem cell compartment, could lead to an exacerbate reactive oxygen species (ROS) production and engraftment failure (43,145). Merling et al. have developed a platform that combines the production of iPSCs derived from patients with the five genetic forms of CGD with the targeting of *AAVS1* safe harbor assisted by ZFNs, resulting in neutro‐ phils and macrophages differentiated from the corrected iPSCs with a restored ROS produc‐ tion and antimicrobial function (146).

TALEN-assisted specific gene correction of *RUNX1* gene in iPSCs derived from dermal fibroblasts from a patient with familial platelet disorder resulted in the restoration of mega‐ karyopoiesis and subsequent maturation of the corrected cells (147). Even in a DNA repair deficiency syndrome such as Fanconi anemia, with defects in homology-directed DNA repair, the wild-type cDNA of *FANCA* was targeted to *AAVS1*, assisted by ZFNs with an efficiency of gene targeting up to 4%, rendering disease-free hematopoietic progenitors (108).

Direct reprogramming, meaning the generation of hematopoietic cells or HSCs directly from nonhematopoietic cells, has been also assayed with only modest or even irreproducible results (148,149). No data are available related to the combination of gene correction with hemato‐ poietic direct reprogramming.

Taking all these data into account, an iPSC-based approach would provide an unlimited source of patient-derived corrected cells from which hematopoietic cells could be derived continu‐

**Figure 2.** Potential strategies to follow in a HSC gene editing clinical protocol.

ously, and this approach could serve as a complementary approach to treat patients, for example, infusions of *in vitro*-derived autologous T cells to stabilize patients after HSCT (150). However, proper differentiations of the desirable cellular type, being the generation of real HSCs the major barrier, and an unsuitable engraftment of corrected cells remain the main pitfalls for the translation of this technology to the clinics.

#### **10. Overall remarks**

Although conventional gene therapy is becoming an alternative therapeutic option, being considered in some instances before allogeneic BM transplantation therapeutic option, the development of new tools and strategies to overcome the potential risks associated with the present ones is an already established challenge in the gene therapy field. Moreover, the precise correction of the mutation or mutations present in the patient's cells is the primary objective of any researcher committed to gene therapy. In this respect, gene editing is clearly an option to be explored and optimized. The gene editing field is constantly growing and changing. In the future, both hereditary and acquired diseases may benefit from gene editing strategies coupled with hematopoietic cell transplantation. As the conventional gene therapy of mono‐ genic disease is the first and most successful clinical therapy involving genetic modification, the partnership between gene editing and hematopoietic progenitor transplantation will be one of the widest and effective gene therapies for these types of hematopoietic diseases. Without any doubts, this traces the origin of a new era in cell and gene therapy and brings us closer to the aim of accurate site-specific gene editing in patients.



**Table 2.** Present results in the gene editing of hematopoietic stem cells.

#### **Author details**

Sergio López-Manzaneda, Sara Fañanas-Baquero, Virginia Nieto-Romero, Francisco-Jose Roman-Rodríguez, Maria Fernandez-Garcia, Maria J. Pino-Barrio, Fatima Rodriguez-Fornes, Begoña Diez-Cabezas, Maria Garcia-Bravo, Susana Navarro, Oscar Quintana-Bustamante and Jose C. Segovia

Hematopoietic Innovative Therapies Division, Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT) – Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER). Madrid, Spain

Advanced Therapies Unit, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz (IIS-FJD, UAM), Madrid, Spain

#### **References**


kinase deficiency patient-specific induced pluripotent stem cells. Stem Cell Rep. 2015;5:1053–66.


## **Insulin Gene Therapy for Type 1 Diabetes Mellitus: Unique Challenges Require Innovative Solutions**

Andrew M Handorf, Hans W Sollinger and Tausif Alam

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/62657

#### **Abstract**

Type 1 diabetes mellitus (T1DM) is a disease characterized by chronically elevated blood glucose levels that results from the autoimmune destruction of the insulin-producing β cells of the pancreas. While treatment options exist, they all possess serious limitations. Insulin gene therapy provides a promising alternative aimed at replacing insulin production in native non-β cells. For insulin gene therapy applications to be successful in treating T1DM, a glucose-sensitive organ must be targeted for insulin expression, insulin production must be responsive to ever-changing blood glucose levels, and insulin expression must persist long term. In addition, the amount of insulin production is critical, as too little insulin would lead to poor glucose regulation and too much insulin would induce hypoglycemia, a potentially life-threatening state. Together, insulin gene therapy provides challenges that are absent with other gene therapy applications. In this chapter, we examine the challenges of insulin gene therapy and discuss how the two key components of insulin gene therapy—the insulin expression cassette and the delivery vehicle—can be tailored for the successful treatment of T1DM.

**Keywords:** gene therapy, type 1 diabetes mellitus, lentivirus, insulin, β cells

#### **1. Introduction**

Type 1 diabetes mellitus (T1DM) is an autoimmune disorder whereby the β cells of the pancreas are destroyed. Under physiological conditions, β cells synthesize and secrete insulin in response to changes in blood glucose levels. Insulin, in turn, acts on other cells to promote glucose uptake

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

from the blood and thus lower blood glucose levels to normal. Without sufficient numbers of functionalβcells,insulinproductionbecomesinadequateandunabletopromptlyrestorenormal blood glucose levels. Over time, chronically elevated blood glucose levels, termed hyperglyce‐ mia, cause numerous secondary complications, ultimately leading to widespread tissue and organ damage, as well as increased mortality. According to the National Diabetes Statistics Report, 29.1 million people, or 9.3% of the US population, have diabetes [1], with T1DM accounting for roughly 5% of all diagnosed cases in the adult population. The total economic cost of diabetes is estimated to be \$245 billion per year, with T1DM costing an estimated \$8– 14 billion per year. Thus, diabetes poses a significant burden to our society. Unfortunately, no cure exists for T1DM.

#### **1.1. Current treatment options**

While there is currently no cure, several therapies exist to better control blood glucose levels. The most common form of therapy involves use of synthetic insulin, which typically requires multiple insulin injections per day. Unfortunately, this option is cumbersome and unable to restore perfect glucose control. As a result, exogenous insulin therapy delays the onset and reduces the severity of secondary complications but is unable to prevent them [2]. In addition, exogenous insulin therapy can cause hypoglycemia, a potentially life-threatening condition characterized by dangerously low blood glucose levels. The insufficiencies of exogenous insulin therapy arise from the fact that most insulin regimens are unable to mimic normal β cell secretion in response to continually fluctuating blood glucose levels. In an effort to improve upon exogenous insulin therapy and better imitate normal physiology, specialized insulin pumps, dubbed artificial pancreata, have been developed to deliver insulin when needed by continuously monitoring blood glucose levels [3, 4]. Artificial pancreata are able to improve glycemic control, but the implanted glucose sensors inevitably accumulate serum proteins that can compromise the accuracy of glucose measurements and consequently affect the precision of insulin delivered, limiting its long-term effectiveness. Ultimately, exogenous insulin therapies provide a suboptimal therapy for T1DM. Conversely, tighter glucose control can be attained through whole-pancreas transplantation [5], but this therapy is severely hampered by a shortage of donor organs and further complicated by the need for lifelong immunosup‐ pression. Transplantation of pancreatic islets was anticipated to minimize the impact of donor shortage, as islets from one donor could be expanded *ex vivo* to a quantity sufficient for multiple recipients [6], but equivalent successes like those observed with whole-pancreas transplanta‐ tion have yet to be obtained [7]. Hence, there is clearly a need for effective and broadly applicable treatment options for T1DM.

Gene therapy-based treatment options have emerged as a promising alternative. Gene therapy refers to any technique aimed at using genes to treat or prevent diseases, whether it be through delivery of a functional gene to replace a defective one or through knockdown of dysfunctional genes using gene silencing technologies. For T1DM, three primary gene therapy approaches have been explored to prevent or treat the disease. First, researchers have attempted to prevent the autoimmune destruction of β cells by modifying the immune system. Second, researchers have attempted to generate surrogate β cells from native non-β cells to replace the function of those cells lost from autoimmune destruction. Third, researchers have taken a more straightforward approach and simply attempted to replace the primary function of β cells insulin production—to treat T1DM. Not surprisingly, each of these approaches has their own advantages and disadvantages, which we will outline in the following sections.

#### **1.2. Prevention of autoimmune β cell destruction**

The most logical therapy for T1DM would be to take preemptive measures and prevent the development of autoimmunity that causes β cell destruction in susceptible individuals. Not surprisingly, many researchers have pursued this strategy as a potential alternative to current treatment options, and to do so, various genetic modifications to both β cells and immune cells have been tested in experimental models of T1DM. First, researchers have sought to induce immune tolerance so that β cell antigens are no longer recognized as foreign. French and colleagues sought to induce immune tolerance by driving intrathymic expression of proinsulin under the control of the major histocompatibility complex (MHC) class II promoter and found that it prevented the onset of T1DM. Similarly, Tian and colleagues transduced autologous bone marrow hematopoietic stem cells of non-obese diabetic (NOD) mice with diabetesresistant MHC class II I-Aβ chain molecules to examine whether this could prevent the development of diabetes. Indeed, they found that expression of this diabetes-associated allele prevented the development of autoreactive T cells by intrathymic deletion and protected the mice from developing insulitis and diabetes [8]. Second, groups have aimed to modulate the immune system through cytokine-based approaches. For example, gene transfer of transform‐ ing growth factor-β and calcitonin gene-related peptide have been shown to prevent the onset of diabetes in NOD mice [9, 10]. Lastly, groups have attempted to modify residual β cells so that they can resist autoimmune destruction, an event that generally occurs through apoptosis. Liu *et al.* overexpressed the antiapoptotic gene, bcl2, in β cells to increase the survival of β cells without affecting their function [11].

Although these preventative options have shown promise, they are hampered by several limitations: (1) These strategies rely on the early detection of diabetes. This is difficult because individuals often do not become symptomatic until they have already lost greater than 80% of their β cells. Thus, efforts to protect the remaining β cells would still leave the patient hyperglycemic. (2) T1DM is a multifactorial disease, making it nearly impossible to accurately predict who from the general population will succumb to the disease [12]. Thus, early intervention can be risky and perhaps even accelerate the progression of the disease. (3) The immune system is highly evolved and its complexities are not well understood. Further, innumerable functional redundancies exist that allow the immune system to compensate for the loss of any single factor or pathway. At our current level of understanding, it seems unlikely that selectively targeting a specific component of the immune system could prevent the autoimmune destruction of remaining β cells. As a result, other gene therapy strategies have been explored.

#### **1.3. Reprogramming non-β cells into β cells**

The goal of reprogramming non-β cells into surrogate β cells is to create replacement cells that are as similar to native β cells as possible, including the ability to not only synthesize insulin but also store it within secretory granules and secrete it instantaneously upon elevations in blood glucose levels. The most common way that researchers have done so is by overexpress‐ ing β cell-specific transcription factors in non-β cells. Transcription factors are DNA-binding proteins that modulate the rate of transcription of specific genes in a cell type-specific manner. During development, transcription factors play a critical role in executing differentiation programs by driving the expression of cell type-specific genes, and during adulthood, transcription factors are important for maintaining the differentiated status of somatic cells. For β cells, the transcription factors PDX1, NeuroD1, Neurog3, Pax4, Pax6, Nkx6.1, Nkx2.2, and MafA are among the many transcription factors ultimately responsible for directing and/ or maintaining the β cell fate.

With the aforementioned knowledge in mind, researchers have attempted to overexpress these transcription factors in a variety of cell types with the hopes of reprogramming them into β cells, including pancreatic exocrine cells [13, 14], keratinocytes [15], hepatic oval stem cells [16], adipose-derived stem cells [17], and hepatocytes. Of these, hepatocytes have been most commonly targeted due to the fact that they are closely related developmentally to β cells and easy to target. PDX1, which regulates early pancreas morphogenesis during development and controls glucose-dependent insulin expression in β cells, has been shown to be indispensable for the conversion of non-β cells into β cells. For example, Ferber *et al*. expressed PDX1 in the livers of diabetic mice and observed insulin expression and secretion, as well as restoration of normoglycemia [18]. Expression of NeuroD1 has also been shown to be a potent inducer of insulin expression in both primary duct cells and hepatocytes [19, 20]. However, ectopic expression of Neurog3 and Nkx6.1, which are also associated with β cell development, was unable to generate surrogate β cells. For example, despite the ability of Neurog3 to activate the persistent expression of NeuroD1, the use of Neurog3 in β cell engineering is not sufficient to generate surrogate β cells [21–25]. However, co-delivery of Neurog3 with PDX1 and MafA was successful in converting pancreatic exocrine cells into surrogate β cells [25]. Similarly, Gefen-Halevi *et al*. overexpressed Nkx6.1 in liver cells but found that it alone was unable to induce insulin expression. However, when co-expressed with PDX1, it promoted reprogram‐ ming of liver cells to β cells capable of glucose-stimulated insulin secretion [26]. Together, these studies validate the utility of transcription factor-mediated production of surrogate β cells in animal models and underscore its potential for treatment of T1DM.

Despite these positive findings, the long-term success of reprogramming strategies will rely on the ability of newly formed β cells to evade preexisting autoimmunity. This is particularly relevant given the abundance and diversity of previously identified autoantigens involved in the autoimmune destruction of native β cells [99]. It is expected that the more similar a surrogate β cell is to a native β cell, the more likely it will be targeted by the host's immune system and destroyed. For instance, while Ferber *et al*. found that hepatic expression of PDX1 was able to correct diabetic hyperglycemia, they also found that the mice developed hepatitis and were prone to autoimmunity against the newly formed β cells [18]. While other studies in NOD mice, a mouse model of autoimmune diabetes provides hope that autoimmunity could be avoided through reprogramming strategies [27], these studies must be carried out for longer periods of time to assess true efficacy. Ultimately, it is likely that these strategies will either require lifelong immunosuppression or selective immunomodulation to ensure the survival of the surrogate β cells.

#### **1.4. Insulin gene therapy**

Given the autoimmune nature of T1DM, it may actually be unfavorable to produce surrogate cells that closely resemble native β cells. Since the primary function of β cells is to synthesize and secrete insulin, many groups have taken a humbler approach and simply aimed to restore insulin production in non-β cells, a field known as insulin gene therapy. Although insulin is also an autoantigen known to result in native β cell destruction [28], the theoretical risk of recurring autoimmunity is greatly reduced. Of course, it should be emphasized that β cells are extremely precise in their ability to secrete insulin upon demand, so simply expressing insulin in a non-β cell would likely not be an effective means of treating T1DM. For instance, an individual would not want to be producing high levels of insulin at all times because this would cause hypoglycemic episodes. Thus, to understand whether insulin gene therapy is indeed a viable option to treat T1DM, a clear understanding of the intricacies and aptitude of β cells must first be attained. In the following sections, we will outline what makes a β cell so adept at controlling glycemia and cover what criteria must be met for insulin gene therapy approaches to be successful. We will then present an overview of the field of insulin gene therapy to treat T1DM, with an emphasis on the unique challenges not found in other gene therapy applications. More specifically, we will go into detail about the choice of cell type to target *in vivo*, as most cell types are limited in their ability to adequately sense changing blood glucose levels. We will then discuss key features used within insulin expression cassettes to meet the specific needs of the field, with a particular emphasis on glucose-inducible response elements (GIREs), before discussing the advantages and disadvantages of various viral vectors as they pertain to the field of insulin gene therapy. Lastly, we will present future directions necessary to make insulin gene therapy a clinical reality.

#### **2. Insulin gene therapy**

#### **2.1. Features of the β cell**

In order to determine whether insulin gene therapy possesses the necessary elegance to be a viable treatment option, it is important to understand the key features of β cells that make them so adept at controlling glycemia and the minimum requirements that absolutely must be met to replace their function. Native β cells possess several important features that together allow them to precisely control blood glucose levels. Specifically, the β cell has the ability to regulate insulin production at the transcriptional, post-transcriptional, translational, and posttranslational levels, as well as the ability to store and secrete insulin in a highly regulated fashion in response to glucose. While no individual feature is sufficient in itself to control glycemia, nor specific to β cells, the combination of features makes it a remarkably competent cell for its task.

First and most importantly, β cells have the ability to sense and quickly respond to small changes in circulating glucose levels over a broad range of physiological concentrations (2–20 mM) through concentration-dependent entry and metabolism of glucose. They do so through the activity of glucose transporter-2 (GLUT2) and glucokinase. GLUT2 is a transmembrane protein that enables glucose transport across cell membranes, whereas glucokinase is an enzyme that phosphorylates glucose to initiate its intracellular metabolism. GLUT2 and glucokinase have been dubbed the "glucose sensors" of β cells because they enable β cells to sense glucose over a very broad range of concentrations. They are able to do so due to their high Km for glucose (~17 and 8 mM, respectively), which allows their activity to vary substan‐ tially based on glucose availability [29].

β cells also have the ability to secrete insulin in a precisely regulated fashion in response to elevations in blood glucose levels. β cells do so through their capacity to (1) synthesize and store large quantities of insulin within secretory vesicles and (2) generate secondary stimulussecretion coupling signals to activate nearly instantaneous insulin vesicle exocytosis. Remark‐ ably, β cells can secrete substantial quantities of insulin within a minute after exposure to elevated glucose. This is a consequence of their metabolic capabilities; β cells possess a unique combination of metabolic enzymes that ultimately leads to the generation of signals capable of altering insulin secretion in response to glucose metabolism. Namely, β cells, but not other mammalian cell types, have negligible lactate dehydrogenase activity while displaying increased pyruvate carboxylase activity, which directs pyruvate almost entirely toward mitochondria for subsequent metabolism via the tricarboxylic acid cycle and oxidative phosphorylation [30]. In so doing, β cells increase their ratio of ATP/ADP and activate ATPsensitive K+ channels, a key stimulus leading to insulin vesicle exocytosis.

Furthermore, β cells have remarkable control of insulin biosynthesis at the transcriptional, post-transcriptional, translational, and post-translational levels, with each level of control being regulated by glucose availability. At the transcriptional level, increased glucose levels lead to upregulated insulin expression through the enhanced activity of the transcription factors, PDX1 and MafA [31, 32]. At the post-transcriptional and translational levels, β cells have glucose-responsive mechanisms to modulate insulin mRNA stability and the rate of translation, respectively. This glucose-dependent regulation is primarily governed by an RNAbinding protein known as polypyrimidine tract-binding protein (PTB). Association of PTB with a pyrimidine-rich stretch in the 3′ untranslated region (UTR) of the preproinsulin mRNA has been shown to be responsible for glucose-dependent changes in its stability and rate of translation [33, 34]. In fact, the half-life of preproinsulin mRNA has been shown to increase nearly 3-fold as a result of PTB association [35, 36]. At the post-translational level, β cells possess specific enzymes that allow them to process proinsulin—a precursor form—into fully active insulin. Proinsulin conversion involves removal of two basic pairs of amino acids—the C-peptide—and is mediated by the β cell endoproteases, PC1/3 and PC2, and the exopeptidase, carboxypeptidase-H [37, 38]. Altogether, β cells have a variety of glucose-dependent mecha‐ nisms to control insulin output.

β cells further refine insulin biosynthesis and secretion in response to other circulating metabolites. First, the gut produces the peptide hormones, glucose-dependent insulinotropic polypeptide (GIP) and glucagon-like peptide-1, which bind specific receptors found predom‐ inantly on β cells to bolster insulin production [39]. Second, specific amino acids and free fatty acids can serve as insulin secretagogues. While some amino acids and free fatty acids can actively promote insulin secretion, most of them simply have a role in amplifying the stimu‐ latory effects of glucose [40]. Third, metabolic stress can induce neuronal signals that influence insulin output [41]. Together, these inputs regulate insulin production and secretion from β cells to more precisely maintain glucose levels within a normal range.

#### **2.2. Requirements of insulin gene therapy**

Combining all the remarkable features of the β cell, the end product is a cell with impressive glucose sensitivity, the ability to control insulin biosynthesis at several levels in a glucosedependent fashion, and the ability to store, process, and secrete insulin almost instantaneously in response to glucose. The β cell is truly impressive. With this knowledge in mind, it would be easy to argue that insulin gene therapy lacks the sophistication necessary to adequately treat T1DM, especially given its relative simplicity. However, it is important to keep in mind that the most commonly used treatment for T1DM currently involves repeated injections of synthetic insulin. While this treatment option holds very little sophistication or biomimicry, it has still proven effective enough to remain a viable option since it was first employed as a medication in 1922. With that being said, any treatment that could provide better glycemic control while averting the cumbersome nature of synthetic insulin therapy would be a noteworthy improvement. So, beyond the ability to produce insulin alone (a feat that could likely be achieved in any cell type using a strong constitutive promoter), what other features of β cells are absolutely critical for the success of insulin gene therapy applications?

Given that glucose is the primary stimulus leading to the production of insulin and subsequent clearance of itself, the impressive glucose sensitivity conferred by GLUT2 and glucokinase is a necessary feature that must be present in an insulin-producing surrogate β cell. Without these proteins, a surrogate β cell would not be able to precisely sense glucose concentration and control insulin output over a broad range of circulating glucose concentrations. In addition to glucose sensitivity, it is also important that the insulin-producing surrogate cell has the ability to respond to changing blood glucose levels by modulating insulin output. In so doing, insulin output could be adjusted in a physiologically relevant manner to better control glycemia. It is likewise important for an insulin-producing cell to have the ability to process proinsulin into insulin, given that proinsulin has less than 10% of the biological activity of fully processed insulin [42]. Lastly, it would be ideal for an insulin-producing cell to have the ability to package, store, and secrete insulin almost instantaneously in response to elevated blood glucose levels. However, it is debatable whether this last feature is indeed critical to the success of insulin gene therapy. Diabetes is such a devastating disease because of the secondary complications that arise as a consequence of sustained hyperglycemia, not repeated episodes of transient hyperglycemia. Thus, as long as the insulin-producing surrogate cell has the ability to produce sufficient quantities of insulin within a reasonable time frame in a glucose-responsive fashion, the difference between near instantaneous insulin secretion or secretion with a few minutes' delay may be less critical. The quantity of insulin secreted must simply be large enough to correct hyperglycemia, but not too large as to cause hypoglycemia.

Thus, a long-lasting treatment for T1DM using insulin gene therapy could be achieved, but several criteria must be considered. First, the appropriate cells must be targeted for insulin production. At a minimum, these cells would need to express the glucose sensors, GLUT2 and glucokinase. Second, insulin transgene expression must be responsive to fluctuating blood glucose levels, being upregulated during hyperglycemia and downregulated during euglyce‐ mia. There are a variety of mechanisms available to endow an insulin-producing surrogate β cell with this ability. Third, there must be some mechanism in place for the target cell to process proinsulin into mature insulin. Lastly, an appropriate gene correction tool must be utilized to safely and effectively drive long-term insulin expression.

While meeting these criteria provides several challenges, treating T1DM through insulin gene therapy presents additional challenges not associated with other gene therapy strategies. First, insulin has a relatively short half-life compared to many other proteins being used for gene therapy applications. For instance, the circulating half-life of coagulation factor IX, which is deficient in hemophilia B patients, is estimated to be around 18–25 hours [43], whereas the circulating half-life of insulin has been estimated to be around 4–6 minutes. To compensate for the short circulating half-life of insulin, it is necessary to produce large amounts of insulin either by developing a highly effective gene expression system or by transducing a larger number of cells than other gene therapy applications. This makes the choice of gene delivery system a critical one, as the delivery vehicle must be produced in great abundance and transduce cells efficiently *in vivo*. Second, basal insulin production must be kept low during fasting periods and upregulated only when blood glucose levels become elevated. Other gene therapy applications need only to deliver the therapeutic protein of interest constitutively at low levels to correct the clinical manifestation, owing to greater protein stability and the particular function of the protein. If insulin was expressed constitutively at low levels to satisfy basal metabolic activity, there would be long periods of postprandial hyperglycemia, and if the level of constitutive insulin production was increased to effectively control postprandial glucose levels, there would be a very high possibility of hypoglycemia during fasting periods. Thus, insulin expression cannot be driven by a strong constitutive promoter; it instead must be responsive to fluctuating blood glucose levels. This makes the design of the insulin expression cassette extremely important. It also creates a much narrower therapeutic window for dosing than other gene therapy applications. In the following sections, we will discuss these factors in depth.

#### **2.3. Target cells for insulin gene therapy**

At a minimum, the target cells chosen for insulin gene therapy would need to express GLUT2 and glucokinase and have innate mechanisms for glucose-responsiveness, thus giving them the ability to sense and respond to continually fluctuating blood glucose levels. Without them, insulin secretion from surrogate β cells would be far less precise. Besides β cells, the only cells that express both GLUT2 and glucokinase are hepatocytes and cells of the hypothalamus and small intestine. Thus, these cells serve as a nice starting point for targeting of insulin transgene expression. It should be noted, however, that a variety of other cells have been targeted for insulin expression, including skeletal myocytes, fibroblasts, and mesenchymal stem cells. An interesting example is the targeting of skeletal myocytes. Skeletal myocytes do not express GLUT2 or glucokinase. Instead, they express GLUT4 and hexokinase, which each have a higher affinity (i.e. lower Km) for glucose. To endow skeletal myocytes with enhanced glucose sensitivity, Callejas *et al*. co-expressed insulin with glucokinase. Whereas insulin alone was insufficient to adequately treat T1DM in dogs using skeletal myocytes as surrogate β cells, coexpression of glucokinase with insulin was able to normalize fasting hyperglycemia and accelerate glucose disposal after oral challenge [44]. These findings emphasize the importance of glucose sensitivity in treatment of T1DM.

Intestinal K cells provide a particularly promising cell type for insulin gene therapy applica‐ tions because they not only express GLUT2 and glucokinase, but they also possess the proinsulin processing enzymes and have the machinery for regulated insulin secretion. Cheung and colleagues exploited these unique advantages to generate insulin-producing surrogate β cells from K cells. To do so, they generated transgenic mice expressing human insulin under control of the GIP promoter, a K cell-specific promoter believed to be regulated by glucose [45]. They found that the GIP promoter was able to target insulin expression to K cells specifically, and the transgenic expression of insulin was effective at promoting normal fasting glucose levels and efficient glucose clearance in response to an oral glucose challenge for up to three months after mice were rendered diabetic with streptozotocin (STZ). However, while their findings hold promise, the translation of this strategy to the clinic will rely on their ability to address the following concerns: (1) The gut is one of the primary hubs for the immune system, and given that insulin itself is an autoantigen responsible for native β cell destruction [46], intestinal K cells may be particularly susceptible to recurring autoimmune attack; longterm protection from autoimmunity must be demonstrated. (2) More importantly, the GIP promoter is not only regulated by glucose intake but also by other sources—most notably fats. Thus, patients receiving this treatment would, at the very least, need to maintain a very strict diet to avoid potentially fatal consequences, like hypoglycemia. Further, another study found that glucose alone does not even regulate insulin secretion when controlled by the GIP promoter [47]. Regulation of the GIP promoter must be more thoroughly examined before moving toward human clinical trials.

The most commonly chosen target cells of insulin gene therapy applications, and the one we have chosen, are hepatocytes. Although hepatocytes do not have the machinery to store insulin within secretory vesicles and secrete it in a regulated fashion, they express GLUT2 and glucokinase and possess a robust capacity to synthesize and secrete proteins constitutively. In addition, hepatocytes are attractive targets for insulin expression because they (1) are closely related to β cells developmentally, (2) play a very important role in glucose homeostasis, and (3) are relatively easy to target. As a result, it has been the most commonly targeted organ for *in vivo* production of insulin-producing surrogate β cells and will be the focus for the remainder of this chapter.

#### **2.4. Expression cassette design**

After choosing an appropriate cell type for insulin production, there are several considerations that must be taken into account when designing the insulin expression cassette. Perhaps the first decision that must be made is which promoter to use to drive insulin expression. One of the most commonly used promoters within the field of gene therapy is the cytomegalovirus (CMV) promoter. The CMV promoter is a mammalian promoter from the human cytomega‐ lovirus that drives strong, constitutive transgene expression. While the CMV promoter has been used quite frequently to drive insulin expression for treatment of T1DM, there is one fundamental reason why these studies could never be translated to the clinic: Insulin expres‐ sion must be responsive to glucose, being upregulated when blood glucose levels rise and downregulated to low levels during fasting periods. The CMV promoter would drive consis‐ tent, high level expression of insulin even during fasting periods, which would ultimately cause blood glucose levels to fall dangerously low. Furthermore, even if regulatory elements were added to the expression cassette to endow glucose-responsiveness to insulin expression, the CMV promoter is so strong that it would override these elements. Thus, a weaker promoter must be used to maintain low levels of insulin production during fasting periods if insulin gene therapy is to be successful in treating T1DM.

Weaker tissue-specific promoters have been employed for hepatic insulin gene therapy applications to not only reduce the potential for hypoglycemia but also to improve targeting to the tissue of choice. For instance, several groups have used liver-specific promoters that activate insulin transgene expression in hepatocytes but remain inactive in other cell types. Interestingly, some liver-specific promoters are inherently glucose-responsive, making them a great choice for insulin gene therapy applications. For instance, Chen *et al*. used the glucose-6 phosphatase (G6Pase) promoter to drive insulin expression and found that elevated glucose concentrations enhanced promoter activity [48]. They also found that insulin strongly inhibited G6Pase promoter activity under low glucose conditions, creating a system with feedback inhibition [49]. The group then delivered the insulin gene to the liver of STZ-induced diabetic rats under the control of the G6Pase promoter and found that *ad libitum* hyperglycemia was significantly reduced, glucose utilization was accelerated after glucose challenge, and fasting glucose levels were within a normal range without hypoglycemia. Similarly, Burkhardt *et al*. used the liver-specific GLUT2 promoter to drive insulin gene expression in a glucose-inducible but insulin-repressive fashion and found an improvement in diabetic hyperglycemia [50]. However, it is worth noting that the activity of the wild-type insulin promoter used by native β cells is actually enhanced by insulin, creating a feed-forward system to amplify insulin expression.

To generate a more physiologically mimetic system driving insulin expression, Hsu and coworkers used the rat insulin-1 promoter, creating a system that is activated by both glucose and insulin, similar to native β cells [51]. In so doing, they were successful at driving insulin secretion from Huh7 hepatoma cells *in vitro* in response to glucose. They were also able to augment insulin expression *in vivo* in response to glucose and theophylline—a pharmacolog‐ ical activator of cAMP—and ameliorate hyperglycemia in STZ-induced diabetic mice. How‐ ever, they did not test whether insulin activated transgene expression using this promoter, and it is unclear how this promoter was active in hepatocytes, which do not ordinarily express the β cell-specific transcription factors necessary to upregulate insulin expression. Regardless, hepatocytes do not possess enough of the β cell-specific regulatory mechanisms to safely express insulin in a feed-forward manner with respect to insulin. Thus, for the sake of hepatic insulin gene therapy, it would be simpler to create a system that was unresponsive to insulin altogether.

Even if a liver-specific promoter does not possess glucose responsiveness, it can be endowed with sensitivity to glucose through incorporation of GIREs. GIREs are glucose-responsive DNA sequences found in the promoter region of several genes encoding lipogenic enzymes, like L-pyruvate kinase (L-PK), S14, fatty acid synthase, and acetyl-CoA carboxylase [52]. GIREs are composed of two 6-bp motifs known as E boxes, with a consensus sequence of CACGTGnnnnnCACGTG (**Figure 1**). E boxes are generally recognized by transcription factors harboring basic helix-loop-helix/leucine zipper DNA-binding domains [53]. A specific transcription factor, dubbed carbohydrate response element-binding protein (ChREBP), has been found in great abundance in the liver, as well as the small intestine and adipose tissue, the most active sites of *de novo* lipogenesis [54].

**Figure 1.** Glucose-inducible response elements (GIREs) and their transcriptional activators.

Incorporation of GIREs enables glucose-responsive control of gene transcription. GIREs are composed of two 6-bp motifs with a consensus sequence of CACGTG separated by 5 bp. A tetramer of ChREBP-Mlx binds each GIRE to amplify gene transcription in response to elevated glucose levels. GIREs have been identified in liver-specific genes like L-pyruvate kinase (L-PK), acetyl-CoA carboxylase (ACC), and fatty acid synthase (FAS).

Thule and colleagues leveraged GIREs to endow glucose-responsiveness to the liver-specific insulin-like growth factor binding protein-1 (IGFBP1) promoter. The IGFBP1 promoter is repressed by insulin, creating a feedback inhibition loop on insulin expression, but it is not inherently influenced by changes in glucose concentrations. To generate glucose-responsive insulin expression, they incorporated GIREs from the L-PK gene directly upstream of the IGFBP1 promoter and found that, depending on the number of GIREs incorporated, a 1.6- to 6.4-fold increase in promoter activity could be produced in response to elevated glucose concentrations in primary hepatocytes *in vitro* [55]. In addition, Thule and Liu used a recombi‐ nant adenovirus to deliver their glucose-responsive, insulin-repressive insulin construct into STZ-induced diabetic rats and found that it was able to produce near-normal glycemia and weight gain without inducing lethal hypoglycemia [56].

In our lab, we used the liver-specific albumin promoter—which is neither glucose- nor insulinresponsive—and inserted the GIREs from the S14 gene upstream of the albumin promoter to create a system that is unresponsive to insulin but activated by elevated glucose levels (**Figure 2**) [57]. To test the effect that the S14 GIREs have on glucose-responsive insulin output from the albumin promoter, we first generated insulin expression cassettes containing one to five GIREs. Interestingly, we found that the degree of glucose-induction on insulin output increased as the number of GIREs was increased up to three, after which there was only a marginal enhancement in insulin output. We observed a 9-fold increase in insulin output from primary hepatocytes between low and high glucose conditions when three GIREs were incorporated upstream of the albumin promoter (**Figure 3**). When we delivered this insulin expression cassette into the livers of STZ-induced diabetic rats through direct injection, we found that fasting blood glucose levels were reduced to normal, blood glucose levels of diabetic rats fed *ad libitum* were significantly reduced, and glucose clearance was significantly accel‐ erated during an intraperitoneal glucose tolerance test. However, these effects only lasted for roughly a month, owing to the use of adenoviral vectors to deliver our insulin expression cassette.


**Figure 2.** Elements of the insulin expression cassette—TA1.

TA1 is a 2.1-kb insulin expression cassette containing elements that drive insulin expression in both a liver-specific and glucose-responsive fashion. The albumin promoter is largely responsible for restricting insulin expression to hepatocytes, while the α-fetoprotein transcrip‐ tional enhancer, vascular endothelial growth factor (VEGF) translational enhancer, and albumin 3′-UTR also promote liver specificity. Glucose responsiveness is primarily driven by three copies of GIREs, although the albumin 3′-UTR also promotes glucose-responsive insulin biosynthesis.

**Figure 3.** Glucose induction of insulin expression and the effect of hepatocyte-specific enhancer elements on overall insulin output.

The insulin expression cassettes, TA0 and TA1, differ in the presence of α-fetoprotein tran‐ scriptional enhancer and the albumin 3′-UTR. Inclusion of these elements greatly increases overall insulin output, while both constructs display similar glucose responsiveness.

To improve our insulin expression cassette, we investigated whether inclusion of various liverspecific enhancer elements could further enhance insulin production. Specifically, we incor‐ porated a transcriptional enhancer from the human α-fetoprotein gene, an intron from the human growth hormone gene previously shown to improve mRNA processing efficiency, a translational enhancer from the vascular endothelial growth factor (VEGF) gene that functions as an internal ribosomal entry site, and the 3′-UTR from the human albumin gene that also contains an intron to improve mRNA processing (**Figure 3**). The ability of each element to enhance glucose-inducible insulin expression was first examined by transducing primary rat hepatocytes *in vitro* using an adenovirus. After testing different constructs containing a variety of different combinations of these elements, we ultimately found that insulin expression cassettes incorporating the α-fetoprotein transcriptional enhancer, VEGF translational enhancer, and albumin 3′-UTR led to increased insulin production *ex vivo*. Specifically, the VEGF translational enhancer led to a 4- to 6-fold increase in insulin output alone at both low and high glucose concentrations. Incorporation of the transcriptional enhancer and 3′-UTR led to another 5- to 8-fold increase in insulin output (**Figure 3**). Together, these modifications to the insulin expression cassette resulted in a 20- to 50-fold increase in insulin output *in vitro* compared to the original constructs, thus allowing us to more efficiently drive insulin expres‐ sion, a particularly important factor when less efficient delivery vehicles must be used for gene therapy.

We confirmed the utility of this improved insulin construct *in vivo* by delivering our insulin expression cassette in the form of minicircle DNA, which can be readily produced in large quantities. Upon intravenous injection of this minicircle DNA into STZ-induced diabetic rats, we observed a DNA dose-dependent correction in hyperglycemia in both fasted rats and rats fed *ad libitum*. In addition, we observed a full restoration in the rate of weight gain in STZinduced diabetic rats comparable to that of healthy, non-diabetic rats, and intraperitoneal glucose tolerance tests demonstrated glucose-inducible increases in insulin production capable of correcting hyperglycemia within 45 minutes. A single injection of minicircle DNA led to normalization of serum levels of albumin, triglycerides, cholesterol, aspartate transa‐ minase, alanine aminotransferase, and alkaline phosphatase, thus demonstrating restoration of healthy liver function. Further, there were no signs of hepatic inflammation, underscoring the safety of hepatic insulin gene therapy. Together, we were able to create a treatment for T1DM possessing glucose-responsive insulin production (due to the natural expression of GLUT2 and glucokinase from hepatocytes, and the presence of GIREs in the insulin expression cassette) that is capable of fully correcting diabetic hyperglycemia.

Another novel feature of our insulin expression cassette is the presence of the albumin 3′-UTR. As mentioned previously, the albumin 3′-UTR contains an intron that improves mRNA processing. However, in addition to that, it also contains two pyrimidine-rich stretches known to bind PTB [58]. PTB is a ubiquitously expressed mRNA binding protein that serves as a common mediator of mRNA stability. As mentioned previously, PTB binding sequences are also found in the 3′-UTR of the preproinsulin gene. This is a particularly important feature for hepatic insulin gene therapy applications, as the half-life of preproinsulin mRNA has been reported to be less than 6 hours in hepatocytes. That is much less than the 29–77 hours found in β cells. Perhaps even more importantly, the presence of PTB binding sites would also confer glucose-responsive control of preproinsulin mRNA translation in hepatocytes [34]. Thus, the presence of the albumin 3′-UTR endows hepatocyte-derived surrogate β cells with improved mRNA processing and stability, as well as glucose-responsive control of translation.

One final consideration when designing an expression cassette for insulin gene therapy is the preproinsulin sequence used. A mature insulin molecule is composed of two polypeptide chains—the A and B chains—linked together by two disulfide bonds. However, the insulin gene produces a single preproinsulin polypeptide that contains two basic pairs of amino acids separating the A and B chains, known as the C-peptide, as well as a 24-residue signal peptide. The signal peptide is removed as preproinsulin is translocated into the rough endoplasmic reticulum, forming proinsulin. Proinsulin undergoes further maturation within secretory granules through the action of prohormone convertases PC1/3 and PC2, as well as carboxy‐ peptidase-H. These enzymes are co-packaged with proinsulin in secretory granules and together act to remove C-peptide and produce mature insulin. However, prohormone convertases PC1/3 and PC2 are only found in β cells and other cells with the regulated secretory pathway, like pituitary cells and intestinal K cells. Thus, for insulin gene therapy applications to be successful, it is important to maintain proinsulin processing, even if researchers choose to target cell types that do not have the regulated secretory pathway, like hepatocytes. In these instances, modifications can be made to the preproinsulin sequence to bypass the necessity of PC1/3 and PC2. The most commonly used modification is incorporation of furin cleavage sites [59, 60]. Furin is a ubiquitously expressed endoprotease that can efficiently cleave proteins at paired basic amino acid sites. Through incorporation of furin cleavage sites, any cell of the body can produce fully functional insulin.

Further modifications can be made to the preproinsulin sequence to enhance bioactivity or production for insulin gene therapy applications. First, the preproinsulin sequence can be mutated to alter the stability of the resulting insulin molecules. The most prevalently used mutation is the B10 mutation—a naturally occurring mutation where the histidine residue at position 10 on the B chain is replaced by aspartic acid [59, 61]. This mutation results in enhanced stability and the accumulation of 10- to 100-fold more mature insulin than wild-type insulin. Other mutations have been found to result in highly potent insulin analogues, including HisA8, ArgA8, and GluB10 [61].

Another way the preproinsulin sequence can be modified is through codon optimization. A codon is a series of three nucleotides that encode a specific amino acid. There are 64 different codons but only 20 different amino acids, which means that many amino acids are encoded by multiple codons. It is generally acknowledged that different organisms have codon preferences as a result of the composition of their respective tRNA pool. In other words, specific codons are preferred by specific organisms because they have that specific tRNA in greater abundance. It is thought that cDNA sequences with optimized codons will achieve faster rates of translation and accuracy, thus improving translational efficiency and production of the transgene product. For gene therapy applications, this has been shown to improve the potency of treatment. For instance, Cantore *et al*. observed a 2- to 3-fold increase in potency of their factor IX treatment for hemophilia B in dogs upon codon optimization [62]. Codon-optimized versions of human insulin have also been shown to achieve better glycemic control in diabetic dogs due to increases in insulin production [44].

In summary, there is some flexibility in the design of the expression cassette for insulin gene therapy applications; the relatively small size of the preproinsulin gene is advantageous for the design of an expression cassette and its subsequent delivery to target cells. Regulatory elements capable of improving cell type specificity, overall insulin output, and glucose responsiveness can be employed to yield insulin expression with greater precision. In addition, the preproinsulin sequence itself can be modified to improve production and functionality. Once a sufficient level of control has been attained over insulin expression, it then becomes a matter of delivering the expression cassette to target cells in a safe and efficient manner.

#### **2.5. Delivery vehicles for insulin gene therapy**

When choosing a delivery vehicle for insulin gene therapy applications, two important considerations must be taken into account. First, the delivery vehicle must be able to promote long-term insulin expression. This is important because an antibody response from the initial treatment will reduce the efficacy of subsequent treatments. Thus, repeated administration of most delivery vehicles is largely ineffective, especially if it occurs more than two weeks after the initial treatment. Second, it must be possible to affordably and reliably produce the gene delivery vehicle in the large quantities needed for gene therapy. This is particularly important in the field of insulin gene therapy because the insulin molecule has a relatively short circu‐ lating half-life, estimated at around 4–6 minutes [63]. As a result, a greater number of cells must be targeted for insulin expression than other gene therapy applications.

Many gene delivery vehicles exist and can be broadly grouped into two categories: viral and non-viral. Non-viral methods have the advantages of being safer and inducing less of an immune response. However, non-viral methods typically only drive transgene expression transiently, as most are incapable of supporting chromosomal integration. In addition, they tend to deliver genes inefficiently *in vivo*. Regardless, we explored the potential of delivering the insulin gene in the form of minicircle DNA to validate the *in vivo* efficacy of our expression cassette. We chose to use minicircle DNA because they can be produced in large quantities and contain no bacterial or viral elements, improving their likelihood of evading the immune system. We found that delivering our insulin expression cassette as a minicircle was able to correct diabetic hyperglycemia in a dose-dependent fashion in STZ-induced diabetic rats (**Figure 4**). While the effects of this treatment only persisted for about a month, the reduced immunogenicity of minicircle DNA allowed for repeated administration, although the second injection was not nearly as effective as the first (**Figure 5**) [64].

**Figure 4.** Dose-dependent effect of insulin minicircle DNA treatment on hyperglycemia in rats.

STZ-induced diabetic rats were intravenously injected with the indicated dose of TA1m minicircle DNA and measured for both fasting (A) and *ad libitum* (B) blood glucose levels. There was a dose-dependent correction of diabetic hyperglycemia that persisted for at least 10 days.

**Figure 5.** Effect of repeated administration of insulin minicircle DNA on fasting hyperglycemia in diabetic rats.

STZ-induced diabetic rats were intravenously injected with 0.8 μg/gm body weight of TA1m via tail vein, and blood glucose measurements were made after a 4-hour fast. TA1m was able to correct diabetic hyperglycemia for nearly a month before the effects began to diminish. A second TA1m injection (0.8 μg/gm body weight) was made on the 26th day and was able to re-correct fasting blood glucose levels, thus demonstrating that a substantial humoral response had not been mounted against the minicircle DNA.

We have also explored the use of viral delivery vehicles, as viruses are highly evolved and proficient at delivering genetic information into target cells. Of course, viral vectors will inevitably elicit an immune response, with some viruses invoking a greater immune response than others. Several viral vectors have been used for insulin gene therapy, with each possessing their own unique features. Refer to **Table 1** for an overview of the features inherent to various viral vectors.


**Table 1.** Features of various viral vectors for gene therapy applications.

Adenovirus, adeno-associated virus, oncoretrovirus, and lentivirus are the most commonly used delivery vehicles for gene therapy applications. Each viral vector possesses their own unique features that affects their suitably for insulin gene therapy applications.

Adenoviruses were among the first viral vectors used in the field of insulin gene therapy due to their ability to be produced in very high titers and transduce non-dividing cells with high efficiency. These features allow researchers to transduce a large number of cells *in vivo* and overcome the lack of insulin protein stability. However, adenoviral vectors are unable to integrate their genetic cargo into the host's genome and thus provide only transient gene expression [65, 66]. As a proof-of-principle, we initially used adenoviral vectors to establish the efficacy of our insulin expression cassette *in vivo*. Indeed, we were successful at correcting fasting blood glucose levels and improving *ad libitum* glucose levels in diabetic rats (**Fig‐ ure 6**). As expected, however, the observed reduction in blood glucose levels only persisted for about a month. This observation is in agreement with other studies using adenoviral vectors to deliver the insulin gene into diabetic animals, which likewise noted an improvement of hyperglycemia for only a month [67]. Unfortunately, unlike minicircle DNA, a humoral response is elicited by the first treatment, precluding repeated administration of adenovirusbased treatments. Thus, while adenoviral vectors are extremely efficient gene delivery tools, they are not well suited as a long-term therapeutic tool for treatment of T1DM, at least in their initial form.

**Figure 6.** Effect of TA1 on hyperglycemia in diabetic rats when delivered via adenovirus.

STZ-induced diabetic rats were injected with 2 × 1010 adenoviral pfu/rat, and both fasting (A) and *ad libitum* (B) blood glucose measurements were made. Treatment with adenoviral vectors carrying the TA1 insulin expression cassette were able to fully correct diabetic hyperglycemia after an overnight fast and partially correct *ad libitum* blood glucose levels for around one month.

More recently, researchers have generated a newer version of "gutted" adenoviral vector that has been stripped of all viral coding sequences, greatly reducing their immunogenicity [68, 69]. This is an important advancement in the field of gene therapy because adenoviral vectors have proven so immunogenic in past human clinical trials that their administration led to the death of a patient in 1999, temporarily halting progress in the field of gene therapy [70, 71]. Immunogenicity is undoubtedly a very large concern with adenoviral vectors, so any im‐ provement is useful. However, it seems unlikely that immunogenicity will ever be completely eliminated from adenoviral vectors, or any viral vector for that matter. Further advancements have now made it possible to improve upon the innate capabilities of adenoviral vectors by adding the potential to integrate their genetic cargo into a host's genome and drive long-term transgene expression. The advancement was a result of the merging of two technologies, where chromosomal integration is mediated by the Sleeping Beauty (SB) transposon system and efficient gene delivery accomplished by the gutted adenoviral vectors. DNA transposons translocate from one DNA site to another through a simple cutting-and-pasting process, enabling the integration of defined DNA sequences into mammalian genomes [72]. To achieve stable transgene expression using this system, two separate adenoviral vectors must be administered and co-transduce a single cell. The first vector represents the transposon donor vector, which contains a transposon encoding the transgene of interest. The second vector encodes the SB transposase, which mediates relocation. This system has been used successfully to enable persistent phenotypic correction of hemophilia B in dogs [72] and holds great potential for the treatment of other diseases that require persistent gene expression.

To combat issues related to immunogenicity and short-term expression, other groups have employed adeno-associated virus (AAVs). AAVs are able to transduce both dividing and nondividing cells. In dividing cells, AAVs are able to integrate transgenes into the host's genome at a specific site on chromosome 19 [73]. Within non-dividing cells, the AAV genetic cargo remains largely episomal, as the chromatin is less accessible. AAVs are less immunogenic than adenoviral vectors and are reported to cause relatively long-term transgene expression in nondividing cells. The primary disadvantage of using AAVs is that their packaging capacity is less than 5 kb, limiting the use of larger expression cassettes with greater complexity for regulated expression. However, the preproinsulin gene is quite small, so even when the gene is accom‐ panied with multiple regulatory elements in a complex expression cassette, the maximum size limitation is unlikely to become an issue for insulin gene therapy applications. Indeed, AAVs have been used to successfully drive insulin expression within non-β cells. Park *et al*. used AAV to deliver insulin under control of the CMV promoter into STZ-induced diabetic rats and found improved glucose tolerance (at 2 g/kg) comparable to that of non-diabetic control rats [74]. Additionally, they observed a less pronounced immune response using AAV when compared to the same treatment using adenoviral vectors.

Retroviral vectors are another widely used gene delivery vehicle, owing to their ability to integrate their genetic cargo into a host's genome and attain sustained gene expression. However, retroviral vectors are greatly limited by their inability to integrate their cargo into the chromosomes of non-dividing cells, a problem that severely hinders their utility in insulin gene therapy applications. In cases where retroviral vectors have been used to deliver insulin, hepatocyte proliferation must first be stimulated [75]. This, of course, greatly limits the translation of retroviral vectors for treatment of T1DM. Retroviral vectors have also been shown to have a preference to integrate their genetic cargo in close proximity to the transcrip‐ tional regulatory sequences of proto-oncogenes, as observed in 1999 following the treatment of nine severe-combined immunodeficiency patients [76]. Insertional mutagenesis led to the development of leukemia in four of the nine patients, ultimately halting the field of clinical gene therapy temporarily. All in all, retroviral vectors do not possess favorable features for insulin gene therapy applications.

Lentiviral vectors are a type of retrovirus that provide two key advantages over other retro‐ viruses: (1) they are able to integrate their genetic cargo into the genome of both dividing and non-dividing cells and (2) have less preference to integrate near regulatory sequences of protooncogenes, reducing the risk of insertional mutagenesis [77, 78]. Their ability to transduce nondividing cells is critical for gene therapy strategies, as most cells of the body are either nondividing or slowly dividing. An additional advantage of lentiviral vectors is that they do not elicit a strong immune response. Unfortunately, lentiviral vectors possess two major pitfalls limiting their widespread translation to human clinical trials: (1) Lentiviral vectors are difficult to produce in high titer [79] and (2) the efficiency of lentiviral transduction *in vivo* is signifi‐ cantly lower than other vectors, especially adenoviral vectors. Given the relative instability of insulin, the need to transduce a larger number of cells than other gene therapy applications, and the fact that—unlike other diseases—a partial correction of hyperglycemia is not sufficient to adequately treat T1DM, these pitfalls pose some limitations to the use of lentiviral vectors for insulin gene therapy applications. Nonetheless, lentiviral vectors offer long-term transgene expression with reduced immunogenicity and an improved biosafety profile and are thus a viable candidate as a therapeutic tool for treatment of T1DM.

To combat issues related to lentivirus infectivity, researchers have modified lentiviral vectors to improve their *in vivo* efficacy. For example, Naldini *et al*. has previously shown that inclusion of viral protein R—a viral protein present in native HIV-1 particles—within synthetic lentiviral particles is critical for hepatocyte transduction [80]. Conversely, Schaffer *et al*. has generated vesicular stromatitis virus envelope proteins that show improved serum resistance [81], which could improve lentiviral efficacy *in vivo*. A combination of modifications will hopefully yield a more potent lentiviral particle *in vivo*. Interestingly, Ren and colleagues used lentiviral vectors to deliver insulin to the livers of STZ-induced diabetic rats [82] and NOD mice [83] and observed long-term correction of diabetic hyperglycemia with no evidence of impaired liver function, intrahepatic inflammation, or recurring autoimmunity against the newly formed insulin-producing cells. While their work gives validity to the use of lentiviral vectors for insulin gene therapy applications, it should be noted that insulin expression was driven by the CMV promoter and displayed no responsiveness to circulating glucose levels.

#### **3. Concluding remarks**

Overall, insulin gene therapy provides a promising alternative to current treatments for T1DM. Although this treatment option will inevitably lack the full sophistication of native β cells, it would certainly improve upon current treatment options. Insulin gene therapy opens the possibility of having a one-time treatment option that can provide long-term correction of diabetic hyperglycemia through physiologically relevant mechanisms, like glucose-dependent alterations in insulin transcription and translation. Further, the simplicity of the treatment should yield reproducible results with excellent success rates and additionally help newlyformed insulin-producing surrogate β cells evade recurring autoimmunity. However, several hurdles must still be overcome.

In order for the treatment to be a viable option, long-term insulin expression must be driven to avoid repeated injection. However, the viral vectors currently used to drive long-term transgene expression, like lentivirus, are generally difficult to produce in high titer and limited by their transduction efficiency. As a result, most successful long-term efforts in the field have employed the CMV promoter, which can only drive strong constitutive expression of insulin. In order to take those research efforts to the next level, weaker tissue-specific promoters must be used that drive low basal levels of insulin expression during fasting periods and substan‐ tially upregulated insulin expression upon increases in glucose availability. To date, this has yet to be achieved. We are currently exploring several viral vectors for their capacity to deliver our insulin expression cassette—which has elements to drive liver-specific, glucose-responsive insulin expression —at a therapeutic level. In so doing, we hope to produce an affordable, long-term treatment option for patients with T1DM.

#### **Author details**

Andrew M Handorf, Hans W Sollinger\* and Tausif Alam

\*Address all correspondence to: hans@surgery.wisc.edu

University of Wisconsin-Madison, Madison, WI, USA

#### **References**


in insulin-dependent diabetes mellitus. The New England Journal of Medicine. 1993;329(14):977–86. PubMed PMID: 8366922.


of insulin mRNA stability. The Journal of Biological Chemistry. 1985;260(25):13590–4. PubMed PMID: 3902821.


## **Application of Genome Editing Technology to MicroRNA Research in Mammalians**

Lei Yu, Jennifer Batara and Biao Lu

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/

#### **Abstract**

Targeted nucleases have recently emerged as a powerful genome editing tool. The ability of introducing targeted, desired changes into mammalian genome makes them an invalu‐ able tool to unravel functions of miRNAs in biology and disease. In combination with ho‐ mologous donor vector, targeted nucleases can achieve high efficiency and precision, enabling bi-allelic ablation of miRNA in cultured somatic cells. Here we review the struc‐ ture and function of miRNA as well as the unique implementation of genome editing technology in modifying miRNA sequences in mammalians. This chapter discusses the four mainstay genome editing technologies: meganuclease, zinc finger nuclease (ZFN), transcription activator-like effector nuclease (TALEN) and clustered regularly inter‐ spaced short palindromic repeat-associated nuclease Cas9 (CRISPR-Cas9), focusing on TALEN.

**Keywords:** Genome editing, miroRNA, TALEN, ZFN, CRISPR-Cas9, Meganuclease, Non-homologous end joining, Homologous recombination

#### **1. Introduction**

Recent achievements in genome editing have resulted in progress beyond any imagination decades ago. This new technology provides tools for fast and precise alterations into genome with unprecedented efficiency and specificity. For instance, a group of targeted nucleases have been successfully used to modify genomic sequences in a wide variety of cells and organisms, including mammalians.[1-4] This modification is realized by the combination of conventional gene knockout technology with genome editing, empowering the researchers to make any desired changes in a given gene or its regulatory elements to establish casual linkage between genetic variations and biological phenotypes. This new methodology completes a framework for studying gene function and regulation *in vitro* and *in vivo*.

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### **1.1. Conventional gene knockout technology**

Mammalian genomes contain billions of DNA bases and are difficulty to manipulate.[5-7] Conventional genome editing is inefficient and must undergo a number of complicated and time-consuming procedures to obtain double knockouts. Because this technology had been built upon homologous recombination (HR), the targeted gene modifications occur at an extremely low frequency (1 in 106 ~ 109 cells).[8] This is the reason why this conventional approach is mainly used for producing knockout mice, for which a large amount of high quality embryonic stem cells is available. Although knockout mice have provided great insights into the fundamentals of developmental biology and physiology; the pathological roles of genes have yet to be addressed in relevant disease models or human samples.

#### **1.2. Novel genome editing technology: Targeted nucleases**

To overcome limitations of conventional gene knockout technology, a number of nucleasebased methods have been developed in the past decade.[9-12] These novel technologies exploit the ability of endonucleases to induce double-stranded DNA breaks (DSBs) and stimulate subsequent damage repair mechanisms in mammalian cells.[11, 13-17] Remarkably, this nuclease-induced DSBs not only make sequence changes at the break points but also enhance HR rate to an astonish frequency of 1-40%.[7, 18] This approach provides a simple and effective way to streamline the genome editing with the potential to generate double knockout models in both cell lines and animals. Because of this technical breakthrough, nuclease-based methods quickly gain popularity for gene editing in variety of cell types and species. To date, four major classes of targeted nucleases are prominent: meganucleases derived from microbial mobile elements, zinc finger nuclease (ZFN) based on eukaryotic transcription factor DNA binding motif, transcription activator-like effector nuclease (TALEN) derived from a plan-invasive bacterial protein and most recently the bacterial type II clustered, regularly interspaced, short palindromic repeat-associated nuclease Cas (CRISPRE-Cas).[19] These technologies are collectively termed as targeted nucleases because they all have more or less a programmable DNA binding domain for specific genome targeting. The first three nucleases all recognize specific DNA sequences through protein-DNA interaction, whereas the last CRISPRE-Cas is targeted by a short guide RNA that recognizes the unwind DNA via Watson-Crick base pairing. The purpose of the nuclease is to make localized DSBs. These DSBs will invoke powerful non-homologous end-joining as well as homologous-direct recombination repair pathway for the versatile and precise modification of complicated mammalian genome.[14, 18, 20-32]

#### **2. miRNA**

Large fraction of the genome is transcribed without protein coding potential. [33, 34] These non-coding transcripts can be broadly categorized into short and long non-coding RNAs, in which the arbitrary size delineation is at ~ 200 bases in length. Short non-coding RNAs are less 200 bases including microRNA, tRNAs and snoRNAs. An abundant class of small regulatory RNAs are termed microRNAs (miRNAs) that have been identified to play major roles in gene regulation.[35-38] Like coding mRNAs, miRNAs are transcribed by polymerase II and processed into ~ 22 nucleotide non-coding transcripts, which repress translation by binding to target sites within the 3' untranslated region of mRNA.[39-41] Recent genomic studies have discovered over 1000 miRNAs in human and mouse genome.[34] It is estimated that up to 80% of human genes are regulated by miRNAs.[36, 42] Some mRNAs are clustered in polycistronic transcripts and allow coordinated regulation, while others are expressed in a tissue-specific and developmental stage-specific manner.[35, 40] The roles of miRNAs have been extensively studied at molecular and cellular levels in a number of species including mammals.[43, 44] Furthermore, their roles in physiology and development in animals have been established by conditional knockins and knockouts.[45-47] In fact, miRNAs are expressed across genome, and many of them show spatial and temporal expression. Similar to transcription factors, miRNAs may function as master regulators of embryonic pluripotency, differentiation, and tissue/organ formation.[37] Recently, increasing evidence suggests that miRNAs are implicated in numer‐ ous disease states as well. For instance, miR-21 was found to be overexpressed in virtually all types of human cancers and thus has emerged as an important therapeutic target in cancer treatment.[45, 48, 49] Additionally, miR-21 and other miRNAs have been shown to play crucial regulatory roles in basic cell functions such as cell growth, proliferation and apoptosis.[48, 49] Their pathological roles in multiple human diseases such as autoimmune, cardiovascular and neurological disorders and obesity are emerging.[50, 51]

#### **2.1. Genome organization and biogenesis**

MicroRNA genes have a great diversity in mammalian genomes. They are located in either introns of annotated protein-coding genes or outside the context of an annotated genes.[34] Gemone analysis studies reveal that up to 42% of human miRNAs are in clusters of two or more with pairwise chromosome distance of at most 3000 nucleotides. This pattern of cluster‐ ing allows similar levels of expression and coordinated regulation. Examples of the two most famous clusters are miR-17~92 and miR-302~367. While the deletion of miR-17~92 cluster causes skeletal and growth defects, overexpression of miR-302-367 cluster enhances somatic reprogramming to pluripotent status. These clustered miRNAs appear to function together; therefore alteration of a specific member may not result in expected changes in physiology.

Although the exact mechanism of the regulation of each miRNA remains to be determined, the biogenesis of miRNA is becoming more apparent. From their gene loci, miRANs are initially transcribed by RNA polymerase II as long primary transcripts, which are processed into approximately 70-nucleotide precursors by the RNAse III enzyme Drosha in the nucleus. [52] These highly structured precursors are termed pre-miRNAs and subsequently transported from the nucleus to the cytoplasm by an Exportin-5 protein shuttler.[40] In cytoplasm, these pre-miRNAs are further cleaved by another RNase III enzyme, Dicer, resulting in imperfect miRNA:miRNA duplexes of about 18 ~ 24 bp in length.[53, 54] Although either strand of the duplex may potentially mature as a functional miRNA, only one strand is usually chosen and subsequently incorporated into the RNA-induced silence complex (RISC) where the miRNA and its mRNA target interact.[55, 56]

#### **2.2. Molecular tools for miRNA study**

Despite their fundamental importance, the exact role of miRNAs in the context of human development and disease processes remain largely unknown. This is partly due to lack of effective methods for completely abolishing their expression in human cells and diseaserelevant models. Although miRNA knockdown by short-interfering RNAs (siRNAs) provides a rapid and inexpensive tool to study most protein-coding genes, it cannot be used to reduce mature miRNAs in a sensible way at the cellular level. Other alternatives include small molecule inhibitors, antisense oligo-nucleotides, anti-miR vectors and miRNA sponges.[57, 58] However, the major limitations of these methods are transient nature of their effects and a high risk of off-target effects as well as toxicity. It is no surprise to see reports on discrepancies between the effects of miRNA inhibitors and genetic knockouts. With the completion of the genome sequencing, genome-wide gene targeting knockout of miRNA have been taken.[59] As a result, resources for the conditional ablation of miRNAs in mouse have been attempted. [60] In one study, generation of 162 miRNA targeting vectors and 46 germline-transmitted miRNA knockout mice was reported.[60] However, this homologous donor based knockout technology is expected to have tremendous hurdles to apply to other mammalian species such as rat, pig and monkeys. These larger animals are more close to humans; however, they all lack large scale culture of high quality embryonic cells to make use of this conventional technology. Nevertheless, the work in mice will provide an important basis for elucidating the physiological roles of certain miRNAs in at least one animal species.

#### **3. Targeted nucleases**

Programmable DNA-binding proteins have emerged as an exciting platform for engineering targeted nucleases for precise genome editing.[61] The key component of these engineered nucleases is the DNA recognition domain that is capable of directing the nucleases to the specific genome loci therefore generating DNA double-strand breaks (DSBs) near or at the target sites.[62] In mammalian cells, these DSBs are repaired by one of two mechanisms, nonhomologous end-joining (NHEJ) or homologous recombination (HR).[18] Repair by NHEJ is error-prone and often results in small insertions or deletion mutations, termed indels, at the break point. At the same time, DSBs can also greatly stimulate high-fidelity HDR repair mechanism in fast dividing cells. In mammalian cells, NHEJ is dominant over HR, but the frequency of latter can be substantially increased when large amount of exogenous homolo‐ gous donors are co-delivered into cells.[25]

With these strategies, various methods, including meganucleases, ZFNs, TALENs and CRISPRE-Cas have been reported for genome editing in a wide variety of mammalian species. [63-70] For these methods, a DNA endonuclease enzyme for generating DSBs is brought in place by a guide molecule. In the first three scenarios, the guider is a protein, whereas the last (CRESPRE-Cas) is a short stretch of RNA. Figure 1 illustrates the action of targeted nucleases and the binding mode of guide molecules. In general, the protein guider is more specific than the RNA guider, where degeneration is governed by hybridization mechanism. The pros and cons as well as their features of four main technologies are discussed below.

**Figure 1.** Action of the Targeted Nucleases and Their DNA Binding Mode. Genomic DNA is shown horizontally in black and double stranded, with the site of DNA cleavage indicated by arrowheads. For meganuclease, the holoen‐ zyme binds and cleaves the target DNA. For ZFN and TALENs, they function as a pair; with one zinc finger DNA binding domain (ZF) binding to the upper strand while the other ZF binding to the lower strand. Once this fused FokI enzyme (purple oval) is oriented to form a homodimer, it is activated to cut DNA. For CRISPR-Cas, the Cas9 holoen‐ zyme (orange oval) is directed to the target site by the guide RNA and cleave the DNA at the position close to the PAM motif (grey arrow).

#### **3.1. Meganuclease**

Meganucleases are endonucleases with a large DNA recognition site of 12 ~ 45 bp in length.[71] As a result, this site may occur only once in most mammalian genome.[6] Although Meganu‐ clease possesses high degree precision and low toxicity, its target range is limited. Moreover, the intertwined DNA-binding domain and nuclease domain restrict its capacity to reprogram for other targets, and the probability of finding a meganuclease for cutting a desired locus is extremely slim.[72] For example, the phiC31 integrase mediates recombination between a donor DNA of two 34 bp sequences, termed as attachment sites (att) and the other in mam‐ malian genome. In the introduction of phage integrase, a phiC31 integrase can insert a plasmid donor DNA of any size and requires no additional co-factors. Other advantages of using phiC31 integrase include non-viral delivery, sustainable transgene expression, and functions in species like bacteria, yeast, plants, frogs, chickens, mice, rats, pigs, cows, and humans.[2, 73] However, potential sites of phiC31-based genome editing are limited. In human genome, of the 106 mapped integration sites of phiC31, ~ 39% are within coding genes and ~ 61% are in the intergenic regions.[74]

#### **3.2. Zinc finger nuclease**

First described in 1996, ZFN is a chimeric protein that is composed of two distinctive domains, a programmable DNA binding domain and an endonuclease, FokI.[75] Because of its pro‐ grammability, ZFN have been successfully employed to modify almost all genome types including bacteria, plants, and animals.[1, 76, 77] In fact, ZFNs have been used for the correc‐ tion of a number of hereditary diseases such as hemophilia B, sickle cell anaemia, a-1 anti‐ trypsin deficiency and gene therapies for viral infections.[77-80] ZFN-based HIV gene therapy is already under clinical trials, because of its specificity and safety profile.[81, 82] However, the use of ZFNs has been partly hindered by the complex and time-consuming strategies to generate highly specific zinc-finger arrays with sufficient affinity and specificity. It is worthy to note that ZFNs also suffer some constrains in targeting range with about one potential target site every 500 bp.[83] It is conceivable that one may find it difficult to design ZFNs to precisely targeting a smaller gene in genome, such as miRNAs.

#### **3.3. Transcription activator-like effector nuclease**

Similar to ZFNs, TALENs are also fusion proteins that comprise of a programmable DNA binding domain with an endonuclease, FokI. The DNA-binding domain of TALEN consists of ~34 amino acid repeats, followed by a single half repeat of 20 amino acids.[84] Interestingly, the tandem repeats are nearly identical, except for two amino acid codons at positions 12 and 13, referred to as "repeat variable di-residue" (RVD). [85]Each of the four most common RVDs specifies the binding to one of the four nucleotide bases (Table 1). The natural RVD for G is NN with asparagine at positions 12 and 13. NN binds G with high affinity, but also recognizes and binds A with relative low affinity.[86] Although artificial NH or NK provides good specificity for G, the binding affinity to G is relative low as compared to NN.[86] It is worth to note that TALE proteins bind DNA sequences with an invariable base T in the first position of the target. The corresponding module is not the repeat but the cryptic sequences flanking the repeats. Because the first binding base is invariable, the DAN-binding domain can theatrically be programmed to bind any sequences starting with a "T". Taking advantage of the simplicity of the coding principle, the DNA-binding domain can be easily designed to allow binding of almost any sequences within genomes. Owing to the repetitive nature of the DNA binding domain, the assembly of the custom TALENs by direct synthesis or traditional cloning is expensive and technically challenging.[87] Realizing the potential of TALEN technology, a number of approaches for TALEN assembly have been devised to allow low ~ medium throughput, or high-throughput with automation. Fortunately, a number of Biotech compa‐ nies provide either assembly kit and/or service due to its technical difficulty (Table 2). Like ZFN, this genome editing technology has been shown to function in a wide variety of cells and organisms, including bacteria, yeast, plant, insect, zebrafish and mammal.[21, 88-92] Further‐ more, unlike meganuclease or ZFN that limit the choice of targets, TALEN can virtually bind any loci in the genome with new design that removes the 5' first "T" base constrain.


\*Note: NH and NK favor specificity rather than activity. They bind to G more specifically but with less affinity as compared to NN.

**Table 1.** TALEN DNA binding repeat and its simple code scheme


**Table 2.** Selected companies with TALEN tool kit and service

#### **3.4. CISPRE-Cas9**

Unlike ZFNs or TALENs, CRISPR-Cas needs a short RNA for target site recognition, which is mediated by Watson-Crick DNA base pairing. To form a triplex with genomic DNA, CRISPR-Cas9 also requires a NGG protospacer adjacent motif (PAM) immediately downstream of the hybrid region.

The bacterial CRISPRE-Cas9 system can be reconstituted in mammalian cells using three components: a programmable, specificity-determining CRISPR RNA (crRNA), an auxiliary trans-activating (tracrRNA), and a CRISPRE-associated endonuclease Cas9 (Cas9).[93, 94] In current applications, crRNA and tracrRAN duplexes are fused to generate a chimeric sgRAN that mimics the natural crRNA-tracrRNA hybrid. The single sgRNA has been shown to interact with the holoenzyme Cas9 to generate efficient cleavage.[95] The adaptation of the CRISPR-Cas9 by the research community have been phenomenal, and the system has been used to modify genes in most model species, including drosophila, silkworm, plant, C elegan, zebrafish, Xenopus, rat, mouse, pig, and human.[96-106] The success is mainly due to some significant advantages. First, a single protein Cas9 is required and remains the same. This means no time-consuming protein engineering is required, in contrast to that of meganuclease, ZFN or TALEN. Second, genome targeting depends on an oligonucleotides (20 ~ 30 in length), which are very easy and cheap to produce. Third, among the established programmable DNAbinding domains, CRISPR-Cas9 is most easily to facilitate genome-scale perturbations owing to its one-binding to one target. Last, this is an open system, where most of them are established and can be purchased from the non-profit distributor Addgene in Cambridge Massachusetts. However, the sequence of individual guide RNA makes difference in terms of efficacy and specificity, not all guide RNA provides the same high levels of genome editing activities.[107] Bad guide RAN can elicit off-target effects, causing cyto-toxicity to cells.[108] This can be very problematic for therapeutic applications.

The technologies are still quickly evolving, and tools to better assess both efficiency and specificity will be essential to improve the system. Choosing certain method shall be carefully balanced to serve the need of research. For example, ZFNs, TALENs and CRISPR-Cas9 are reasonable options to target most coding genes. ZFNs and TALENs shall be preferred when specificity and low-toxicity are required. For genome-wide screen or in dealing with multiple targets, CRISCR-Cas9 is more suitable because of its robustness and one molecule for one target. For small gene manipulations, TALENs may be of choice since they have broad rang access in the genome and edit precisely the target with high specificity with much less toxicity and off-target effects as compared to CRISCR-Cas9.

#### **4. Design strategy and experimental approaches**

#### **4.1. Design strategies for miRNA targeting**

One of the challenges in knocking out a miRNA is that the mature and fully functional miRNA is only ~ 22 nucleotides in length. Therefore, sequence alterations outside this 22-nucleotide region may have little or no effects on function of miRNA. To design useful targeted nucleases, sites have to be carefully chosen so that targeted nucleases are directed to the critical region of miRNA gene loci. To this end, the relative small but highly structured pre-miRNA (~70 bp) appears appropriate since it contains sequences vital for miRNA biogenesis and function.[35, 43] Within pre-miRNA, the seed regions of 5p and 3p as well as the dicer processing sites are of choice (Figure 2). Indeed, a number of reports have used this strategy for miRNA knockouts; studies have shown that small indels targeting these sites are effective to abolish miRNA expression in mammalian cells.[45, 46, 59]

Here, we use miR-21 as an example to illustrate the valid target sites that were successfully used in knockout studies. We choose miR-21 for a number of reasons. First, the human miRNA-21 was one of the first mammalian genes identified. MicroRNA-21 is located on plus strand of chromosome within a protein coding gene TMEM49. It is independently transcribed as a ~3433 nuclotides long primary transcript, where the pre-miR-21 (72 nucleotides) has a typical stem-loop (hairpin) structure similar to other miRNAs.[109] Second, the mature miRNA sequence is typically processed from the 5'-arm of the miR-21 precursor.[110] Last, miR-21 is strongly conserved and its role in physiology and pathology has been extensively explored and firmly established.[45, 48, 49, 109] Together, miR-21 provides general design strategies for miRNA gene editing.

As illustrated in Figure 2, the region corresponding to 1-8 nucleotides of the mature miRNA is the most preferred site of cleavage. This is because a small indel in this seed region is expected to abolish miRNA activity.[111] The second choice would be the adjacent Dicer processing site of the seed region. In principle, the chosen of process sites shall be preferred to the one that close to the seed region, indels involves both process site and seed sequences would most likely results in complete knockout. According to above consideration, the 5' miRNA seed region and the adjacent Dicer processing site are usually chosen for miR-21 knockout. Similarly, the 3'-arm of the miRNA precursor shall be preferred when the mature miRNA is of 3p miRNA.

**Figure 2.** MicroRNA knockout strategy and the preferred cleavage sites. The upper panel shows the stem-loop struc‐ ture of miR-21, with mature miR-21 shown in red and seed region underlined. The middle panel shows the TALEN pair and their binding sequences, separated by a 15-bp spacer. The cleavage site normally falls in the middle of the spacer, where the FokI dimerizes and makes double-strand breaks (DSBs) at the seed and/or Dicer process site. These DSBs can lead to two potential consequences as illustrated in the low panel. The predominant route 1 leads to indels via non-homologous end joining mechanism, where the alternative route 2 may lead to a precise genome editing via homologous recombination. Repair by homologous recombination can be used to bring in any desired changes at the targeted site. For knocking out miRNA-21, replacement of pre-miRNA with selection markers can facilitate the enrich‐ ment and selection of edited cells.

Here we introduce the potential sites as a general strategy for miRNA knockout utilizing the dominant NHEJ mechanisms. In fact ZFNs, TALENs and CRISPR-Cas have been used to functionally knockout a number of miRNAs in mammalian cells.[45, 59, 112-116] Among the three types of programmable nucleases, TALENs are suitable for small loci with narrow targetable regions, whereas ZFNs and CRISPR-Cas are limited by availability of binding modules and the requirement of PAM motif respectively. It is important to keep in mind; the binding sites of targeted nucleases may be different from the cleavage sites in terms of design. For ZFNs and TALENs, the cleavage site is situated in the middle of their binding sites of ZFNs or TALENs. For CRSPR-Cas, the cleavage site is within the guider RNA and close to the PAM motif (Figure 2).

#### **4.2. TALEN design and assembly**

Free online tools such as TALEN Targeter and SAPTA are available to design DNA binding domain of TALEN that is specific and has a low risk of of-target effects.[117, 118] Similar to ZFNs, TALENS s requires the design of protein pairs, which bind two optimal anchoring positions on opposite strands, usually spaced 15 ~ 25 bp apart to allow for FokI dimerization and cutting.[119] Length of DNA binding sequences may vary, typically ranging from 14 ~ 20 bp. In humans, 20 bp may offer high specificity, considering the genome size. It is worthy to note that longer domains (18 ~ 20) may decrease cell toxicity by reducing the risk of off-target effects.[120]

A number of approaches have been developed for rapid assembly of custom TALENs. With these advances, TALEN pairs can be generated easily and economically in a matter of days. [121] Most of the methods rely on the ability of type IIS restriction enzymes to assemble premade repeats into fully functional TALEN scaffold (Figure 3). A TALEN scaffold is comprised of a number of domains from the N-terminus to the C-terminus, including a nuclear localization signal, part of the N-terminal sequences for the first "T" recognition, flowed by the last half-repeat and part of C-terminal sequences fused with a nuclease FokI. Because the last binding nucleotide can be any one of the four, the TALEN scaffolds normally have four different flavours. To make the TALEN fully functional, an eukaryotic gene promoter is normally placed at 5' of TALEN coding region and a poly A signal sequences at the 3' end of TALEN.

Most assembly protocol is based on the Golden Gate method, which relies on the ability of type IIS restriction enzymes to cut outside their recognition site. Type IIS recognition sites arranged in inverse orientation at the 5' and 3' end of DNA fragment will be removed upon cleavage, slowing simultaneous restriction and ligation. The continuous re-digestion of unwanted ligation products increases the formation of the desired construct.[122, 123] As type IIS fusion sites can be designed to have different sequences, Golden Gate cloning enables directional and seamless assembly of multiple DNA fragments. Based on this principle, onestep or two-step assembly protocol or kits are developed and commercially available to allow do-it-yourself assembly of TALEN in any molecular biology laboratories.[121]

**Figure 3.** Golden Gate assembly of TALENs and molecular structure. Two-step assembly strategy can increase the cor‐ rect joining of monomers into multimers and subsequently into the TALEN scaffold. A TALEN, from its N- to C-termi‐ nus, is composed of a nuclear localization signal peptide (NLS), modified N- and C-termini flanking the assembled RVDs, and fused nuclease FokI at C-terminus. CMV (cytomegaloviral) promoter situates at the 5' of the TALEN and drives its expression in mammalian cells. The polyadenylation signal (Poly A) is to add a poly(A) tail to the TALEN mRNA.

#### **4.3. HR donor design and construction**

While it is possible to disrupt genes in mammalian genome with TALENs alone, the frequency of gene editing is typically 2 ~ 40%, averaging ~16% for mono-allelic disruptions.[87] Cells carrying bi-allelic disruption are rare and require time-consuming signal cell-derivation and subsequent screening.[5] One strategy is to combine TALENs targeting to the miRNA seed region with a homologous recombination of donor vector carrying a selectable marker.[46, 59, 114] This approach enables convenient positive selection, and the combination of NHEJ with stem-loop deletion results in efficient bi-allelic miRNA gene ablation, which can be as high as >90%.[46]Additionally, by using HR donors, endogenous loci can be potentially modified with custom sequences such as IRES-florescent proteins to allow functional assessment of endoge‐ nous miRNA expression and regulation.[124]

#### **4.4. Generation of knockout cell clones**

All targeted nucleases can be used in mammalians to create miRNA knockouts.[45, 46, 59, 113-116] Success of miRNA gene editing depends largely on the ability to deliver all the reagents efficiently to the cells. For transgenics, direct injection of DNA vector or sometime *in vitro* transcribed nuclease mRNAs into embryos is effective.[88, 125] For cultured cells, the options include plasmid DNA transfection, viral deliver, and transfection with synthetic mRNA or proteins.[126-130] Normally, donor DNAs can also supplied via plasmid format and co-delivered with nucleases using the same methodology. To deal with difficult cell types, viral transduction appears to be effective. Although lentiviral system appears to work well for both ZFNs and CRISPR-Cas, it is not suitable for TALEN delivery due to its incompatiblility with the tandem repeats of TALEN.

Following the delivery, small-scale sequence changes are introduced at the break by NHEJ. [18] These indels are typically assayed by polymerase chain reaction (PCR) amplification of the region, followed by DNA sequencing or by a gel electrophoresis assay using T7E1 endonuclease, or alternatively by high-resolution melting analysis.[131] In addition to detecting changes at genomic level, a reverse transcripts-PCR on the expression can also be performed to confirm the reduced/ablated expression of miRNA at transcript level. PCRbased genotyping can be used to make distinctions between HR and NHEJ events when donor DNA is used.[131]

#### **5. Applications of targeted nucleases for miRNA research**

Targeted nucleases are powerful genome editing tools for uncovering gene functions. Though relatively new, they have been successfully employed in a broad variety of systems and produced exciting results for miRNA research.

#### **5.1. Knockout of the miR-200 family**

To understand the biological and pathological significance of miRNAs, a talen-based knockout library for 274 highly conserved human miRNAs has been established.[59] To demonstrate the genome editing activities of the TALEN library, 66 TALEN pairs against 33 miRNA loci are selected. All TALEN pairs tested induce mutations as assessed by a mismatch-senstivie T7EI assay with a frequency above 0.5%.[132] To gain some insight of functional role miRNAs, the authors conduct detail analysis on members of the miR-200 family.

There are at least two members in the highly conserved miR-200 family, miR-141 and miR-200c. [59] Interestingly, miR-141 and miR-200c have largely indistinguishable activity and differ only in the seed region by one nucleotide. This imposes a great difficulty to use either over‐ expression or complementary inhibitor-based knockdown to investigate the potential func‐ tional divergence without complications. Using TALEN technology, however, the authors can target the seed region of the 5p strand for miR-141 but choose the Drosha processing site in the 3p strand for miR-200c, hence avoid cross-targeting.[52, 110] With this design, both single and double knockout clones were obtained and their corresponding expression of mature miRNAs was confirmed by RT-PCR.[59] Using these cell models, the authors found that miR-141 represses the expression of mRNAs that have miR-141 motif at the 3'-untranslated region. Similarly, miR-200c represses expression of mRNAs that have miR-200c motif. These data indicate the two closely related miRNAs do not cross-react notably and may control largely nonoverlapping group of genes. Together, TALEN-based method may provide unprecedented tools for miRNA research with great precision and specificity.

#### **5.2. miR-21 ablation and cancer research**

MicroRNA-21 gene knockout in the cultured human cells was achieved independently by two research groups.[45, 131, 133] The first group used a combination of TALEN pair with a HR donor. TALENs were designed to position the miR-21 seed region in the central portion of the spacer, directing the cleavage to the functionally essential miRNA motif. The HR donor construct was created corresponding to the cleavage location of the TALEN pair and carried 509-bp (5' arm) and 600-bp (3' arm) regions of homology to the miR-21 genome sequences. TALEN pair and HR donor were delivered together using transfection proto‐ col. In the case of HR events, the donor replaces the entire miR-21 precursor with two selectable marker genes (red florescent protein and puromycin resistant gene). Clonal population of cells in which an HR event occurred can be easily selected by puromycin treatment. Because NJEJ is the predominant repair mechanism induced by DSBs, selecting for HR events would most likely produce clones that harbour bi-allelic modifications, with the second allele carrying an indel in the seed region. In fact, this approach demonstrated bi-allelic miR-21 gene disruption at very high frequency of 87% in cultured HEK293 cells. [131] Analysis of three independent clones showed a total loss of miR-21expression. Phenotypical examination revealed an increase in miR-21 target gene expression, reduced cell proliferation, and alteration of global miRNA expression profiles, which is in agree‐ ment with the role of miR-21 in cancer biology.[45, 48]

#### **5.3. miRNA knockout for transgenics**

To explore the gene specific-function *in vivo*, generation of gene-targeting animal is powerful strategy. To this end, targeted nucleases have demonstrated that all nucleases are applicable for generating knockout animal and have several advantages over the conventional embryonic stem cells or nuclear transfer technologies. Recently, ZFNs, TALEN and CRISPRE-Cas were able to generate knockout animals in several model species in addition to the established spices of mice.[19] One of the strong merit is that they are free of constrain from using embryonic stem cells. The genome of the fertilized zygotes can be directly modified by injecting the targeted nuclease. Additionally, these new tools are robust enough to allow a simultaneous knockout of multiple genes, thus greatly accelerating the experimental speed and answering complicated biological questions.

Targeted knockout of miRNAs in mice by TALEN has also been reported.[113] In this study, microinjection of synthesized mRNA of TALEN was carried out in one cell stage of embryo. Embryos were allowed to develop to two-cell stage and subsequently transferred into pseudopregnant female mice. With optimized protocol, these mice were able to produce 29.6% mono-allelic offspring. Further cross-mating of the heterozygous founder mice with the wild type strain was able to produced heterozygote offspring, suggesting transmittable of the mutated miRNA.

#### **6. Prospects and challenges**

Targeted nuclease technology has become one the most powerful and versatile platforms for engineering biology. The new technology is enabling systematic interrogation of genome, including miRNAs in mammals with high precision and efficacy. It is superb over other technologies by creating null mutations that lead to complete suppression of gene expression. Furthermore, bi-allelic ablation of miRNAs in cultured somatic cells opens new avenue to study the pathological role of miRNAs in relevant disease samples. Knowledge gained from these studies is more likely lead to discovery of novel drug targets, accelerating new thera‐ peutics towards clinic application. In deed, a number of animal studies as well as clinical trials using targeted nucleases already provide encouraging results. In addition to repairing mutations underlying inherited diseases, targeted nucleases can be used to create productive mutations in tissues to combat viral infections or complex diseases such as familial hypercho‐ lesterolemia and hypertension.

Targeted nucleases can be tweaked to carry out other functions, such as modifying DNA associated histones, activating or inhibiting gene transcriptions and monitoring chromatin dynamics in living cells. It becomes increasing apparent that targeted nuclease is a versatile and common platform for elucidating gene function and epigenetic regulation.

However, many challenges still lie ahead. The most prominent issue may be the off-target effect. In this aspect, ZFNs and TALENs appear less problematic owing to the intimate interaction with their binding sites and the requirement of correct binding of two molecules. In contrast, CRISPRE-Cas tends to have higher off-target effects, because it binds DNA via base pairing. One or more mispairings may be tolerated, which may cause detrimental unintended effects.[134] Attempts to solve this issue include improvement of guide RNA scaffold, use of computational algorithms to predict off-targets, and development of high throughput method to assess unintended cuts. The challenge, though, is difficult to detect the off-target cleavage. Hence, one shall interpret the data with caution, and is aware of the potential off-target effects and its related toxicity. Secondly, although nuclease molecule may have high level of activity within cells, one may find that some tissue types are still difficult to deliver into. In this case, multiple delivery methods shall be tried, suitable one shall be optimized. Finally, for further development toward clinical application, it will be essential to thoroughly the safety and toxicity profile using a variety of mammalian models.

#### **7. Conclusion**

Advance in genome engineering technology based on the targeted nucleases are enabling the systematic interrogation of mammalian gene function. Using this system, miRNA gene sequences within mammalian genome can be easily edited with high efficacy and precision. Targeted miRNA editing will empower researcher to reveal the complex regulatory circuits governed by miRNAs and to realize, in the long-term, their full diagnostic and therapeutic potential.

#### **Author details**

Lei Yu1 , Jennifer Batara2 and Biao Lu2\*

\*Address all correspondence to: blu2@scu.edu

1 Institute for Advanced Interdisciplinary Research in Science and Technology, East China Normal University, Shanghai, China

2 Department of Bioengineering, Santa Clara University, Santa Clara, California, USA

#### **References**


## **Emerging Gene Correction Strategies for Muscular Dystrophies: Scientific Progress and Regulatory Impact**

Houria Bachtarzi and Tim Farries

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/62282

#### **Abstract**

Muscular dystrophies comprise a heterogeneous cluster of inherited muscle degenera‐ tive disorders with the common feature of progressive muscle weakness. These represent good candidates for treatment with gene-based therapies. Progress in gene transfer technologies has raised hopes for successful therapeutic restoration of mutated genes such as dystrophin in Duchenne muscular dystrophy. Delivery to enough muscle cells, however, remains a challenge for a successful gene replacement therapy. Other approaches based on exon skipping to correct mutant dystrophin's pre-mRNA splicing patterns have been tried, and partial restoration of dystrophin expression was reported in late-stage clinical trials, but full therapeutic efficacy is yet to be confirmed. The emergence of gene editing and its recent success in AIDS have opened a new therapeutic era for muscular dystrophies. This chapter will cover new gene correction strategies for muscular dystrophies and their regulatory challenges before they can become routine treatment modalities in the clinic.

**Keywords:** gene therapy, muscular dystrophies, regulatory impact, inherited muscle degenerative disorders, progressive muscle weakness

#### **1. Introduction**

Muscular dystrophy refers to a range of conditions of progressive muscle weakening generally due to genetic defects in proteins that are critical for muscle functioning. The most common form of the disease, and one of the more common of any seriously debilitating genetic disorders, is Duchenne muscular dystrophy (DMD), which is caused by the mutation of the

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

dystrophin gene on the X chromosome and found in approximately 1 in 3600 of male births. The dystrophin gene product is a large protein with repeated elements and is involved in connecting the muscle fibres to the extracellular matrix [1]. Other proteins, mutations of which may cause forms of muscular dystrophy, include poly(A) binding protein nuclear 1 (PABPN1), myotonic dystrophy (DM) protein kinase (DMPK), or the product of the Emery-Dreifuss muscular dystrophy gene.

This chapter focuses on opportunities to develop treatments for these conditions with therapies that target the genetic defect that is the cause of the disease. The promise shown by new technologies for gene therapeutics makes this a particularly propitious time to consider the impact that such scientific progress may have on this clinically important set of conditions.

Notable among the new technological advances of relevance are the development of methods of specifically changing the sequence of the human genome, either by introducing a new gene sequence, which may repair the function of a defective gene, or by correcting the defect in the endogenous gene. Although the means to deliver genes to human cells still rely on the use of viral vectors, the methodology for the effective manufacture of these complex biologicals has developed. The approvals by the European Union (EU) of Glybera® and Imlygic® show that viral products can be manufactured to the quality standards acceptable for commercial products, with acceptable profiles of safety and efficacy. Another transformative technology is the derivation from stem cell precursors of different differentiated cell types with the potential to repair damaged or otherwise defective tissue. Putting these technologies together results in a possibility to create and manufacture to GMP standard, well-characterised patientspecific (or at least patient-compatible), genetically modified cells, a combination that may be anticipated in the near future to enable the viable therapeutic treatment of many previously intractable genetic diseases.

This chapter looks at the current status and future prospects of how the latest gene-based technologies are being applied to the alleviation, or even cure, of inherited muscular dystro‐ phies. Furthermore, we consider the challenge of translating these treatment modalities into medicines that can be approved for commercial use by the regulatory authorities, notably within the EU. To do this, developers must go beyond the scientific mechanisms of efficacy to establish how the products will be manufactured to consistent quality standards and to demonstrate that they are clinically safe.

#### **2. Muscular dystrophies**

Muscular dystrophies comprise a heterogeneous cluster of inherited muscle degenerative disorders, each caused by a distinct gene mutation. More than 30 genes have been identified, each causing a different type of muscle pathology with different patterns of muscle weakness and disease progression (**Table 1**). This includes but is not limited to Duchenne, Becker, congenital, myotonic, Emery-Dreifuss, facioscapulohumeral, oculopharyngeal, and limbgirdle muscular dystrophies. Each of these varies in terms of pattern of inheritance, age of disease onset, biochemical markers (such as creatinine kinase's upper limits), types of muscles affected, and complications at other organ sites, including cardiac and pulmonary problems. These give rise to varied symptoms, including muscle weakness and wasting (a common feature of a number of muscular dystrophies), joint stiffness, and scoliosis in addition to respiratory complications (such as chest infections and shortness of breath). Other symptoms include ankle swelling often linked with cardiomyopathy, fainting, eyelid drooping, and dysphagia. Ophthalmological symptoms, such as myopia in facioscapulohumeral muscular dystrophy and severe congenital muscular dystrophy (CMD) variants, cataracts in DM, and eyelid drooping in oculopharyngeal muscular dystrophy (OPMD), have also been reported. Hearing loss and skin lesions are also common in facioscapulohumeral muscular dystrophy and Ullrich CMD, respectively. Overall, these cause profound impairments in physical activity and quality of life [2, 3].



Emerging Gene Correction Strategies for Muscular Dystrophies: Scientific Progress and Regulatory Impact http://dx.doi.org/10.5772/62282 191


loss reported.


MDC1A = congenital muscular dystrophy with merosin deficiency; MDC1C = congenital muscular dystrophy and abnormal glycosylation of dystroglycan; FCMD = Fukuyama congenital muscular dystrophy; WWS = Walker-Warburg syndrome; X-R = X-linked recessive; AD = autosomal dominant; AR = autosomal recessive.

**Table 1.** Summary of major muscular dystrophies.

With an estimated incidence of 1 in 3600 to 6000 boys, DMD is one of the most common and severe forms of muscular dystrophies with an early-onset and progressive muscle weakness leading to the loss of ambulation by the second decade. This X-linked recessive condition is caused by mutations in the dystrophin gene (*DMD*, locus Xp21.2), which is expressed in skeletal, smooth, and cardiac muscles, and hence the pathological involvement of different organs beyond skeletal muscle weakness, including respiratory and cardiac systems. The *DMD* gene comprises 79 exons and encodes a 14-kb mRNA transcript. This gives rise to a large protein product (427 kDa), a crucial cytoskeletal protein that mediates major structural and signalling functions within muscles. Four functional units make up the dystrophin pro‐ tein. These include the actin-binding domain at the N-terminus, the rod domain consisting of 24 spectrin-like domains with four interspersing hinges, the cysteine-rich domain that mediates binding to β-dystroglycan, and the C-terminal domain mediating binding to syn‐ trophin and dystrobrevin. These binding units confer a principal structural role for dystro‐ phin facilitating its assembly with other proteins to form the dystrophin-associated glycoprotein complex (DAGC) (**Figure 1**) [1, 4, 5].

**Figure 1. The role of dystrophin in muscle physiology and disease**. (A) The Dystrophin-associated Glycoprotein Complex (DAGC) in skeletal muscle; (B) schematic illustration of exons that make up the *DMD* gene; (C) consequences of a frameshift mutation in the *DMD* gene illustrated by exon 50 deletion; and restoration of dystrophin expression with exon skipping therapeutics.

Deletions in one or more of the 79 exons that make up the *DMD* gene are common in DMD and account for approximately two thirds of the reported mutations. Other documented disruptions within the *DMD* gene include duplications (~10%), point mutations (~10%), and smaller rearrangements (15%) [6]. These lead to loss of dystrophin, which in turn destabilises the DAGC, resulting in the weakening of muscle fibre strength, increased susceptibility to stretch-induced damage, and raised intracellular calcium influx. These physiological distur‐ bances account for the underlying histopathological features often observed in skeletal and cardiac muscles from affected patients, including muscle fibre necrosis, inflammation, and substitution with fibroadipose tissue [7, 8].

No curative treatment is currently available for most muscular dystrophies, including DMD. Current approaches involve relieving symptoms, delaying disease progress, and preventing complications [2]. Although these interventions proved beneficial in the short term, none of them can provide a long-term treatment and a permanent correction of the underlying pathological features. From a molecular point of view and based on advances in the identifi‐ cation of genes behind the observed phenotypes, most of these muscle pathologies represent good candidates for treatment with gene-based therapies.

#### **3. Gene therapy for DMD**

DMD has been the main focus and the proof-of-principle model for most gene therapy strategies targeting neuromuscular disorders, over the past years, with proof-of-concept validated in preclinical and clinical settings. The first clinical trial in the neuromuscular field, in fact, involved a gene replacement approach to deliver full-length dystrophin via a nonviral method of transfer [9]. The observed low expression levels, however, at local injections sites, highlighted the need for potent gene replacement transfer systems. Of these, adeno-associated viruses (AAVs) are seen as the best available option. The efficiency of functional dystrophin expression in patients with DMD, using AAV as a vector, was assessed in a phase I clinical trial [10]. The approach was shown to be safe with no concerning side effects, although overall efficiency was compromised by the development of dystrophin-specific T-cell-mediated immune responses. This finding was complemented by observations that preexisting T-cellmediated immune responses to AAV were present, which, together with the dystrophinspecific T-cell responses, could have contributed to the low transgene expression levels detected following intramuscular injection. These early conclusions highlighted the need to circumvent immune destruction of therapeutic transgenes, by delivering unaffected homologs of dystrophin. Of these, a microutrophin-expressing recombinant AAV2/6 was shown to restore the dystrophin-glycoprotein complex and revert pathology in dystrophin (-/-)/utrophin (-/-) double-knockout mice model [11]. Delivering truncated versions of target proteins or their homologs, however, does not reconstitute a full gene replacement approach for DMD. Microdystrophin and minidystrophin transgenes often lack some crucial rod and hinge domains of full-length dystrophin, including neuronal nitric oxide synthase, syntrophin, and dystrobrevin, hence compromising maximal dystrophin functionality and membrane rigidity. This led to the engineering of triple AAV constructs using trans-splicing and hybrid methods, capable of delivering full-length dystrophin following coinfection of the tri-vectors in affected muscles *in vivo* [12, 13]. In trans-splicing, each vector acts as an independent construct holding sequential exonic sequences of human dystrophin's coding sequence. Coinjection of the vectors cause the constructs to cojoin via their inverted terminal repeats and deliver a fulllength therapeutic transgene. These results circumvent the limited packaging capacity of AAVs and offer clinical hopes for boys with severe forms of DMD, for which the reversion of phenotype to a milder disease form [Becker muscular dystrophy (BMD)] using microdystro‐ phin or minidystrophin-expressing AVV is not sufficient.

Although these gene replacement therapies are still in the early clinical phase of research and development, exon skipping-based therapeutics for DMD is progressing faster. Antisense oligonucleotides (AONs) have long been an effective alternative to dystrophin gene replace‐ ment therapy. These work by binding to the dystrophin transcript at sites that interfere with normal RNA processing, so that exons containing the mutations are bypassed, giving rise to in-frame transcripts capable of producing shorter yet functional protein products (**Figure 1C**). This was based on observations that BMD-like patients have in-frame transcripts with shorter dystrophin yet still maintain ambulation [14]. Hence, manipulating the splicing pattern of mutant dystrophin in DMD patients with an AON-based approach has the potential to alter disease phenotype from a clinically severe form of DMD to a milder BMD phenotype. For instance, the reading frame of dystrophin could be restored by blocking specific enhancers or splicing regulatory elements responsible for controlling the gene's exon recognitions. The approach was initially demonstrated *in vivo* in the *mdx* mouse model of DMD, bearing a singlepoint mutation in exon 23 that creates a stop codon with subsequent absence of dystrophin expression. Using a 2′-*O*-methyl (2′OMe) oligoribonucleotide complementary to the murine intron 22's 3′ splice site, it was possible to restore sarcolemmal expression of dystrophin in transfected myotubes *in vivo* [15]. The proof-of-concept has also been demonstrated in relevant human cell culture models derived from DMD patients bearing an exon 45 deletion, one of the most frequently deleted exons in DMD. The efficient restoration of dystrophin's coding frame was achieved by targeting splicing regulatory elements in exon 46 using a 2′OMe oligonu‐ cleotide [16]. Similarly, a successful correction was achieved through exon 51 skipping in a phase II clinical trial, where a successful restoration of both dystrophin and the DAGC was observed in patients with a deletion in exons 45 to 50 or exons 48 to 50 [8, 17], opening up great therapeutic hopes and paving the way towards late-stage clinical trials. A Biologics License Application (BLA) was submitted to the Food and Drug Administration (FDA) for approval of the antisense agent PRO051, 2′OMe (drisapersen) targeting exon 51 for DMD. However, this application was recently rejected, as a phase III study (NCT01803412) of long-term intake failed to meet its primary efficacy endpoint and also showed evidence of significant toxicity to a number of organs. Another similar antisense agent, eteplirsen, is currently under the FDA review for approval.

Although AON-based therapy for DMD has long been the prime attention of scientists and clinicians, this approach does not provide a long-term cure for DMD. Regular administrations of high doses are required to achieve a constant skipping and redirection of gene expression.

#### **4. Gene editing strategies**

With advances in human genome and increasing need to provide a simplified long-term curative approach for genetic muscle diseases, new sophisticated technologies based on gene editing have emerged. These aim to permanently correct disease phenotypes in affected individuals using site-directed endonucleases such as zinc finger nucleases (ZFNs), transcrip‐ tion activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeats (CRISPR). These corrections occur through the generation of doublestranded DNA breaks that will eventually induce intrinsic cellular DNA repair mechanisms mostly via nonhomologous end joining (NHEJ) [18, 19].

#### *Zinc finger nucleases (ZFNs)*

ZFNs are currently the highly advanced gene editing system. Initially, these were rationally designed taking advantage of key biological properties of zinc finger transcription factors such as their DNA sequence recognition function. Each zinc finger component is composed of ~30 amino acids and is capable of recognising 3 bp of DNA. The overall structure is arranged in tandem repeats of zinc finger motifs, hence allowing the recognition of longer sequences of DNA. The nuclease activity of ZFNs is conferred through fusion to *Fok*I endonuclease's catalytic domain. This design ensures that enzymatic and subsequent DNA cleavage activities are only targeted to sites recognised by the binding domains of ZFNs. The latter consist of Cys2His2 zinc finger structures, in which a single zinc atom is surrounded by 30 amino acids and is capable of recognising 3 bp of DNA. More often, however, three to six zinc finger units are assembled, which allows the recognition of 9- to 18-bp DNA sequences. This is usually regarded as acceptable for a single locus targeting in a human genome. From a structureactivity relationship, however, full functionality is not achieved unless *Fok*I is presented in a dimerised form so as two DNA cleavage domains are dimerised around the target DNA sequence [19, 20]. In an attempt to maximise the biosafety profile of ZFNs, several methods have been employed to increase specificity. These include limiting the spacer length between the recognition sites of chimeric ZFN subunits [20]. In fact, ZFNs with shorter interdomain linkers connecting the Cys2-His2 zinc finger and the nuclease domains were shown to have a restricted activity with a 6-amino acid linker exerting the most selective activity at a target DNA site with a 6-bp spacer [21]. Another approach has been to generate obligate heterodimer nuclease domains with decreased off-target effects [22–24]. This is associated with a relatively weaker interaction between the heterodimer cleavage domains, which would necessitate stronger interactions between each monomer and the target site to achieve a site-specific cleavage while minimising cleavage at weakly bound nontarget sites. This low-affinity highavidity approach has been proposed as a plausible mechanistic approach behind the sitespecific cleavage activity of the newly engineered obligate heterodimer *Fok*I domains. An alternative to the obligate heterodimer method has been the expression of autonomous ZFN pairs, the combined expression of which was as effective as obligate heterodimer ZFN domains at inducing targeted chromosomal deletion in mammalian cells, with reduced toxic effects that are often thought to be linked with unwanted individual ZFN subunits' cross-reaction [25]. The enhancement of *Fok*I's enzymatic activity has also been reported, whereby an *in vivo* evolution-based method was employed to further increase the cleavage activity of *Fok*I. The incorporation of the enhanced domain in heterodimer ZFN structures resulted in a potent product with improved overall cleavage profile [26].

The successful correction of different genetic mutations associated with various diseases, including sickle cell anaemia [27], haemophilia [28], α1-antitrypsin deficiency [29], X-linked severe combined immunodeficiency [30], and, more recently, HIV [31, 32], has led to its application in DMD, in which precise gene editing can be achieved by deleting targeted exons from the dystrophin gene. For instance, using extended Modular Assembly (eMA)/Context-Dependent Assembly (CoDA) methods, it was possible to generate several exon 51 targeted ZFNs. Two of which demonstrated a remarkable activity with mild off-target mutagenic effects in myoblast cells from DMD patients, harbouring a deletion of exons 48 to 50. When implanted into the hind limb of immunodeficient mice, the corrected myoblasts were capable of main‐ taining dystrophin expression *in vivo*, with a correct sarcolemmal localisation [33]. This proofof-concept study has shown that the ZFN-based approach could potentially be adopted under a cell-based therapy approach for DMD, hence holding a promising faster translation into the clinic, considering that the approach is already in clinical trials for *ex vivo* cell modifications in HIV [32].

#### *Transcription activator-like effector nucleases (TALENs)*

TALENs were originally isolated from *Xanthomonas* bacteria, plant pathogens capable of employing up to 40 effector proteins to circumvent eukaryotic cell defences during a host infection [34]. These are composed of repeated motifs of 33 to 35 amino acid residues, identical with the exception of the 12th and 13th residues, which are often known as the repeat-variable di-residues. The latter play a key role in TALEN's DNA-binding specificity, whereby a different pair of amino acids would exhibit a specific binding to a corresponding nucleotide in the target sequence. For instance, the asparagine (Asn; N)-isoleucine (Ile; I) NI, the histidine (His; H)-aspartate (Asp; D) HD, the Asn-Asn NN or the Asn-lysine (Lys; K) NK, and the Asnglycine (Gly; G) NG pairs preferentially bind to adenine, cytosine, guanine, and thymine, respectively. These constitute TALENs' DNA-specific binding domains, which, like ZFNs, are conjugated to nonspecific *Fok*I cleavage domains, hence directing them to the target site for gene editing and subsequent correction of the final protein product. A recognition sequence of 14 to 20 bp, together with an appropriately spaced *Fok*I subunits separated by 12 to 19 bp, will ensure good genomic target recognition with maximal cleavage activity owing to efficient *Fok*I dimerisation [35, 36].

Although not yet being as clinically advanced as ZFNs, the lower cost and ease of production of TALENs have generated an increasing interest in the technology. Recent work in DMD has shown the feasibility of the approach using optimised exon 51 targeted TALEN-encoding plasmids transfected into myoblast cells isolated from two different patients with deletions in exons 48 to 50. A satisfactory gene correction was observed with up to 12.7% and 6.8% of alleles confirmed to have indels in the two treated patient myoblast cell lines, respectively. This correction was further correlated with a good restoration of dystrophin expression at the expected predicted size of ~412 kDa compared to its expression in the wild-type isolated myoblasts [37]. Whole exome sequencing of the successfully corrected TALEN-treated cells revealed no insertions or deletions, except at the exon 51 target locus. Further analysis using the TALE-NT 2.0 Paired Target Site Prediction web server confirmed the nature of the observed single-nucleotide variants occurring. These were in fact related to expected genomic mutations that normally occur during cell clonal expansion, as none of these showed any similarity to the employed TALEN target site with spacers of 1 to 30 bases [37].

Besides the dystrophin gene as a main target in DMD, other genes have received comparable interest as targets for TALEN-based gene editing. Of these, myostatin (MSTN) has been the centre of attention as a member of the transforming growth factor-β family. This is mostly linked to its recognised role in muscle physiology, acting as a negative regulator of skeletal muscle mass. In fact, studies have shown that mutations in the *MSTN* gene cause an increase in skeletal muscle fibre numbers and sizes, which in turn lead to muscle hypertrophy without any alarming consequences [38–40]. These observations were recently translated into thera‐ peutic interventions, whereby inhibiting the MSTN signalling pathway using pharmacological agents has shown real benefits in DMD but also in other muscle wasting pathologies such as sarcopenia and cancer cachexia [41–45]. A phase I study is currently under way to assess the safety of an anti-MSTN monoclonal antibody in advanced cancer patients with cachexia (NCT01524224). The same approach using a different pharmacological agent, BMS-986089 (Bristol-Myers Squibb), will soon be tested in a first clinical trial involving patients with DMD (NCT02515669). From a molecular therapeutic point view, *MSTN* gene editing provides a longterm control to switch off MSTN signalling in the wasted affected muscles. The approach was successfully demonstrated recently using a pair of TALEN-expressing plasmids targeting the human exon 2 locus, a highly conserved region within the coding sequence of the *MSTN* gene. Consistent with previous ZFN-related toxicity studies, the reported MSTN-TALEN targeted system was engineered using obligate heterodimers of the *Fok*I domain to minimise off-target effects often seen with homodimer variants. Initial experiments in the HEK293 cell line revealed that the mutation induced by the TALEN approach was persistent with indels still detected 1 month after transfection using the T7E1 assay, a commonly used enzyme mismatch cleavage method for detecting mutations. A similar finding was reported in four different cell lines from different species including human, bovine, and murine cells. Further investigation in primary myoblast cultures derived from a dysferlin-deficient mouse model as well as from patients with dysferlinopathy confirmed an efficiency of 10.3% to 24.6% of gene editing after treatment [46]. Although most ZFN- and TALEN-based gene editing systems for muscle diseases to date have been engineered based on NHEJ mechanisms, proof-of-concept data are now emerging on TALEN-mediated homology-directed DNA repair (HDR). Targeted integration of dysferlin, a mutation of which is associated with limb-girdle muscular dystrophy (LGMD) type 2B and Miyoshi myopathy, has been shown following cotransfection of HEK293 with MSTN targeted TALEN-expressing plasmid and a donor plasmid expressing dysferlin tagged to an enhanced cyan fluorescent protein under a cytomegalovirus (CMV) promoter [46]. Although this approach is still at its early research and development stage, the results obtained *in vitro* remain promising and could offer the basis for future TALEN-corrected myoblast transplantation therapy for a number of severe muscular dystrophies.

#### *Clustered regularly interspaced short palindromic repeats (CRISPR)*

Recent achievement in the field of gene editing has seen the emergence of CRISPR as a novel correction tool in biomedicine. CRISPRs were first described in prokaryotes as part of an RNAguided adaptive immune system to protect against foreign intrusions such as viruses and plasmids. In response to these, bacteria and archaea are programmed to incorporate short sequences of foreign DNA at one end of a repeat element, which is often referred to as the CRISPR. The integrated inserts between the repeat elements of prokaryotic CRISPR-associated (Cas) loci hence confer a permanent future mechanism of defence against the past invaders [47]. Three types of CRISPR/Cas systems (I–III) have been described, with differing structural, functional, and mechanistic characteristics. Each system, however, shares a similar sequence arrangement of short repeated units of 30 to 40 nucleotides, separated by a unique "target" specific nonrepeated sequence (a spacer) of equal length [48, 49]. A leader sequence incorpo‐ rating a promoter is often present, which initiates a unidirectional transcription of CRISPR sequences. Whereas types I and III systems are present in both bacteria and archaea, type II systems have been reported in bacteria only [50]. Type II systems that use a unique Cas protein referred to as Cas9 are of particular interest in medical application. The RNA-guided, nucleasemediated genome editing of a type II CRISPR system is mediated by Cas9, a nuclease that is directed to the target DNA site by a single-guide RNA recognising a specific locus next to the protospacer adjacent motif. This subsequently creates a double-stranded break, which could be repaired by NHEJ or by a homology-directed mechanism of repair provided that an exogenous plasmid donor is codelivered [51].

In muscular dystrophy, the CRISPR/Cas9-based technology could theoretically be employed to correct germ-line DNA prenatally in one-cell zygotes or as a postnatal treatment. The first approach was recently tested *in vivo* in *mdx* zygotes. By injecting a single-guide RNA-guided CRISPR/Cas9 together with an appropriate HDR template, Long et al. demonstrated for the first time the feasibility of the approach to correct a nonsense mutation in exon 23 of the *DMD* gene [52]. Although the CRISPR/Cas9-treated *mdx* zygotes produced mice with varying degrees of gene correction, those showing 41% of gene correction by HDR and 83% correction by NHEJ repair mechanisms demonstrated a complete disease-free phenotype with normali‐ sation of dystrophin expression at the histological level [52]. Despite showing some promising results in DMD, a successful germ-line DNA gene editing strategy would (if ever ethically permissible) require a good knowledge of the nature of mutation affecting the maternal disease carrier to allow a site-specific targeted correction. This prior knowledge, however, could be problematic in X-linked DMD in which it is estimated that one third of all mutations in the *DMD* gene arise spontaneously [53] and hence would not qualify for a Cas9 gene editing-based treatment. This would restrict the approach to the correction of well-characterised known point mutations, which only make 15% of DMD mutations. One alternative approach to consider would be a postnatal *DMD* gene correction. This was recently demonstrated using multiplexed single-guided RNAs to direct the Cas9 nuclease to mutations at exons 45 to 55 of the dystrophin gene [54], an approach that has the potential to correct more than 60% of mutations in DMD patients owing to the large hotspot mutation deletion achieved. Despite a good restoration of dystrophin expression both *in vitro* and *in vivo* following transplantation of the CRISPR/Cas9 treated DMD patient myoblasts into immunodeficient mice, the overall deletion efficiency, however, was less than that obtained following exon 51 deletion [54]. This suggests that the size of the targeted sequence is crucial in dictating Cas9-associated gene editing capacity to mediate an efficient repair by NHEJ or HDR, where a size-dependent decrease in nucleasemediated gene deletion was previously noted [55]. Although proof-of-concept has eloquently been shown in research settings, CRISPR/Cas9-based gene editing in DMD would only be envisioned in the clinic should appropriate gene delivery methods are employed. Gene transfer systems capable of directing the elements of the CRISPR/Cas9 cassette to the diseased muscles *in vivo* are needed to ensure a satisfactory *DMD* gene editing and a full therapeutic outcome. For instance, AAV has previously been shown to be a safe and an effective gene transfer system in some clinical trials for gene replacement therapy [56]. In this regard, the AAV8 and AAV9 serotypes would be useful delivery tools for CRISPR/Cas9-mediated gene editing for DMD, owing to their reported efficient gene transfer to skeletal muscles and the heart following systemic administration [57, 58], which therefore could potentially result in a robust gene correction at the key target tissues harbouring the mutation in DMD patients. Recent studies have demonstrated the efficiency of AAV as a good platform tool for CRISPR/ Cas9-mediated gene editing transfer to dystrophic muscles in DMD disease models *in vivo*. The approach was recently tested following administration by three different systemic routes, including intraperitoneal, intramuscular, and retro-orbital injections in postnatal mdx mice, and proved to be successful with an increasing dystrophin expression reported from weeks 3 to 12 after virus injection. This correlated with an overall improvement in skeletal muscle function [59]. Mutated exon 23 targeted deletion was also demonstrated to restore disease phenotype in both neonatal and adult DMD models *in vivo* by systemic and local delivery of a CRISPR/Cas9-expressing AAV [60]. A similar viral-mediated transfer of a CRISPR/Cas9 system coupled with paired guide RNAs flanking the mutated exon 23 in DMD proved that AAV is a good gene editing transfer system for local and systemic restoration of dystrophin expression in muscle cells but also myogenic stem cells in diseased muscle *in vivo* [61].

#### **5. Application in trinucleotide repeat expansion muscular dystrophies**

Proof-of-concept studies in DMD have opened new horizons in the application of gene editing in other muscular pathologies for which the correction of toxic genes is believed to address the underlying dynamic mutations often associated with triplet repeat expansion and some locus contractions disorders.

#### *Oculopharyngeal muscular dystrophy*

OPMD is an inherited autosomal dominant, slow-progressing, late-onset degenerative muscle disorder characterised by progressive eyelid drooping (ptosis), swallowing difficulties (dysphagia), and proximal limb weakness. Whereas the incidence in Europe is 1/100,000, the disease has been largely reported in the Bukhara Jew population in Israel (1/700) and the French Canadian population in Quebec (1/000) [62].

The underlying genetic defect behind OPMD is an abnormal expansion of a (GCG)n trinu‐ cleotide repeat in exon 1 of the PABPN1 gene, which leads to an expanded polyalanine tract at the N-terminal of the PABPN1 protein (12–17 repeats are often detected in mutant PABPN1 compared with only 10 repeats in the wild-type protein) [62].

From a molecular therapeutic point of view, OPMD is a good candidate for gene editing-based treatment. Previous studies have shown the importance of abrogating intranuclear inclusions (INIs) of mutant PABPN1, the main pathological hallmark of the disease, with neutralising agents capable of binding aggregated mutant proteins. Of these, intrabodies were shown to be effective in reducing mutant PABPN1-associated INIs and restoring a normal pattern of gene expression in a *Drosophila* OPMD model [63]. Although these show antiaggregate properties by tackling the disease phenotype at the protein level, they do not address the underlying molecular pathology behind it, hence necessitating a repeated administration approach to keep the disease under control. In this regard, gene editing could offer a long-term permanent therapeutic advantage. Allele-specific correction of expPABPN1 can be achieved by CRISPR/Cas9-mediated editing targeting the expanded GCG moiety. Unlike recently tested short hairpin RNA (shRNA)-based knockdown of PABPN1 [64], this gene level of correction could result in a permanent production of functional PABPN1 protein that is capable of correcting the OPMD phenotype at both histological (reduction of pathological aggregates) and molecular (abrogation of mutant PABPN1) levels. The need for an adjunct gene replace‐ ment therapy as is often the case in knockdown approaches [62] could no longer be necessary. However, thorough characterisation and mechanistic studies would need to be conducted both *in vitro* and *in vivo* to confirm CRISPR/Cas9 editing specificity towards the expanded mutant PABPN1 with no obvious off-target effects on the wild-type PABPN1 gene or indeed other genes within the treated tissues.

#### *Myotonic dystrophies*

DM is an autosomal dominant, slow-progressing inherited multisystem genetic disorder affecting the muscles (causing wasting), the eyes (leading to cataracts), and the heart (causing conduction defects). The disease is also characterised by metabolic disturbances (endocrine changes) and prolonged contraction of skeletal muscles (myotonia).

Two types of DM have been defined to date. These include DM type 1 (DM1) and DM type 2 (DM2), with DM1 being the most severe and most common form of DM. Overall incidence has been estimated at 3 to 15 per 100,000 in Europe with a higher prevalence in Iceland (1:10,000) and a reported incidence of as high as 1:500 in Quebec [65]. The disease is caused by a CTG trinucleotide expansion in the 3′-untranslated region of the DMPK gene. CTG repeats exceed‐ ing 37 are considered abnormal with larger repeats (below 400 repeats) leading to a more severe disease [66, 67]. Like most debilitating muscular pathologies, no curative treatment currently exists for DM. The use of pharmacological agents [68, 69] and antisense RNA-based therapies [70–72] have shown improvements in disease phenotype, but overall benefits on the long-term remain, in some cases, limited and full therapeutic efficacy is yet to be demonstrated in the clinic. Although still in its early infancy, gene editing for DM treatment is an appealing alternative to most pharmacological and gene therapy approaches reported to date. Correction of mutant DMPK at the gene level offers the possibility for a permanent modulation of pathological phenotypes and long-term disease control. A recent study has shown the feasibility of editing intron 9 in the DMPK gene in DM1 neural stem cells derived from human DM1 induced pluripotent stem cells (iPSCs) using a TALEN-mediated homologous recombi‐ nation-expressing cassette integrated upstream of the CTG repeats [73]. A significant reduction in nuclear RNA foci, together with restoration of normal microtubule-associated protein τ (MAPT) and muscleblind-like (MBNL) splicing patterns, were observed [73]. Although transition into clinical use may be long and difficult, this proof-of-concept *ex vivo* study offers a rationale for the genetic correction of DM1-dervied stem cells as a potential autologous cell therapy for DM1 patients in the future.

#### **6. Regulatory challenges and pathways into the clinic**

In October 29, 2015, the United Kingdom became the first country in the world to legally approve one type of a human germ-line gene modification based on mitochondrial replace‐ ment. This approach is thought to save at least 10 children each year from mitochondrial diseases by preventing maternal transfer of mutations in mitochondrial DNA to offspring (Department of Health, 2014). Human germ-line genetic engineering, however, is currently not permitted in the United Kingdom and in other European countries. Although a somatic gene editing-based approach would be less questionable from an ethical point of view, a number of safety concerns would need to be addressed from a regulatory perspective. In Emerging Gene Correction Strategies for Muscular Dystrophies: Scientific Progress and Regulatory Impact http://dx.doi.org/10.5772/62282 203

**Figure 2. Regulatory pathway for advanced therapy medicinal products**. MAA = Marketing Authorisation Applica‐ tion; BLA = Biologics License Application; EMA = European Medicines Agency; FDA = U.S. Food and Drug Adminis‐ tration; PIP = Paediatric Investigation Plan; ATMPs = Advanced Therapy Medicinal Products.

accordance with Article 2(1)(a) of Regulation (EC) No. 1394/2007 and as per the European Medicines Agency (EMA) classification, gene editing-based products, including cells modified *ex vivo*, will be regulated as advanced therapy medicinal products (ATMPs) for which a central authorisation procedure governed by the EMA would apply (**Figure 2**). This would eventually lead to a single marketing authorisation that is valid across the entire EU and the European Economic Area (EEA) countries. A number of guidelines on the requirements for product quality and preclinical and clinical studies have been issued by the EMA over the past years to facilitate the transition of promising experimental advanced therapies into the clinic.

#### *Safety considerations for gene editing*

The risk of inducing modifications at off-target genes, leading to unwanted side effects, is a major safety concern. The degree of off-target events and their clinical implications are crucial questions that need to be carefully addressed as part of a regulatory new investigational drug development plan. The primary concern is that modifications could be carcinogenic. It must also be remembered that (unless the therapy is based on a cell type that is selected and clonally expanded after modification) the off-target effects will be heterogeneous from cell to cell, so the analysis must encompass a suitably large (and potentially diverse) population of cells exposed to the modifying agent to evaluate if any cells may be adversely transformed even at low frequency. There may also be risks associated with changes that have cytotoxic effects, and if such effects are frequent in the same cells that undergo correct gene editing, the consequences could include loss of efficacy.

In this regard, rigorous quality assurance tests would need to be conducted to identify any insertions or deletions that could result from a gene editing NHEJ-based treatment. These include gene sequencing, mismatch cleavage assays based on CelI or T7 endonuclease I enzymes, and the tracking of indels by decomposition (TIDE) method. These tests could be used as quality assurance tools to determine the number and location of indels or substitution events occurring after gene editing treatment, more often based on *in silico* predictions. However, as previously noted with ZFNs, some mutations could occur near cryptic off-target sites that are not predictable *in silico* [74]. Care should also be taken when validating the assays employed for product characterisation purposes. This follows from previous observations whereby initial analysis by whole genome sequencing (WGS) in gene-corrected human iPSCs (hiPSCs) revealed a large number of indels, of which some were confirmed to be false positive [75]. This highlights some limitations of WGS and calls for a multitesting approach employing different analytical methodologies.

Although current assays would be valuable in identifying the location of suspected off-target mutations that could arise from NHEJ or HDR-based gene editing treatments, they do not, however, provide precise information on the ability of these mutations to cause carcinogenicity in the long term. Hence, additional functional studies are required to assess the significance of these off-target events at the molecular and cellular levels. For instance, an unwanted insertion in the middle of an essential gene sequence could have dramatic consequences. One should have enough knowledge on the biological and physiological functions of the affected gene to draw an informed decision on the overall safety of the proposed approach. Similarly, an off-target insertion into an enhancer or a repressor region would disturb genetic homeo‐ stasis resulting in an unwanted down-regulation or up-regulation of genes that come under the affected promoter. This could subsequently interrupt normal cellular activity with potential alteration of phenotype at histological and physiological levels. Although not all mutations could affect cellular proliferation, an unwanted "indel" mutation occurring within genes known to regulate cell replication or involved in programmed cell death raise an alarming concern from a safety point of view. This is due to potential risks arising from a compromised cellular viability leading to severe toxicity in healthy tissues or an uncontrolled replication that could eventually lead to tumour formation. These potentially serious conse‐ quences warrant careful considerations during early product development stages; hence, basic biological studies to further characterise the sites at which these mutations have occurred should be conducted as ad hoc validation studies to rule out any unwanted insertional mutagenesis and/or tumourigenic consequences following a gene editing-based treatment.

Although clinical safety experience with biological therapies based on gene editing is lacking, current product development and regulatory strategies should also draw from past and available human clinical trial data on retroviral and lentiviral vectors' integration sites. A key safety parameter would be to satisfy the regulatory bodies that those detected off-target effects have been thoroughly characterised and occur at "low risk" sites unlikely to cause insertional oncogenesis following integration of foreign DNA. In fact, not all events at off-target genomic sites would result in serious side effects. For instance, those occurring at "extragenic" sites distant from essential gene regulatory sequences are less likely to cause harm than those affecting "intragenic" sites. Modifications to sites that do not fall within a gene transcription unit as well as those situated more than 50 kb away from the 5' end of any gene and more than 300 kb from genes linked with cancer or microRNA sites are generally considered safer. Genomic sites not within ultraconserved regions and outside long noncoding RNAs would also represent low-risk sites in the event of off-target DNA integration [19, 76].

Furthermore, validation studies should not only focus on the affected off-target sites but also neighbouring sites by measuring the effect of these mutations on neighbouring gene expres‐ sion. These studies should ideally be validated *in vivo*. The choice of appropriate animal models, however, remains a challenge and care should be taken when interpreting data [19, 76, 77].

#### *Validation and choice of functional studies for assessment of toxicity*

Although basic biological studies are valuable in assessing the extent of genome modification as a whole and increasing our understanding of different alterations taking place including those that are unlikely to result in clinical toxicity, they do not fill all the regulatory require‐ ments gaps. Functional toxicity studies would, therefore, need to be conducted in parallel early during the preclinical stage of product development. Two paramount questions would need to be carefully addressed to satisfy regulators' concerns on serious toxicity issues that are specific to genome editing-based medicines. These include cytotoxicity and genotoxicity. Tumourigenicity is the most important safety consequence to consider, but germ-line effects are also not expected to be permissible. Cytotoxicity may be caused by (i) the vector itself (including expression of viral vector antigens and potential persistence of the vector in cells), (ii) gene editing machinery (and any persistence thereof, especially DNase), (iii) off-target gene editing, or (iv) on target gene editing (possible if the editing is either not accurately restoring the wild-type or is generating new antigenicity and/or genetic instability).

Eloquent studies and approaches would have to be designed to address these issues. At preclinical stage, this could include viability assays based on GFP-positive (GFP+ ) cells, whereby the treated cells are cotransfected with the tested nuclease-expressing construct and a GFP-expressing plasmid. This would allow investigators to track and quantify any observed decline in the GFP+ cell population as a result of a nuclease-related toxicity. The approach would also be valuable in dose escalation studies when deciding on optimal dosage adminis‐ tration for subsequent first-in-human clinical studies. Monitoring of clonal changes has also been used as an informative way to assess genotoxicity *in vitro* in cells pretagged with unique short sequence identifiers to allow tracking of changes in starting clone dynamics over a period of time [78]. Similarly, other approaches have relied on fluorescently tagged cell cycle indica‐ tors in cell lines such as HeLa FUCCI cells as a complementary *in vitro* validation method to assess the genotoxicity effects of genome editing at the cell cycle level [79].

However, it is important to bear in mind that measuring cytotoxicity and genotoxicity using approaches based on reporter genes and tagging systems is restricted to preclinical assess‐ ments and cannot be employed at later stages of product development when assessing potential toxicity in human clinical trials. Hence, the need for clinical toxicity assays that are fit for purpose should not be neglected. Furthermore, correlation of these assays with clinical outcomes is yet to be demonstrated. For this reason, all functional toxicity studies conducted *in vitro* should use cellular models and gene transfer methods that are similar to those intended to be applied in clinical settings to fulfil some of the safety regulatory requirements.

#### *Vector considerations for in vivo and ex vivo gene editing*

Potential cytotoxic effects do arise not only from on-target and off-target effects but also from the employed gene transfer system itself. In this regard, the type of vector (viral or nonviral) and the gene (nuclease) transfer approach (direct *in vivo* delivery versus *ex vivo*) adopted will have to be implemented earlier during product development and considered as part of the constructed regulatory strategy for the nuclease-based medicine under consideration.

For severe neuromuscular disorders in which there is a widespread distribution of affected muscles, AAV vectors and in particular serotypes 8 and 9 are seen as ideal systems for gene editing-based nuclease transfer to diseased muscle tissues owing to their relatively high tropism for skeletal muscle cells and the heart [57, 58] as well as their documented safety profile in the clinic following recent approval of Glybera® in Europe [80]. From a product development point of view, this can be achieved through an HDR-based gene editing construct packaged within a recombinant AAV. Although technically this is achievable with ZFNs due to a relatively small monomer insert size of ~ 1.2 kb, this might not be the case for Cas9 nucleases. With an insert size of ~4.1 kb, this could be an issue when it comes to packaging a bicistronic construct harbouring a donor DNA template [81], considering the limited packaging capacity of rAAVs. Taking into account the EMA's guidelines on quality, nonclinical and clinical issues related to the development of recombinant adeno-associated viral vectors (EMEA/CHMP/ GTWP/587488/2007 Rev. 1) [82], caution should be taken during the manufacturing stage of the vector to avoid the possibility of producing AAV particles whose packaged DNA is greater than that of wild type virus. Technically, this could be overcome by splitting the Cas9 gene between two vectors. However, this markedly increases the developmental and regulatory complexity, with each product warranting a full characterisation and a thorough assessment of approaches employed during manufacturing, quality control, and preclinical evaluation (including choice of animal models for testing, vector persistence and tropism, reactivation of virus infection, and germ-line transmission) in addition to clinical studies. The latter would have to be based on a data-driven dose selection for each vector used and show a comprehen‐ sive picture of virus biodistribution and shedding, immunogenicity profile, and germ-line transmission considering the permanent gene modification achieved with gene editing-based nucleases and the risks associated if the virus expressing these genes accidentally infect germline cells. A long-term follow-up is therefore highly recommended and should not be neglected as part of a complete regulatory plan for a gene editing-expressing AAV vector.

Similarly, the use of genetically modified and nonmodified cells including myoblast cells and myogenic stem cells for the treatment of monogenic inherited neuromuscular diseases has been well validated in preclinical and clinical studies [83–86] and it is anticipated that these therapies would become normal treatment modalities in the clinic in the future.

Although gene edited-based cell therapies for HIV and leukaemia are seeing a rapid positive progress in clinical trials and are rapidly extending to other debilitating diseases, a smooth and safe progress of these life-changing advanced therapies in the market would require fulfilments of a number of criteria from a regulatory point of view. The ultimate goal is to ensure that these therapies would reach the severely affected individuals at minimal health and safety risks with regards to the treated patients themselves as well as third parties and the environment in large. According to the EMA's guidelines, a multifactorial risk assessment approach needs to be considered taking into account the origin of cells involved, the type of vector employed during the genetic modification procedure, the manufacturing process, the noncellular components used as part of the formulated product, and the intended specific therapeutic use of the final product.

Lentiviral vectors have long been the vector of choice for the *ex vivo* engineering of stem cells and muscle progenitor cells for the treatment of muscular dystrophies [87–89]. This adds an additional layer of regulatory requirements from quality, nonclinical and clinical perspectives. The manufacture of viral vectors to the GMP quality standards required for an approvable medicinal product is expensive and often inefficient, but developers are helped by the guidance issued in these areas by the EMA (for example, "Guideline on the quality, nonclinical and clinical aspects of gene therapy medicinal products (EMA/CAT/80183/2014)") [90], as well as the monograph of the European Pharmacopoeia (Ph. Eur. 5.14). A thorough product characterisation is required as part of this process that would include different but comple‐ mentary methods based on molecular, biological, and immunological assays with the aim of assuring the identity, purity, and potency of the produced genetically modified cells as a final product. Major considerations for viral vectors used *in vivo* are the biodistribution, potential for generation and/or shedding of infective virus, potential for germ-line modification, and the impact of interaction with the recipient's immune status. Clinical use of viral vectors and genetically modified cells should also comply with applicable regulations for genetically modified organisms.

Complexities arise from the genetic manipulation (gene correction) as well as the differentia‐ tion status and capacity of the modified cells, which could result in a mosaic cell population. Furthermore, intrinsic variations between cells as a result of donor differences give rise to massive batch variations in the final product. From a regulatory perspective and in line with the EMA's guidelines, nonclinical and clinical studies need to be conducted with cell medicinal products that are well characterised. The manufacturing process needs to be robust and quality control focused capable of maintaining consistency and reproducibility of the final cell-based product. For this, all starting materials need to be well defined and carefully documented. For treatment of muscle diseases, it is often reasonable to administer cell-based products that are in a differentiated state, which may pose less tumourigenic risks. However, one cannot exclude the existence of a subpopulation of cells in an undifferentiated proliferative state. For iPSCs, for example, it is empirical to conduct additional testing of cell transformation and tumour formation during the early manufacturing stage of the product as a precaution measure. This would often need to be combined with the selection of appropriate markers during critical manufacturing steps for assuring a defined stage of differentiation that is intended for therapeutic use.

For any cell-based product that has undergone a substantial *ex vivo* manipulation, a robust process validation process is therefore paramount and should include a combination of genetic stability, tumourigenicity, and phenotypic profile assessments of both wanted and unwanted cell populations at all critical stages of manufacturing to ensure safety requirements are met. The mosaic nature of cell-based medicinal products often complicate their identity and purityrelated characterisation. Most often, it is reasonable to accept that purity does not always equate homogeneity. Truly selective markers that could accurately map and distinguish different cell types and differentiation stages are yet to be identified, which often render product characterisation a challenging task for most developers. Nevertheless, one should not underestimate the importance of a thorough demonstration of product consistency as a minimum requirement for characterisation purposes.

The ability to track any cell-based therapy following administration in patients is crucial for clinical monitoring purposes. However, limitations in current medical methodologies do not allow a full biodistribution profile to be drawn from human studies. Hence, the importance of thoroughly designed biodistribution studies in nonclinical models should take into account the multistep biodistribution characteristics of cell products, including migration, niche, engraftment, differentiation, and persistence, together with reliable *in vivo* tracking methods such as the use of marker genes or labelled cells. Current European regulatory requirements do not give exemptions when the risk profile of the cell-based product under investigation is subject to a safety concern or when its route of administration (such as intravenous delivery) warrants a special attention. For this, noninvasive methods based on clinically accepted tracers should be considered and their use should be justified when conducting biodistribution studies in human trials.

#### **7. Conclusion**

For diseases caused by genetic defects, means to correct the defect are attractive routes to cure the disease at its source. Medical scientists working in the field of muscular dystrophies are therefore excited by the opportunities that recent developments in gene therapeutics bring to the field. Technologies do now exist for manipulating the human genome in *ex vivo* cultured cells with considerable specificity. However, there are further technical challenges to solve before such technologies translate into viable medical treatments for affected individuals. The medical need is evident from the severity and incidence of these conditions, which are frequently severely debilitating. It should also be noted that muscular dystrophies are not a single disease but cover a range of conditions caused by different mutations, and it is to be expected that, for specific gene-based therapeutics, a different medicinal product will generally be required according to the different mutation that has caused the disease.

The increasing number of investigational medicinal products entering clinical trials involving patients with DMD and BMD has prompted the EMA to publish its guidelines on the clinical investigation of medicinal products for the treatment of DMD and BMD (EMA/CHMP/ 236981/2011, Corr. 1) [91], expected to come into effect in July 2016. This gives a general guidance to be taken into account during the clinical development and evaluation of currently investigated therapies for DMD and BMD [91].

As described in this chapter, major recent advances for the enablement of gene-based treat‐ ments of muscular dystrophies include the following:


#### **Author details**

Houria Bachtarzi\* and Tim Farries

\*Address all correspondence to: houria.bachtarzi@eraconsulting.com

ERA Consulting (UK) Ltd. (European Regulatory Affairs), London Gas Museum, Twelvetrees Crescent, London, E3 3JG, United Kingdom

#### **References**


## *Edited by Michael S.D. Kormann*

Site-specific endonucleases create double-strand breaks within the genome and can be targeted to literally any genetic mutation. Together with a repair template, a correction of the defective locus becomes possible. This book offers insight into the modern tools of genome editing, their hurdles and their huge potential. A new era of in vivo genetic engineering has begun.

Photo by Sergey Nivens / AdobeStock

Modern Tools for Genetic Engineering

Modern Tools for

Genetic Engineering

*Edited by Michael S.D. Kormann*