**1. Introduction**

Currently the world population is about 8 billion people. According to the UN data, the number of people experience moderate or severe food shortages has reached 2 billion or 26.4% of the world population [1]. Huge efforts are being made to eradicate hunger and malnutrition in the world actually. Many of them are associated with scientific breakthrough in the life science and agriculture area [2]. However, despite the achievements of plant breeding, the issue of short term creation of new high-yielding and stress resistant varieties of crops is still actual. All of this is aimed to challenge the such problems as crop losses due to climatic changes, reducing of cultivated areas and spread of more aggressive and resistant pathogens. No less important reason is world population growth [2, 3]. At the moment, this is impossible without the use of biotechnological and genetic engineering approaches.

In the 20th century a classical crop breeding approaches were based on either natural mutations or artificially induced mutagenesis [4]. However, the traditional breeding methods have sufficient disadvantage such as long-term period to create

of new varieties with desired agronomic characteristics of any crops. This depends not only on the duration of growing season and reaching of mature stage of plants (especially the long-period growth plants, e.g. trees), as well as is associated with applying of multiple stages of crossing, selection and testing in breeding process. In addition, the following should be mentioned, both natural mutations and conventional methods of chemical and physical mutagenesis do not permit to target the plant genome [4].

At the turn of XXIth century the development and introduction of molecular DNA markers allow to significantly reduce the time required to create new lines and varieties of agricultural crops. In other words, the marker assisted selection approaches were appeared, thus significantly increase the effectiveness of breeding programs to increment in productivity of crops in a wide range of environmental conditions [5, 6]. However, these approaches also do not enable to target the crop genome.

At the same time, advances of next-generation sequencing, multitude of sequenced genomes of major crops and newly identified genes and their functions motivate researchers to pursue targeted breeding of plants. All of this have significantly promoted the development of targeted genome editing (GE) approaches [7–10]. One of the first GE technologies was RNA interference (RNAi) [4, 11, 12]. Despite the successful application of this technology in functional genomics and plant breeding [15–17], this method has a number of disadvantages, such as partial gene function suppression and indefinite insertion place of an RNAi construction into the genome [4].

The solution to these breeding problems was the application of GE methods using sequence-specific nucleases (SSN) to introduce targeted mutations in crops with high efficiency and accuracy [7–10, 12]. Artificially engineered SSNs such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regulatory interspaced short palindromic repeats (CRISPR) associated with the endonuclease Cas (CRISPR/Cas) have been shown to be highly effective in targeting mutagenesis in a wide range of model plants and crops [7–10, 12]. In addition, oligonucleotide directed mutagenesis (ODM) allows to edit the genome at the single nucleotide level. Moreover, base editors (BEs) have recently been developed to replace A-T base pairs to G-C base pairs [12, 13].

Nowadays GE technologies are widely applied both in functional genomics and in the development of new varieties of crops with new valuable properties and resistant to various biotic and abiotic stresses [7–10, 12, 13]. Herewith, despite the fact that modern GE technologies are much more accurate than conventional mutagenesis, the legistation of GE crops remains the main bottleneck [13, 14]. A particular difficulty is associated with the biosafety assessment of such crops, the impossibility of determining the subsequent effect of single base mutations after using ODM and BEs [13, 14].

This review discusses GE mechanisms and their use for crop improvement, as well as the problems associated with these approaches.

### **2. Genome editing mechanisms**

#### **2.1 Programmable nucleases**

Currently, there are three main GE methods classified according to the mechanism of action. The most commonly one applied for plant genomics is the targeted generation of double-stranded DNA breaks (DSBs) by SSNs [12, 13]. Whereat this DSBs are recovered by the cell's own endogenous repair mechanisms either through

#### *Using of Genome Editing Methods in Plant Breeding DOI: http://dx.doi.org/10.5772/intechopen.96431*

non-homologous end joining (NHEJ) or homologous recombination (HDR) [7–10, 12–16]. Thereat, the reparation of the target DNA sequence leads to the genesis of single base mutations that changing or shifting of the reading frame and initiating of indels or nucleotide substitutions, as well chromosomal rearrangements [8, 15].

Targeted induction of DSB is possible by programmable nucleases. The prevailing nucleases for GE are zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALEN), and clustered regularly interspaced short palindromic repeats (CRISPR) associated with the endonuclease Cas (CRISPR/Cas). These three classes of nucleases are different in structure, activity and action mechanism, that leads to differences in target selection, efficiency, specificity and nature of mutation [7–10]. Let's take a closer examine each programmable nucleases types.

### *2.1.1 Zinc finger nucleases (ZFNs)*

Zinc finger nucleases (ZFNs) are the first artificial endonucleases designed for GE [17]. Every ZFN is derived by aggregating of DNA-binding domain containing of a few linked zinc finger (ZF) motifs and the nonspecific endonuclease FokI [12, 17]. Association of ZF motifs promotes to develop ZF proteins (ZFP) consisting of approximately 30 amino acids and having ββα structure stabilized by zinc ions chelation [17]. Combination of ZFP with methylase, FokI and transcription activator/repressor gives rise ZFN [12, 13, 17]. FokI is an endonucleases recognizing of 5-mer non-palindromic sequence 5'-GGATG-3' : 5'-CATCC-3' and cleaving DNA 9/13 nucleotides downstream of the recognition site [12, 17, 18].

By intersecting with DNA, each ZF motif is capable to bind one triplet of nucleotides inserting an α-helix into the major groove of the DNA double helix [12, 18]. It should be also noted that one ZF has not sufficient specificity for binding to the target genome. However, artificial ZFN usualy contains three or four ZFs, which permit to bind 18-24-mer site after FokI dimerization which is necessary for efficient DNA restriction [17]. During FokI dimerization, two ZFNs can bind both forward and reverse DNA strands respectively, and two target sequences - forward and reverse should be separated by a spacer sequence of 5 to 7 bp [7]. In this case, ZFN acts like dimer and generates DSBs with short 5'-cohesive overhangs, which are filled by homologous recombination, that gives rise to indels into the genome [7–10, 12, 17, 18].

It should also be noted that, despite the sufficient binding specificity of ZFNs, they are more likely to make nucleotide mismatch [17]. Heterodimerization of FokI nuclease is used to minimize non-target effects and, accordingly, cellular toxicity of ZFN [19].

According to the first report in 1996, ZFNs have been successfully applied for gene modification in plants [17, 18]. The ZFN technology was used to edit the genome of tobacco and *Arabidopsis* [18]. In tobacco ZFN technology was used to restore the function of the defective reporter gene *GUS:NPTII* [18]. In *Arabidopsis*, the induction of ZFN expression under control of the heat shock protein promoter led to 106 mutations on the DNA, where 83 (78%) were 1-52-mer deletions, 14 (13%) - 1–4-mer insertions and 9 (8%) – deletions accompanied by insertions [18]. Nowadays, there are many studies confirming the possibility of GE in tobacco, *Arabidopsis*, maize, soya, canola and other plants using ZFNs [8, 15, 18]. At the same time, the use of ZFNs permit to introduce mutations in the endochitinase-50 gene (*CHN50*) in tobacco to emergence of resistance to *pat* herbicides [18]. In addition, similar results was got by the target editing of *IPK1* (inositol-1, 3, 4, 5, 6-pentakisphosphate kinase 1) gene, responsible for phytic acid biosynthesis in maize. ZFN-based targeting of *ABI4* (ABA Insensitive-4) gene in *Arabidopsis*, Dicer-like genes (*DCL4a* and *DCL4b*) in soybean and genes of alcohol dehydrogenase and chalcone synthase in *Arabidopsis* have been also reported [18].

However, despite the rather successful use of ZFNs, they have not become widespread as a GE tool due to the presence of a number of disadvantages. Main of them is complexity and high cost intensive technology, constructing of protein domains for each specific locus of the genome [18], likelihood of inaccurate cleavage of the target DNA due to single nucleotide substitutions or incorrect interaction between domains [8, 15, 17–19].

### *2.1.2 Transcription activator-like effector nucleases (TALENs)*

TALENs similar to ZFN are enzymes consisting of specific DNA-binding domains of highly conservative repeats originating from effectors such as transcription activators (TALEs) which associate with FokI [20]. TALEs domains contain 15–30 copies of 33–34 highly conserved amino acid sequences [20]. The exceptions are 12th and 13th amino acid residues, which have high variability (repeat-variable diresidues – RVD) [17, 20]. It permits to establish the recognition code for specific nucleotides using a pair of such amino acids within the repeating peptide chains of a given protein [20]. This code is degenerate, but there is a clearly pronounced preference for some combinations of amino acids [17, 20]. It permit to design recombinant proteins capable of recognizing specific DNA sequences [20]. Activity of TALEN depends on amino acid number between TALE domain and FokI, as well as base number between binding sites [17, 20].

In contrast to ZF each repeat in the TALE domain recognizes one nucleotide [17, 20]. The TALE domains recognize 15-30 nucleotides that is 30-60 nucleotides for each TALEN dimer after FokI dimerization. Moreover, despite the fact that TAL domains have higher binding specificity, they are more likely to allow nucleotide mismatches [12, 13, 17]. As well as ZFN the heterodimerization of FokI is applied to minimize off-target effects by using of TALEN [21].

Analysis of mutations occurred during GE using TALEN shows that deletions are way more than insertions (89% versus 1.6%). The reason is the longer length of the TALEN spacers, which provide more extended protruding ends for the DNA fragments after DSBs [22].

Theoretically, the use of TALEN permit to introduce DSB into any part of the genome. There is one limitation only – the presence of thymidine upstream of the 5' end of the target sequence is needed for the TALEN nuclease recognition sites. However, variation of the spacer length allows to select restriction sites [20].

#### *2.1.3 Clustered regulatory interspaced short palindromic repeats (CRISPR/Cas9)*

CRISPR/Cas technology permits to make different changes in the DNA sequence [12–16]. Moreover, this GE technology is much cheaper, faster, more efficient and simple in practical application in comparing to ZFN or TALEN [17, 23]. This technology is based on the use of mechanisms of adaptive "immunity" discovered in bacteria – a specific antiviral defense of bacterial cells based on the complementary binding of viral DNA and their follow destruction [7–10, 12–14].

In this system small guide RNAs (crRNA) are used for sequence-specific interference of foreign nucleic acids. CRISPR/Cas includes a genetic locus so-called CRISPR containing short repeats separated by unique sequences (spacers) [24–29]. The CRISPR complex is predated by the AT-enriched leader sequence and flanked by *cas* genes [24–27].

Depending on the *cas* genes classification CRISPR/Cas systems are divided into two classes. The class 1 of CRISPR-Cas (types I, III, and IV) uses for interference protein complexes with several *cas*, while class 2 systems (types II, V, and VI) – single effector protein [26, 28, 29]. Type I system is characterized by the presence of

#### *Using of Genome Editing Methods in Plant Breeding DOI: http://dx.doi.org/10.5772/intechopen.96431*

Cas3. Type II systems use Cas1, Cas2, Cas9 and Csn2 or Cas4, and type III systems – Cas10, the role of which has not yet been identified [29].

Currently, CRISPR/Cas type II is most often used for genome editing. This type contains the protein Cas9, which is necessary for interference and bacterial immunity [26]. Let's a closer look at this genome editing system.

To edit target genes, the CRISPR/Cas9-based GE requires the occurrence of CRISPR-associated protein 9 (Cas9), CRISPR RNA (crRNA), transactivating crRNA (tracrRNA) and ribonuclease III (RNase III) [12–14]. Thereat the crRNA and tracrRNA coassemble into a single guide RNA (sgRNA) [28].

Cas9 is the endonuclease cleaving a double-stranded DNA (dsDNA) [24–27]. This nuclease were isolated from various bacteria, such as *Brevibacillus laterosporus*, *Staphylococcus aureus*, *Streptococcus pyogenes*, *Streptococcus thermophilus* [24–26]. It should be noted that Cas9 from *Streptococcus pyogenes* (SpCas9) is most often used to genome edit [27].

Cas9 contains two domains: His-Asn-His (HNH) and RucV-like domains cleaving dsDNA at 3 bp upstream of the motif adjacent to protospacer (PAM) (5′ NGG or 5′-NAG for SpCas9) [24, 25, 29]. The HNH domain cleaves the complementary crRNA strand, while the RucV-like domain - the opposite strand of dsDNA [47]. Then generated DSBs are repaired by NHEJ or HDR [24–29].

sgRNA is 100-mer synthetic RNA and consists crRNA and tracrRNA. The 5'-end of sgRNA contains a 20 nucleotide guide sequence to identify the target sequence followed by consensus PAM sequence [27]. 3'-end of sgRNA has loop structure which permit to fix the guide sequence to the target site and interact with Cas9 forming ribonucleoprotein complex (RNP) which generates DSB at the target DNA region [27, 29].

Efficient DNA cleavage are provided by ribonucleoprotein complex (RNP) [29]. crRNA plays an important role in the recognition of target DNA due to the sequence that directs RNP to a specific locus by base pairing with target DNA with formation of R-loop structure [29]. R-loop generation activates the HNH and RuvC-like endonuclease domains of Cas9, which cleave dsDNA, creating blunt-ended DSB at 3 bp upstream of the PAM site [25]. Thus, CRISPR/Cas9 performs gene editing in three stages. At the first stage Cas9 is expressed, at the second stage – generation of sgRNA which containing 20 nucleotides complementary to the target region. The third stage requires an NGG PAM recognition site located closer to the 3' end of the target region. The RNP guided by sgRNA generates a blunt-ended DSB at 3 bp upstream of the PAM site [25].

The one limitation of the CRISPR/Cas9 using is the fact that sgRNA design requires the presence of the targeted PAM sequence at the 3' end [24–26]. For SpCas9 this sequence is defined as 5'-NGG-3' [25]. Frequency of PAM sites in the plant genomes revealed by *in silico* is 5-12 sites per 100 bp [30]. This fact underlies a difficulty of identifying target sequence, especially in large genomes with nomerous of repetitive sequences, such as maize, cotton, wheat, etc. [13]. All of mentioned above is one of the factors of CRISPR/Cas9 untargeted effect [30]. To reduce unexpected effects other Cas9 can be used, for example from *S. aureus*, which recognizing less common NNGRRT-PAM [31] or mutant SpCas9 recognizing non-canonical PAM [32].

## **2.2 Oligonucleotide-directed mutagenesis (ODM)**

ODM is site-directed mutagenesis tool using mutagenic DNA fragments 20-200 nucleotides in length [33]. In this approach, the fragment sequence match with a target sequence in the genome, except of a single base pair, which is a putative mutation introduced into the genome [33, 34].

In eukaryotic cells the oligonucleotide for ODM penetrates into the cytoplasm through the cell membrane, then enters the nucleus and complementarily interacts with the target DNA sequence. Herewith, the mismatched nucleotides contribute to initiate a specific change of the sequence that occur in the target gene due to the errors reparation mechanism in the cell. This is two-stage process initially requires annealing of the specific oligonucleotides with target DNA, and subsequently repair of nucleotide mismatch leading to a directed mutation [33]. This system was first demonstrated on mammalian, after that in plants [35]. However, it should be noted that in plants the oligonucleotide does not integrate into the genome because of modifications of the 5' and 3' ends, that prevent DNA ligation, and due to the activity of endonucleases that destroy oligonucleotides [34].

ODM can be accomplished by single-stranded DNA fragments, but their using is limited by a short intracellular half-life [13]. To overcome this disadvantage, the stabilizing modifications for ssODN are necessary. These include chimeraplasts (DNA/RNA duplexes modified by methylation), modification with phosphorothioate, ssODNs with a 5'-tag Cy3 and modified 3'idC reverse base [33]. Additionally, it should be noted the rather low efficiency of ODM, that positive correlate with the length of oligonucleotide fragments. It was shown that by increasing of ssODN length to 200 nucleotides it allows to increase accuracy of editing up to 0.05% in *Arabidopsis thaliana* [33]. The chimeraplasts using did not lead to an increase of mutation frequency higher than the level of spontaneous mutations in *Nicotiana tabacum* or *Brassica napus* [36]. In this regard, to increase the mutagenesis efficiency and the target mutations frequency ODM is often used in combination with nonspecific reagents inducing DSB, such as antibiotics or TALENs and CRISPR/Cas9 in *Arabidopsis* and *Linum usitatissimum* [37].
