**2. The 2A story — The end of the beginning**

#### **2.1. The co-translational model of 2A-mediated "cleavage"**

FMDV, like other members of the family *Picornaviridae*, is a non-enveloped RNA virus which contains a single-stranded, positive-sense RNA molecule of approximately 8500 nt that functions as an mRNA [26]. This (+) RNA encodes a high molecular mass polyprotein that undergoes co-translational processing to yield the structural proteins (1A, 1B, 1C and 1D, commonly known as VP4, VP2, VP3 and VP1 respectively) which comprise the viral capsid, and the non-structural proteins (2A, 2B, 2C, 3A, 3B, 3Cpro, and 3Dpol) that control the viral life cycle within host cells [27]. The 2A oligopeptide is only 18 amino acids (aa) long (- LLNFDLLKLAGDVESNPG-) defined by the co-translational "cleavage" at its C-terminus and a post-translational cleavage at its N-terminus, mediated by the virus-encoded proteinase 3Cpro [28]. Analysis of recombinant FMDV polyproteins [29] and artificial polyprotein systems in which 2A was inserted between two reporter proteins [22] showed that just 2A, plus the Nterminal proline of the downstream protein 2B was sufficient for highly efficient co-transla‐ tional "cleavage" (Figure 1, Panel A). Quantification of products using *in vitro* cell-free translation systems showed the product upstream of 2A accumulated in a molar excess over that downstream – at variance with a proteolytic model of 2A which predicts a 1:1 stoichiom‐ etry of the cleavage products [23, 30, 31]. We and others have shown that 2A is not a proteinase, nor a substrate for a host-cell proteinase, but an autonomous element mediating a co-transla‐ tional "recoding" event [27, 29]. From these observations we proposed a model of the 2A reaction based on hydrolysis of the nascent chain from ribosome-associated tRNA at the peptidyl-transferase centre [23, 24, 30]. For in-depth reviews of the model see [24, 32, 33]. **Figure 1**

separated by the tobacco etch virus (TEV) NIa protease recognition sequence (heptapeptide cleavage recognition sequence ENLYFQS) together with the NIa proteinase [16-18]. The utility of the NIa protease is limited due to the presence of a nuclear-localizing signal (NLS) within the protease and the amount of energy necessary to express the 49 kDa protease. It is also possible to use linker sequences that are putative substrates of known endogenous plant

To bypass the need for an endogenous or recombinant accessory protease acting on the translated polypeptide product a different approach involves the use of self-processing viral 2A peptide bridges [reviewed in 20, 21]. The designation "2A" derives from the systematic nomenclature of protein domains within the polyproteins of picornaviruses. In foot-andmouth disease virus (FMDV) and some other picornaviruses the oligopeptide 2A region of the polyprotein manipulates the ribosome to "skip" the synthesis of the glycyl-prolyl peptide bond at its own carboxyl terminus leading to the release of the nascent protein and translation of the downstream sequence [22]. Under the monikers of "Skipping", "Stop-Carry On" and "StopGo" translation, it allows the stoichiometric production of multiple, discrete, protein products from a single transgene [23,24]. Several recent review articles have amply covered the role of 2A biotechnology in animal systems [20, 25]. This summary-review will provide an up-to-date overview of 2A and cover the wider application of 2A-polyproteins to the expres‐

FMDV, like other members of the family *Picornaviridae*, is a non-enveloped RNA virus which contains a single-stranded, positive-sense RNA molecule of approximately 8500 nt that functions as an mRNA [26]. This (+) RNA encodes a high molecular mass polyprotein that undergoes co-translational processing to yield the structural proteins (1A, 1B, 1C and 1D, commonly known as VP4, VP2, VP3 and VP1 respectively) which comprise the viral capsid, and the non-structural proteins (2A, 2B, 2C, 3A, 3B, 3Cpro, and 3Dpol) that control the viral life cycle within host cells [27]. The 2A oligopeptide is only 18 amino acids (aa) long (- LLNFDLLKLAGDVESNPG-) defined by the co-translational "cleavage" at its C-terminus and a post-translational cleavage at its N-terminus, mediated by the virus-encoded proteinase 3Cpro [28]. Analysis of recombinant FMDV polyproteins [29] and artificial polyprotein systems in which 2A was inserted between two reporter proteins [22] showed that just 2A, plus the Nterminal proline of the downstream protein 2B was sufficient for highly efficient co-transla‐ tional "cleavage" (Figure 1, Panel A). Quantification of products using *in vitro* cell-free translation systems showed the product upstream of 2A accumulated in a molar excess over that downstream – at variance with a proteolytic model of 2A which predicts a 1:1 stoichiom‐ etry of the cleavage products [23, 30, 31]. We and others have shown that 2A is not a proteinase, nor a substrate for a host-cell proteinase, but an autonomous element mediating a co-transla‐ tional "recoding" event [27, 29]. From these observations we proposed a model of the 2A

proteases [19].

166 Biotechnology

sion of multiple proteins in plants.

**2. The 2A story — The end of the beginning**

**2.1. The co-translational model of 2A-mediated "cleavage"**

**Figure 1. Schematic overview of 2A function and gene fusion constructs.** Panel A: Two individual polypeptides can be generated from one transcript using F2A to link the individual genes. Panel B: F2A cleavage efficiency. All con‐ structs shared a common core consisting of CFP-F2A-RABD2a. Panel C: Sequences encoding HA-tagged CAH1 were fused in-frame to wild type and mutant versions of genes encoding the GTPases RABD2a, SAR1, and ARF1 linked by F2A-the pre-protein has the endomembrane targeting sequence. These synthetic polyproteins were efficiently cleaved when transiently expressed in protoplasts and *in planta.* CFP, enhanced cyan fluorescent protein; GUS, β-glucuroni‐ dase; SP, ER signal peptide of CAH1 protein; GS, Golgi targeting signal of N-acetylglucosaminyl transferase 1; RABD2a, *Arabidopsis* RABD2a GTPase; SAR1, *Nicotiana tabacum* SAR1p; ARF1, *Arabidopsis* ADP-ribosylation factor 1; HA, hemagglutinin epitope tag; CAH1, *Arabidopsis* α-CAH1 (adapted from [60]).

2A comprises two parts, an N-terminal region (without sequence conservation) predicted to form an alpha helix, and a C-terminal motif,-DxExNPG, followed by a proline required for the reaction. Recently it was shown that the synonymous codon usage of this conserved motif is biased [34]. The amino acids E,S,N,P,G,P tend to use GAG, TCC, AAC, CCT, GGG and CCC respectively. The results also indicate that the synonymous codon usage of the 2A peptide has no effect on 2A activity. In summary, our results indicate the conserved –DxExNPG motif within the peptidyl transferase centre (PTC) of the ribosome is restricted and it forms a tight turn, shifting the ester bond between the C-terminal glycine and tRNAGly (in the P site of the ribosome) into a conformation which rules out nucleophilic attack by prolyl-tRNAPro (in the A site)-no peptide bond is formed. Although no stop codon is involved, eukaryotic translation release (termination) factors 1 and 3 (eRF1/eRF3) release the nascent protein from the ribosome [35-37]. Due to its mode of action, the 2A peptide has been described as a "*cis*-acting **hy**drola**s**e **el**ement" (CHYSEL) [32]. Our model of this translational recoding event predicts two out‐ comes, either ribosomes terminate translation, or, translation of the downstream sequences resumes. Skipping induced by 2A sequences gives approximately equal expression of the proteins upstream and downstream of the 2A site as measured by: i) CAT and GUS enzyme activity [38]; ii) cell free translation *in vitro* and Western blot [22, 23, 31, 39, 40]; iii) GFP/FACS with antibiotic resistance [41]; iv) co-fluorescence reporting [42, 43]; v) fluorescence resonance energy transfer (FRET) analysis [44] and vi) protein segregation in transgenic animals [45, 46]. Since these sequences act co-translationally, artificial polyprotein systems may include signal sequences to localize different protein translation products to discrete sub-cellular sites.

#### **2.2. 2A and 2A-like sequences**

Probing databases for the presence of the "signature" motif (-DxExNPGP-) showed that "2Alike" sequences were present in several genera of the *Picornaviridae* (aphtho-, cardio-, erbo-, tescho and certain parechoviruses), single-stranded RNA insect viruses (iflaviruses, dicistro‐ viruses, tetraviruses), double-stranded RNA viruses of the *Reoviridae* (type C and non-ABC rotaviruses, cypoviruses) and penaeid shrimp viruses [47, reviewed in 20,48]. Previously we demonstrated the activity of "2A-like" sequences within non-long terminal repeat retrotrans‐ posons (non-LTRs) of *Trypanosoma brucei*, *T.cruzi*, *T.vivax*, and *T.congolense* [31, 49] and more recently within the non-LTRs of a wide range of multicellular organisms: *Xenopus tropicalis* (African claw-toed frog, vertebrate), *Branchiostoma floridae* (Amphioxus, Florida lancelet, cephalochordate), *Strongylocentrotus purpuratus* (purple sea urchin, echinoderm), *Aplysia californica* (California sea slug, mollusc), *Crassostrea gigas* (Pacific oyster, mollusc), *Lottia gigantean* (Owl limpet, mollusc) and *Nematostella vectensis* (sea anemone, cnidarian) [50]. Presently, *in silico* searches have identified the 2A motif in a range of putative retrotransposon domains (Table 1).

Chimeric polyproteins incorporating 2A have been widely tested in eukaryotic systems, including mammalian [22], plant [38], insect [51], yeast [39] and fungal cells [52].The 2A system does not work in prokaryotic cells-the reported proteolysis activity of 1D-2A in *Escherichia coli* cells [53] was not detected in equivalent constructions in our laboratory showing "cleav‐ age" specificity for eukaryotic systems alone [54]. The unique activity of 2A peptides has led to their use as tools for co-expression of two (or more) proteins in biomedicine and biotech‐ nology [reviewed in 20, 21, 55]. The most widely used 2A sequence is derived from the FMDV (hereafter referred to as "F2A") [42]. Other 2A peptides used successfully include "T2A" from *Thosea asigna* virus (TaV), "E2A" from equine rhinitis virus (ERAV) and "P2A" from porcine


The –DxExNPGP- motif conserved among 2A/2A-like sequence is shown in red.

**Table 1.** Active 2A cellular sequences.

respectively. The results also indicate that the synonymous codon usage of the 2A peptide has no effect on 2A activity. In summary, our results indicate the conserved –DxExNPG motif within the peptidyl transferase centre (PTC) of the ribosome is restricted and it forms a tight turn, shifting the ester bond between the C-terminal glycine and tRNAGly (in the P site of the ribosome) into a conformation which rules out nucleophilic attack by prolyl-tRNAPro (in the A site)-no peptide bond is formed. Although no stop codon is involved, eukaryotic translation release (termination) factors 1 and 3 (eRF1/eRF3) release the nascent protein from the ribosome [35-37]. Due to its mode of action, the 2A peptide has been described as a "*cis*-acting **hy**drola**s**e **el**ement" (CHYSEL) [32]. Our model of this translational recoding event predicts two out‐ comes, either ribosomes terminate translation, or, translation of the downstream sequences resumes. Skipping induced by 2A sequences gives approximately equal expression of the proteins upstream and downstream of the 2A site as measured by: i) CAT and GUS enzyme activity [38]; ii) cell free translation *in vitro* and Western blot [22, 23, 31, 39, 40]; iii) GFP/FACS with antibiotic resistance [41]; iv) co-fluorescence reporting [42, 43]; v) fluorescence resonance energy transfer (FRET) analysis [44] and vi) protein segregation in transgenic animals [45, 46]. Since these sequences act co-translationally, artificial polyprotein systems may include signal sequences to localize different protein translation products to discrete sub-cellular sites.

Probing databases for the presence of the "signature" motif (-DxExNPGP-) showed that "2Alike" sequences were present in several genera of the *Picornaviridae* (aphtho-, cardio-, erbo-, tescho and certain parechoviruses), single-stranded RNA insect viruses (iflaviruses, dicistro‐ viruses, tetraviruses), double-stranded RNA viruses of the *Reoviridae* (type C and non-ABC rotaviruses, cypoviruses) and penaeid shrimp viruses [47, reviewed in 20,48]. Previously we demonstrated the activity of "2A-like" sequences within non-long terminal repeat retrotrans‐ posons (non-LTRs) of *Trypanosoma brucei*, *T.cruzi*, *T.vivax*, and *T.congolense* [31, 49] and more recently within the non-LTRs of a wide range of multicellular organisms: *Xenopus tropicalis* (African claw-toed frog, vertebrate), *Branchiostoma floridae* (Amphioxus, Florida lancelet, cephalochordate), *Strongylocentrotus purpuratus* (purple sea urchin, echinoderm), *Aplysia californica* (California sea slug, mollusc), *Crassostrea gigas* (Pacific oyster, mollusc), *Lottia gigantean* (Owl limpet, mollusc) and *Nematostella vectensis* (sea anemone, cnidarian) [50]. Presently, *in silico* searches have identified the 2A motif in a range of putative retrotransposon

Chimeric polyproteins incorporating 2A have been widely tested in eukaryotic systems, including mammalian [22], plant [38], insect [51], yeast [39] and fungal cells [52].The 2A system does not work in prokaryotic cells-the reported proteolysis activity of 1D-2A in *Escherichia coli* cells [53] was not detected in equivalent constructions in our laboratory showing "cleav‐ age" specificity for eukaryotic systems alone [54]. The unique activity of 2A peptides has led to their use as tools for co-expression of two (or more) proteins in biomedicine and biotech‐ nology [reviewed in 20, 21, 55]. The most widely used 2A sequence is derived from the FMDV (hereafter referred to as "F2A") [42]. Other 2A peptides used successfully include "T2A" from *Thosea asigna* virus (TaV), "E2A" from equine rhinitis virus (ERAV) and "P2A" from porcine

**2.2. 2A and 2A-like sequences**

168 Biotechnology

domains (Table 1).

teschovirus-1 (PTV-1) [Table 2]. Comparing the *in vitro* activity of different 2As inserted between GFP and GUS, we have shown that T2A20 has the highest cleavage efficiency followed by E2A20, P2A20, and F2A20 [31]. In 2A peptide-linked TCR:CD3 constructs, Szymczak and colleagues demonstrated that F2A22 and T2A18 have higher efficiency than E2A20 [44]. In human cell lines, zebrafish and mice, cleavage and targeting of NLS-EGFP and mCherry-CAAX to the nucleus and plasma membrane, respectively, was the most efficient in P2A19-linked constructs followed by T2A18, E2A20 and F2A22 [56]. To allay public fears and opposition to plants carrying a transgenic viral sequence, efficient 2A-like cellular sequences could be used (Table 1) [Unpublished Data].


The –DxExNPGP-motif conserved among 2A/2A-like sequence is shown in red.

**Table 2.** Examples of 2A/2A-like sequences used in biomedicine and biotechnology

#### **2.3. Intracellular protein targeting Of 2A constructs**

For effective technologies, some synthesized proteins must be transported across membranes and directed towards other sites in order to function. Protein targeting occurs either cotranslationally (targeting to endoplasmic reticulum [ER], Golgi, vacuole, plasma membrane) or post-translationally (targeting to nucleus, mitochondria, chloroplast, etc) and is orchestrated by distinct signal sequences encoded within the polypeptide [42]. In plants, the original FMDV-2A sequence was tested in various artificial polyproteins using reporter genes chlor‐ amphenicol acetyltransferase (CAT), β-glucuronidase (GUS) and green fluorescent protein (GFP) expressed in transgenic tobacco plants. This preliminary series of studies suggested that 2A cleaves proteins properly in plant cells [38, 57] and directs protein targeting to different cellular compartments *via* either co-or post-translational mechanisms [58]. Subsequently, Samalova and co-workers questioned its use in plant systems, suggesting that the 2A sequence was dispensable for efficient cleavage of polyproteins carrying a single internal signal peptide – it appears signal peptide cleavage by signal peptidase was responsible for processing the polyprotein. The use of a self-cleaving 2A was required when both halves of the fusion were translocated across the ER membrane, however, the upstream product was mis-sorted to the vacuole. Furthermore, it was shown that the FMDV 2A peptide resulted in low rates of polypeptide separation in plant cells when placed downstream of common fluorescent proteins (GFP and RFP derivatives) [43].

The *Arabidopsis* carbonic anhydrase (CAH1) is one of the few plant proteins known to be targeted to the chloroplast *via* the secretory pathway – the pre-protein has the endomembrane targeting sequence. The need for post-translational modifications, such as N-glycosylation, for proper folding, and to enhance stability and/or function of these proteins probably explains the use of this alternative trafficking pathway [59]. Recently, the FMDV 2A co-expression system was re-assessed to study the effects of three Ras-like small GTPase proteins, RAB2a, ARF1, and SAR1 on CAH1 protein trafficking in plant cells [60]. Members of this superfamily share several common structural features and act as molecular switches that regulate many aspects of plant vesicular transport [61,62]. Rabs regulate virtually all steps of membrane traffic from the specification of membrane identity to the accuracy of vesicle targeting [63]. ARF1 has been shown to play a critical role in COPI-mediated retrograde trafficking, while SAR1 is involved in COPII-mediated ER-to-Golgi protein transport [reviewed in 64]. In this study, targeting information and the sequence N-terminal of 2A proved to be important for efficient cleavage when translated by membrane-bound and cytosolic ribosomes respectively (Figure1, Panel B). In addition, expected subcellular localization of the fluorescent marker protein suggested no significant mis-targeting of the 2A-tagged markers. After optimization of 2A cleavage efficiency, mutant forms of the three small GTPases (HACAH1-2A/RABD2a/SAR1/ ARF1, Figure1, Panel C) were successfully used to study trafficking of CAH1 through the endomembrane system demonstrating the versatility of 2A in plant systems.

**Virus Abbreviation 2A/2A-like sequence References**

Equine rhinitis A virus ERAV -QCTNYALLKLAGDVESNPGP- [44] Porcine teschovirus -1 PTV-1 -ATNFSLLKQAGDVEENPGP- [45]

*Thosea asigna* virus TaV -EGRGSLLTCGDVESNPGP- [44,46]

The –DxExNPGP-motif conserved among 2A/2A-like sequence is shown in red.

**2.3. Intracellular protein targeting Of 2A constructs**

proteins (GFP and RFP derivatives) [43].

**Table 2.** Examples of 2A/2A-like sequences used in biomedicine and biotechnology

Foot-and-mouth disease virus FMDV -PVKQLLNFDLLKLAGDVESNPGP- [38,44,83,99,106,150]

For effective technologies, some synthesized proteins must be transported across membranes and directed towards other sites in order to function. Protein targeting occurs either cotranslationally (targeting to endoplasmic reticulum [ER], Golgi, vacuole, plasma membrane) or post-translationally (targeting to nucleus, mitochondria, chloroplast, etc) and is orchestrated by distinct signal sequences encoded within the polypeptide [42]. In plants, the original FMDV-2A sequence was tested in various artificial polyproteins using reporter genes chlor‐ amphenicol acetyltransferase (CAT), β-glucuronidase (GUS) and green fluorescent protein (GFP) expressed in transgenic tobacco plants. This preliminary series of studies suggested that 2A cleaves proteins properly in plant cells [38, 57] and directs protein targeting to different cellular compartments *via* either co-or post-translational mechanisms [58]. Subsequently, Samalova and co-workers questioned its use in plant systems, suggesting that the 2A sequence was dispensable for efficient cleavage of polyproteins carrying a single internal signal peptide – it appears signal peptide cleavage by signal peptidase was responsible for processing the polyprotein. The use of a self-cleaving 2A was required when both halves of the fusion were translocated across the ER membrane, however, the upstream product was mis-sorted to the vacuole. Furthermore, it was shown that the FMDV 2A peptide resulted in low rates of polypeptide separation in plant cells when placed downstream of common fluorescent

The *Arabidopsis* carbonic anhydrase (CAH1) is one of the few plant proteins known to be targeted to the chloroplast *via* the secretory pathway – the pre-protein has the endomembrane targeting sequence. The need for post-translational modifications, such as N-glycosylation, for proper folding, and to enhance stability and/or function of these proteins probably explains the use of this alternative trafficking pathway [59]. Recently, the FMDV 2A co-expression system was re-assessed to study the effects of three Ras-like small GTPase proteins, RAB2a, ARF1, and SAR1 on CAH1 protein trafficking in plant cells [60]. Members of this superfamily share several common structural features and act as molecular switches that regulate many aspects of plant vesicular transport [61,62]. Rabs regulate virtually all steps of membrane traffic from the specification of membrane identity to the accuracy of vesicle targeting [63]. ARF1 has been shown to play a critical role in COPI-mediated retrograde trafficking, while SAR1 is involved in COPII-mediated ER-to-Golgi protein transport [reviewed in 64]. In this study,

*Picornaviridae*

170 Biotechnology

*Tetraviridae*

Exchange factors for ARF GTPases (ARF-GEFs) regulate vesicle trafficking in a variety of organisms. In animals and fungi, there are eight ARF-GEF families, but only the apparently ancestral GBF and BIG families are present in plants, suggesting that plant ARF-GEFs have acquired multiple roles in different trafficking pathways [65, 66]. In *Arabidopsis* the ARF-GEFs GNOM-like 1 (GNL1) and its close homologue GNOM jointly regulate the retrograde COPImediated traffic from the Golgi to the ER, which is the ancient eukaryotic function of the GBF1 class [67]. Another line of research by Teh and Moore (2007) revealed secretory traffic is resistant to the trafficking inhibitor brefeldin A (BFA), whereas endosomal recycling involves GNOM – GNL1 is a BFA-resistant GBF protein that functions with the BFA-sensitive ARF GEF GNOM [68]. The 20aa 2A peptide from FMDV was used in this study to construct polyproteins that expressed trafficked fluorescent protein markers in fixed stoichiometry in different cellular compartments: N-ST-RFP-2A-GFP-HDEL produces a Golgi-localized RFP (red) and an ER-localized GFP (green), N-secRFP-2A-GFP-HDEL produces an ER-localized GFP and an RFP that is targeted to the vacuole *via* the ER and Golgi.

#### **2.4. The use of 2A multigene expression strategies in plant science — Caveats and proposals**

The take-home message from F2A mutagenesis experiments is that the sequence is largely intolerant to amino acid substitution over its entire length [31, 37]. While mutations of conserved amino acids have, in general, more pronounced effects than changes to nonconserved ones [31], variations at most positions within the peptide reduce activity – 2A peptides are optimized to function as a whole [37]. Sequences immediately upstream of 2A are known to be either critical or very important for activity [57, 69-72]. Longer versions of F2A with extra sequences derived from the capsid protein ("1D") – upstream of 2A in the FMDV polyprotein – produce higher levels of cleavage [23, 29, 47]. Specifically, N-terminal extension of 2A by 5aa of 1D improved "cleavage", but extension by 14aa of 1D or longer (21 and 39aa) produced complete "cleavage" and an equal stoichiometry of the up-and downstream translation products [23]. After "fine-tuning" of the F2A sequence we suggest that researchers opt for F2A30 (+11aa 1D). This 2A proved to be the most favourable in terms of both length and cleavage efficiency and was unaffected by the sequence of the upstream gene [73,74]. In the case of shorter 2As, *cleavage efficiency* has been improved by insertion of various spacer sequences such as Gly-Ser-Gly or Ser-Gly-Ser-Gly [41, 44, 45, 75-77], the V5 epitope tag (- GKPUPNPLLGLDST-) [78], or a 3xFlag epitope tag [79] ahead of the 2A sequence. If opting for a shorter sequence, users should be aware activity can be affected by the short amino acid tract linking the protein upstream with 2A introduced by the cloning strategy. For example, the F2A20 encoded by pGFP-F2A20-GUS was highly active [29, 31], whereas the pGFP-F2A20CherryFP was noticeably lower [73]. The only difference was the short "linker" between GFP and F2A created by the cloning strategy:-SGSRGAC-(pGFPF-2A20-GUS; linker derived from *Xba1* and *Sph1* restriction sites) and-RAKRSLE-(pGFPF-2A20-CherryFP; linker derived from furin and *Xho1* restriction site) [73]. Taken together, these observations are consistent with our translational model in which 2A activity is a product of its interaction with the exit tunnel of the ribosome which is thought to accommodate a nascent peptide of 30-40 amino acids [80].

When using the 2A system, it should be noted that the 2A oligopeptide remains as a C-terminal extension of the upstream fusion partner and the downstream protein must have an N-terminal proline residue. Although an N-terminal proline confers a long half-life upon a protein [81], it does prevent many N-terminal post-translational modifications that may be essential for activity. If this is the case, proteins that require authentic termini can be introduced as the first polyprotein domain. The need to target proteins to different subcellular locations within plant cells by C-terminal localization signals may be compromised if they contain a 2A-extension. In the case of proteins translocated into the ER, a strategy was adopted to include a furin proteinase cleavage site between the upstream protein and 2A [82,83]. Furin is a subtilisin-like serine endoprotease that cleaves precursors on the C-terminal side of the consensus sequence –Arg-X-Lys/Arg-Arg ↓ (-RX(K/R)R-) in the *trans*-Golgi network (TGN) [84,85]. The furin cleavage sequences ↑ -RKRR-,- ↑ RRRR-, and – ↑ RRKR-consisting of only basic amino acids, which can be efficiently cut by carboxypeptidases ( ↑ ), was used to remove 2A peptide-derived amino acids from the upstream antibody heavy chain during protein secretion [83]. Proteins expressed in plants could have their 2A extensions removed by endogenous proteinases acting on similar hybrid linker peptides. In 2004, François and colleagues connected the first nine amino acids (SN ↑ AADEVAT) of the LP4 peptide to the 20aa F2A to generate a hybrid linker peptide, LP4-2A [86]. LP4 is the fourth linker peptide of the naturally occurring polyprotein precursor originating from seed of *Impatiens balsamina* [87]. Cleavage of the polyprotein with plant defensin DmAMP1 from *Dahlia merckii* at its amino-terminus and plant defensin RsAFP2 from *Raphanus sativus* at its carboxy-terminus resulted in the release and targeting of (DmAMP1-SN) and RsAFP2 to different cellular compartments [86]. Recently, 2A and LP4-2A were used to connect the *Bacillus thuringiensis* (Bt) *cry1Ah* gene, which encodes a protein exhibiting strong insecticidal activity, and the *mG2-epsps* gene, which encodes a protein tolerant to glyphosate, the world's most important and widely used herbicide [72]. The expression level of the two genes linked by LP4-2A was higher than those linked by 2A, regardless of the order of the genes within the vector. Furthermore, tobacco plants transformed with the LP4-2A fusion vectors showed better pest resistance and glyphosate tolerance compared to plants transformed with the 2A fusion constructs.
