**5.1** *Cis***-elements in trocarin D promoter region**

268 Gene Duplication

Human\_FX

**Human FX**

**Murine FX Rabbit FX**

Rat\_FX

**Rat FX**

Rabbit\_FX

Bovine\_FX

Trocarin\_D Hoplocepha

**Bovine FX**

*Trocarin D*

Notechis\_s

Pseudechis

T.\_carinat

PCCS

Oxyuranus\_microlepid Oxyuranus\_scutellatu

*Pseudonaja textilis* **PFX2**

*Pseutarin C (PCCS)*

*Tropidechis carinatus* **FX**

*Pseudechis porphyriacus* **vPA**

*Hoplocephalus stephensii* **vPA**

**Mammalian Factor X** 

**Group D Prothrombin Activators**

**Group C Prothrombin Activators**

**Reptilian Factor X** 

*Pseudonaja textilis* **PFX1**

*Notechis scutatus* **vPA**

*Oxyuranus microlepidotus* **vPA** *Oxyuranus scutellatus* **vPA**

PFX1

Murine\_FX <sup>99</sup>

69

**69**

100

**100**

100

79

**95 79**

85

**85**

99

**87 100**

**99**

87

80

**80**

**<sup>81</sup> <sup>99</sup>**

81

100

**100**

100

**100**

0.05

Fig. 7. Phylogenetic relationships of snake venom and plasma prothrombin activators with

Zebrafish\_

**Zebrafish FX**

In the previous sections, we have described how the venom prothrombin activators have been modified to gain certain characteristics, such as resistance to inactivation, which enables them to function better as toxins. However, differential and tissue-specific expression of venom prothrombin activators and their plasma coagulation factors is also important for their respective physiological roles. The expression of toxins should be venom gland-specific and inducible to higher levels. This is so the snake can protect itself against its own venom toxins and replenish its venom supply quickly. Conversely, plasma coagulant factors are mainly expressed in the liver at constituently low levels so that they can be

To understand how the venom prothrombin activators are regulated for tissue specificity and level of expression, we determined the gene structure of trocarin D and TrFX. Based on the cDNA sequences of trocarin D and TrFX, and that of mammalian FX gene, primers were

other known FX sequences (Reza et al. 2006). "vPA" is an abbreviation for venom prothrombin activators. Arrows indicate the three independent "recruitment" events of

PFX2 <sup>95</sup>

snake venom prothrombin activators.

**5. Comparison of trocarin D and TrFX genes** 

activated to induce blood coagulation during vascular injuries.

The overlapping promoter regions of trocarin D and TrFX were characterized by comparing them against previously characterized human (Hung et al. 2001; Hung and High 1996) and murine (Wilberding and Castellino 2000) FX promoter regions (Reza et al. 2007) (Figure 9). Based on these comparisons, four conserved *cis*-regulatory elements in the trocarin D and TrFX promoter regions were identified (Figure 9): (i) a CCAAT box (Hung et al. 2001; Hung and High 1996; Wilberding and Castellino 2000), (ii) a gut-specific transcription factor GATA-4 binding site (Hung et al. 2001), (iii) a liver-specific transcription factor HNF-4 (Hung and High 1996), and (iv) multiple Sp1/Sp3 binding sites (Hung et al. 2001).

Comparison of the trocarin D and TrFX promoter regions reveals that trocarin D has a 264 bp insertion (Figure 8 and 9). This 264 bp is located from -33 to -297 bp upstream of the trocarin D start codon (ATG) (Figure 9). This insertion is postulated to play a major role in the recruitment of the duplicated TrFX gene by causing it to be exclusively expressed in the venom gland as the procoagulant toxin, trocarin D. Hence, it was termed *V*enom *R*ecruitment/*S*witch *E*lement (*VERSE*). This segment was characterized for its *cis*-elements and gene-regulatory role using luciferase assays in primary venom gland cells and mammalian cell lines (Kwong et al. 2009). The *VERSE* promoter was found to be responsible for the elevation of expression levels, but not tissue-specific expression, of trocarin D. In terms of *cis*-element characterization, besides confirming the presence of two TATA-boxes, one GATA box and one Y-box, three novel *cis*-elements were also identified (Figure 9). Functionally, it is found that both TATA boxes (TLB2 and TLB3) are functional. However, TLB2 is the primary TATA box which initiates and directs transcription start site (Kwong et al. 2009) (Figure 9).

Duplication of Coagulation Factor Genes and Evolution of Snake Venom Prothrombin Activators 271

intron 1 of trocarin D could contain *cis*-elements that are responsible for the venom glandspecific expression of trocarin D. In summary, the characterization of the *VERSE* promoter and intron 1 regions of trocarin D has increased our understanding regarding

Fig. 10. Comparison of intron 1 regions in TrFX and trocarin D genes (Reza et al., 2007).

functional sites, which often results in new ligand-binding specificities (Kini 2002).

Besides neofunctionalization, changes in gene regulation are also the outcomes of gene duplication. This can be seen in two isoforms present in the venom of *Naja sputatrix*: cardiotoxin and α-neurotoxin (Ma et al. 2001). Besides varying in function, these two isoforms have different expression levels in the venom gland. Cardiotoxin constitutes 60% of the venom while the α-neurotoxin makes up only 3% of the venom. Gene duplication is evident from the gene comparison whereby the structures and amino acid sequence of these

Besides "recruitment", gene duplication also plays an important role in the diversification of venom toxins. This diversification is essential for the development of novel toxins. This diversification through gene duplication is evident from the many toxin isoforms present in the snake venom. Interestingly, each isoform varies in its function and gene regulation. Gene duplication has also led to neofunctionalization of venom toxins, which has led to the new families of snake venom toxins (St Pierre et al. 2008) and addition of new members within these families (Fry et al. 2003; Landan et al. 1991b; Lynch 2007; Moura-da-Silva et al. 1996). The three-finger toxin (3FTx) multigene family is a good example of neofunctionalization by gene duplication (Fry et al. 2003). Structurally, all the members of this family have very well-conserved cysteine residues and share a common structure of three beta-stranded loops extending from a central core. However, they exhibit a wide variety of pharmacological effects. For example, acetylcholinesterase inhibition (fasciculin from *Dendroaspis angusticeps* venom), neurotoxicity (α-bungarotoxin from *Bungarus multicinctus* venom), cardiotoxicity (β cardiotoxin from *Ophiophagus hannah* venom), and many others (for details, see (Kini and Doley 2010)). Neofunctionalization occurs when a toxin gene undergoes gene duplication and the duplicated gene is mutated within the

**6. Gene duplication in snake venom toxin diversification** 

gene regulation of venom prothrombin activators.

Fig. 9. Comparison of promoter regions in mammalian and *T. carinatus* prothrombin activator genes (Reza et al., 2007).

### **5.2 Comparison of trocarin D and TrFX first introns**

The intron 1 size of trocarin D is 7911 bp, while that of TrFX is 5293 bp. The difference in size is explained by three insertions and two deletions in the trocarin D intron 1 region (Reza et al. 2007) (Figure 10). Bioinformatics analysis of these insertion/deletion segments reveals that they are novel. The three insertion segments within intron 1 of trocarin D region are 214 bp, 1975 bp and 2174 bp in size with respective positions at 128 bp, 914 bp and 3300 bp on the trocarin D gene (Figure 10). Upon closer analysis of the insertion segment sequences, it is observed that the first insert within intron 1 of trocarin D is almost an exact repeat (96.33% identity) of the intron 1 segment spanning from 3082 bp to 3299 bp. The other two inserts seem to be inverted repeats of each other (~71% identity). The third insert shows potential of being a Scaffold/Matrix Attachment Region (S/MAR) due to: (i) a high AT content (Cockerill and Garrard 1986; Liebich et al. 2002; Zhou and Liu 2001), (ii) a topoisomerase II (Boulikas 1993), (iii) a S/MAR consensus motif (van Drunen et al. 1997), (iv) a significant over-representation of characteristic hexanucleotides (Liebich et al. 2002), and (v) an ATTA motif and an AT-rich region with H-box (Will et al. 1998). The two deletion segments within intron 1 of trocarin D are 255 bp and 1406 bp in size with respective positions at 2610 bp and 3770 bp on the TrFX gene (Reza et al. 2007) (Figure 10). As the *VERSE* promoter of trocarin D does not regulate tissue-specific expression (Kwong et al. 2009), it is postulated that these insertions and deletions in the


HNF-4

SP1/SP3

GATA-4

TLB3 Y box

TLB2 TLB1


*VERSE* 264 bp -297 -33


box GATA-4 HNF-4

CCAAT box GATA-4


HNF-4

SP1/SP3

Fig. 9. Comparison of promoter regions in mammalian and *T. carinatus* prothrombin

The intron 1 size of trocarin D is 7911 bp, while that of TrFX is 5293 bp. The difference in size is explained by three insertions and two deletions in the trocarin D intron 1 region (Reza et al. 2007) (Figure 10). Bioinformatics analysis of these insertion/deletion segments reveals that they are novel. The three insertion segments within intron 1 of trocarin D region are 214 bp, 1975 bp and 2174 bp in size with respective positions at 128 bp, 914 bp and 3300 bp on the trocarin D gene (Figure 10). Upon closer analysis of the insertion segment sequences, it is observed that the first insert within intron 1 of trocarin D is almost an exact repeat (96.33% identity) of the intron 1 segment spanning from 3082 bp to 3299 bp. The other two inserts seem to be inverted repeats of each other (~71% identity). The third insert shows potential of being a Scaffold/Matrix Attachment Region (S/MAR) due to: (i) a high AT content (Cockerill and Garrard 1986; Liebich et al. 2002; Zhou and Liu 2001), (ii) a topoisomerase II (Boulikas 1993), (iii) a S/MAR consensus motif (van Drunen et al. 1997), (iv) a significant over-representation of characteristic hexanucleotides (Liebich et al. 2002), and (v) an ATTA motif and an AT-rich region with H-box (Will et al. 1998). The two deletion segments within intron 1 of trocarin D are 255 bp and 1406 bp in size with respective positions at 2610 bp and 3770 bp on the TrFX gene (Reza et al. 2007) (Figure 10). As the *VERSE* promoter of trocarin D does not regulate tissue-specific expression (Kwong et al. 2009), it is postulated that these insertions and deletions in the

activator genes (Reza et al., 2007).

**Human FX**

**Murine FX**

**TrFX**

**Trocarin D**

**5.2 Comparison of trocarin D and TrFX first introns** 

SP1/SP3 SP1/SP3

SP1/SP3

SP1/SP3

CCAAT box GATA-4 SP1/SP3 HNF-4



CCAAT

CCAAT box GATA-4


intron 1 of trocarin D could contain *cis*-elements that are responsible for the venom glandspecific expression of trocarin D. In summary, the characterization of the *VERSE* promoter and intron 1 regions of trocarin D has increased our understanding regarding gene regulation of venom prothrombin activators.

Fig. 10. Comparison of intron 1 regions in TrFX and trocarin D genes (Reza et al., 2007).
