**3. Results and discussions**

## **3.1. Parasite proteins involved in the nuclear transport machinery**

Table 2 shows a summary of protein sequences used in this *in silico* analysis. A total of 904 and 642 protein sequences were retrieved in FASTA format from NCBI server and UniProt/SwissProt database respectively. A total of 18 protein sequences with less than 100 amino acid residues were excluded from the study as they were considered not completely functional [17]. Hence, 1528 protein sequences were used for protein sequence clustering. The 30% identity and above at the amino acid level is considered sufficient to imply functional relatedness [17]. Therefore, protein clustering with more than 30% similarity on the retrieved protein sequences produced a non-redundant data set of 248 protein sequences.


**Table 2.** Summary of protein sequences retrieved in *in silico* analysis.

38 Bioinformatics

FASTA format as queries.

accordingly.

used.

sequences.

**3. Results and discussions** 

functional annotation which makes use of Conserved Domain Database (CDD) [12], Simple Modular Architecture Research Tool (SMART) [13] and InterPro [14] programs. The protein

Posttranslational modification (PTM) is the chemical modification of a protein after its translation. It is one of the later steps in protein biosynthesis, and thus gene expression, for many proteins. In this part of study, in relation to regulatory aspects of nuclear transport mechanism, we focused on potential glycosylation and phosphorylation sites. To analyze the post translational modification sites, all protein sequences of *T. brucei* from TriTrypDB were subjected to PROSITE [15] programme. The proteins sequences were submitted in

Protein–protein interactions occur when two or more proteins bind together, often to carry out their biological function. Proteins might interact for a long time to form part of a protein complex, a protein may be carrying another protein, or a protein may interact briefly with another protein just to modify it. To analyze the participation of parasite proteins in proteinprotein interactions, all protein sequences of *T. brucei* from TriTrypDB were subjected to mining of STRING 8.2 database [16]. The STRING 8.2 database integrates information from numerous sources, including experimental repositories, computational prediction methods and public text collections. The proteins sequences were submitted in FASTA format as queries. All information on protein-protein interaction were recorded and evaluated

The degree of similarity between amino acids occupying a particular position in the protein sequence can be interpreted as a rough measure of how conserved a particular region or sequence motif is. To compare the parasite proteins with human homologues, all protein sequences of *T. brucei* from TriTrypDB were subjected to BLASTp analysis against *Homo sapiens* proteins. The proteins sequences were submitted in FASTA format as queries. The criteria such as cutoff point with E-value of less than 1e-06 and score of more than 100 were

Table 2 shows a summary of protein sequences used in this *in silico* analysis. A total of 904 and 642 protein sequences were retrieved in FASTA format from NCBI server and UniProt/SwissProt database respectively. A total of 18 protein sequences with less than 100 amino acid residues were excluded from the study as they were considered not completely functional [17]. Hence, 1528 protein sequences were used for protein sequence clustering. The 30% identity and above at the amino acid level is considered sufficient to imply functional relatedness [17]. Therefore, protein clustering with more than 30% similarity on the retrieved protein sequences produced a non-redundant data set of 248 protein

**3.1. Parasite proteins involved in the nuclear transport machinery** 

sequences were submitted in FASTA format as queries.

The BLASTp analyses against TriTrypDB using cut off point with E-value of less than 1e-06 and score of more than 100 for the whole 248 query protein sequences resulted in 34 hits of parasite proteins. However our approach failed to identify a Ran GTPase-activating protein (RanGAP) protein in this parasite. In reference [18] also reported that sequence similarity searches have been unable to identify a RanGAP protein in any protozoan. Keyword searches among annotated proteins in the *T. gondii* genome database identified one candidate which was shown to have strong similarity to Ran-binding protein 1 (RanBP1) based on sequence analysis. Perhaps the RanGAP function in apicomplexans is performed by a single protein with multiple cellular responsibilities (i.e., a fusion of Ran binding protein 1 and RanGAP). It is also possible that a completely unique parasite protein possesses the RanGAP function.

Table 3 shows the identified and characterized parasite proteins involved in the nuclear transport machinery. The functional annotation based on protein domains, showed that, out of 34, only 22 parasite protein sequences were predicted with high confidence level to be involved in the nuclear transport mechanism with the presence of relevant protein domains. This includes guanine triphosphate (GTP)-binding domain, Nucleoporin (NUP) C terminal domain, Armadillo repeat, Importin B N-terminal domain, regulator of chromosome condensation 1 (RCC1) repeat and Exportin domain (Table 4). All these protein domains were experimentally verified to regulate the nuclear transport mechanism in eukaryotes. There were seven *T. brucei* proteins that exhibited functional features of the Importin receptor. This finding is consensus with the number of Importin receptors in another eukaryotic pathogen, *Toxoplasma gondii* [8]. In addition, our results of other nuclear transport constituents in *T. brucei* such as RCC1, Ran, nuclear transport factor 2 (NTF2), cell apoptosis susceptibility (CAS), Exportin and Ran binding proteins were also in agreement with reference [18].

The nuclear and cytoplasmic compartments are divided by the nuclear envelope in eukaryotes. By using this compartmentalization and controlling the movement of molecules between the nucleus and the cytosol, cells are able to regulate numerous cellular mechanisms such as transcription and translation. Proteins with molecular size lower than 40 kDa are able to passively diffuse through the nuclear pore complex (NPC), whereas larger proteins require active transport through the assistance of Karyopherins, specific transport receptors that shuttle between the nucleus and cytosol. Karyopherins which are able to distinguish between the diverse proteome to target specific cargo molecules for transport, can be subdivided into those that transport molecules into the nucleus (Importins) and those that transport molecules out of the nucleus (Exportins). It has been reported that more than 2000 proteins are shuttled between the nucleus and the cytoplasm in yeast [19]. From our result, with the identification of Karyopherin and Nucleoporin proteins in *T. brucei*, we expect that the parasite employs the typical components for the nuclear transport machinery.

Investigation on Nuclear Transport of *Trypanosoma brucei*: An *in silico* Approach 41

The Armadillo (Arm) repeat is an approximately 40 amino acid long tandemly repeated sequence motif first identified in the *Drosophila melanogaster* segment polarity gene armadillo involved in signal transduction through wingless. Animal Arm-repeat proteins function in various processes, including intracellular signalling and cytoskeletal regulation, and include such proteins as beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC), tumour suppressor protein, and the nuclear transport factor

Members of the Importin-beta (Karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, Importin-beta mediates interactions with the pore complex, while Importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the

Arrays of Huntingtin, elongation factor 3 (EF3), protein phosphatase 2A (PP2A), and the yeast PI3-kinase TOR1 (HEAT) repeats consists of 3 to 36 units forming a rod-like helical structure and appear to function as protein-protein interaction surfaces. It has been noted that many HEAT repeat-containing proteins are involved in intracellular

The sequences featured in this family are similar to a region close to the N-terminus of yeast exportin 1 (Xpo1, Crm1). This region is found just C-terminal to an importin-beta Nterminal domain (pfam03810) in many members of this family. Exportin 1 is a nuclear export receptor that interacts with leucine-rich nuclear export signal (NES) sequences, and Ran-GTP, and is involved in translocation of proteins out of

**Protein domain Accession Description** 

Ran binding

Armadillo IPR000225

Importin beta IPR001494

HEAT IPR000357

protein pfam08389

Exportin 1-like

nuclear pores.

Ran GTPase SM00176 Ran is involved in the active transport of proteins through

domain PDOC50196 This domain binds RanGTP and increases the rate of RanGAP1-induced GTP hydrolysis.

importin-alpha, amongst others

classical NLS import of proteins.

transport processes.

the nucleus.


Key:

GTP Guanine triphosphate

CAS Cell apoptosis susceptibility

CSE Chromosome seggregation

RCC1 Regulator of chromosome condensation 1

NUP Nucleoporin

HEAT Huntingtin, elongation factor 3 (EF3), protein phosphatase 2A (PP2A), and the yeast PI3-kinase TOR1

WD Trp-Asp (W-D) dipeptide

RNA Ribonucleic acid

**Table 3.** Identified and characterized *T. brucei* proteins of nuclear transport. Protein domain identification involved CDD, SMART, InterPro and PROSITE programs.

machinery.

Key:

GTP Guanine triphosphate CAS Cell apoptosis susceptibility CSE Chromosome seggregation

WD Trp-Asp (W-D) dipeptide RNA Ribonucleic acid

NUP Nucleoporin

RCC1 Regulator of chromosome condensation 1

**Subject sequences** 

more than 2000 proteins are shuttled between the nucleus and the cytoplasm in yeast [19]. From our result, with the identification of Karyopherin and Nucleoporin proteins in *T. brucei*, we expect that the parasite employs the typical components for the nuclear transport

**E-value Score Functional protein domains** 

Tb927.3.1120 1.70E-72 718 Ran GTPase, GTP-binding domain

Tb11.02.0870 3.20E-16 187 Ran binding domain Tb927.2.2240 2.40E-15 190 Exportin-like protein

Tb11.03.0140 5.80E-09 107 NUP C terminal domain Tb927.10.8170 2.10E-28 315 NUP C terminal domain Tb927.8.3370 2.50E-48 281 Ran-binding protein Mog1p

Tb11.02.1720 2.60E-26 276 Armadillo-like helical

Tb927.3.4600 3.70E-08 149 Armadillo-like helical

Tb927.6.3870 8.50E-14 164 RNA recognition motif

Tb927.7.5760 1.30E-08 115 Nuclear transport factor 2 domain

Tb927.8.4280 2.90E-08 112 Nuclear transport factor 2 domain

Tb927.7.1190 6.90E-20 172 RCC1 repeat

Tb11.01.7200 7.10E-07 137 Nsp1-like Tb927.7.6320 1.20E-11 136 RCC1 repeat

Tb09.160.2360 1.40E-36 379 WD40 repeat

Tb09.211.4360 5.50E-33 348 Karyopherin Importin Beta, Armadillo repeat Tb11.01.5940 9.30E-149 1391 Exportin-1 C terminal, Importin Beta N terminal domain

Tb927.6.2640 9.10E-83 815 Karyopherin Importin Beta, Armadillo repeat

Tb11.01.7010 8.20E-42 464 Armadillo repeat, Karyopherin Importin Beta

Tb11.01.8030 1.70E-18 218 HEAT repeat, Armadillo repeat, Importin Beta N terminal domain

Tb10.70.4720 4.60E-77 761 Importin Beta N terminal domain, Karyopherin domain

HEAT Huntingtin, elongation factor 3 (EF3), protein phosphatase 2A (PP2A), and the yeast PI3-kinase TOR1

**Table 3.** Identified and characterized *T. brucei* proteins of nuclear transport. Protein domain

identification involved CDD, SMART, InterPro and PROSITE programs.

Tb927.6.4740 1.10E-75 748 CAS/CSE domain, Importin Beta N terminal domain



Investigation on Nuclear Transport of *Trypanosoma brucei*: An *in silico* Approach 43

predicted that the parasite proteins could be phosphorylated at Serine, Threonine and Tyrosine amino residues. However, the O-glycosylation sites were not present in three

**Subject sequences Phosphorylation site Glycosylation site**  Tb927.3.1120 + + Tb09.211.4360 + + Tb11.01.5940 + + Tb11.02.0870 + - Tb927.2.2240 + + Tb927.6.2640 + + Tb927.6.4740 + + Tb927.7.1190 + + Tb11.03.0140 + + Tb927.10.8170 + + Tb927.8.3370 + - Tb11.01.7010 + + Tb11.02.1720 + + Tb11.01.8030 + + Tb11.01.7200 + + Tb927.7.6320 + + Tb927.3.4600 + + Tb09.160.2360 + + Tb927.6.3870 + + Tb927.7.5760 + - Tb10.70.4720 + + Tb09.211.2550 + + Tb927.8.4280 + +

**Table 5.** Phosphorylation and O-glycosylation sites in the *T. brucei* proteins. Identification of these

Most of the parasite proteins were predicted to be involved in O-linked glycosylation. In eukaryotes, the O-linked glycosylation takes place in the Golgi apparatus. It also occurs in archaea and bacteria. Phosphorylation was reported to be crucial in the regulation of protein-protein interactions of the NADPH oxidase in the phagocytic cells [20]. The phosporylation-based signaling in *T. brucei* has been reported by reference [21]. Thus we

parasite proteins, namely Tb11.02.0870, Tb927.8.3370 and Tb927.7.5760.

Key**:** 

(+) indicates presence (-) indicates absence

functional sites involved ScanProsite programme.

**Table 4.** Summary of protein domains

#### **3.2. Regulatory aspect of the parasite nuclear transport**

Table 5 shows the presence of phosphorylation and glycosylation sites in the parasite proteins. The phosphorylation sites were found to be present in all parasite proteins. It was predicted that the parasite proteins could be phosphorylated at Serine, Threonine and Tyrosine amino residues. However, the O-glycosylation sites were not present in three parasite proteins, namely Tb11.02.0870, Tb927.8.3370 and Tb927.7.5760.


Key**:** 

42 Bioinformatics

CAS/CSE IPR005043

WD40 IPR001680

RCC1 PDOC00544

terminal PDOC51434

NSP 1 IPR007758

NTF 2 IPR002075

**Table 4.** Summary of protein domains

NUP C-

**Protein domain Accession Description** 

other proteins.

stimulator (GDS)

Nucleoporins.

Table 5 shows the presence of phosphorylation and glycosylation sites in the parasite proteins. The phosphorylation sites were found to be present in all parasite proteins. It was

**3.2. Regulatory aspect of the parasite nuclear transport** 

In the nucleus, cell apoptosis susceptibility (CAS) acts as a nuclear transport factor in the importin pathway. The Importin pathway mediates the nuclear transport of several proteins that are necessary for mitosis and further progression. CAS is therefore thought to affect the cell cycle through its effect on the nuclear transport of these proteins

WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among

The regulator of chromosome condensation (RCC1) is a eukaryotic protein which binds to chromatin and interacts with ran, a nuclear GTP-binding protein (see <PDOC00859>), to promote the loss of bound GDP and the uptake of fresh GTP, thus acting as a guanine-nucleotide dissociation

Communication between the nucleus and cytoplams of an eukaryotic cell is mediated by the nuclear pore complexes (NPCs), which act as selective molecular gateways. Through these gateways, ribonucleic acids (RNAs) and proteins are exported into the nucleus. Each NPC consists of ~30 distinct proteins termed Nucleoporins, each present in at least eight copies, reflecting the octagonal symmetry of the complex.

The NSP1-like protein appears to be an essential component of the nuclear pore complex, for example preribosome nuclear

Nuclear transport factor 2 (NTF2) is a homodimer which stimulates efficient nuclear import of a cargo protein. NTF2 binds to both RanGDP and FxFG repeat-containing

export requires the Nup82p-Nup159p-Nsp1p complex.

(+) indicates presence

(-) indicates absence

**Table 5.** Phosphorylation and O-glycosylation sites in the *T. brucei* proteins. Identification of these functional sites involved ScanProsite programme.

Most of the parasite proteins were predicted to be involved in O-linked glycosylation. In eukaryotes, the O-linked glycosylation takes place in the Golgi apparatus. It also occurs in archaea and bacteria. Phosphorylation was reported to be crucial in the regulation of protein-protein interactions of the NADPH oxidase in the phagocytic cells [20]. The phosporylation-based signaling in *T. brucei* has been reported by reference [21]. Thus we

believe that the phosphorylation could also regulate the nuclear transport components of *T. brucei* to participate in various functional interactions. Meanwhile, it was suggested that Olinked glycosylation may be analogous to protein phosphorylation. According to [22], phosphorylation by proline-directed kinases share the same sites with those potentially Oglycosylated by O-linked N-acetylglucosamine transferase (OGT). From this it is possible that O-glycosylation and phosphorylation may compete for sites of modification. Therefore, it is a strong likelihood that the nuclear transport of *T. brucei* could be regulated by both phosphorylation and O-glycosylation.

Investigation on Nuclear Transport of *Trypanosoma brucei*: An *in silico* Approach 45

hydrolysed and Ran dissociates from the receptor. The Importin can then bind and import another cargo molecule, while nuclear transport factor 2 (NTF2) recycles RanGDP back to nucleus. The cargo binding to exportins is controlled in a reverse manner compared to Importins; they recruit cargo at high RanGTP levels in the nucleus and release cargo at low

**Figure 5.** Protein functional interaction network in the nuclear transport of *T. brucei*. This protein interaction data was obtained from STRING 8.2 database. The letters (a-i) indicate the parasite proteins

network for the nuclear transport machinery remains to be elucidated.

Table 5 shows evaluation of the obtained protein interaction data of the parasite nuclear transport. There were 13 functional interactions between parasite proteins identified from the mining of STRING 8.2 database. The score values of functional interactions range from 0.45 to 0.976. The Importin alpha (Tb927.6.2640) was found to be the most interactive parasite proteins by participating in six functional interactions. Based on the relevant protein domains and previous reports, four out of 13 functional interactions were considered with high confidence level. It should be emphasized that our approach only considered the protein interaction data derived from experiments, gene fusion and text mining. To our knowledge, this is the first report of functional protein interactions in the nuclear transport of the eukaryotic parasites. Whether other eukaryotic parasites share the common protein interaction

RanGTP concentrations in the cytoplasm.

involved in the nuclear transport.

Apart from acting simply as an architectural structure which facilitates nuclear transport, the NPC may also play a more dynamic role in regulating transport. The specificity of import and export may be influenced by recognition of different substrates and alteration of the Nucleoporin expression. This would allow different interaction between the NPC and Karyopherins and modulate the nuclear import and export. However, the most common impact on nucleocytoplasmic movement stems comes from the post translational modifications of the cargo proteins themselves [23]. The post translational modification of NPC was reported by [24]. Post-translational modification of NUPs by ubiquitylation and phosphorylation can affect NUP turnover and pore disassembly, respectively. Our study identified four parasite proteins containing the Nucleoporin-related domain. We anticipate that the assembly and disassembly of the parasite Nucleoporin proteins might also be modulated by phosphorylation.

The NPC becomes an ideal target for inhibition of nuclear import or export. One of the most common features of Nucleoporins is the presence of conserved FG or FXFG repeats that bind to the Importin family members [25]. The monoclonoal antibodies such as mAb414 and RL2 can interrupt translocation through the NPC by blocking the FG and FXFG epitopes of the Nucleoporins. Consequently, several Nucleoporin proteins were identified by their reactivity against the anti-FG antibodies. Most of these FG repeat proteins exist as the cytoplasmic fibrils or projections on the nuclear side of the NPC. The monoclonal antibodies prevent cargo from associating with the edge of an NPC so it cannot cross the membrane [26]. Thus, there is a possibility that the pathogenesis of *T. brucei* could be controlled by inhibiting its Nucleoporin proteins.

## **3.3. Participation of parasite proteins in functional interaction network**

Figure 5 illustrates the protein interaction data obtained from STRING 8.2 database. The mining of protein interaction data which is useful in contextual annotation of protein function showed that, out of 22 parasite homologues, only nine parasite proteins were interacting with each other. Out of the seven identified *T. brucei* Importins, only two namely Tb927.6.2640 and Tb10.70.4720 were found to be involved in that protein interaction network. This database mining approach indicated that *T. brucei* nuclear transport is typical of eukaryotic organisms. Importins initially recruit cargo at low RanGTP concentrations in the cytoplasm and release cargo at high RanGTP levels in the nucleus. Importin–RanGTP complexes return afterwards to the cytoplasm, where the Ran-bound GTP is finally hydrolysed and Ran dissociates from the receptor. The Importin can then bind and import another cargo molecule, while nuclear transport factor 2 (NTF2) recycles RanGDP back to nucleus. The cargo binding to exportins is controlled in a reverse manner compared to Importins; they recruit cargo at high RanGTP levels in the nucleus and release cargo at low RanGTP concentrations in the cytoplasm.

44 Bioinformatics

phosphorylation and O-glycosylation.

modulated by phosphorylation.

inhibiting its Nucleoporin proteins.

believe that the phosphorylation could also regulate the nuclear transport components of *T. brucei* to participate in various functional interactions. Meanwhile, it was suggested that Olinked glycosylation may be analogous to protein phosphorylation. According to [22], phosphorylation by proline-directed kinases share the same sites with those potentially Oglycosylated by O-linked N-acetylglucosamine transferase (OGT). From this it is possible that O-glycosylation and phosphorylation may compete for sites of modification. Therefore, it is a strong likelihood that the nuclear transport of *T. brucei* could be regulated by both

Apart from acting simply as an architectural structure which facilitates nuclear transport, the NPC may also play a more dynamic role in regulating transport. The specificity of import and export may be influenced by recognition of different substrates and alteration of the Nucleoporin expression. This would allow different interaction between the NPC and Karyopherins and modulate the nuclear import and export. However, the most common impact on nucleocytoplasmic movement stems comes from the post translational modifications of the cargo proteins themselves [23]. The post translational modification of NPC was reported by [24]. Post-translational modification of NUPs by ubiquitylation and phosphorylation can affect NUP turnover and pore disassembly, respectively. Our study identified four parasite proteins containing the Nucleoporin-related domain. We anticipate that the assembly and disassembly of the parasite Nucleoporin proteins might also be

The NPC becomes an ideal target for inhibition of nuclear import or export. One of the most common features of Nucleoporins is the presence of conserved FG or FXFG repeats that bind to the Importin family members [25]. The monoclonoal antibodies such as mAb414 and RL2 can interrupt translocation through the NPC by blocking the FG and FXFG epitopes of the Nucleoporins. Consequently, several Nucleoporin proteins were identified by their reactivity against the anti-FG antibodies. Most of these FG repeat proteins exist as the cytoplasmic fibrils or projections on the nuclear side of the NPC. The monoclonal antibodies prevent cargo from associating with the edge of an NPC so it cannot cross the membrane [26]. Thus, there is a possibility that the pathogenesis of *T. brucei* could be controlled by

**3.3. Participation of parasite proteins in functional interaction network** 

Figure 5 illustrates the protein interaction data obtained from STRING 8.2 database. The mining of protein interaction data which is useful in contextual annotation of protein function showed that, out of 22 parasite homologues, only nine parasite proteins were interacting with each other. Out of the seven identified *T. brucei* Importins, only two namely Tb927.6.2640 and Tb10.70.4720 were found to be involved in that protein interaction network. This database mining approach indicated that *T. brucei* nuclear transport is typical of eukaryotic organisms. Importins initially recruit cargo at low RanGTP concentrations in the cytoplasm and release cargo at high RanGTP levels in the nucleus. Importin–RanGTP complexes return afterwards to the cytoplasm, where the Ran-bound GTP is finally

**Figure 5.** Protein functional interaction network in the nuclear transport of *T. brucei*. This protein interaction data was obtained from STRING 8.2 database. The letters (a-i) indicate the parasite proteins involved in the nuclear transport.

Table 5 shows evaluation of the obtained protein interaction data of the parasite nuclear transport. There were 13 functional interactions between parasite proteins identified from the mining of STRING 8.2 database. The score values of functional interactions range from 0.45 to 0.976. The Importin alpha (Tb927.6.2640) was found to be the most interactive parasite proteins by participating in six functional interactions. Based on the relevant protein domains and previous reports, four out of 13 functional interactions were considered with high confidence level. It should be emphasized that our approach only considered the protein interaction data derived from experiments, gene fusion and text mining. To our knowledge, this is the first report of functional protein interactions in the nuclear transport of the eukaryotic parasites. Whether other eukaryotic parasites share the common protein interaction network for the nuclear transport machinery remains to be elucidated.


Investigation on Nuclear Transport of *Trypanosoma brucei*: An *in silico* Approach 47

**similarity (%)** 

counterparts in human. The degree of sequence similarity between parasite proteins and human counterparts range from 19% to 72%. The resulting score values range from 49.3 to 558. Meanwhile, all the identified human proteins contain the same protein domains

**Subject sequence Human counterparts Score E-value Sequence** 

Tb927.3.1120 NP\_006316.1 313 1.00E-109 72% Tb09.211.4360 NP\_694858.1 221 6.00E-62 25% Tb11.01.5940 NP\_003391.1 558 0 33% Tb11.02.0870 AAA85838.1 79.3 5.00E-20 40% Tb927.2.2240 AAH20569.1 79.3 2.00E-16 29% Tb927.6.2640 NP\_036448.1 360 4.00E-119 42% Tb927.6.4740 AAC50367.1 368 9.00E-113 29% Tb927.7.1190 AAI42947.1 453 2.00E-27 27% Tb11.03.0140 AAH45620.2 258 2.00E-09 38% Tb927.10.8170 NP\_705618.1 134 1.00E-33 28% Tb927.8.3370 AAF36156.1 70.9 2.00E-17 27% Tb11.01.7010 NP\_002262.3 207 2.00E-56 23% Tb11.02.1720 NP\_006382.1 156 9.00E-28 24% Tb11.01.8030 NP\_002262.3 101 3.00E-23 21% Tb11.01.7200 CAA41411.1 59.7 4.00E-10 19% Tb927.7.6320 NP\_001041659.1 146 4.00E-17 28% Tb927.3.4600 NP\_006381.2 65.9 2.00E-12 20% Tb09.160.2360 NP\_003601.1 142 2.00E-39 30% Tb927.6.3870 NP\_001073956.2 75.5 8.00E-18 31% Tb927.7.5760 NP\_037380.1 49.3 6.00E-11 26% Tb10.70.4720 NP\_002256.2 277 2.00E-81 28% Tb927.8.4280 NP\_005787.1 73.6 1.00E-19 31%

**Table 7.** Comparison of the identified parasite proteins with human counterparts at protein sequence

A study reported by [28] showed that despite the high degree of similarity in the primary structure of human and *T. cruzi* ubiquitins, the three amino acid difference is sufficient to distinguish parasite versus host proteins. In this study, a simplified one step purification

level. This comparison involved BLASTp programme.

involved in the nuclear transport.

**Table 6.** Evaluation on protein interaction data obtained from STRING 8.2 database. The evaluation was based on the identified protein domains.

To gain an insight into nuclear transport, understanding on interactions between transport receptors and proteins of the nuclear pore complex (NPC) is essential. According to [27], the fluorescence resonance energy transfer (FRET) can be employed between enhanced cyan and yellow fluorescent proteins (ECFP, EYFP) in living cells in order to explain the transport of receptor through the NPC. A FRET assay has been used to analyze a panel of yeast strains expressing functional receptor--ECFP and nucleoporin-EYFP fusions. Based on this approach, points of contact in the NPC for the related Importin Pse1/Kap121 and Exportin Msn5 were successfully characterized. That study proved the advantage of FRET in mapping dynamic protein interactions in a genetic system. In addition, both Importin and Exportin have overlapping pathways through the NPC. However, our database mining approach did not reveal any functional interaction between Nucleoporin and Karyopherin proteins of *T. brucei*.

#### **3.4. Sequence similarity between parasite proteins and their human counterparts**

Table 6 shows the degree of protein sequence similarity between parasite and human proteins. The similarity search for the sequence was carried out with the help of BLASTp tool. All the parasite proteins of nuclear transport machinery were found to have their counterparts in human. The degree of sequence similarity between parasite proteins and human counterparts range from 19% to 72%. The resulting score values range from 49.3 to 558. Meanwhile, all the identified human proteins contain the same protein domains involved in the nuclear transport.

46 Bioinformatics

**Subject sequence**  **Interacting** 

Tb11.01.5940 Tb927.6.2640 Experiment,Text

Tb11.01.5940 Tb927.6.4740 Text mining,Co-

Tb927.6.2640 Tb927.6.4740 Experiment,Text

was based on the identified protein domains.

proteins of *T. brucei*.

**partner Source Score Confidence** 

Tb927.3.1120 Tb11.01.5940 Experiment 0.45 High Lounsbury and

Tb927.3.1120 Tb927.8.4280 Experiment 0.534 High Fried and Kutay

Tb11.01.5940 Tb11.02.0870 Experiment,Text mining 0.88 High Lounsbury and

mining,Co-expression 0.812 Moderate None

mining,Co-expression 0.976 Moderate None

expression 0.46 Moderate None

Tb927.3.1120 Tb11.02.0870 Experiment,Text mining 0.512 Moderate None

Tb11.02.0870 Tb927.6.2640 Experiment,Text mining 0.453 Moderate None

Tb927.6.2640 Tb09.160.2360 Experiment,Text mining 0.647 Moderate None

Tb927.6.2640 Tb927.8.4280 Experiment,Text mining 0.641 Moderate None Tb10.70.4720 Tb927.8.4280 Experiment,Text mining 0.535 Moderate None Tb927.8.4280 Tb927.8.3370 Experiment 0.502 Moderate None **Table 6.** Evaluation on protein interaction data obtained from STRING 8.2 database. The evaluation

Tb927.6.2640 Tb10.70.4720 Experiment,Text mining 0.769 High Fried and Kutay

To gain an insight into nuclear transport, understanding on interactions between transport receptors and proteins of the nuclear pore complex (NPC) is essential. According to [27], the fluorescence resonance energy transfer (FRET) can be employed between enhanced cyan and yellow fluorescent proteins (ECFP, EYFP) in living cells in order to explain the transport of receptor through the NPC. A FRET assay has been used to analyze a panel of yeast strains expressing functional receptor--ECFP and nucleoporin-EYFP fusions. Based on this approach, points of contact in the NPC for the related Importin Pse1/Kap121 and Exportin Msn5 were successfully characterized. That study proved the advantage of FRET in mapping dynamic protein interactions in a genetic system. In addition, both Importin and Exportin have overlapping pathways through the NPC. However, our database mining approach did not reveal any functional interaction between Nucleoporin and Karyopherin

**3.4. Sequence similarity between parasite proteins and their human counterparts** 

Table 6 shows the degree of protein sequence similarity between parasite and human proteins. The similarity search for the sequence was carried out with the help of BLASTp tool. All the parasite proteins of nuclear transport machinery were found to have their

**level Reference** 

Macara (1997)

(2003)

Macara (1997)

(2003)


**Table 7.** Comparison of the identified parasite proteins with human counterparts at protein sequence level. This comparison involved BLASTp programme.

A study reported by [28] showed that despite the high degree of similarity in the primary structure of human and *T. cruzi* ubiquitins, the three amino acid difference is sufficient to distinguish parasite versus host proteins. In this study, a simplified one step purification

procedure to partially purify *T. cruzi* ubiquitin was performed. Following this preparation, ELISA and Western blots were carried out to show that chagasic sera recognise *T. cruzi* but not human or Leishmania ubiquitin indicating a species-specific response. Thus, it is probable that the *T. brucei* proteins could also be distinguished from human counterparts at primary sequence level by using the immunodetection method.

Investigation on Nuclear Transport of *Trypanosoma brucei*: An *in silico* Approach 49

link which ultimately controls the Ran concentration gradient. Furthermore, phosphorylation of Nup50 which is dependent on ERK, reduces its association with importin-β1 and transportin *in vitro*, and ERK2 is responsible to the oxidant-induced collapse of the Ran gradient [42]. It remains unknown how much modulating individual transport factors contributes to the overall regulation of nuclear trafficking. However, it is noteworthy that the kinase inhibitor PD98059, which targets ERK1/2 and ERK5, significantly increases classical nuclear import, both under normal and stress conditions. Taken together, these results highlight a critical role of ERK activity in nuclear transport, with ERK kinases targeting both soluble factors and nucleoporins [41]. Thus, there is an urgent need to investigate the possible connection between upstream signaling apparatus

We have provided interpretation of heterologous data sets for nuclear transport system of *T. brucei* from various resources. With the availability of protein databases and computer-aided softwares, we are able to explain various functional interactions between identified parasite proteins and how these functional interactions give rise to functionality and behavior of the parasite nuclear transport. This would partially facilitate the exhausted effort to obtain system-level understanding of *T. brucei* pathogenesis. Our *in silico* approach has the potential to speed up the rate of drug target discovery while reducing the need for expensive lab work and clinical trials. The conventional approaches *in vivo* and *in vitro* have high tendencies to produce inefficient results when investigating complex large scale data such as proteins associated with nuclear shuttling of macromolecules across the nuclear envelope. Therefore, the systematic *in silico* approach from this study provides a tremendous opportunity of cost effective drug target discovery for the pharmaceutical

Experimental techniques such as yeast two-hybrid assay and affinity purification combined with mass spectrometry are useful to investigate the possible protein-protein interaction. However, they have their limitations in detecting certain types of interactions. They also have technical problems to scale-up for high-throughput analysis. In conjunction with this, *in silico* approach may solve those problems in inferring the protein function. The scope of experimental data can be expanded to increase the confidence of certain interacting protein pairs with the availability of databases containing *in silico* data such as protein domain and 3D structure. The databases integrate information from various resources such as computational prediction methods and public text collections. Since *in silico* and experimental approaches are complementary to each other, the combination of these different approaches is very useful to obtain a more accurate picture of *T. brucei* nuclear

with nuclear transport components in *T. brucei*.

industry.

transport.

**4.3.** *In silico* **approach for drug target discovery** 

**4.4. Experimental validation of** *in silico* **data** 
