**4. Study strain variation of** *E. histolytica*

#### **4.1 Isoenzymes or zymodeme analysis**

The first observation of the variation within *E. histolytica* came from isoenzyme studies by Sargeaunt and his collegues (1978) [34]. These studies not only discriminate the 'non-pathogenic' and 'pathogenic' E. dispar and *E. histolytica*, respectively, but also identified variations within each group. A zymodeme can be described as a group of amoeba variants or strains that sharing the same electrophoretic pattern for several enzymes. Zymodemes consist of electrophoretic patterns of malic enzyme, glucose phosphate isomerase, hexokinase and phosphoglucomutase isoenzyme [35]. Since then, a total of 24 different zymodemes have been identified, 21 of which are from human isolates (9 of *E. histolytica* and 12 of *E. dispar*) [35]. However, the presence of starch in the medium influences the patterns of most variable zymodeme and upon removal of bacterial floras from the medium many zymodemes disappear, suggesting that at least some of the bands are of bacterial rather than amoebic origin [36, 37]. Accordingly, only four different stable isoenzyme patterns remains, three for *E. histolytica* (II, XIV, and XIX) and one for *E. dispar* (I) [36]. Isoenzyme (zymodeme) analysis was the classical gold standard method to differentiate *E. histolytica* subgroups prior to the development of DNAbased techniques. Zymodeme or isoenzyme analysis has a number of drawbacks, such as the difficulty of performing the test and targeting only a limited diversity of strains. It is a time-consuming technique; it depends on the growing of amoebae in culture and requires harvesting a large number of cells for the enzyme analysis. Cultivation of amoebae is not always successful; it may result in selection bias, and consequently, one species or strain may outgrow the other, which is not preferable when studying zymodemes. Zymodeme analysis is not easily incorporated into routine clinical laboratory work because of the expertise required to culture the parasites, the complexity of the diagnostic process and the cost. Isoenzyme analysis has been replaced by DNA-based methods as the method of choice for studying *Entamoeba* species [38].

### **4.2 PCR-restriction fragment length polymorphism**

Polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) analysis is molecular-based technique to differentiate species within the genus or to show the genetic diversity of certain species or strain. The method involves the digestion of PCR products of specific gene with restriction enzymes to produce fragments of different numbers and sizes based upon the differences in the number and location of restriction sites present in the amplicon. The selections of the target genes and restriction enzymes are based on the way that, for instance, different species of certain genus produce the same-size amplicon but show different banding size or patterns for different species by gel electrophoresis after been digested with specific restriction enzyme [39].

Molecular typing based on polymorphic genetic loci has been confirmed to aid in precise examination and identification of population structures of *E. histolytica* isolates in nature [40]. However, the differences in the clinical outcome of the *E. histolytica* infections may be due to genetic variability and inter-strain-related virulence present in the genomic sequence of *E. histolytica* [41].

#### **4.3 Serine-rich** *E. histolytica* **protein (***SREHP***)**

The majority of the biological functions are initiated through proteins, identifying certain proteins and estimating their functions enabling their potential use as markers for studying population nature and genetic diversity. The highest immunogenic protein among all identified proteins in *Entamoeba* species is *SREHP*, which is an extracellular protein with a surface antigen exposed for host immune system [42]. *SREHP* plays a vital role in the parasite virulence through several mechanisms, such as participation in signalling pathway *via* possessing signal peptide; participation in the process of phosphorylation and protein modification; having peptidase activity and representing as potent chemoattractant for trophozoites [43]. The presence of multiple tandem-repeat sequences in the central region of *SREHP gene*s can be used to study inter-strain variation and genetic polymorphism in *E. histolytica* [44].

Furthermore, *SREHP* is critical in phagocytosis and immune evasion. As it also plays a role in adherence to apoptotic cells, it influences the virulence of the parasite [43]. Furthermore, *SREHP*'s polymorphism—the variable numbers of tandem repeats—might mediate a wider adherence range or variable affinity, suggesting that the tandem repeats are binding domains. Thus, this polymorphism plays a vital role in the pathogenicity of different *E. histolytica* strains [43, 45, 46]. The presence of polymorphisms in *SREHP* between homologous loci on allelic chromosomes could also be due to either the presence of more than one strain of *E. histolytica* in a single sample or the presence of repeated loci at several locations in the genome*,* each resulting in a different PCR product [28]. The presence of a high level of genetic diversity may reflect the existence of several different clones in a limited geographic area and/or rapid production of *SREHP* repeats. Evidence has suggested that diversity among *E. histolytica* strains based on gene polymorphism may result from new haplotype creation due to the shuffling of alleles during genetic recombination and reassortment [47]. Notably, the presence of sexual reproduction in the natural population of *E. histolytica* has been suggested based on the discovery of complement genes that are necessary for meiosis in the *E. histolytica* genome. Sexual reproduction is enormously important in gene exchange (e.g., in drug resistance and virulence genes) and consequently generates genotypes that spread rapidly. This may also identify the linkage disequilibrium patterns among genetic markers as that of *SREHP gene* polymorphism [26].

#### **4.4 Short tandem repeats (STRs)**

The tRNA genes of *E. histolytica* are highly polymorphic that present in clusters of one to five distinct types, interposed with non-coding short tandem repeats (STRs) and these clusters are in turn repeated to form long arrays. The arrays make up about 13%of the genome. Since the STR regions displayed a high degree of intra-specific variation regarding type, repeat number and the arrangement patterns between tRNA array units, therefore these features make STR very useful genetic tool for quantification of Entamoeba evolutionary divergence and assessing virulence of individual *E. histolytica* strains [48].

A number of primers lying in the non-coding regions were used for the identification of *E. histolytica* strain based on tRNA-associated STRs. Later, primers are designed to enable concurrent differentiation and strain typing of *E. histolytica* and *E. dispar* [49]. These markers have shown to be stable and suitable for tracking the transmission of a known strain within an individual, family unit, and/or community [50].

Moreover, it has been documented that the same strain of *E. histolytica* was never identified in epidemiologically unlinked patients, which reflects a remarkable degree of genetic diversity within this relatively limited geographic area. In certain endemic regions with *E. histolytica* and *E. dispar*, the utilization of species-specific primers is of great importance because a significant number of individuals could be infected with both species.

#### **4.5 Chitinase**

The repeat-containing region of *Chitinase* gene of *E. histolytica* is least polymorphic as compared with tRNA-linked loci and SREHP; therefore, this gene is less commonly used for strain differentiation and studying genetic diversity [46]. *Chitinase* gene repeats ranged from 84 to 252 nucleotides consistent with four heptapeptide repeats (28 amino acids) to 12 heptapeptide repeats (84 amino acids). Studies have revealed that the nucleotide of *chitinase* gene repeat of certain isolate was identical to that of the standard strain, suggesting that the similarity could be due to chance convergence rather than a common ancestor [51].

#### **4.6 Use of retrotransposons as genetic markers**

#### *4.6.1 Transposon display*

The genomic sequence of *E. histolytica* displays abundant non-LTR (long terminal repeats) retrotransposons that are dispersed uniformly throughout the genome. In *Entamoeba*, the term long interspersed elements (EhLINEs) is used to describe the autonomous non-long terminal repeat retrotransposons (LTR) elements, while short interspersed elements (EhSINEs) referred to short non-autonomous elements. Several families of EhLINEs and EhSINEs have been detected in *E. histolytica* [28, 52, 53]. It has been documented that both EhLINEs/EhSINEs account for approximately 6% of the *Entamoeba* genome. Moreover, EhLINEs and EhSINEs are present on all chromosomes and estimated to have 140 copies of these elements per genome, which appear to be in non-telomeric position [54]. It has been suggested that EhLINEs and EhSINEs were inserted at different genomic location during the course of evolution in various strains; therefore, they can be utilized as genetic markers for strain identification of *E. histolytica* [55].

Amplified fragment length polymorphism (AFLP) is a highly sensitive technique for detecting polymorphisms in DNA, the method based on restriction enzymes that cut the DNA and adaptors attached to the ends of the fragments. The DNA fragments are then amplified using PCR, and their varying lengths can then be visualized on gel after been electrophoresed. Transposon display (TD) is a modified AFLP technique uses specific primer that anchors in a transposon to simultaneously identify up to several hundred markers in the genome [56, 57]. TD consists of amplification of sequences flanking the transposon by ligation-mediated PCR. The resulting fragments are locus-specific and can be analysed by polyacrylamide gel electrophoresis. Transposon display has been used to investigate and explore the behaviour and stability of transposable elements in plants [58]. The technique has also been effectively used to display yeast mutants conferring quantitative phenotypes [59].

Laboratory strains of *E. histolytica* grown axenically showed different patterns of TD when primers targeted various regions of EhSINE1 [60]. TD technique has many potential advantages over other methods. The technique is being developed to *Genetic Variability of* Entamoeba histolytica *Strains DOI: http://dx.doi.org/10.5772/intechopen.106828*

study DNA isolated directly from both ALA pus and faecal samples. Furthermore, more than one polymorphic band is yielded by TD as compared with single-band polymorphism in a normal PCR. It is low-cost technique since the method based on only one reaction through single-specific primer with capacity to display a whole range of bands in each strain. The use of transposon-specific primer makes it more sensitive and reliable than AFLP. Thus, this technique could be utilized in performing significant epidemiological studies and large-scale molecular typing of this parasite. Intensive analysis and studying of the bands may help in understanding the dynamics of EhSINE retrotransposition in various strains of *E. histolytica*.

#### *4.6.2 REP PCR*

The repetitive element palindromic-polymerase chain reaction (REP PCR) was first devised for strain and serotype identification in enteric bacteria [61]. REP PCR is commonly used in clinical laboratories for detecting strains of bacteria, fungal and dermatophytes [62]. The method depends on the targeting of interspersed repetitive consensus sequences in the genome that enables amplification of diverse-sized DNA fragments and may be present in both the orientations on the chromosome.

The designed PCR primers must target 'read outward' from the repeats, amplifying the region between two such elements in either direction. These primers are complementary and attached to dispersed repeated sequences. This may result in varying band patterns when the repetitive sequences are located in different positions in the genome of different strains. This principle was employed successfully using EhLINEs/EhSINEs dispersed in the *E. histolytica* genome. For this purpose, several sets of primers were designed from EhLINEs and EhSINEs to involve the entire stretch of each element. Each specific strain generated a unique profile of REP PCR fingerprint consisting of multiple bands of different sizes [63]. This could be used to establish relationships between different strains. This procedure can provide extended variation between the strains than the tandem repeat technique since using this, many loci can be investigated simultaneously.

#### **4.7 Single-nucleotide polymorphism**

Single-nucleotide polymorphism (SNPs) is the simplest form of variation in the genomic DNA sequence on bases of single nucleotide. To investigate the genetic diversity of *E. histolytica* strains, 9077 bp have been sequenced from 14 isolates [64]. It was proposed that coding and non-coding regions are challenged to several selection pressures and could be related to specific clinical outcome of the disease. A statistically significant difference was recorded in the presence of SNPs with higher rates in non-coding regions as compared with coding regions. The SNP markers are of great importance in studying of evolutionary analyses because these markers are evolutionarily stable and unlikely to mutate. Nevertheless, regarding *Entamoeba*, this method is still in the early stages and needs to be further explored. The procedure also requires large-scale sequencing of the PCR products.

#### **4.8 Microarray**

Microarray is a genomic tool used to detect the expression of thousands of genes simultaneously from a sample. Microarray assay has been used successfully for the identification of *Entamoeba* and other water-borne protozoa [65]. Genotyping

of *E. histolytica* based on microarray assay can be useful for the study of genetic diversity and potential genotypic-phenotypic relations of the clinical isolates. The technique involves powerful DNA amplification that combines with subsequent hybridization to oligonucleotide probes specific for several target sequences. The distinct importance of this detection method is that it can investigate thousands of genes all together at once. In addition to the study of the of inter-strain variability, many biological properties of genes can be revealed through microarray technique, such as, detection of genes regulated by drug exposure, tissue invasion and developmental changes. For studying genetic diversity, sequenced genomic DNA (gDNA) clones of *E. histolytica* (HM-1:IMSS) were used to generate an 11,328 clone genomic DNA microarray; all clones on the array are beneficial for analysis since the genetic differences in coding and non-coding regions are equally important in determining the genotype of a strain [66].

Besides conserved genes, like rRNA and *hsp*, which have been extensively used as diagnostic markers, several genus- and species-specific genes such as the cysteine protease gene (*cp1*) were selected as amplification targets to avoid possible crosshybridization and co-amplification issues.

The DNA microarray assay has been used in large-scale expression profiling of *Entamoeba* species/strains, which permits the study of genetic and expression differences that may associate with parasite virulence. However, more studies are needed to confirm the significance of these genes in amoebic virulence and pathogenesis. The main restrictions of the technique for diagnostic uses are the cost-intensive, its robustness and labour inputs [66].

#### **5. Conclusions**

Several powerful molecular techniques and genetic biomarker are available nowadays, such as chitinase, SREHP polymorphisms, SNPs, STRs, retrotransposons and microarray all provide informative understanding to study parasite biology. For examples, studying STR regions in tRNA genes of *E. histolytica* revealed interspecies variation that may related to the clinical outcome of the disease and may provide evidence for the geographic origin of the infection [67]. Moreover, studies documented the presence of high genetic diversity among *E. histolytica* strains using PCR-RFLP analysis on *SREHP* gene and addressed the association of the strains with the clinical presentation of the disease [8, 44, 68]. The investigations based on *chitinase* gene sequence repeat suggested the existence of strains of *E. histolytica* that could be non-pathogenic, invasive for the intestinal mucosa or invasive for liver tissue [46, 69, 70]. Another genetic marker, TD technique was developed to identify strains of *E. histolytica* using retrotransposon EhSINE1, and it showed to be cost effective, sensitive and reliable procedure that successfully characterizes the strains based on geographical distribution [60]. A polymorphism study showed 14 genotypic patterns among the 14 isolates of *E. histolytica* using SNPs and recorded extensive diversity between coding and non-coding regions [64]. Microarray analysis was applied to distinguishing the symptomatic and asymptomatic strains of *E. histolytica* through investigating the gene expression profile of both virulent strain (HM1-IMSS) and non-virulent strain (Rahman) [71]. The study inspecting a total of 54 hybridizations and statistical analysis was performed for each gene; the most common differentially regulated genes were in carbohydrate metabolism, virulence-related functions (CPs, Gal/GalNAc lectin), signal transduction pathways, antibacterial activity (AIG1) and

transcription factors [71]. Consequently, all these tools and biomarkers can aid in determining the role of *E. histolytica* genetics in the outcome of infection and can be used for population-based studies as well as to develop an improved evolutionary and phylogenetic framework for the parasite.
