**8.2. Heterogeneity of populations and phase variation**

Phase variation is used by various bacterial species to generate diversity within a population. Phase variation is a process of change in the expression of the epitopes of the cellular surface of the bacteria [19]. Bacterial cells may phenotypically vary even within a clone population, which allows them to adapt to their environment or even to evade the immune response of the host. Phase variation is a phenomenon that generates phenotypic heterogeneity within a population by means of gene regulation, which changes genes from a state of expression in which they are "turned on" to a state of non-expression in which they are "turned off". The state of expression is inheritable, reversible and affects the same phenotype.

Antigenic variation is referred to as the expression of an alternative form of an antigen of the cellular surface, such as polysaccharides, lipoproteins and type IV pili, which at the molecular level, share characteristics with the phase variation mechanisms. During this adaptation process, the bacteria display reversible phenotypic changes as a result of genetic changes or epigenetic alterations in a specific locus. The mechanisms which allow for phase variation are: genetics (Slipped-strand mispairing, recombination) and epigenetic (DNA methylation) [47,48].

#### a. Genetic mechanisms

Numerous studies have been performed to reveal the genetic basis of the variation of the O antigen. In certain cases, the variability in the expression of the genes is regulated by elements in *cis*, which cause changes in the composition of the structure of the antigens of the bacterial surface. Certain pathogens change the structure of the O antigen through the acquisition of phage genetic material followed by recombination processes [7].

#### i. Slipped-strand mispairing

One of the mechanisms that regulate phase variation at the molecular level is the slipping of one of the DNA strands, which causes a mispairing between the daughter strand and the parent strand during the replication of the DNA. This process is known as slipped-strand mispairing (SSM). The genomic sequences susceptible to SSM are those which contain short repetitions, microsatellites or a variable number of in tandem repetitions, which may cause a change in the expression of genes at the level of the transcription processes or translation according to the location of the repeated sequence in relation to the promoter and the codifying sequence [49]. At a transcriptional SSM, this may lead to the activation or deactivation of the promoter region of the target gene, as occurs in *H. influenzae (hif*A/B). At a translational level, SSM may affect the codifying region, as for example, with the genes involved in the biosynthesis of the LPS of *H. influenzae* and *Neisseria* spp. [50].

Within the genome of *H. pylori*, certain loci have been identified with repeated sequences of a single nucleotide or a pair of nucleotides. Several of these repetitions are within the open reading frames (ORFs) (**Figure 8)**. In the transcription process, the mispairing between nucleotides, when one of the strands of DNA slips over another chain, causes the "gain" or "loss" of a unit in the reading frame, which leads to the loss of the start codon or mutations in the proteins. Therefore, SSM increases the genetic variability of *H. pylori.*  Similar repetitive sequences have been found in other microorganisms, such as *H. influenzae* [51].

One group of genes that generate phase variation are those that code for enzymes that intervene in the biosynthesis of LPS, which may cause variants of the gene product in the same bacterial population. The LPS of the majority of *H. pylori* strains contains complex carbohydrates known as the Lewis antigen. Type 1 (Lea,Leb) and type 2 (Lex,Ley ) Lewis antigens are epitopes of fucosylated oligosaccharides. At least 80% of the strains of *H. pylori* express type 2. Some research on the antigenic determinants involved in the biosynthesis of the Lewis antigen have allowed for the identification of the fucosyltransferases (FucTs) that are involved in the formation of these antigens [19].

The genes that code for the FucTs have elements in *cis* that are differentiated by containing poliA and poliC sequences of different lengths that mediate SSM. The size of these sequences regulates the activation and deactivation of the genes of the FucTs. However, in some cases, such as in *H. pylori* UA948, the inhibition of the expression of *futB* is due to mutations outside of the hypervariable region (the elimination of 80 nucleotides in the promoter region) [52].


**Figure 8.** The nucleotide sequence of the central region of the Hp *fuc*T2 gene. The sequences show the characteristics (simple repetitions) responsible for the ORF in the *H. pylori* J99 strain and the 26695 type variant. Due to the number of different repetitions of the residues of poli C, start sequences of an ORF of *fucT2* of the 26695 strain are found in the TGA stop codon (marked with asterisks) shortly after reading 1 (HP0093), which is the same as the marker of the *fuc*T J99 reading. The three supposed motifs X XXY YYZ are highlighted in bold and are underlined.

The α 1,2-fucosyltransferase (FutC) catalyses the addition of fucose in the conversion process of LeX to LeY. Sanabria-Valentín et al., through *in vitro* and *in vivo* studies, confirmed the main function of *futC* slipped-strand mispairing in the variation of the Le antigen [53].

The *futC* gene includes an internal Shine-Dalgarno type sequence and a heptamer (AAAAAAG) followed by a loop structure. During translation, when the ribosomes are in the heptameric sequence of the mRNA, a phase shift occurs in the reading frame. The presence of Shine-Dalgarno type sequences and the loop structure accelerate the translation process by an interaction with the ribosome components [19].

#### ii. Phage recombination: seroconversion

86 The Complex World of Polysaccharides

i. Slipped-strand mispairing

*influenzae* [51].

promoter region) [52].

Numerous studies have been performed to reveal the genetic basis of the variation of the O antigen. In certain cases, the variability in the expression of the genes is regulated by elements in *cis*, which cause changes in the composition of the structure of the antigens of the bacterial surface. Certain pathogens change the structure of the O antigen through the

One of the mechanisms that regulate phase variation at the molecular level is the slipping of one of the DNA strands, which causes a mispairing between the daughter strand and the parent strand during the replication of the DNA. This process is known as slipped-strand mispairing (SSM). The genomic sequences susceptible to SSM are those which contain short repetitions, microsatellites or a variable number of in tandem repetitions, which may cause a change in the expression of genes at the level of the transcription processes or translation according to the location of the repeated sequence in relation to the promoter and the codifying sequence [49]. At a transcriptional SSM, this may lead to the activation or deactivation of the promoter region of the target gene, as occurs in *H. influenzae (hif*A/B). At a translational level, SSM may affect the codifying region, as for example, with the genes

Within the genome of *H. pylori*, certain loci have been identified with repeated sequences of a single nucleotide or a pair of nucleotides. Several of these repetitions are within the open reading frames (ORFs) (**Figure 8)**. In the transcription process, the mispairing between nucleotides, when one of the strands of DNA slips over another chain, causes the "gain" or "loss" of a unit in the reading frame, which leads to the loss of the start codon or mutations in the proteins. Therefore, SSM increases the genetic variability of *H. pylori.*  Similar repetitive sequences have been found in other microorganisms, such as *H.* 

One group of genes that generate phase variation are those that code for enzymes that intervene in the biosynthesis of LPS, which may cause variants of the gene product in the same bacterial population. The LPS of the majority of *H. pylori* strains contains complex carbohydrates known as the Lewis antigen. Type 1 (Lea,Leb) and type 2 (Lex,Ley ) Lewis antigens are epitopes of fucosylated oligosaccharides. At least 80% of the strains of *H. pylori* express type 2. Some research on the antigenic determinants involved in the biosynthesis of the Lewis antigen have allowed for the identification of the fucosyltransferases (FucTs) that

The genes that code for the FucTs have elements in *cis* that are differentiated by containing poliA and poliC sequences of different lengths that mediate SSM. The size of these sequences regulates the activation and deactivation of the genes of the FucTs. However, in some cases, such as in *H. pylori* UA948, the inhibition of the expression of *futB* is due to mutations outside of the hypervariable region (the elimination of 80 nucleotides in the

acquisition of phage genetic material followed by recombination processes [7].

involved in the biosynthesis of the LPS of *H. influenzae* and *Neisseria* spp. [50].

are involved in the formation of these antigens [19].

a. Genetic mechanisms

The O antigen is a determinant of the virulence necessary for the pathogenicity of *S. flexneri*. The O antigen of *S. flexneri* is called the Y serotype and consists of repeated units of a tetrasaccharide of N-acetylglucosamine-rhamnose I-rhamnose II, rhamnose III, which forms the structure of the vertebral column of the polysaccharide unit of all the serotypes of *S. flexneri*, except the 6 and 6a serotypes. There are 13 serotypes, which are differentiated by the addition of glucosyl groups or acetyl residue to the different sugar molecules in the tetrasaccharide unit [54,55].

The temperate phages of *S. exneri* play an important role in the processes of seroconversion (antigenic variation). The bacteriophages SfV, SfII, SfX, the cryptic prophages SfI and SfIV code for the *gtr* genes, which are the proteins involved in the glycosylation of the O antigen. When these phagic elements lysogenise, a conversion of Y serotype strains into 5a, 2a, X, 1a and 4a serotypes, respectively, occurs (**Figure 9**).

**Figure 9.** The chemical composition of the different serotypes of *S. flexneri*. The serotype Y formed by repeated units of tetrasaccharide N-acetylglucosamine-rhamnose I–rhamnose, II–rhamnose III. The serotypes are differentiated by the bonding of the glucosyl or the acetyl group. Adapted from [54].

The bacteriophage codes for an acetyltransferase and produces a conversion to the 3b serotype. The lysogenisation of the SfV bacteriophage produces modifications of the type V O antigen, which involves the addition of a glycosyl group through a bond of α 1,3 to the rhamnose II of the repeated tetrasaccharide unit. Similar to other phages that intervene in the glycosylation process, the genes involved in the conversion of serotypes are located immediate downstream from the *attP* locus, which is preceded by the genes *int* and *xis*. These phages are inserted into the *thrW* locus of the host [56].

The *gtr* genes of the temperate phages that code the glycosyltransferases are located in the genome of the phage, downstream from the *attP* locus. These genes are found in a cluster of three genes: *gtr*A, *gtr*B and *gtr* (type), which are cotranscribed. *grt*A and *grt*B are homologous and are interchangeable among the serotypes of *S. flexneri*. The Gtr (type) protein is specific for the formation of the glucosyl bond in a particular sugar molecule of the O antigen [54].

It has been suggested that GtrB catalyses the transfer of glucose from UDP-glucose to bactoprenol phosphate to form UndP--glucose in the cytoplasm. This molecule is subsequently translocated by GtrA in the periplasm before the glucosyl residue is joined by Gtr(type) for the growth of the O antigen unit [57].

The genes *gtr*V and *grt*X code for the glycosyltransferases GtrV and GtrX, respectively, which are membrane proteins that catalyse the transference of glucosyl residues through the bonding of the 1,3 rhamnose II and rhamnose I of the O antigen unit. This intervenes in the conversion of the serotype of *S. flexneri* from Y to the 5a serotype and the X serotype, respectively. GtrlV adds glucosyl residues to N-acetylglucosamine of the repeated unit of the O antigen through an 1,6 bond, converting the Y serotype into the 4a serotype [57,58].

b. Epigenetic mechanisms: DNA methylation

88 The Complex World of Polysaccharides

and 4a serotypes, respectively, occurs (**Figure 9**).

The temperate phages of *S. exneri* play an important role in the processes of seroconversion (antigenic variation). The bacteriophages SfV, SfII, SfX, the cryptic prophages SfI and SfIV code for the *gtr* genes, which are the proteins involved in the glycosylation of the O antigen. When these phagic elements lysogenise, a conversion of Y serotype strains into 5a, 2a, X, 1a

**Figure 9.** The chemical composition of the different serotypes of *S. flexneri*. The serotype Y formed by repeated units of tetrasaccharide N-acetylglucosamine-rhamnose I–rhamnose, II–rhamnose III. The serotypes are differentiated by the bonding of the glucosyl or the acetyl group. Adapted from [54].

The bacteriophage codes for an acetyltransferase and produces a conversion to the 3b serotype. The lysogenisation of the SfV bacteriophage produces modifications of the type V O antigen, which involves the addition of a glycosyl group through a bond of α 1,3 to the rhamnose II of the repeated tetrasaccharide unit. Similar to other phages that intervene in the glycosylation process, the genes involved in the conversion of serotypes are located immediate downstream from the *attP* locus, which is preceded by the genes *int* and *xis*.

The *gtr* genes of the temperate phages that code the glycosyltransferases are located in the genome of the phage, downstream from the *attP* locus. These genes are found in a cluster of three genes: *gtr*A, *gtr*B and *gtr* (type), which are cotranscribed. *grt*A and *grt*B are

These phages are inserted into the *thrW* locus of the host [56].

The term epigenetic is defined as "inheritable changes in genetic expression that occur without alterations in the DNA nucleotide sequence". Thus, an epigenetic mechanism may be understood as a complex system to use the genetic information selectively by activating and deactivating various functional genes. Epigenetic modifications may imply methylation of cytosine residue in the DNA. DNA methylation has been observed in various bacterial species. In bacteria, methylation is part of a defence mechanism to reduce the amount of horizontal genetic transference among species. DNA methylation constitutes an epigenetic marker that identifies the template strand during the replication of the DNA. Generally, the methylation of the regulatory elements of genes, such as promoters, enhancers, insulators and repressors, suppresses this function [59].

The modifications of the O antigen that may affect the serotype are related to those that contain the operon that code for the glycosyltransferases (*gtr*).

Within a clone population of *S. enterica* serovar Typhimurium, the lysogenic phage P22 may lead to variability of the O antigen. The phase variation of the gtr (glycosylation of the O antigen) indirectly contributes to the diversity of the serotypes of *Salmonella*. The cluster that codes for the glycosyltransferases consists of three genes: *gtrA* codes for a membrane protein, *gtrB* codes for a glycosyl translocase and *gtrC* codes for the glycosyltransferase, which mediates the bonding of glucose to the O antigen [58].

Through studies based on the analysis of gene expression, the presence of mutations, the level of DNA methylation and the *in vitro* interaction of DNA-proteins, Broadbent et al., demonstrated that the Dam methyltransferase proteins together with OxyR regulate phase variation at the level of the *gtr* P22 promoter in *S.enterica* serovar Typhimurium. OxyR is an activator or a repressor of the *gtr* system, which depends on the alternative side (GATC sequences) to which OxyR bonds in the *gtr* P22 regulatory region (**Figure 10)**. The bonding of OxyR is inhibited by the methylation of the Dam target sequence, and the state of expression of the system is inheritable [60].

**Figure 10.** Model of the phase variation of the regulatory region of the *gtr* operon. Illustration of the interaction of DNA proteins in the regulatory region of the *grt* P22 operon, which consists of methylation and demethylation in the GATC sequence in the activated and deactivated phases. Adapted from [60].

Understanding the variation of the LPS structure is important because the composition and the length of the O antigen chain may be an indicator of the virulence, and this characteristic often differs within a single bacterial strain [7].
