**5. Assembly of the EPS repeating unit**

104 The Complex World of Polysaccharides

VF39 strain-specific plasmids.

and tightly regulated control.

*faba* or *Trifolium pratense* roots, respectively [49,52].

mutations in GL3 and GL1 also target the regulatory genes.

**4. Synthesis of nucleotide-sugar precursors** 

The *pssA* gene controlling the first step in the repeating unit assembly is localized approximately at 90 kb from the Pss-I cluster. The gene was shown to be transcribed as a monocistronic mRNA [45]. Upstream of the *pssA* is the *pssB* gene encoding inositol monophosphatase [49-51]. Effects of mutations within the *pssB* gene on the synthesis of EPS and symbiotic behavior have been analyzed in the *Rlv* VF39 and *Rlt* TA1 backgrounds and have been shown to be contradictory. In *Rlv* VF39 the *pssB* mutants retained the ability to produce EPS in amounts equal to those of the wild-type strain. In *Rlt* TA1 the *pssB* inactivation displayed an increased overall production of EPS versus the wild-type strain, and alterations in the LPS PAGE-banding pattern and the O-antigen sugar composition [51,52]. Nevertheless, *pssB* mutants of both strains elicited non-effective nodules on the *Vicia* 

In the case of GL1 and GL3 mutants we have not carried out the extended chromosomal walking around the Tn*5* insertions, but only short genome sequences flanking Tn*5* were determined. We localized the Tn*5* insertion in the GL3 mutant within some small *orf* (213 bp only). The ortholog of this *orf* (RL2260) was found in the chromosome sequence of *Rlv* 3841 located far away from the Pss-I cluster. It encodes a 7.2 kDa positively charged protein (pI 10.8), which is conserved in numerous bacteria. As for the GL1 mutation, we have not been able to map it up to now. Probably, this mutation targets a gene located in one of the *Rlv*

Taking together these data, one can conclude that the core set of EPS synthesizing genes are clustered in the chromosomal Pss-I region. Clustering of genes involved in EPS biosynthesis is not unique for *R*. *leguminosarum*, but is widespread in different polysaccharide producing bacteria [53,54]. Such a type of genes organization could reflect their coordinated expression

Evidently, EPS biosynthesis is linked with other metabolic pathways in the cell. Therefore, localization of mutations affecting EPS production distant from the Pss-I region can reflect this linkage. For example, the *pssA* gene is located in another chromosomal region probably due to involvement in the initiation of not only EPS synthesis, but the synthesis of the other polysaccharides. Recently, several regulatory genes influencing EPS production (*psiA*, *psrA*, *exoR*, *expR*, *rosR, praR*) were found to be localized either in different regions of the chromosome or at the endogenious plasmids (reviewed by [24]). We can not exclude that

According to the sugar composition of acidic EPS, biosynthesis of its repeating units requires nucleotide sugars: UDP-glucose, UDP-glucuronic acid and UDP-galactose that are formed by central carbon metabolism. Several genes involved in synthesis of nucleotidesugar precursors were identified in *R*. *leguminosarum* genomes. The *exoB* encodes UDPglucose 4-epimerase, responsible for UDP-galactose production [55]. Mutations in this gene have a pleiotropic effect and influence the synthesis of different classes of galactose As mentioned above, heteropolysaccharides are polymers consisting of identical repeating units which can vary only by the distribution of modifying groups per monomer. Obviously, their assembly has to be stringently controlled. The biosynthetic pathways for the number of heteropolysaccharides have been elucidated [54,59]. It is evident from the obtained data that the unique structure of the repeating unit is governed by the specificity of nonprocessive glycosyltransferase (GT) catalyzing a certain step of the biosynthesis. This specificity is likely based on the ability of GT to recognize the sugar residue to be transferred, the acceptor, and the linkage to be formed. At present the complete pathway of EPS biosynthesis has not been determined for any of *R*. *leguminosarum* biovars, and only some individual steps have been characterized. Nevertheless, a detailed consideration even of these fragmentary data together with a comparative analysis of the available *R*. *leguminosarum* genome sequences allowed us to predict the genetic control of all steps in the repeating unit assembly at least for *Rlv* VF39 and closely related strains.

It was shown previously that assembly of a repeating unit of *R. leguminosarum* EPS starts with the addition of a glucose residue to the lipid carrier [60]. Biochemical studies and complementation analysis provided strong evidence that this reaction is conducted by the *pssA* gene product [36,61]. The *pssA* gene encodes the integral inner membrane protein UDP-glucose:polyprenyl-phosphate glucosephosphotransferase, belonging to a family of diverse bacterial sugar transferases. Members of this family catalyze the formation of a phosphodiester bond between polyprenol phosphate and hexoso-1-phosphate, which is donated by nucleotide sugars. The *pssA* gene is highly conserved in *R*. *leguminosarum* biovars and *R. etli* [44-46,50]. The *pssA* mutants do not produce EPS and as a consequence impaired the normal development of the nitrogen-fixing nodules on the appropriate plant hosts and the formation of biofilms both *in vitro* and on root hairs [5,45,46,50]. Some contradictory results exist concerning the influence of *pssA* mutations on the synthesis of CPS which displays similar structure to EPS in terms of glycosyl composition [5,30,62]. In the *Rlt* 5599 genetic background, *pssA* mutants still produce CPS at the level similar to that of a wild-type strain [46]. In contrast, both EPS and CPS are absent in the *pssA* mutant of *Rlv* 3841 [5], which indicates that the PssA protein might be involved also in the initiation of the CPS synthesis. It has been shown that expression of *pssA* depends on the environmental factors such as phosphate and ammonium concentrations and also on root exudates [63].

Bossio and co-workers have demonstrated that subsequent to addition of glucose to isoprenylpyrophosphate, two glucuronic acid residues are attached [64]. The attachment of the first GlcA is catalyzed by the *pssE* and *pssD* gene products [47,61]. This conclusion is based on the results of *in vivo* reciprocal complementation between the *pssED* and *spsK* genes of the *Sphingomonas* strain S88 and the *in vitro* sugar incorporation studies. Recently we confirmed directly the function of PssED. The corresponding mRNAs were translated in a wheat-germ cell-free system in the presence of liposomes obtained from *Rlv* VF39 phospholipids. The resulting proteoliposomes were used as an enzyme source in experiments on PssDE specificity (Ivashina et al., unpublished).

Both, PssD and PssE display similarity to GT family 28 (CAZy database, http://www.cazy.org/). Notably, the amino acid sequences of PssD and PssE are similar to the N-terminal and C-terminal halves of the glucuronosyl-(ß1,4)-glucosyltransferase SpsK, respectively. This observation has led us to the conclusion that PssD and PssE represent two subunits of the same enzyme. The proposed catalytic domain is localized in the peripheric inner membrane of the *pssE* subunit, in contrast to PssD, which was shown to be an integral inner membrane protein. We have also found that the integration of PssE into a membrane or liposomes strongly depended on the presence of PssD and *vice versa* (Ivashina et al, unpublished). It is interesting to note that the same was observed in the case of yeast proteins Alg13 and Alg14. It was found that Alg14 was needed for the correct positioning of Alg13 on the cytosolic face of the endoplasmic reticulum membrane mediating the formation of the active UDP-N-acetylglucosamine transferase complex [65,66]. Mutations in the *pssE* and *pssD* genes fully abolished EPS production and as a consequence resulted in defects of nodule infection [67-69].

PssC belongs to the GT family 2 containing a variety of inverting glycosyltransferases (enzymes that form glycosidic bonds with stereochemistry opposite to that of the glycosyl donor) that utilize a diverse range of nucleotide-sugar donors and participate in the synthesis of various types of polysaccharides [70]. This GT was assigned by Pollock and coworkers to glucuronosyl-(β-1,4)-glucuronosyltransferase, which catalyzed the attachment of the third sugar residue (GlcA) to the disaccharide (GlcA-β-1,4-Glc) lipid-linked intermediate with the formation of the ,1-4 glycosidic bond [61]. This conclusion was based on the comparative data on the genetic control of the first three steps of *R. leguminosarum* EPS and the *Sphingomonas* strain S88 sphingan assembly. Obviously, these data can not be considered as direct evidence, and conclusion on the PssC assignment needs additional experimental proofs.

Several *pssC* mutations have been characterized in various *R*. *leguminosarum* biovars backgrounds to date [46,47]. All of them were mapped at the N-terminus of PssC and resulted in the decreased amount (27-38%) of EPS in culture supernatants. However, structural analysis of EPSs secreted by these mutants showed them to be identical to that of the wild-type strains [46,47]. We proposed that the initiation of translation of *pssC* could be realized from the second potential start codon GTG located downstream from the spots of mutations and therefore leading to the synthesis of protein retaining enzymatic activity. In fact, Western blot analysis with antibodies against PssC demonstrated the synthesis of a truncated protein in the *pssC* mutant (Ivashina et al., unpublished). Attempts to introduce mutations into the central part of *pssC* in the *Rlv* VF39 background were unsuccessful. However, it was easy to homogenote *pssC* in the same strain carrying a mutation in the *pssD* gene, which failed to produce EPS. These results pointed to the detrimental effect of such mutations most probably due to the accumulation of lipid-linked intermediates in the cytoplasmic membrane and as a result to inability of the lipid carrier to be released for other essential cellular functions. The data obtained with the use of different genetic systems led to the same conclusions [61].

106 The Complex World of Polysaccharides

defects of nodule infection [67-69].

proofs.

the first GlcA is catalyzed by the *pssE* and *pssD* gene products [47,61]. This conclusion is based on the results of *in vivo* reciprocal complementation between the *pssED* and *spsK* genes of the *Sphingomonas* strain S88 and the *in vitro* sugar incorporation studies. Recently we confirmed directly the function of PssED. The corresponding mRNAs were translated in a wheat-germ cell-free system in the presence of liposomes obtained from *Rlv* VF39 phospholipids. The resulting proteoliposomes were used as an enzyme source in

Both, PssD and PssE display similarity to GT family 28 (CAZy database, http://www.cazy.org/). Notably, the amino acid sequences of PssD and PssE are similar to the N-terminal and C-terminal halves of the glucuronosyl-(ß1,4)-glucosyltransferase SpsK, respectively. This observation has led us to the conclusion that PssD and PssE represent two subunits of the same enzyme. The proposed catalytic domain is localized in the peripheric inner membrane of the *pssE* subunit, in contrast to PssD, which was shown to be an integral inner membrane protein. We have also found that the integration of PssE into a membrane or liposomes strongly depended on the presence of PssD and *vice versa* (Ivashina et al, unpublished). It is interesting to note that the same was observed in the case of yeast proteins Alg13 and Alg14. It was found that Alg14 was needed for the correct positioning of Alg13 on the cytosolic face of the endoplasmic reticulum membrane mediating the formation of the active UDP-N-acetylglucosamine transferase complex [65,66]. Mutations in the *pssE* and *pssD* genes fully abolished EPS production and as a consequence resulted in

PssC belongs to the GT family 2 containing a variety of inverting glycosyltransferases (enzymes that form glycosidic bonds with stereochemistry opposite to that of the glycosyl donor) that utilize a diverse range of nucleotide-sugar donors and participate in the synthesis of various types of polysaccharides [70]. This GT was assigned by Pollock and coworkers to glucuronosyl-(β-1,4)-glucuronosyltransferase, which catalyzed the attachment of the third sugar residue (GlcA) to the disaccharide (GlcA-β-1,4-Glc) lipid-linked intermediate with the formation of the ,1-4 glycosidic bond [61]. This conclusion was based on the comparative data on the genetic control of the first three steps of *R. leguminosarum* EPS and the *Sphingomonas* strain S88 sphingan assembly. Obviously, these data can not be considered as direct evidence, and conclusion on the PssC assignment needs additional experimental

Several *pssC* mutations have been characterized in various *R*. *leguminosarum* biovars backgrounds to date [46,47]. All of them were mapped at the N-terminus of PssC and resulted in the decreased amount (27-38%) of EPS in culture supernatants. However, structural analysis of EPSs secreted by these mutants showed them to be identical to that of the wild-type strains [46,47]. We proposed that the initiation of translation of *pssC* could be realized from the second potential start codon GTG located downstream from the spots of mutations and therefore leading to the synthesis of protein retaining enzymatic activity. In fact, Western blot analysis with antibodies against PssC demonstrated the synthesis of a truncated protein in the *pssC* mutant (Ivashina et al., unpublished). Attempts to introduce

experiments on PssDE specificity (Ivashina et al., unpublished).

The PssJ protein is the last glycosyltransferase for which its biochemical function was ascertained experimentally. The *Rlt* RBL5515 strain carrying mutation in the *pssJ* gene (known as *exo*344::Tn*5*), synthesizes residual amounts of EPS, the repeating unit of which lacks the terminal galactose of the side chain. On the basis of the structural features of the polysaccharides synthesized and the results of an analysis of the enzyme activities involved, it was hypothesized that the galactosyltransferase catalyzing formation of the 1-3 linkage between sub-terminal (Glc) and terminal (Gal) sugar residues in the octasaccharide unit is affected in this strain [6]. PssJ did not reveal any homologs in protein databases and therefore it could be referred to a family of "not-classified glycosyltransferases" (CAZy database).

It can be seen from the EPS repeating unit structure that the third (GlcA) and the fourth (Glc) sugar residues in the backbone chain are linked by the 1-4 glycosidic bond (Table 1). PssS is the only enzyme which can be responsible for this reaction. According to homology search data, the PssS was referred to the GT family 1 (CAZy database), which integrates the retaining glycosyltransferases forming glycosidic bonds with stereochemistry identical to that of the glycosyl donor. These enzymes were shown to be involved in exopolysaccharide, lipopolysachharide, and slime polysaccharide colanic acid biosynthesis. We were unable to disrupt the *pssS* gene in the wild-type strain *Rlv* VF39, but easily inactivated it in the *pssD*  mutant (Eps- ). It is likely that in this case the inactivation of *pssS* also leads to the accumulation of toxic lipid-linked intermediates as it was proposed for *pssC* mutants.

Summarizing, specific GTs were assigned to the assembly of the backbone chain of the octasaccharide unit, as well as in the attachment of the terminal Gal in the side chain. It should be emphasized that in all known *R*. *leguminosarum* EPS structures the backbone chains are identical (Table 1). At the same time, in genomes of *R*. *leguminosarum* strains, which synthesize similar EPSs the orthologs of *pssAEDCS* genes are present (Fig. 1). Therefore, we can conclude that the prediction has been made correctly. The same statement is true for PssJ catalyzing the attachment of the terminal Gal residue: in the *R*. *leguminosarum* genomes, where *pssJ* has not been found, the *pssK*, which has been predicted to modify this Gal residue (see below), is also missing.

As mentioned above, *R*. *leguminosarum* and *R. etli* can produce EPSs with side chains varying in their length, sugar composition and type of glycosidic linkages. It should be noted, that the set of GT genes in Pss-I clusters can also vary. Studies on the genetic control of EPS biosynthesis are impeded to a considerable degree due to the fact that when genome sequences are available, nothing is known about the structure of synthesized EPS and *vice*

*versa*. The *Rlv* VF39 and *Rlt* TA1 represent the only pair when (i) the structure of EPSs and sequences of the Pss-I clusters are determined; (ii) both strains produce structurally identical EPSs but differ in the sets of GT genes; (iii) data on mutational analysis of *Rlv* VF39 GT genes are obtained. Taking into account all these considerations, we have picked the *Rlv* VF39/*Rlt* TA1 pair for prediction of the pathway of side chain biosynthesis. In this case the question arises, which glycosyltransferase initiates branching by attachment of the Glc residue via the β1-6 bond, and which GT(s) is (are) responsible for the attachment of two subsequent Glc residues by formation of the β1-4 glycosidic linkage. It is obvious, that in *Rlt* TA1 only two GTs (PssF and PssI) can perform these functions. In addition to PssF and PssI, two other GTs (PssH and PssG) can participate in the side chain assembly in the *Rlv* VF39. We introduced mutations into all four genes (*pssFGHI*) in *Rlv* VF39 and found that the structures of EPSs of mutant strains were identical to that of the parental strain, and only the level of acidic EPS production decreased. Based on these results, we can conclude that the action of each GT considered in this system can be interchangeable.

In our opinion, PssF is the best candidate to play the role of GT which catalyzes the attachment of the Glc residue by formation of the β1-6 bond. Firstly, in all *Rhizobium leguminosarum* strains the EPS side chain starts with the Glc residue attached to the backbone chain via the β1-6 bond. At the same time, PssF is present in all PssI-clusters sequenced up to now. Secondly, PssI, PssH and PssG reveal a rather high level of similarity with each other especially in the N-terminal parts of their amino acid sequences where catalytic domains are located. In contrast, PssF is practically non-homologous to that of three GTs but shows although weak but yet reliable homology of its N-terminal half with GTs attaching the Glc residue via the β1-6 bond (e.g. ExoO from *S. meliloti* [20]).

If our prediction on the PssF function is correct, the attachment of two subsequent Glc residues could be achieved by single GT (PssI) in the *Rlt* TA1, and as many as three GTs (PssI, PssH and PssG) could participate in this process in the *Rlv* VF39. Apparently, PssI in the *Rlt* TA1 strain is to a certain extent tolerant to the acceptor structure and the identity of EPS repeating units probably is attained at the expense of high specificity of PssJ, which catalyzes the last step of the EPS assembly.

It seems that in the *Rlv* VF39 the subsequent attachment of Glc residues is achieved by two separate GTs, namely PssI and PssH. This assumption is based on a comparison of the amino acid sequences of these homologous GTs. A rather low level of similarity of their Cterminal parts, containing the putative acceptor recognition domain was observed. In contrast, PssG reveals a very high level of homology to PssI over its entire amino acid sequence (more than 80% similarity). It is plausible to assume that PssI and PssG are isoenzymes, which handle the same step in the EPS assembly. Thereby, genetic control of the repeating unit biosynthesis in the *Rlv* VF39 resembles that of *S. meliloti*, where the attachment of the sugar residue at each step of biosynthesis is catalyzed by specific GT, and even two GTs can participate in catalysis at some steps of the pathway.

A presumptive circuit of the EPS repeating unit assembly in *Rlt* TA1 and *Rlv* VF39 is presented in Figure 2. It should be noted that PssA, PssDE, PssC, PssS, PssF and PssI/PssG apparently comprise the basic set of GTs involved in biosynthesis of all EPS under consideration. Taking into account the mentioned variability of side chains of these EPS, one can expect that the basic set of GTs in corresponding *R. leguminosarum* strains is augmented. Indeed, PssJ is present in *Rlv* 3841, *Rlv* VF39, *Rlt* TA1, *Rlt* WSM2304, *Re* CFN42, but is absent in *Re* CNPAF512 and *Re* CIAT 652. At the same time, genes for family 2 GT PsaC and family 1 GT PsaD were found in the PssI-clusters of the latter strains. In addition, genes encoding family 2 GTs PsaB and PsaE, amino acid sequences of which are similar to GTs attaching the Glc residue via the β1-6 bond, were localized in *Rlt* WSM2304 and *Re* CIAT 652, respectively.

**Figure 2.** Model for the EPS repeating unit assembly in *Rlv* VF39 and *Rlt* TA1 strains. Abbreviations: Glc, glucose; GlcA, glucuronic acid; Gal, galactose and Pyr, ketal pyruvate group.

#### **6. Genes involved in modification of EPS**

108 The Complex World of Polysaccharides

*versa*. The *Rlv* VF39 and *Rlt* TA1 represent the only pair when (i) the structure of EPSs and sequences of the Pss-I clusters are determined; (ii) both strains produce structurally identical EPSs but differ in the sets of GT genes; (iii) data on mutational analysis of *Rlv* VF39 GT genes are obtained. Taking into account all these considerations, we have picked the *Rlv* VF39/*Rlt* TA1 pair for prediction of the pathway of side chain biosynthesis. In this case the question arises, which glycosyltransferase initiates branching by attachment of the Glc residue via the β1-6 bond, and which GT(s) is (are) responsible for the attachment of two subsequent Glc residues by formation of the β1-4 glycosidic linkage. It is obvious, that in *Rlt* TA1 only two GTs (PssF and PssI) can perform these functions. In addition to PssF and PssI, two other GTs (PssH and PssG) can participate in the side chain assembly in the *Rlv* VF39. We introduced mutations into all four genes (*pssFGHI*) in *Rlv* VF39 and found that the structures of EPSs of mutant strains were identical to that of the parental strain, and only the level of acidic EPS production decreased. Based on these results, we can conclude that the

In our opinion, PssF is the best candidate to play the role of GT which catalyzes the attachment of the Glc residue by formation of the β1-6 bond. Firstly, in all *Rhizobium leguminosarum* strains the EPS side chain starts with the Glc residue attached to the backbone chain via the β1-6 bond. At the same time, PssF is present in all PssI-clusters sequenced up to now. Secondly, PssI, PssH and PssG reveal a rather high level of similarity with each other especially in the N-terminal parts of their amino acid sequences where catalytic domains are located. In contrast, PssF is practically non-homologous to that of three GTs but shows although weak but yet reliable homology of its N-terminal half with GTs attaching

If our prediction on the PssF function is correct, the attachment of two subsequent Glc residues could be achieved by single GT (PssI) in the *Rlt* TA1, and as many as three GTs (PssI, PssH and PssG) could participate in this process in the *Rlv* VF39. Apparently, PssI in the *Rlt* TA1 strain is to a certain extent tolerant to the acceptor structure and the identity of EPS repeating units probably is attained at the expense of high specificity of PssJ, which

It seems that in the *Rlv* VF39 the subsequent attachment of Glc residues is achieved by two separate GTs, namely PssI and PssH. This assumption is based on a comparison of the amino acid sequences of these homologous GTs. A rather low level of similarity of their Cterminal parts, containing the putative acceptor recognition domain was observed. In contrast, PssG reveals a very high level of homology to PssI over its entire amino acid sequence (more than 80% similarity). It is plausible to assume that PssI and PssG are isoenzymes, which handle the same step in the EPS assembly. Thereby, genetic control of the repeating unit biosynthesis in the *Rlv* VF39 resembles that of *S. meliloti*, where the attachment of the sugar residue at each step of biosynthesis is catalyzed by specific GT, and

A presumptive circuit of the EPS repeating unit assembly in *Rlt* TA1 and *Rlv* VF39 is presented in Figure 2. It should be noted that PssA, PssDE, PssC, PssS, PssF and PssI/PssG

action of each GT considered in this system can be interchangeable.

the Glc residue via the β1-6 bond (e.g. ExoO from *S. meliloti* [20]).

even two GTs can participate in catalysis at some steps of the pathway.

catalyzes the last step of the EPS assembly.

Three genes, *pssR*, *pssM,* and *pssK*, were identified within the Pss-I cluster that may be involved in the modification of EPS in all *R. leguminosarum* strains as well as in *R. etli* CFN42 (Fig. 1). A homology search for the *pssR* gene product revealed similarity of its C-terminal region (amino acids 87-136) with the corresponding parts of a large number of acetyltransferases, including well characterized rhizobial NodL [71-73], *E*. *coli* LacA and CysE [74,75], *S*. *aureus* Cap1G [76]. The hexapeptide repeat [LIV]-G-x(4) (IPR001451, bacterial transferase hexapeptide repeat) is present in the homologous regions of all these proteins including the PssR. It should be noted that PssR shares similarity over nearly its entire length with conservative putative acetyltransferases from different representatives of *Bacillaceae*. Proteins from this family use acetyl coenzyme A as the acetyl donor and acetylate

different substrates including capsular and extracellular polysaccharides, lipooligosaccharides, chitin fragments, N-acetylglucosamine and antibiotics [64,77,78].

Notably *pssR* orthologs were found in all PssI-clusters (Fig. 1). This observation is in agreement with the data concerning the major site of O-acetylation localized at the second GlcA residue in the backbone chains which are identical in all EPS with known structure.

Insertional inactivation of *pssR* in the *Rlv* VF39 genome does not result in a complete absence of acetyl groups in EPS. This suggests the existence of other gene(s) elsewhere in the *Rlv* VF39 genome needed for the EPS acetylation. Decreasing of the level of acetylation has no effect on nodule development and nitrogen fixation. Similar data were obtained for *S*. *meliloti* ExoZ mutants, which failed to acetylate succinoglycan. It was shown that the acetyl decoration of succinoglycan is not absolutely required for a nodule formation; however it increased the efficiency of infection threads initiation [79,80].

The amino acid sequence of PssM shares homology with several known and putative ketal pyruvate transferases, including ExoV from *S. meliloti* and GumL from *Xanthomonas campestris*. Knock-out of the *pssM* gene does not result in the loss of ability to produce HMW EPS, but leads to the absence of the pyruvic acid ketal group at subterminal glucose in the repeating unit of EPS as it was shown by 13C and 1H NMR analyses. Complementation *in trans* restored the EPS modification in the *pssM* mutant [81]. Disruption of the *pssM* gene led to essential disturbances in symbiosis. Thus, the *pssM* mutation resulted in the formation of aberrant non-nitrogen-fixing nodules on peas. Ultrastructural studies of mutant nodules indicated that the infection thread formation, release of bacteria into the plant cell cytoplasm and early steps of differentiation of bacteroids were not affected. However, further stages in the symbiosome development and maintenance were arrested. We proposed that the induction of early senescence of symbiosomes depends on the failure in recognition mechanisms and, what is essential, that recognition of a micro-symbiont by the host plant is important not only at early stages of symbiosis, but also during its intracellular period of life [81]. Moreover, an accumulation of very large starch granules observed in infected and noninfected cells, suggests that the plant-derived photosynthates, which serve as an energy source for nitrogen fixation [82] are not fully consumed in *pssM* induced nodules. The mechanisms which modify the "symbiotic" nodule to starch accumulation may include alteration in the starch phosphorylase activity and (or) its expression [83].

Our finding that mutation in *pssM* abolishes pyruvylation of only one of the two sugar residues in *Rlv* VF39 EPS permits to propose that pyruvylation of the terminal galactose may be controlled by the *pssK* gene localized within the Pss-I cluster. The PssK amino acid sequence was similar to proteins containing the pyruvyltransferase domain IPR007345, including Pvg1p from *Shizosaccharomyces pombe*, YveS, YvfF and YxaB from *Bacillus subtilis* [84], and EpsL from *Streptococcus thermophilus* [85]. It was shown that Pvg1 catalysed the transfer of the pyruvyl group to Galβ1,3-residues in N-linked galactomannan chains [86]. Interestingly, no sequence homology was observed between the PssM and PssK proteins that can reflect different substrate specificities of these enzymes. No direct evidence for the *pssK* function was obtained in any of the *R*. *leguminosarum* strains. Our preliminary data indicate that knock-out of *pssK* abolishes the EPS synthesis and results in a non-slimy phenotype of colonies (Ivashina et al., unpublished). It is possible that pyruvyl modification of the terminal sugar residue may be necessary for the efficient polymerization or export of EPS as it was proposed for *S. meliloti exoV* which is involved in pyruvylation of succinoglycan [19,21].

As seen from Figure 1, in the *R*e CNPAF512 and *R*e CIAT 652 the *pssK* gene is absent and *pssM* is replaced by non-orthologous *psaG* and *psaF* genes, respectively. The latter genes presumably can also encode ketal pyruvate transferases since IPROO7345 domain (Polysacch\_pyruvyl\_Trfase) was found in their amino acid sequences. Unfortunately, EPS structure of both *R. elti* strains remains unknown, but one can suppose that at least in their side chains it differs from that of *Rlv* VF39.

This assumption is based on the observation that the sets of GTs in their Pss-I clusters differ from that of *Rlv* VF39. One can see that genes of ketal pyruvyl transferases are different also. Therefore, this finding additionally argues towards high substrate specificity of these modifying enzymes.
