**3. Organization of exopolysaccharide biosynthesis genes**

According to the modern conception, the synthesis of heteropolysaccharides requires a complex pathway starting with the synthesis of sugar nucleotide precursors as well as of the non-carbohydrate donors followed by sequential assembly of the repeating unit on polyprenyl lipid carries, their modification, polymerization, and export outside of the cell [20,21,43].

We started the study of the genetic control of the biosynthesis of acidic exopolysaccharide with isolation of non-mucoid Tn*5*-derived mutants in *Rlv* VF39. As a result, five non-slimy mutants (GL1-5) were obtained. The mutations were mapped within four separate chromosomal loci. The open reading frames (*orfs*) interrupted by insertion of the Tn*5* transposon were named as *pss* (**p**oly**s**accharide **s**ynthesis) according to Borthakur and coworkers [44]. The Tn*5* insertion in the GL4 mutant was localized within the *pssA* gene [45], the ortholog of which was previously identified in *Rlp* 8002 [44]. The mutations in GL2 and GL6 were mapped within the *pssE* and *pssD* genes, respectively. Their orthologs were found earlier in *Rlt* LPR5 [46]. Chromosomal walking around these genes in *Rlv* VF39 allowed us to identify a 15.5–kb multi-cistronic operon which included a core set of genes needed for the assembly of the repeating unit of the EPS (*pssEDCFGHIJS*), its modification (*pssKMR*), polymerization (*pssL*) and processing (*pssW*) (Fig. 1) [47]. It should be mentioned that the *pssV-E* operon was found in all *R. leguminosarum* and *R. etli* genomes, whose complete or partial sequences are available now. Moreover, nine out of the fifteen genes from this operon have orthologs in all these genomes. At the same time, certain *Rlv* VF39 genes are absent in some other genomes, certain genes are substituted for non-orthologous genes, and some additional genes are also present (Fig. 1). We will discuss the functioning of all these genes bellow. Here we would like only to consider the problem with their names.

All fifteen genes from the *pssV-E* operon were named as *pss* genes. In addition, the same gene name abbreviation was assigned to six genes (*pssA*, *pssB*, *pssN*, *pssO*, *pssP* and *pssT*) localized in other operons. It is easy to count up that only five letters of the alphabet left that can be used with the "*pss"* body in the names of new genes involved in EPS biosynthesis. Meanwhile, in our opinion even at present new names for eight genes from *Rlt* WSM2304, *Re* CFN42, *Re* CNPAF512 and *Re* CIAT 652 have to be assigned. Therefore, we propose (i) to retain the existing names for all orthologous genes, and (ii) to introduce a new set of genes with the body name "*psa*" (**p**oly**s**accharide repeating unit **a**ssembly). Our propositions concerning new names to be assigned for certain genes involved in EPS biosynthesis are summarized in Table 2.

**Figure 1.** Gene arrangements in PssI-cluster of *R*. *leguminosarum* and *R*. *etli* strains.

symbiosis [14,16-17,42].

summarized in Table 2.

[20,21,43].

The acidic nature of EPS is explained by the presence of uronic acids and negatively charged pyruvyl groups. Similar to other representatives of *Rhizobiaceae*, the *R*. *leguminosarum* strains synthesize EPS in high-molecular-weight (HMW) and low-molecular-weight (LMW) forms [13,42]. The latter were proposed to act as signaling factors during the development of

According to the modern conception, the synthesis of heteropolysaccharides requires a complex pathway starting with the synthesis of sugar nucleotide precursors as well as of the non-carbohydrate donors followed by sequential assembly of the repeating unit on polyprenyl lipid carries, their modification, polymerization, and export outside of the cell

We started the study of the genetic control of the biosynthesis of acidic exopolysaccharide with isolation of non-mucoid Tn*5*-derived mutants in *Rlv* VF39. As a result, five non-slimy mutants (GL1-5) were obtained. The mutations were mapped within four separate chromosomal loci. The open reading frames (*orfs*) interrupted by insertion of the Tn*5* transposon were named as *pss* (**p**oly**s**accharide **s**ynthesis) according to Borthakur and coworkers [44]. The Tn*5* insertion in the GL4 mutant was localized within the *pssA* gene [45], the ortholog of which was previously identified in *Rlp* 8002 [44]. The mutations in GL2 and GL6 were mapped within the *pssE* and *pssD* genes, respectively. Their orthologs were found earlier in *Rlt* LPR5 [46]. Chromosomal walking around these genes in *Rlv* VF39 allowed us to identify a 15.5–kb multi-cistronic operon which included a core set of genes needed for the assembly of the repeating unit of the EPS (*pssEDCFGHIJS*), its modification (*pssKMR*), polymerization (*pssL*) and processing (*pssW*) (Fig. 1) [47]. It should be mentioned that the *pssV-E* operon was found in all *R. leguminosarum* and *R. etli* genomes, whose complete or partial sequences are available now. Moreover, nine out of the fifteen genes from this operon have orthologs in all these genomes. At the same time, certain *Rlv* VF39 genes are absent in some other genomes, certain genes are substituted for non-orthologous genes, and some additional genes are also present (Fig. 1). We will discuss the functioning of all these

genes bellow. Here we would like only to consider the problem with their names.

All fifteen genes from the *pssV-E* operon were named as *pss* genes. In addition, the same gene name abbreviation was assigned to six genes (*pssA*, *pssB*, *pssN*, *pssO*, *pssP* and *pssT*) localized in other operons. It is easy to count up that only five letters of the alphabet left that can be used with the "*pss"* body in the names of new genes involved in EPS biosynthesis. Meanwhile, in our opinion even at present new names for eight genes from *Rlt* WSM2304, *Re* CFN42, *Re* CNPAF512 and *Re* CIAT 652 have to be assigned. Therefore, we propose (i) to retain the existing names for all orthologous genes, and (ii) to introduce a new set of genes with the body name "*psa*" (**p**oly**s**accharide repeating unit **a**ssembly). Our propositions concerning new names to be assigned for certain genes involved in EPS biosynthesis are

**3. Organization of exopolysaccharide biosynthesis genes** 


\* EMBL/GenBank/DDBJ accession numbers: *Rlt* WSM2304 (CP001191), *Re* CFN42 (CP000133), *Re* CNPAF512 (AEYZ01000266) and *Re* CIAT 652 (CP001074)

**Table 2.** Proposed names for genes controlling EPS biosynthesis in some *R. leguminosarum* and *R. etli* strains.

The *pssV-E* operon is neighboring with the region comprising several operons containing genes for the Type I secretion system (*prsED*), EPS processing (*plyA*), and EPS polymerization/export (*pssTNOP*). This whole chromosomal region is known now as the Pss-I gene cluster [48].

The *pssA* gene controlling the first step in the repeating unit assembly is localized approximately at 90 kb from the Pss-I cluster. The gene was shown to be transcribed as a monocistronic mRNA [45]. Upstream of the *pssA* is the *pssB* gene encoding inositol monophosphatase [49-51]. Effects of mutations within the *pssB* gene on the synthesis of EPS and symbiotic behavior have been analyzed in the *Rlv* VF39 and *Rlt* TA1 backgrounds and have been shown to be contradictory. In *Rlv* VF39 the *pssB* mutants retained the ability to produce EPS in amounts equal to those of the wild-type strain. In *Rlt* TA1 the *pssB* inactivation displayed an increased overall production of EPS versus the wild-type strain, and alterations in the LPS PAGE-banding pattern and the O-antigen sugar composition [51,52]. Nevertheless, *pssB* mutants of both strains elicited non-effective nodules on the *Vicia faba* or *Trifolium pratense* roots, respectively [49,52].

In the case of GL1 and GL3 mutants we have not carried out the extended chromosomal walking around the Tn*5* insertions, but only short genome sequences flanking Tn*5* were determined. We localized the Tn*5* insertion in the GL3 mutant within some small *orf* (213 bp only). The ortholog of this *orf* (RL2260) was found in the chromosome sequence of *Rlv* 3841 located far away from the Pss-I cluster. It encodes a 7.2 kDa positively charged protein (pI 10.8), which is conserved in numerous bacteria. As for the GL1 mutation, we have not been able to map it up to now. Probably, this mutation targets a gene located in one of the *Rlv* VF39 strain-specific plasmids.

Taking together these data, one can conclude that the core set of EPS synthesizing genes are clustered in the chromosomal Pss-I region. Clustering of genes involved in EPS biosynthesis is not unique for *R*. *leguminosarum*, but is widespread in different polysaccharide producing bacteria [53,54]. Such a type of genes organization could reflect their coordinated expression and tightly regulated control.

Evidently, EPS biosynthesis is linked with other metabolic pathways in the cell. Therefore, localization of mutations affecting EPS production distant from the Pss-I region can reflect this linkage. For example, the *pssA* gene is located in another chromosomal region probably due to involvement in the initiation of not only EPS synthesis, but the synthesis of the other polysaccharides. Recently, several regulatory genes influencing EPS production (*psiA*, *psrA*, *exoR*, *expR*, *rosR, praR*) were found to be localized either in different regions of the chromosome or at the endogenious plasmids (reviewed by [24]). We can not exclude that mutations in GL3 and GL1 also target the regulatory genes.
