**3. Molecular genetics of Nod-factor signaling in legumes**

to NFs [43, 44]. These genes were named *NFR1* and *NFR5*, for Nod-Factor Receptor. Cloning of these genes revealed that they encode receptor-like kinases comprising LysM domains (LysM-RLK). LysM domains occur in a variety of proteins in bacteria and eukaryotes and have been shown to bind glycan-containing ligands (such as chitin) [45]. They consist of a repetition of a small motif typically containing from 44 to 65 amino acid residues – the LysM sequence, or LysM module [46, 47]. One LysM sequence has a βααβ secondary structure with the two helices packing onto the same side of an antiparallel β sheet. Multiple LysM modules in a protein are often separated by small Ser-, Thr-, and Asn-rich intervening sequences [48].

Only in plants are LysM domains associated with a kinase-like domain [49] forming two main LysM-RLK gene families: the LYK family and the LYR family. All the LysM-RLKs are predicted to contain three LysM modules, although these modules exhibit a high degree of divergence, both within a protein and between proteins. It is considered that the initial function of LysM-RLKs has been recognition of chitin-based signal molecules produced by hostile microbes (termed as MAMPs ("microbe-associated molecular patterns") or PAMPs ("pathogenassociated molecular patterns")), similar to the function of CERK1 receptor-like kinase from *Arabidopsis thaliana* [2]. Based on microsyntenies between genomic regions around LysM-RLK genes in legumes and non-legumes (*A.thaliana*, rice) plants, it has been speculated that these genes are the descendants of a common ancestor [50]. Zhang et al. (2007) [51] proposed that in Leguminosae LysM-RLKs have undergone further duplication and diversification, with some LysM-RLKs acquiring the ability to perceive bacterial NFs, leading to mutually beneficial endosymbiosis with rhizobia. One aspect of this diversification is the adaptation of extracel‐ lular LysM domains to recognize specific structures of NFs, while another being evolution of the intracellular kinase domains to switch the signals from cascades inducing defense re‐ sponses to symbiotic gene cascades. Recently, the function of NFRs as NF receptors was

confirmed by demonstration of their ability to directly bind NF molecule *in vitro* [52].

demands [40].

142 Plants for the Future

In *Medicago* and pea, which belong to IRLC (see above), NF perception seems to be more complicated than in *Lotus*. Genes orthologous to *NFR1* and *NFR5* were identified in *Medicago truncatula* (*LYK3* and *NFP*) and in *Pisum sativum* (*Sym37* and *Sym10*), with careful description of corresponding mutant phenotypes [44, 53-55]. While phenotype of *nfp* and *sym10* mutants (in *Medicago* and pea, respectively) coincided with that of *nfr5* mutants in *Lotus*, mutations in genes *lyk3* and *sym37* (orthologs of *NFR1*) led to significantly different phenotype – successful penetration of bacteria into root hair with subsequent block of IT progress, instead of complete absence of responses to rhizobia [55, 56]. These data support the "two-receptor" model of Nodfactor perception proposed more than 20 years ago [40]. According to this model, which was developed on the base of the infection phenotype of several *S. meliloti nod* mutants, there are two different types of NF receptors – the "recognition" (or "signaling") receptor inducing early responses with high affinity for Nod-factor and low requirements toward its structure, and the "entry" receptor that controls penetration of bacteria into plant cell and has more stringent

It is significant to note that NFR5 (and its homologs, NFP in *Medicago* and Sym10 in *Pisum*) lacks the independent kinase activity and thus can function properly only in complex with active kinase (which is suggested to be NFR1) [52]. It can be assumed, based on the above, that in general the "recognition" receptor (NFR5, NFP or Sym10) perceives NF and afterwards As reviewed in our recent publication [57], plant genes involved in development of RN symbiosis may be divided into two groups, according to approach which was used for the gene identification. The first group, *Sym*-genes, had been identified with the use of formal genetic analysis (started from selection of plant mutants defective in nodule development). The other group of genes called nodulins was identified by molecular genetic methods, through identification of proteins and/or RNAs synthesized *de novo* in root nodules.

The large sizes of genomes of crop legumes (e.g., soybean or pea) in which the formal genetics of symbioses was initially developed, as well as low capability for genetic transformation, complicate greatly the cloning of symbiotic genes, analysis of their primary structures, and gene manipulations. Therefore, in the early 1990s, *Lotus japonicus* [58] and *Medicago truncatu‐ la* [59, 60] have been introduced in symbiogenetic studies as model plants. These species are characterized by relatively small genomes (470-500 Mb; [61]) and can be easily genetically transformed [60, 62-64]. In addition, the short life cycle and high seed productivity made them attractive and convenient model objects for studying molecular bases of RN symbioses, as well as other types of plant-microbial symbioses.

The analysis of signaling pathway in RN symbiosis was started with experimental mutagen‐ esis. Large-scale programs of insertion, chemical and X-rays mutagenesis, performed by different research groups, resulted in generation of numerous symbiotic mutants in *L. japonicus* and *M. truncatula* [65, 66] which allowed researchers to identify and characterize a series of *Sym*-genes. The genes involved at the initial stages of nitrogen-fixing symbiosis (named "early *Sym*-genes") were of primary interest, allowing dissection of the mechanisms by which the NF signal is perceived and transduced by host plants.

#### **3.1. Nod-factor signaling in model legumes**

After the first step of NF reception implemented by LysM-receptor kinases (described above), the symbiotic signal is transmitted to the pathway named Common Symbiosis Pathway (CSP), for it shares components with another interaction – arbuscular mycorrhiza (AM) symbiosis, the association with obligate biotrophic fungi of phylum *Glomeromycota*. Arbuscular mycor‐ rhiza is formed by at least 80% of contemporary land plants and is believed to be the most ancient plant-microbe symbiosis which has played a decisive role in plants adaptation for terrestrial life [67-69]. AM is the main source of plants' phosphoric nutrition, although in many temperate and boreal species it is supplemented or even completely replaced by other forms of mycorrhiza (ectotrophic, ericoid) with various representatives of the *Ascomycota* and *Basidiomycota*, and for some plants (orchids) fungi supply not only mineral nutrition, but also organic carbon compounds [69, 70]. Being the first beneficial association with microorganisms known for plants (occurred approximately 400 million years ago), AM is considered as an ancestor for other mutualistic plant-microbe interactions, such as RN symbiosis. Therefore, it is supposed that NF signaling evolved on the base of previously existing AM signaling. Intriguingly, arbuscular mycorrhizal fungi excrete a set of chitin-derived Myc-factors struc‐ turally similar to Nod-factors [71], which also serve as the signaling molecules. It still remains unknown, however, how exactly the Myc-factors are percepted by plants.

The first player in the CSP was identified more than 10 years ago. It is LRR-receptor kinase, or SymRK (symbiotic receptor kinase) described for *Lotus* as SymRK (Symbiotic Receptor Kinase) and for *Medicago* as NORK (Nodulation Receptor Kinase) [72, 73]. In pea, the gene *Sym19* is orthologous to *SymRK* in *Lotus* and *NORK* (also known as *DMI2*, for Doesn't Make Infections) in *Medicago* [72]. Ligand of this receptor kinase is not known as yet (Figure 3). Interestingly, the activity of SymRK is also required for proper progression of late symbiotic stages, at least for rhizobial infection [74]. SymRK kinase domain has been shown to interact with 3-hy‐ droxy-3-methylglutaryl CoA reductase 1 (HMGR1) from *M. truncatula* [75], and an ARID-type DNA-binding protein [76]. These results suggest that SymRK may form complex with key regulatory proteins of downstream cellular responses. Symbiotic Remorin 1 (SYMREM1) from *M. truncatula* and SymRK-interacting E3 ligase (SIE3) from *L. japonicus* have also been shown to interact with SymRK [77, 78].

From left to right: stages of symbiosis.

**Figure 3.** Receptor kinases of pea participating in nodulation signaling.

The symbiosis receptor kinase SymRK acts upstream of the NF-induced Ca2+ spiking in the perinuclear region of root hairs within a few minutes after NF application [79]. Perinuclear calcium spiking involves the release of calcium from a storage compartment (probably the nuclear envelope) through as-yet-unidentified calcium channels. To date, it is known that the potassium-permeable channels might compensate for the resulting charge imbalance and could regulate the calcium channels in plants [80-84]. Also, nucleoporins NUP85 and NUP133 (described only in *Lotus* so far) are required for calcium spiking, although their mode of involvement is currently unknown. Probably, they might be a part of specific nuclear pore subcomplex that plays a crucial role in the signal process requiring interaction at the cell plasma membrane and at nuclear and plastid organelle membranes to induce a Ca2+ spiking [85-86]. Recently, the third constituent of a conserved subcomplex of the nuclear pore scaffold, NENA, was identified as indispensable component of RN endosymbiotic development [87].

of mycorrhiza (ectotrophic, ericoid) with various representatives of the *Ascomycota* and *Basidiomycota*, and for some plants (orchids) fungi supply not only mineral nutrition, but also organic carbon compounds [69, 70]. Being the first beneficial association with microorganisms known for plants (occurred approximately 400 million years ago), AM is considered as an ancestor for other mutualistic plant-microbe interactions, such as RN symbiosis. Therefore, it is supposed that NF signaling evolved on the base of previously existing AM signaling. Intriguingly, arbuscular mycorrhizal fungi excrete a set of chitin-derived Myc-factors struc‐ turally similar to Nod-factors [71], which also serve as the signaling molecules. It still remains

The first player in the CSP was identified more than 10 years ago. It is LRR-receptor kinase, or SymRK (symbiotic receptor kinase) described for *Lotus* as SymRK (Symbiotic Receptor Kinase) and for *Medicago* as NORK (Nodulation Receptor Kinase) [72, 73]. In pea, the gene *Sym19* is orthologous to *SymRK* in *Lotus* and *NORK* (also known as *DMI2*, for Doesn't Make Infections) in *Medicago* [72]. Ligand of this receptor kinase is not known as yet (Figure 3). Interestingly, the activity of SymRK is also required for proper progression of late symbiotic stages, at least for rhizobial infection [74]. SymRK kinase domain has been shown to interact with 3-hy‐ droxy-3-methylglutaryl CoA reductase 1 (HMGR1) from *M. truncatula* [75], and an ARID-type DNA-binding protein [76]. These results suggest that SymRK may form complex with key regulatory proteins of downstream cellular responses. Symbiotic Remorin 1 (SYMREM1) from *M. truncatula* and SymRK-interacting E3 ligase (SIE3) from *L. japonicus* have also been shown

The symbiosis receptor kinase SymRK acts upstream of the NF-induced Ca2+ spiking in the perinuclear region of root hairs within a few minutes after NF application [79]. Perinuclear calcium spiking involves the release of calcium from a storage compartment (probably the

unknown, however, how exactly the Myc-factors are percepted by plants.

to interact with SymRK [77, 78].

144 Plants for the Future

From left to right: stages of symbiosis.

**Figure 3.** Receptor kinases of pea participating in nodulation signaling.

Ca2+ spikes are supposed to activate a calcium- and calmodulin-dependent protein kinase (CCaMK). This kinase contains an autoinhibition domain which, when removed, leads to a spontaneous activation of downstream transcription events and induction of nodule formation even in the absence of rhizobia [88]. Thus, CCaMK appears to be a general "manager" for both RN and AM symbioses and the last member of Common Symbiosis Pathway, because the next steps of nodulation signaling are independent from those of AM: the mutations in downstream *Sym*-genes do not affect the AM symbiotic properties of legume. Interestingly, mutations in any *Sym*-genes do not influence the defense reactions, suggesting that signaling pathways of mutualistic symbioses and pathogenesis are sufficiently different.

The CCaMK is known to form a complex with CYCLOPS, a phosphorylation substrate, within the nucleus [89]. *cyclops* mutants of *Lotus* severely impair the infection process induced by the bacterial or fungal symbionts. During RN symbiosis, *cyclops* mutants exhibit the specific defects in IT initiation, but not in the nodule organogenesis [90], indicating that CYCLOPS acts in an infection-specific branch of the symbiotic signaling network [35]. *Cyclops* encodes a protein with no overall sequence similarity to proteins with known function, but containing a functional nuclear localization signal and a carboxy-terminal coiled-coil domain.

It is supposed that CCaMK with help of CYCLOPS probably phosphorylates the specific transcription factors already present in cell, NSP1 and NSP2, which influence the changes of expression in several genes related to the symbiosis development [91, 92]. The activity of these proteins leads to the transcriptional changes in root tissues, for instance, increasing the level of early nodulins ENOD40, ENOD11, ENOD12, ENOD5, which are known to be the potential regulators of IT growth and nodule primordium formation [93-95]. Also, the changes in cytokinin status of plant are detected, followed by up-regulation of genes encoding for RN symbiosis-specific cytokinin receptors [96-98]. Moreover, transcription regulators NIN and ERN are to be induced specifically downstream of the early NF signaling pathway in order to coordinate and regulate the correct temporal and spatial formation of root nodules [99-102].

The presented genes are responsible for the signal cascade which is aimed to induce the nodulin genes involved in building the symbiotic structures and implementing their biochem‐ ical functions. It is supposed that this signaling pathway did not appear *de novo* in legumes when they become able to form nodules, but was developed from already existing system of AM formation into which the novel, nodule-specific genes were recruited. Still, new genes had been involved in RN symbiosis development, especially those encoding the receptors recog‐ nizing hormones (e.g., cytokinins) and hormone-like molecules (Nod-factors).

Another important signaling process in RN symbiosis is an autoregulation of nodule forma‐ tion. It takes place after successful mutual partners' recognition and signal exchange. It is considered that legume host controls the root nodule numbers by sensing the external and internal cues. A major external cue is the concentration of soil nitrate, whereas a feedback regulatory system where nodules formed earlier suppress further nodulation through shootroot communication is an important internal cue. The latter is known as the autoregulation of nodulation (AON), and is believed to consist of two long-distance signals: a root-derived signal that is generated in infected roots and transmitted to the shoot; and a shoot-derived signal that inhibits nodulation systemically [103-104]. Therefore, AON represents a strategy through which the host plant can balance the symbiotrophic N nutrition with the energetically more "cheap" combined N nutrition.

Recent findings on autoregulation of nodulation suggest that the root-derived ascending signals to the shoot are short peptides belonging to the CLE peptide family [105] [106]. The leucine-rich repeat receptor-like kinase HAR1 of *Lotus* and its homologues in *M. truncatula* and *P. sativum* (SUNN and Sym29, respectively) mediate AON and also the nitrate inhibition of nodulation, presumably by recognizing the root-derived signal [107-110] (Figure 3).

It was suggested that NF signaling induces expression or posttranslation processing of CLE peptides, which likely function as ascending long-distance signals to the shoot [110]. Thus, NF signaling is related to autoregulation as well, but in some indirect way. It is also worth noting that NF signaling pathway appears to work in mature nodules, since aforementioned "early nodulation genes" belonging to CSP, as well as NF receptor kinase genes, are highly expressed in nodule tissues (76, 111). Perhaps the active NF signaling is needed to prevent the induction of defense-like responses and/or to restrict the release of rhizobia into precise cell layers, thus regulating the formation of symbiotic interface [112].

#### **3.2. Pea (***Pisum sativum* **L.) as a unique example of increased specificity in plant-microbe interaction**

Being one of the most ancient crops known to humanity, nowadays garden pea (*Pisum sativum* L.) is widely distributed in the world. According to the recent data, pea is a third most important legume for food industry, following beans and soybeans [113]. It is also the popular model for various genetic and physiological researches, including the studying of symbiosis with nodule bacteria. Despite the fact that work with pea is complicated by the presence of some negative properties, such as relatively large (about 4000 Mb) genome, low seed produc‐ tivity, and poor transformation capability, the use of this object in study of symbiotic relation‐ ships continues and brings significant results.

There are several pea genes known to participate in NFs' reception, with the most interesting of them being *Sym2*. This gene was first described in the 1970s as determinant of "resistance" to nodulation in pea cultivars from Afghanistan and Iran [114, 115]. While being unable to form nodules with the majority of natural *Rhizobium leguminosarum* bv. *viciae* (*Rlv*) strains obtained from European soils, these cultivars have demonstrated the ability to interact normally with strains from the Middle East, such as strain *Rlv* TOM [115]. This feature is controlled by specific recessive allele of *Sym2* named "Afghan allele" (*Sym2A*). Presence of *Sym2A* in homozygous state leads to block of infection thread progression in the root hair, similarly to phenotype of *sym37* mutants [55]. Later it was shown that *Rlv* strains able to nodulate "Afghan" cultivars have special gene called *nodX*, which is involved in the modifi‐ cation of NF structure [116, 117]. *NodX* encodes the acetyltransferase providing O-acetylation on reducing end of NF sugar backbone. Thus, only *nodX*-modified NFs can be recognized by plants with *Sym2A* allele, although Ovtsyna et al. (2000) [118] show that fucosylation on the same position controlled by *nodZ* gene can also induce nodulation of "Afghan" peas.

Another important signaling process in RN symbiosis is an autoregulation of nodule forma‐ tion. It takes place after successful mutual partners' recognition and signal exchange. It is considered that legume host controls the root nodule numbers by sensing the external and internal cues. A major external cue is the concentration of soil nitrate, whereas a feedback regulatory system where nodules formed earlier suppress further nodulation through shootroot communication is an important internal cue. The latter is known as the autoregulation of nodulation (AON), and is believed to consist of two long-distance signals: a root-derived signal that is generated in infected roots and transmitted to the shoot; and a shoot-derived signal that inhibits nodulation systemically [103-104]. Therefore, AON represents a strategy through which the host plant can balance the symbiotrophic N nutrition with the energetically more

Recent findings on autoregulation of nodulation suggest that the root-derived ascending signals to the shoot are short peptides belonging to the CLE peptide family [105] [106]. The leucine-rich repeat receptor-like kinase HAR1 of *Lotus* and its homologues in *M. truncatula* and *P. sativum* (SUNN and Sym29, respectively) mediate AON and also the nitrate inhibition of

It was suggested that NF signaling induces expression or posttranslation processing of CLE peptides, which likely function as ascending long-distance signals to the shoot [110]. Thus, NF signaling is related to autoregulation as well, but in some indirect way. It is also worth noting that NF signaling pathway appears to work in mature nodules, since aforementioned "early nodulation genes" belonging to CSP, as well as NF receptor kinase genes, are highly expressed in nodule tissues (76, 111). Perhaps the active NF signaling is needed to prevent the induction of defense-like responses and/or to restrict the release of rhizobia into precise cell layers, thus

**3.2. Pea (***Pisum sativum* **L.) as a unique example of increased specificity in plant-microbe**

Being one of the most ancient crops known to humanity, nowadays garden pea (*Pisum sativum* L.) is widely distributed in the world. According to the recent data, pea is a third most important legume for food industry, following beans and soybeans [113]. It is also the popular model for various genetic and physiological researches, including the studying of symbiosis with nodule bacteria. Despite the fact that work with pea is complicated by the presence of some negative properties, such as relatively large (about 4000 Mb) genome, low seed produc‐ tivity, and poor transformation capability, the use of this object in study of symbiotic relation‐

There are several pea genes known to participate in NFs' reception, with the most interesting of them being *Sym2*. This gene was first described in the 1970s as determinant of "resistance" to nodulation in pea cultivars from Afghanistan and Iran [114, 115]. While being unable to form nodules with the majority of natural *Rhizobium leguminosarum* bv. *viciae* (*Rlv*) strains obtained from European soils, these cultivars have demonstrated the ability to interact normally with strains from the Middle East, such as strain *Rlv* TOM [115]. This feature is controlled by specific recessive allele of *Sym2* named "Afghan allele" (*Sym2A*). Presence of

nodulation, presumably by recognizing the root-derived signal [107-110] (Figure 3).

"cheap" combined N nutrition.

146 Plants for the Future

**interaction**

regulating the formation of symbiotic interface [112].

ships continues and brings significant results.

More than 20 years ago, *Sym2* was localized on the pea genetic map. Using RAPD (Random Amplification of Polymorphic DNA) markers, Kozik and colleagues [119] created the detailed map of pea I linkage group fragment including *Sym2* and a few other symbiotic genes (such as *Nod3* and *PsENOD7*). Based on the fact that plants with *Sym2A* allele show the "Afghan" phenotype then exposed to NF with specific structure, it was suggested that Sym2 protein could act as an "entry" receptor during preinfection stage (similar to NFR1 in *Lotus* or LYK3 in *Medicago*).

When *Pisum* gene *Sym37* was shown to be orthologous for *NFR1* [55]*,* it was at first proposed as a candidate for *Sym2.* This was strongly supported by the fact that the missense mutation in *Sym37* carried by *Pisum* mutant line RisNod4 led to Nod phenotype (the absence of nodulation), which could be suppressed by *Rlv* strain A1 known to produce broad specter of NFs, including *nodX*-modified one [55]. However, the paralogue of *Sym37*, gene *K1*, was discovered shortly after, the similar structure of which indicated a possible involvement in the reception of NF, although the purpose of this additional NF receptor remained unclear.

The comparison of *Sym37* and *K1* nucleotide sequences obtained from "Afghan" (*Sym2A*) and "European" pea varieties, as well as amino acid sequences of their corresponding proteins, shows that neither of these genes possesses any features correlating with "Afghan" phenotype [55]. Thus, there must be another determinant corresponding to *Sym2*. Recently, the promising candidate was found – the gene named *LykX* by the authors, which is the second paralogue of *Sym37* localized in the same region of the pea genome (Sulima et al., 2015, in preparation). Analysis of the LykX protein sequences revealed that there are amino acid substitutions within first LysM module of receptor domain typical for plants with "Afghan" phenotype [120]. Simultaneously, Li and colleagues [121] compared the sequence of *Sym37* from series of pea genotypes that differ in interaction with rhizobia mutant on *nodE* gene determining the structure of fatty acid on nonreducing end of NF. It was shown that the efficiency of interaction with mutant strain strictly correlates with particular variation of *Sym37*. Similar situation was observed for interaction between *nodX* and *Sym2* (*LykX*) genes: "Afghan" pea varieties requiring NF with additional acetyl group on reducing end of molecule also display charac‐ teristic features in structure of receptor protein LykX.

We proposed a model, based on the above, according to which the less specific "recognition" receptor (Sym10, perhaps in complex with other proteins) perceives the NF signal *per se* and "anchors" NF molecule on the membrane, subsequently "presenting" it to other components of reception complex, with reducing end being tested by Sym2 (LykX), and nonreducing by Sym37 [122]. Only if all participants in the process react positively will the signal be considered as adequate, and symbiogenesis will start properly (see Figure 4). So, in pea not only one ortholog of *Lotus NFR1*, but two closely related paralogs – *Sym37* and *Sym2* – are involved in genetic control of Nod-factor reception. This is not surprising, if we take into account the complexity of Nod-factor molecule and the importance of its proper recognition for successful development of symbiosis.

**Figure 4.** Hypothetical model for precise recognition of Nod-factor structure by receptor kinases in pea. The model is proposed by Dr. V.A. Zhukov (ARRIAM, St. Petersburg, Russia). At first step, less specific receptor (probably, Sym10) anchors NF molecule onto the membrane; then it presents it to Sym37, which tests the structure of the nonreducing end, and to Sym2, which tests the structure of reducing end. When both Sym37 and Sym2 bind NF, they activate downstream components of signal transduction pathway.
