**1. Introduction**

Soil is a dynamic environment due to fluctuations in climatic conditions that affect pH, tem‐ perature, water and nutrient availability. These factors, along with agricultural management practices, affect the soil micro-flora health and the capacity for effective plant-microbe inter‐ actions. Despite these constant changes, soil constitutes one of the most productive of earth's ecospheres and is a hub for evolutionary and other adaptive activities.

### **1.1. Biological nitrogen fixation**

Biological nitrogen fixation (BNF) is one of the most important phenomena occurring in na‐ ture, only exceeded by photosynthesis [1,2]. One of the most common limiting factors in plant growth is the availability of nitrogen [3]. Although 4/5ths of earth's atmosphere is comprised of nitrogen, the ability to utilize atmospheric nitrogen is restricted to a few groups of prokaryotes that are able to covert atmospheric nitrogen to ammonia and, in the case of the legume symbio‐ sis, make some of this available to plants. Predominantly, members of the plant family Legumi‐ nosae have evolved with nitrogen fixing bacteria from the family Rhizobiaceae. In summary, the plants excrete specific chemical signals to attract the nitrogen fixing bacteria towards their roots. They also give the bacteria access to their roots, allowing them to colonize and reside in the root nodules, where the modified bacteria (bacteroids) can perform nitrogen fixation [1,4,5]. This process is of great interest to scientists in general, and agriculture specifically, since this highly complex recognition and elicitation is co-ordinated through gene expression and cellular differentiation, followed by plant growth and development; it has the potential to min‐ imize the use of artificial nitrogen fertilizers and pesticides in crop management. This biologi‐ cal nitrogen fixation process is complex, but has been best examined in some detail in the context of soybean-*Bradyrhizobium* plant-microbe interactions.

© 2013 Subramanian and Smith; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Subramanian and Smith; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## **1.2. Soybean – The plant**

Soybean (*Glycine max* (L.) Merrill) is a globally important commercial crop, grown mainly for its protein, oil and nutraceutical contents. The seeds of this legume are 40% protein and 20% oil. Each year soybean provides more protein and vegetable oil than any other cultivat‐ ed crop in the world.

Soybean originated in China, where it has been under cultivation for more than 5000 years [6]. The annual wild soybean (*G. soja*) and the current cultivated soybean (*G. max*) can be found growing in China, Japan, Korea and the far east of Russia, with the richest diversity and broad‐ est distribution in China, where extensive germplasms are available. The National Gene Bank at the Institute of Crop Germplasm Resources, part of Chinese Academy of Agriculture Scien‐ ces (ICGR-CAAS), Beijing, contains close to 24,000 soybean accessions, including wild soybean types. Soybean was introduced into North America during the 18th century, but intense cultiva‐ tion started in the 1940s – 1950s and now North America is the world's largest producer of soy‐ bean [7,8]. Although grown worldwide for its protein and oil, high value added products such as plant functional nutraceuticals, including phospholipids, saponins, isoflavones, oligosac‐ charides and edible fibre, have gained importance in the last decade. Interestingly, while genis‐ tein and diadzein are signal molecules involved in the root nodulation process, the same compounds can attenuate osteoporosis in post-menopausal women. The other isoflavones have anti-cancer, anti-oxidant, positive cardiovascular and cerebrovascular effects [9]. More recently soybean oil has also been used as an oil source for biodiesel [10-14].

Table 1 provides the latest statistics on soybean cultivation and production as available at FAOSTAT [15]


**Table 1.** Soybean production statistics (FAOSTAT 2010)

Soybean is a well-known nitrogen fixer and has been a model plant for the study of BNF. Its importance in BNF led to the genome sequencing of soybean; details of the soybean genome are available at soybase.org (*G. max* and *G. soja* sequences are available at NCBI as well). Al‐ though considerable work has been conducted on other legumes with respect to biological nitrogen fixation, we focus only on soybean for this review.

The efficiency of BNF depends on climatic factors such as temperature and photoperiod [16]; the effectiveness of a given soybean cultivar in fixing atmospheric nitrogen depends on the interaction between the cultivar's genome and conditions such as soil moisture and soil nutrient availability [17,18]; and the competitiveness of the bacterial strains available, rela‐ tive to indigenous and less effective strains, plus the amount and type of inoculants applied, and interactions with other, possibly antagonistic, agrochemicals that are used in crop pro‐ tection [19]. The most important criteria, however, is the selection of an appropriate strain of *B. japonicum* since specific strains can be very specific to soybean cultivar, and subject to in‐ fluence by specific edaphic factors [20,21,22]. Under most conditions, soybean meets 50-60% of its nitrogen demand through BNF, but it can provide 100% from this source [23].

### **1.3.** *Bradyrhizobium japonicum*

**1.2. Soybean – The plant**

Relationships

4

ed crop in the world.

FAOSTAT [15]

**Area harvested**

**Production**

**Soybean oil**

**Table 1.** Soybean production statistics (FAOSTAT 2010)

nitrogen fixation, we focus only on soybean for this review.

Soybean (*Glycine max* (L.) Merrill) is a globally important commercial crop, grown mainly for its protein, oil and nutraceutical contents. The seeds of this legume are 40% protein and 20% oil. Each year soybean provides more protein and vegetable oil than any other cultivat‐

A Comprehensive Survey of International Soybean Research - Genetics, Physiology, Agronomy and Nitrogen

Soybean originated in China, where it has been under cultivation for more than 5000 years [6]. The annual wild soybean (*G. soja*) and the current cultivated soybean (*G. max*) can be found growing in China, Japan, Korea and the far east of Russia, with the richest diversity and broad‐ est distribution in China, where extensive germplasms are available. The National Gene Bank at the Institute of Crop Germplasm Resources, part of Chinese Academy of Agriculture Scien‐ ces (ICGR-CAAS), Beijing, contains close to 24,000 soybean accessions, including wild soybean types. Soybean was introduced into North America during the 18th century, but intense cultiva‐ tion started in the 1940s – 1950s and now North America is the world's largest producer of soy‐ bean [7,8]. Although grown worldwide for its protein and oil, high value added products such as plant functional nutraceuticals, including phospholipids, saponins, isoflavones, oligosac‐ charides and edible fibre, have gained importance in the last decade. Interestingly, while genis‐ tein and diadzein are signal molecules involved in the root nodulation process, the same compounds can attenuate osteoporosis in post-menopausal women. The other isoflavones have anti-cancer, anti-oxidant, positive cardiovascular and cerebrovascular effects [9]. More

Table 1 provides the latest statistics on soybean cultivation and production as available at

**(Ha)** 102,386,923 1,090,708 78,811,779 19,713,738 2,739,398 31,300 1,476,800

**Yield (Hg/Ha)** 25,548 13,309 28,864 14,100 17,491 19,042 29,424

**(Tonnes)** 261,578,498 1,451,646 227,480,272 27,795,578 4,791,402 59,600 4,345,300 **Seeds (Tonnes)** 6,983,352 43,283 4,838,633 1,906,313 193,870 1,252 154,300

**(Tonnes)** 39,761,852 390,660 24,028,558 12,442,496 2,890,760 9,377 241,300

Soybean is a well-known nitrogen fixer and has been a model plant for the study of BNF. Its importance in BNF led to the genome sequencing of soybean; details of the soybean genome are available at soybase.org (*G. max* and *G. soja* sequences are available at NCBI as well). Al‐ though considerable work has been conducted on other legumes with respect to biological

**World Africa Americas Asia Europe Oceania Canada**

recently soybean oil has also been used as an oil source for biodiesel [10-14].

*B. japonicum*, is a gram negative, rod shaped nitrogen fixing member of the rhizobia and is an N2-fixing symbiont of soybean. *B. japonicum* strain USDA110, was originally isolated from soybean nodules in Florida, USA, in 1957 and has been widely used for the purpose of molecular genetics, physiology, and ecology, owing to its superior symbiotic nitrogen fixa‐ tion activity with soybean, relative to other evaluated strains. The genome sequence of this strain has been determined; the bacterial genome is circular, 9.11 Million bp long and con‐ tains approximately 8373 predicted genes, with an average GC content of 64.1% [24,25].

Initially attached to the root-hair tips of soybean plants, rhizobia colonize within the roots and are eventually localized within symbiosomes, surrounded by plant membrane. This symbiotic relationship provides a safe niche and a constant carbon source for the bacteria while the plant derives the benefits of bacterial nitrogen fixation, which allows for the use of readily available nitrogen for plant growth. Inoculation of soybean with *B. japonicum* often increases seed yield [eg. 26].

*B. japonicum* synthesize a wide array of carbohydrates, such as lipopolysaccharides, capsular polysaccharides, exopolysaccharides (EPS), nodule polysaccharides, lipo-chitin oligosac‐ charides, and cyclic glucans, all of which play a role in the BNF symbiosis. Bacteria produce polysaccharide degrading enzymes, such as polygalacturonase and carboxymethylcellulase, cleave glycosidic bonds of the host cell wall at areas where bacteria are concentrated, creat‐ ing erosion pits in the epidermal layer of the roots, allowing the bacteria gain entry to the roots [27]. The energy source for *B. japonicum* is the sugar trehalose, which is taken up readi‐ ly and converted to CO2 [28,29,30,31]. On the other hand UDP-glucose is taken up in large quantities but metabolized slowly, like sucrose and glucose. Promotion of plant growth causes more O2 to be released and more CO2 to be taken up [24,27].

### **1.4. Lipo-chitooligosaccharide (LCO) from** *Bradyrhizobium japonicum*

As mentioned earlier in this review, the process of nodulation in legumes begins with a complex signal exchange between host plants and rhizobia. The first step in rhizobial estab‐ lishment in plant roots is production of isoflavonoids as plant-to-bacterial signals; the most common in the soybean-*B. japonicum* symbiosis being genestin and diadzein [32], which trig‐

ger the *nod* genes in the bacteria which, in turn, produce LCOs, or Nod factors, that act as return signals to the plants and start the process of root hair curling, leading to nodule for‐ mation. Some recent literature has also shown that jasmonates can also cause *nod* gene acti‐ vation in *B. japonicum* although the strain specificities are very different from those of isoflavonoids such as genistein [33-36]. LCOs are oligosaccharides of β-1,4-linked N-acetyl-D-glucosamine coded for by a series of *nod* genes and are rhizobia specific [37,38]. The nod‐ DABCIJ genes, conserved in all nodulating rhizobia [37,39,40] are organized as a transcriptional unit and regulated by plant-to-rhizobia signals such isoflavanoids [41-43].

Nodulation and subsequent nitrogen fixation are affected by environmental factors. It has been observed that, under sub-optimal root zone temperatures (for soybean 15-17 ºC), pH stress and in the presence of nitrogen, isoflavanoid signal levels are reduced; while high temperature (39 ºC) increases non-specific isoflavanoid production and reduces *nod* gene ac‐ tivation, thereby affecting nodulation [44]). Our laboratory has isolated and identified the major LCO molecule produced by *B. japonicum* 532C as Nod Bj V (C18:1;MeFuc) [45]. This Nod factor contains a methyl-fucose group at the reducing end that is encoded by the hostspecific *nodZ* gene [46], which is an essential component for soybean-rhizobia interactions.

LCOs also positively and directly affect plant growth and development in legumes and nonlegumes. The potential role of LCOs in plant growth regulation was first reported by Denar‐ ie and Cullimore [47]). Nod genes A and B from *R. meliloti*, when introduced into tobacco, altered the phenotype by producing bifurcated leaves and stems, suggesting a role for *nod* genes in plant morphogenesis [48]. The development of somatic embryos of Norway spruce is enhanced by treatment with purified Nod factor from *Rhizobium* sp. NGR234. It has been suggested that these Nod factors can substitute for auxin and cytokinin like activities in pro‐ moting embryo development, and that the chitin core of the nod factor is an essential com‐ ponent for regulation of plant development [49,50]. Some of the LCO induced enod genes in non-legumes seem to encode for defence related responses, such as chitinase and PR pro‐ teins [42,43], peroxidase [51] and enzymes of phenylpropanoid pathway, such as L-phenyla‐ lanine ammonia-lyase (PAL) [52]. Seed gemination and seedling establishment is enhanced in soybean, common bean, maize, rice, canola, apple and grapes, accompanied by increased photosynthetic rates [53]. Hydroponically grown maize showed an increase in root growth when LCO was applied to the hydroponic solution [54,55] and foliar application to green‐ house grown maize resulted in increases in photosynthetic rate, leaf area and dry matter [56]. Foliar application to tomato, during early and late flowering stages, increased flower‐ ing and fruiting and also fruit yield [57]. An increase in mycorrhizal colonization (*Gigaspora margarita*) was observed in *Pinus abies* treated with LCO [50,58]. Recent research in our labo‐ ratory, on soybean leaves treated with LCOs under sub-optimal growth conditions, revealed the up-regulation of over 600 genes, many of which are defense and stress response related, or transcription factors; microarray results show that the transcriptome of the leaves is high‐ ly responsive to LCO treatment at 48 h post treatment [59]. These results suggest the need to investigate more carefully the mechanisms by which microbe-to-plant signals help plants ac‐ commodate abiotic and biotic stress conditions.

Since the protein quality of soybean plays an important role in overall agricultural and in nutraceuticals production, it is imperative that we study the proteomics of soybean and its symbiont *B. japonicum*, not only for better understanding of the crop, but also for the better‐ ment of agriculture practices and production of better high value added food products for human consumption.

### **1.5. Proteomics as a part of integrative systems biology**

ger the *nod* genes in the bacteria which, in turn, produce LCOs, or Nod factors, that act as return signals to the plants and start the process of root hair curling, leading to nodule for‐ mation. Some recent literature has also shown that jasmonates can also cause *nod* gene acti‐ vation in *B. japonicum* although the strain specificities are very different from those of isoflavonoids such as genistein [33-36]. LCOs are oligosaccharides of β-1,4-linked N-acetyl-D-glucosamine coded for by a series of *nod* genes and are rhizobia specific [37,38]. The nod‐ DABCIJ genes, conserved in all nodulating rhizobia [37,39,40] are organized as a transcriptional unit and regulated by plant-to-rhizobia signals such isoflavanoids [41-43].

A Comprehensive Survey of International Soybean Research - Genetics, Physiology, Agronomy and Nitrogen

Relationships

6

Nodulation and subsequent nitrogen fixation are affected by environmental factors. It has been observed that, under sub-optimal root zone temperatures (for soybean 15-17 ºC), pH stress and in the presence of nitrogen, isoflavanoid signal levels are reduced; while high temperature (39 ºC) increases non-specific isoflavanoid production and reduces *nod* gene ac‐ tivation, thereby affecting nodulation [44]). Our laboratory has isolated and identified the major LCO molecule produced by *B. japonicum* 532C as Nod Bj V (C18:1;MeFuc) [45]. This Nod factor contains a methyl-fucose group at the reducing end that is encoded by the hostspecific *nodZ* gene [46], which is an essential component for soybean-rhizobia interactions.

LCOs also positively and directly affect plant growth and development in legumes and nonlegumes. The potential role of LCOs in plant growth regulation was first reported by Denar‐ ie and Cullimore [47]). Nod genes A and B from *R. meliloti*, when introduced into tobacco, altered the phenotype by producing bifurcated leaves and stems, suggesting a role for *nod* genes in plant morphogenesis [48]. The development of somatic embryos of Norway spruce is enhanced by treatment with purified Nod factor from *Rhizobium* sp. NGR234. It has been suggested that these Nod factors can substitute for auxin and cytokinin like activities in pro‐ moting embryo development, and that the chitin core of the nod factor is an essential com‐ ponent for regulation of plant development [49,50]. Some of the LCO induced enod genes in non-legumes seem to encode for defence related responses, such as chitinase and PR pro‐ teins [42,43], peroxidase [51] and enzymes of phenylpropanoid pathway, such as L-phenyla‐ lanine ammonia-lyase (PAL) [52]. Seed gemination and seedling establishment is enhanced in soybean, common bean, maize, rice, canola, apple and grapes, accompanied by increased photosynthetic rates [53]. Hydroponically grown maize showed an increase in root growth when LCO was applied to the hydroponic solution [54,55] and foliar application to green‐ house grown maize resulted in increases in photosynthetic rate, leaf area and dry matter [56]. Foliar application to tomato, during early and late flowering stages, increased flower‐ ing and fruiting and also fruit yield [57]. An increase in mycorrhizal colonization (*Gigaspora margarita*) was observed in *Pinus abies* treated with LCO [50,58]. Recent research in our labo‐ ratory, on soybean leaves treated with LCOs under sub-optimal growth conditions, revealed the up-regulation of over 600 genes, many of which are defense and stress response related, or transcription factors; microarray results show that the transcriptome of the leaves is high‐ ly responsive to LCO treatment at 48 h post treatment [59]. These results suggest the need to investigate more carefully the mechanisms by which microbe-to-plant signals help plants ac‐

commodate abiotic and biotic stress conditions.

The "omics" approach to knowledge gain in biology has advanced considerably in the re‐ cent years. The triangulation approach of integrating transcriptomics, proteomics and me‐ tabolomics is being used currently to study interconnectivity of molecular level responses of crop plants to various conditions of stress tolerance and adaptation of plants, thus improv‐ ing systems level understanding of plant biology [60, 61].

While transcriptomics is an important tool for studying gene expression, proteomics actual‐ ly portrays the functionality of the genes expressed. Several techniques are available for studying differential expression of protein profiles, and can be broadly classified as gelbased and MS (mass spectrometry)-based quantification methods. The gel based approach uses conventional, two-dimensional (2-D) gel electrophoresis, and 2-D fluorescence differ‐ ence gel electrophoresis (2D-DIGE), both based on separation of proteins according to iso‐ electric point, followed by separation by molecular mass. The separated protein spots are then isolated and subjected to MS analysis for identification. Major drawbacks of these tech‐ niques are laborious sample preparation and inability to identify low abundance, hydropho‐ bic and basic proteins.

The MS based approach can be a label-based quantitation, where the plants or cells are grown in media containing 15N metabolite label or using 15N as the nitrogen source. Labelfree quantitation, however, is easier and allows analysis of multiple and unlimited samples. This technique, also referred to as MudPIT (multidimensional protein identification technol‐ ogy), is a method used to study proteins from whole-cell lysate and/or a purified complex of proteins [62,63]. The total set of proteins or proteins from designated target sites are isolated and subjected to standard protease digestions (eg. such as tryptic digestion). In brief, flash frozen leaf samples are ground in liquid nitrogen and polyphenols; tannins and other inter‐ fering substances such as chlorophyll are removed. The processed tissue is resuspended in a chaotropic reagent to extract proteins in the upper phase, and the plant debris is discarded [64-70]. The total protein set, in the resulting solution, is further quantified using the Lowry method [71]. The protein samples (2 µg of total protein each), once digested with trypsin, can then be loaded onto a microcapillary column packed with reverse phase and strong cati‐ on exchange resins. The peptides get separated in the column, based on their charge and hy‐ drophobicity. The columns are connected to a quarternary high-performance liquid chromatography pump and coupled with an ion trap mass spectrometer, to ionize the sam‐ ples within the column and spray them directly into a tandem mass spectrometer. This al‐ lows for a very effective and high level of peptide separation within the mixture, and detects the eluting peptides to produce a mass spectrum. The detected peptide ions, at measured mass-to-charge (m/z) ratios with sufficient intensity, are selected for collision-induced disso‐

ciation (CID). This procedure allows for the fragmenting of the peptides to produce a prod‐ uct ion spectrum, the MS/MS spectrum. In addition, the fragmentation occurs preferentially at the amide bonds, to generate N-terminal fragments (b ions) and C-terminal fragments (y ions) at specific m/z ratios, providing structural information about the amino acid sequence and sites of modification. The b ion and y ion patterns are matched to a peptide sequence in a translated genomic database to help identify the proteins present in the sample [72-75]. A variety of database searching and compiling algorithms are used to interpret the data ob‐ tained for structure and function of the identified proteins.
