**2. Mitochondrial genome**

#### **2.1. Genomes as markers**

Typically, all sufficiently variable DNA regions can be used in genetic studies of popula‐ tions and in interspecific studies. Because of in seed plants chloroplasts and mitochondria are mainly inherited uniparentally, organelle genomes are often used because they carry more information than nuclear markers, which are inherited biparentally. The main benefit is that there is only one allele per cell and per organism, and, consequently, no recombina‐ tion between two alleles can occur. With different dispersal distances, genomes inherited bi‐ parentally, maternally and paternally, also reveal significant differences in their genetic variability among populations. In particular, maternally inherited markers show diversity within a population much better [1].

© 2013 Skuza et al.; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Skuza et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In gymnosperms the situation is somewhat different. Here, chloroplasts are inherited main‐ ly paternally and are therefore transmitted through pollen and seeds, whereas mitochondria are largely inherited maternally and are therefore transmitted only by seeds [2]. Since pollen is distributed at far greater distances than seeds [3], mitochondrial markers show a greater population diversity than chloroplast markers and therefore serve as important tools in con‐ ducting genetic studies of gymnosperms [4]. Mitochondrial markers are also sometimes used in conjunction with cpDNA markers [5].

Mitochondrial regions used in interspecific studies of plants, mainly gymnosperms, in‐ clude, for example, introns of the NADH dehydrogenase gene *nad1* [4, 5, 6], the *nad7* in‐ tron 1 [7], the *nad5* intron 4 [3] and an internally transcribed spacer (ITS) of mitochondrial ribosomal DNA [8, 9].

In addition to the aforementioned organelle markers, microsatellite markers [10, 11] and simple sequence repeats (SSR) are often used in population biology, and sometimes also in phylogeographic studies. Microsatellites are much less common in plants than in animals [12]. However, they are present in both the nuclear genome and the organelle genome. Mi‐ crosatellites may reveal a high variability, which may be useful in genetic studies of popula‐ tions, whereas other sequences or methods such as fingerprinting do not detect mutations sufficiently [9,10,13]. Inherited only uniparentally, organelle markers have a certain quality in phylogeographic analyses. Since they are haploid, the effective population size should be reduced after the analysis using these markers as compared to those in which nuclear mark‐ ers are used [1, 14]. Smaller effective populations sizes should bring about faster turnover rates for newly evolving genotypes, resulting in a clearer picture of past migration history than those obtained using nuclear markers [15-17].

Initially, it was mainly in phylogeographic studies of animal species that mitochondrial markers were used [18]. These studies have provided some interesting data on the begin‐ nings and the evolutionary history of human population [19]. In contrast to studies of ani‐ mals, using mitochondrial markers in studies of plants, especially angiosperms, is limited [20]. Presently, cpDNA markers are most commonly used in phylogeographic studies of an‐ giosperms, whereas mitochondrial markers are prevalent in studies of gymnosperms.

### **2.2. Plant mitochondrial DNA**

Mitochondrial genomes of higher plants (208-2000 kbp) are much larger than those of verte‐ brates (16-17 kbp) or fungi (25-80 kbp) [21, 22]. In addition, there are clear differences in size and organization of mitochondrial genomes between different species of plants. Intramolec‐ ular recombination in mitochondria leads to complex reorganizations of genomes, and, in consequence, to alternating arrangement of genes, even in individual plants, and the occur‐ rence of duplications and deletions are common [23]. In addition, the nucleotide substitution rate in plant mitochondria is rather low [24], causing only minor differences within certain loci between individuals or even species. Extensively characterized circular animal mito‐ chondrial genomes are highly conservative within a given species; they do not contain in‐ trons and have a very limited number of intergenic sequences [25]. Plant mitochondrial DNA (mtDNA) contains introns in multiple genes and several additional genes undergoing expression when compared to animal mitochondria, but most of the additional sequences in plants are not expressed and they do not seem to be esssentials [26]. The completely se‐ quenced mitochondrial genomes are available for several higher plants, including *Arabidop‐ sis thaliana* [27] or *Marchantia polymorpha* [28].

Restriction maps of nearly all plant mitochondrial genomes provide for the occurrence of the master circle with circular subgenomic molecules that arise after recombination among large direct repeats (> 1 kbp) [21, 29-36], which are present in most mitochondrial genomes of higher plants. However, such molecules, whose sizes can be predicted, are very rare or very difficult to observe. It can be explained by the fact that plant mitochondrial genomes are circularly per‐ muted as in the phage T4 [37, 38]. Oldenburg and Bendich reported that mostly linear mole‐ cules in *Marchantia* mtDNA are circularly permuted with random ends [39]. It shows that plant mtDNA replication occurs similarly to the mechanism of recombination in the T4 [38].

Many reports that have appeared in recent years indicate that mitochondrial genome of yeasts and of higher plants exist mainly as linear and branched DNA molecules with varia‐ ble size which is much smaller than the predicted size of the genomes [39-44]. Using pulsed field gel electrophoresis (PFGE) of in-gel lysed mitochondria from different species revealed that only about 6-12% of the molecules are circular [41, 44]. The observed branched mole‐ cules are very similar to the molecules seen in yeast in the intermediate stages of recombina‐ tion of mtDNA [45] or the phage T4 DNA replication [37, 38].

In all but one known case (*Brassica hirta*) [46], plant mitochondrial genomes contain repeat recombinations. These sequences, ranging in length from several hundred to several thou‐ sand nucleotides (nt) exist at two different loci in the master circle, yet in four mtDNA se‐ quence configurations [47]. These four configurations correspond to the reciprocal exchange of sequences 5' and 3' surrounding the repeat in the master circle, which suggests that the repeat mediates homologous recombination. Depending on the number and orientation of repeats, the master circle is a more or less complex set of subgenomic molecules [48].

Maternally inherited mutations, which are associated with mitochondria in higher plants, most often occur as a result of intra- and intergenic recombination. This happens in most cases of cytoplasmic male sterility (cms) [41, 49-51], in *chm*-induced mutation in *Arabidopsis* [52] and in non-chromosomal stripe mutations in maize [53]. In this way, it is assumed that the recombination activity explains the complexity of the variations detected in the mito‐ chondrial genomes of higher plants.

### **2.3. Mitochondrial genome of soybean**

In gymnosperms the situation is somewhat different. Here, chloroplasts are inherited main‐ ly paternally and are therefore transmitted through pollen and seeds, whereas mitochondria are largely inherited maternally and are therefore transmitted only by seeds [2]. Since pollen is distributed at far greater distances than seeds [3], mitochondrial markers show a greater population diversity than chloroplast markers and therefore serve as important tools in con‐ ducting genetic studies of gymnosperms [4]. Mitochondrial markers are also sometimes

A Comprehensive Survey of International Soybean Research - Genetics, Physiology, Agronomy and Nitrogen

Mitochondrial regions used in interspecific studies of plants, mainly gymnosperms, in‐ clude, for example, introns of the NADH dehydrogenase gene *nad1* [4, 5, 6], the *nad7* in‐ tron 1 [7], the *nad5* intron 4 [3] and an internally transcribed spacer (ITS) of mitochondrial

In addition to the aforementioned organelle markers, microsatellite markers [10, 11] and simple sequence repeats (SSR) are often used in population biology, and sometimes also in phylogeographic studies. Microsatellites are much less common in plants than in animals [12]. However, they are present in both the nuclear genome and the organelle genome. Mi‐ crosatellites may reveal a high variability, which may be useful in genetic studies of popula‐ tions, whereas other sequences or methods such as fingerprinting do not detect mutations sufficiently [9,10,13]. Inherited only uniparentally, organelle markers have a certain quality in phylogeographic analyses. Since they are haploid, the effective population size should be reduced after the analysis using these markers as compared to those in which nuclear mark‐ ers are used [1, 14]. Smaller effective populations sizes should bring about faster turnover rates for newly evolving genotypes, resulting in a clearer picture of past migration history

Initially, it was mainly in phylogeographic studies of animal species that mitochondrial markers were used [18]. These studies have provided some interesting data on the begin‐ nings and the evolutionary history of human population [19]. In contrast to studies of ani‐ mals, using mitochondrial markers in studies of plants, especially angiosperms, is limited [20]. Presently, cpDNA markers are most commonly used in phylogeographic studies of an‐ giosperms, whereas mitochondrial markers are prevalent in studies of gymnosperms.

Mitochondrial genomes of higher plants (208-2000 kbp) are much larger than those of verte‐ brates (16-17 kbp) or fungi (25-80 kbp) [21, 22]. In addition, there are clear differences in size and organization of mitochondrial genomes between different species of plants. Intramolec‐ ular recombination in mitochondria leads to complex reorganizations of genomes, and, in consequence, to alternating arrangement of genes, even in individual plants, and the occur‐ rence of duplications and deletions are common [23]. In addition, the nucleotide substitution rate in plant mitochondria is rather low [24], causing only minor differences within certain loci between individuals or even species. Extensively characterized circular animal mito‐ chondrial genomes are highly conservative within a given species; they do not contain in‐ trons and have a very limited number of intergenic sequences [25]. Plant mitochondrial DNA (mtDNA) contains introns in multiple genes and several additional genes undergoing

used in conjunction with cpDNA markers [5].

than those obtained using nuclear markers [15-17].

**2.2. Plant mitochondrial DNA**

ribosomal DNA [8, 9].

Relationships

554

The size of soybean mtDNA has been estimated to be approximately 400 kb [54-56]. Spheri‐ cal molecules have also been observed by electron microscopy [55, 57].

Repeated sequences 9, 23 and 299 bp have been characterized in soybean mitochondria [58, 59]. Also, numerous reorganizations of genome sequences have been characterized among different cultivars of soybean. It has been demonstrated that they occur through homolo‐ gous recombination produced by these repeat sequences [58, 60, 61], or through short ele‐ ments that are part of 4.9kb PstI fragment of soybean mtDNA [62]. The 299 bp repeat

sequence has been found in several copies of mtDNA of soybean and in several other higher plants, suggesting that this repeated sequence may represent a hot spot for recombination of mtDNA in many plant species [59, 62]. Previous results suggested that active homologous recombinations of mtDNA are present in at least some species of plants. Recently (2007) amitochondrial-targeted homolog of the *Escherichia coli recA* gene in *A. thaliana* has been identified [63]. However, the data on recombnation activity in plant mitochondria is still missing. The first data on such an activity in soybean was obtained in 2006 [64]. This discov‐ ery is supported by an analysis of mtDNA of soybean using electron microscopy and 2Delectrophoresis. The results suggest that only a small portion of mtDNA molecules undergoes recombination at any given time. Therefore the question is whether this recombi‐ nation is essential to the functioning of mitochondria and to plant growth.

The repeated sequences of the *atp6, atp9* and *coxII* genes have been also characterized, but their recombination activity has not been analysed [65].

The first data for the restriction map of soybean mtDNA were obtained from the analysis of loci of the *atp4* gene [48]. In the vicinity of this gene two repeated sequences that show char‐ acteristics of recombination repeats have been found [47, 48]. Active recombination repeats were also identified in circular molecules smaller than 400 kb [55, 66]. These observations suggest that soybean mtDNA has multipartite structure that is similar to other plant mito‐ chondrial genomes containing recombination repeats.

In the mitochondrial genome of cultivar Williams 82, recombinantly active repeats 1 kb and 2 kb have been described [48]. In a different repeat of 10 kb, surrounding both 1 kb and 2 kb repeats, two breakpoints have been identified. This recombination of smaller and larger re‐ peats probably leads to the complex structure of genomes.

The analysis of restriction fragment length polymorphism (RFLP) of mtDNA seems to be a useful method in studying phylogenetic relationships within species.

Grabau et al. (1992) analyzed the genomes of 138 soybean cultivars [60]. Using 2.3 kb Hin‐ dIII mtDNA probe from Williams 82 soybean cultivar revealed restriction fragment length polymorphisms (RFLPs), which allowed for the division of many soybean cultivars into four cytoplasmic groups: Bedford, Arksoy, Lincoln and soja-forage.

Subsequent analyses showed variations within, and adjacent to, the 4.8 kb repeats. Bedford cytoplasm turned out to be the only one that contains copies of the repeat in four different genomic environments, which indicates its recombination activity [61]. Lincoln and Arksoy cytoplasms contain two copies of the repeat and a unique fragment that appear to result from rare recombination events outside, but near, the repeat. In contrast, forage-soja cyto‐ plasm contains no complete repeat, but it contains a unique truncated version of the repeat [61]. Sequence analysis revealed that truncating is caused by the recombination with a re‐ peat of 9 bp CCCCTCCCC. The structural reorganization that occurred in the region around 4.8 kb repeat may provide a way to analyze the relationships between species and evolution within the soybean subgenus.

In order to determine the sources of cytoplasmic variability, Hanlon and Grabau (1995) studied the old cultivars of soybeans with the same 2.3-kb *Hind*III fragment and with a mtDNA fragment containing the *atp6* gene [62]*.* They showed that mtDNA RFLP analysis with these probes is useful for the classification of mitochondrial genomes of soybean. Grabau and Davies (1992) made a general classification of wild soybean using the 2.3-kb *Hind*III as a probe [68].

sequence has been found in several copies of mtDNA of soybean and in several other higher plants, suggesting that this repeated sequence may represent a hot spot for recombination of mtDNA in many plant species [59, 62]. Previous results suggested that active homologous recombinations of mtDNA are present in at least some species of plants. Recently (2007) amitochondrial-targeted homolog of the *Escherichia coli recA* gene in *A. thaliana* has been identified [63]. However, the data on recombnation activity in plant mitochondria is still missing. The first data on such an activity in soybean was obtained in 2006 [64]. This discov‐ ery is supported by an analysis of mtDNA of soybean using electron microscopy and 2Delectrophoresis. The results suggest that only a small portion of mtDNA molecules undergoes recombination at any given time. Therefore the question is whether this recombi‐

A Comprehensive Survey of International Soybean Research - Genetics, Physiology, Agronomy and Nitrogen

The repeated sequences of the *atp6, atp9* and *coxII* genes have been also characterized, but

The first data for the restriction map of soybean mtDNA were obtained from the analysis of loci of the *atp4* gene [48]. In the vicinity of this gene two repeated sequences that show char‐ acteristics of recombination repeats have been found [47, 48]. Active recombination repeats were also identified in circular molecules smaller than 400 kb [55, 66]. These observations suggest that soybean mtDNA has multipartite structure that is similar to other plant mito‐

In the mitochondrial genome of cultivar Williams 82, recombinantly active repeats 1 kb and 2 kb have been described [48]. In a different repeat of 10 kb, surrounding both 1 kb and 2 kb repeats, two breakpoints have been identified. This recombination of smaller and larger re‐

The analysis of restriction fragment length polymorphism (RFLP) of mtDNA seems to be a

Grabau et al. (1992) analyzed the genomes of 138 soybean cultivars [60]. Using 2.3 kb Hin‐ dIII mtDNA probe from Williams 82 soybean cultivar revealed restriction fragment length polymorphisms (RFLPs), which allowed for the division of many soybean cultivars into four

Subsequent analyses showed variations within, and adjacent to, the 4.8 kb repeats. Bedford cytoplasm turned out to be the only one that contains copies of the repeat in four different genomic environments, which indicates its recombination activity [61]. Lincoln and Arksoy cytoplasms contain two copies of the repeat and a unique fragment that appear to result from rare recombination events outside, but near, the repeat. In contrast, forage-soja cyto‐ plasm contains no complete repeat, but it contains a unique truncated version of the repeat [61]. Sequence analysis revealed that truncating is caused by the recombination with a re‐ peat of 9 bp CCCCTCCCC. The structural reorganization that occurred in the region around 4.8 kb repeat may provide a way to analyze the relationships between species and evolution

In order to determine the sources of cytoplasmic variability, Hanlon and Grabau (1995) studied the old cultivars of soybeans with the same 2.3-kb *Hind*III fragment and with a

nation is essential to the functioning of mitochondria and to plant growth.

their recombination activity has not been analysed [65].

Relationships

556

chondrial genomes containing recombination repeats.

peats probably leads to the complex structure of genomes.

useful method in studying phylogenetic relationships within species.

cytoplasmic groups: Bedford, Arksoy, Lincoln and soja-forage.

within the soybean subgenus.




**Mt type Probe** *coxI coxII atp6* **Reference Enzyme** *Hind***III** *Bam***HI** *Eco***RI** *Hind***III** *Bam***HI** *Eco***RI** *Bam***HI** *Eco***RI** mt-f 2,4; 3,5; 5,0 [87] mt-g 1,0; 2,6 [87] mt-h 2,6; 2,9 [87] mt-m 2,9 [87] mt-n 12,0 [87] Ic 5,6 0,8; 2,5; 5,0 10,5 1,6 5,8 1,9 5,0 8,2; 12,0 [58]

A Comprehensive Survey of International Soybean Research - Genetics, Physiology, Agronomy and Nitrogen

Relationships

558

Id 5,6 0,8; 2,5; 5,0 10,5 1,6 5,8 1,9 5,0; 6,0; 12,0 2,8; 6,0;

Ie 5,6 0,8; 2,5; 5,0 10,5 1,6 5,8 1,9 5,0; 12,0 2,8; 6,0;

Ik 5,6 0,8; 2,5; 5,0 10,5 1,6 5,8 1,9 5,0; 5,4; 5,8 2,8; 6,0;

IIIb 5,6 0,8; 2,5; 5,0 10,5 1,2 8,5 6,2; 6,5 2,9; 5,0 6,0; 8,2;

IIId 5,6 0,8; 2,5; 5,0 10,5 1,2 8,5 6,2; 6,5 5,0; 6,0; 12,0 3,2; 6,2;

Iva 5,6 0,8; 2,5; 5,0 10,5 3,5 8,1 5,0 2,4; 5,0 3,0; 6,0;

IVb 5,6 0,8; 2,5; 5,0 10,5 3,5 5,8 5,0 2,9; 5,0 6,0; 8,2;

IVf 5,6 0,8; 2,5; 5,0 10,5 3,5 5,8 5,0 2,4; 3,5; 5,0 3,2; 6,2;

IVh 5,6 0,8; 2,5; 5,0 10,5 3,5 5,8 5,0 2,6; 2,9 3,2; 6,2;

IVi 5,6 0,8; 2,5; 5,0 10,5 3,5 5,8 5,0 5,2; 12,0 3,2; 6,2;

Va 5,6 0,8; 2,5; 5,0 10,5 5,8 5,8 12,0 2,4; 5,0 3,0; 6,0;

Vb 5,6 0,8; 2,5; 5,0 10,5 5,8 5,8 12,0 2,9; 5,0 6,0; 8,2;

V'j 5,6 0,8; 2,5; 5,0 10,5 5,8 15,0 1,6 5,0; 6,0 2,8; 6,0;

Vc 5,6 0,8; 2,5; 5,0 10,5 5,8 5,8 12,0 5,0 8,2; 12,0 [58]

IVc 5,6 0,8; 2,5; 5,0 10,5 3,5 5,8 5,0 5,0 8,2; 12,0 [58]

IIg 8,5 0,8; 2,5; 5,0 9,0 1,3 7,0 4,8 1,0; 2,6 2,8; 3,0; 9,5[58]

12,0

12,0

12,0

12,0

12,0

12,0

12,0

12,0

12,0

12,0

12,0

12,0

12,0

[58]

[58]

[58]

[58]

[58]

[58]

[58]

[58]

[58]

[58]

[58]

[58]

[58]

**Table 1.** Classification of mitochondrial genome types based on RFLPs using coxI, *coxII* and *atp6* as probes. Sizes of hybridization signals (kb) are shown.

In their research Tozuka et al. (1998) used two fragments of mtDNA as probes: the 0.7-kb *Hind*lll-*Nco*I fragment containing the *coxII* (the gene encoding the mitochondrial cytochrome oxidase subunit II) of wild soybean and the 0.66-kb *Sty*I fragment containing the *atp6* (the gene encoding the mitochondrial ATPase subunit 6) from *Oenothera* [69, 70] (Table 1).

Based on the RFLPs detected in gel-blot analysis with the *coxII* and *atp6* probes, the harvested plants were divided into 18 groups. Five mtDNA types were described in 94% of the surveyed plants. The geographical distribution of mtDNA types revealed that in many regions soybean growing wild in Japan consisted of a mixture of plants with different types of mtDNA, some‐ times even within a single location. Some of these mtDNA types have shown marked geo‐ graphic clines among the regions. In addition, some wild soybeans had mtDNA types that were identical to those described in cultivated soybeans. These results suggest that mtDNA analysis could resolve maternal origin among of the genus *Glycine* subgenus *Soja* [69].

Kanazawa et al. (1998) gathered 1097 *G. soja* plants from all over Japan and analyzed their RFLP of mitochondrial DNA (mtDNA) using five probes (*coxI, coxII, atp6, atp9, atp1=atpA*) [58] (Table 1). 20 different types of mitochondrial genomes labeled as combinations of types I to VII and types from a to k were identified and characterized in this study. Nearly all the mtDNA types described for soybean cultivars also occurred in wild soybean.

The mitochondrial *atpA* gene was also analysed [48]. It was shown that in soybean this gene has a sequence in 90-97% identical with mitochondrial genes of other plants [71-81]. Se‐ quence similarity is limited to the *atpA* coding region. An intriguing feature of the *atpA* open reading frame of soybean is an 642 nt overlap in the putative translation termination site on‐ to an unidentified open reading frame of the *orf214*. The ends of the open reading frame con‐ tain four tandems of UGA codon that covers four tandems of AUG codon that initiates an unidentified *orf214* frame. The *atpA-orf 214* region was found in soybean mtDNA in multiple sequence contexts. This can be attributed to the presence of two recombination repeats.

The open reading frame shares 79% of nucleotide identity with the *orf214* and is located in the same *atpA* locus position as in common bean *orf209* [82]. Since such organization is a re‐ peat of overlapping the *atpB* and *atpE* reading frames in several chloroplast genes [83, 84], the probability that the *orf214* codes a different ATPase subunit cannot be evaluated because small ATPase subunits are poorly conserved [85].

So for a total of 26 mtDNA haplotypes of wild soybeans have been identified based on RFLP with probes from two mitochondrial genes: *cox2* and *atp6* [69, 86] (Table 1). The three most common haplotypes (Id, IVa and Va) are present in 43 populations. The distribution of mtDNA haplotypes varies among opulations [87]. Recently Shimamoto (2001) analyzed the genetic polymorphisms of mitochondrial genes subgenus *Soja* originating from China and Japan [88] (Table 1). As a result of these studies, 6 types of mitochondrial genomes were distinguished.
