**2. Conclusion and perspectives**

153, 158, 160, 161]. *LEC1* and *L1L* can active the promoter of *CRUCIFERIN C* (*CRC*), and *LEC1* can also regulate *CRC* and other SSP genes working with *FUS3* and *ABI3* [161]. In addition to RY motifs, the presence of G-Box elements is also proper activation of target promoters of *LEC1*, *LEC2*, *ABI3* and *FUS3* [162]. Some studies showed that *LEC2*, *ABI3* and *FUS3* collaborate with *bZIPs* TFs that interact with these G-Box elements to activate SSP genes [163, 164]. Furthermore, *GmDOF4* and *GmDOF11* can bind with the promoter of *CRA1* to regulate the expression of SSP [41]. *GmDREBL* can be upregulated by *GmABI3* and *GmABI5* and be regulated by the late stage of SSP genes [44]. *DGAT* can reduce the soluble carbohydrate content of mature seeds and increase the seed protein content at the same time [165]. Therefore, in addition to *WAR1*, *LEC1*, *LEC2*, *ABI3* and *FUS3*, transcription factors of *MYB*, *bZIP*, *MADS*, *DOF* or *AP2* families are also involved in the accumulation of storage compounds (oil and SSPs) and

Small RNAs, such as miRNAs and short interfering RNAs (siRNAs), are key components of the evolutionarily conserved system of gene regulation in eukaryotes [166]. Wherein, microR-NAs (miRNAs) are a class of non-coding small RNAs of 20–24 nt in length that play an important role in plant growth and development. Structurally, except for the characteristics of the segments, all miRNA precursors have well-predicted stem-loop hairpin structures, and this fold-back hairpin structure has a low degree of freedom of energy [167]. The microRNA database (http://www.mirbase.org/) is a searchable database of published miRNA sequences and annotations. According to miRBase, miRNA information of 1269 species has been collected, including 399 soybean miRNAs. For example, gma-MIR156d belongs to the MIPF0000008, MIR156 gene family, described as *Glycine max* miR156d stem-loop, annotated that microRNA (miRNA) precursor mir-156 is a family of plant non-coding RNA. This microRNA has now been predicted or experimentally confirmed in a range of plant species (MIPF0000008). The products are thought to have regulatory roles through complementarity to mRNA. SFGD is a comprehensive database of integrated genomic and transcriptome data and a comprehensive database of soy acyl lipid metabolic pathways, including a coexpression regulatory network of 23,267 genes and 1873 miRNA-target pairs as well as a set of acyl-lipid pathways containing 221 enzymes and more than 1550 genes, providing biologists with a useful toolbox [168]. In addition, SoyKB is also a website, which provides information on soybean genomics, transcriptomics, proteomics and metabolomics as well as gene function and biology annotation, including information like genes, microRNAs, metabolites and mono nucleotide polymorphisms (SNPs) [169]. Shi and Chiang used miRNA-specific forward primers and sequences complementary to poly(T) linkers as reverse primers to find a simple and effective method to determine miRNA expression. Total RNA (including miRNAs) was polyadenylated and

seed development regulatory network, as partners or direct target genes [162].

reverse transcribed into cDNA using poly (T) linkers for real-time PCR.

There are few studies on miRNAs related to plant quality. Soybean cotyledons affect soybean seed yield and quality. Goettel et al. analyzed 304 miRNA genes expressed in soybean cotyledons and predicted their complex miRNA networks to 1910 genes. By analyzing extensive biological pathways present in soybean cotyledons, the evolutionary pathways of

**1.4. Small RNA regulation of seed composition**

28 Next Generation Plant Breeding

As sequencing development of soybean genome, the cultivar Williams 82 genome has been released by Schmutz et al. [172], and it update the quality of assembly of the reference genome year by year. In present version (*Glycine max Wm82.a2.v1*), 56,044 protein-coding loci and 88,647 transcripts have been predicted, and all related data have been released in Phytozome (https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org\_Gmax). At the basis of the reference genome, around 265 cultivated soybean varieties, 92 wild soybean varieties and 10 semi-wild soybean varieties have been resequenced; these information give a foundation for functional genomic analyses such as transcriptomic, proteomic, epigenomic and non-coding RNA analyses [173].

Although many genes and regulators of seed oil content and SSP have been identified and their associated regulatory networks have been well studied in Arabidopsis, there are still unclear in soybean in addition to *WAR1*, *LEC1*, *LEC2*, *ABI3* and *FUS3* due to the 75% duplication genome [172]. Combination and application of multiple omics (genomics, functional genomics, transcriptomic, proteomics and epigenomics) and advanced biotechnology (genome editing) needed to clarify the soybean seed oil content and SSP gene and regulatory network. Secondary population including recombinant heterozygous lines (RHL), chromosome segment substitution line (CSSL) and/or near isogenic lines (NIL) need to be applied to reduce the variable for analyzing the effects of single gene or transcription factors and used to identify the effective alleles and evaluate its effects and contribution. Combination of general loci could be further used for design of selection chip assay, which may lead to the foundation of high oil or high seed storage protein breeding.
