**3.3 The genomic revolution**

The recent development of massively parallel DNA sequencing platforms, the so called next-generation sequencing (NGS), allowed for the democratization of genomic and metagenomic approaches due to cost reduction and wide availability (Shendure & Ji, 2008). Such technological development has been welcomed by the fungal research community, as it permits rapid studies of deeper scope than have been possible to date. Fungal communities can now be described based on millions of sequences in a very short time frame and at relatively reduced cost. Furthermore, NGS is also enabling an increase in the number of sequenced fungal genomes, providing valuable information crucial for a better understanding of fungal biology and evolution (Brockhurst et al., 2011).

Metagenomic fungal community studies have been based on massively parallel (454) pyrosequencing, a technology able to generate over a million ~500 base-pair sequences in a day (Margulis et al., 2005). 454 pyrosequencing has been preferred to other available technologies precisely because of the long sequence reads it generates, which is crucial for the OTU identification step. 454 has been used to study a wide array of fungal communities, including phyllosphere fungi (Jumpoponen & Jones, 2009, 2010), ECM fungi (Jumpponen et al., 2010, Wallender et al., 2010), AMF (Öpik et al., 2009, Lumini et al., 2010), soil fungi (Buée et al., 2009, Rousk et al., 2010), and indoor fungi (Amend et al., 2010). These studies are important contributions representing the first steps in using metagenomics to study fungal diversity. Interestingly, the 454 results published so far confirm the trends of hyperdiversity and rarity described by traditional sequencing methods (Buée et al., 2009, Jumpponen & Jones, 2009, Tedersoo et al., 2010).

As with any new technology, pyrosequencing approaches introduce many biases that are still not completely understood, such as artefactual singletons due to sequencing errors and the formation of chimeric sequences, unintentionally formed during the polymerase chain reaction step (Bellemain et al., 2009, Quince et al., 2010, Tedersoo et al., 2010). Several attempts are being made to overcome these biases, such as the development of tools like a chimera checker (Nilsson et al., 2010) and a method for extracting the variable and informative regions of the NGS generated sequences (Nilsson et al., 2010b). There have also been some discussions on the usefulness of pyrosequencing data for determining fungal abundances, with some studies advising caution when using 454 data to quantify fungal communities (Amend et al., 2010b, Unterseher et al., 2011). Undoubtedly, the technological improvements on high-throughput sequencing coupled with refinement of analytic tools will significantly increase the quality of metagenomic results in the near future, making NGS an even more powerful and informative approach.

The massive amounts of information provided by metagenomic studies are by far the most substantial source of fungal diversity data today. As mentioned above, only a small fraction of the planet's fungal diversity has been documented and it has been suggested that the sequences generated in environmental studies should be the base for describing and naming new fungal species (Hibbett et al., 2011). The authors suggest a protocol to describe fungi based on molecular sequence similarity, but stress that sequence data should be used alone only when no other sources of information are available. Although sequences from environmental sampling offer limitations for taxonomy and phylogenetics (particularly analysis of single markers), they are practical and easy to obtain, accessible through databases, good for automated approaches, and used in phylogenetic studies. Formally naming fungal species from sequence data would imply some radical changes in the procedure for species descriptions (see section 3.1 above), however it would be a very effective way to rapidly accelerate the rate of fungal discovery.

The recent development of massively parallel DNA sequencing platforms, the so called next-generation sequencing (NGS), allowed for the democratization of genomic and metagenomic approaches due to cost reduction and wide availability (Shendure & Ji, 2008). Such technological development has been welcomed by the fungal research community, as it permits rapid studies of deeper scope than have been possible to date. Fungal communities can now be described based on millions of sequences in a very short time frame and at relatively reduced cost. Furthermore, NGS is also enabling an increase in the number of sequenced fungal genomes, providing valuable information crucial for a better

Metagenomic fungal community studies have been based on massively parallel (454) pyrosequencing, a technology able to generate over a million ~500 base-pair sequences in a day (Margulis et al., 2005). 454 pyrosequencing has been preferred to other available technologies precisely because of the long sequence reads it generates, which is crucial for the OTU identification step. 454 has been used to study a wide array of fungal communities, including phyllosphere fungi (Jumpoponen & Jones, 2009, 2010), ECM fungi (Jumpponen et al., 2010, Wallender et al., 2010), AMF (Öpik et al., 2009, Lumini et al., 2010), soil fungi (Buée et al., 2009, Rousk et al., 2010), and indoor fungi (Amend et al., 2010). These studies are important contributions representing the first steps in using metagenomics to study fungal diversity. Interestingly, the 454 results published so far confirm the trends of hyperdiversity and rarity described by traditional sequencing methods (Buée et al., 2009, Jumpponen &

As with any new technology, pyrosequencing approaches introduce many biases that are still not completely understood, such as artefactual singletons due to sequencing errors and the formation of chimeric sequences, unintentionally formed during the polymerase chain reaction step (Bellemain et al., 2009, Quince et al., 2010, Tedersoo et al., 2010). Several attempts are being made to overcome these biases, such as the development of tools like a chimera checker (Nilsson et al., 2010) and a method for extracting the variable and informative regions of the NGS generated sequences (Nilsson et al., 2010b). There have also been some discussions on the usefulness of pyrosequencing data for determining fungal abundances, with some studies advising caution when using 454 data to quantify fungal communities (Amend et al., 2010b, Unterseher et al., 2011). Undoubtedly, the technological improvements on high-throughput sequencing coupled with refinement of analytic tools will significantly increase the quality of metagenomic results in the near future, making

The massive amounts of information provided by metagenomic studies are by far the most substantial source of fungal diversity data today. As mentioned above, only a small fraction of the planet's fungal diversity has been documented and it has been suggested that the sequences generated in environmental studies should be the base for describing and naming new fungal species (Hibbett et al., 2011). The authors suggest a protocol to describe fungi based on molecular sequence similarity, but stress that sequence data should be used alone only when no other sources of information are available. Although sequences from environmental sampling offer limitations for taxonomy and phylogenetics (particularly analysis of single markers), they are practical and easy to obtain, accessible through databases, good for automated approaches, and used in phylogenetic studies. Formally naming fungal species from sequence data would imply some radical changes in the procedure for species descriptions (see section 3.1 above), however it would be a very

understanding of fungal biology and evolution (Brockhurst et al., 2011).

**3.3 The genomic revolution** 

Jones, 2009, Tedersoo et al., 2010).

NGS an even more powerful and informative approach.

effective way to rapidly accelerate the rate of fungal discovery.

The accessibility of genomics has also enabled the possibility of a dramatic increase in the number of fungal sequenced genomes. Sequencing the genomes of ecologically and taxonomically relevant fungi is and will continue to provide information not only on those specific species, but will also permit the study of genome structure, gene evolution, metabolic and regulatory pathways and life histories (Martin et al., 2011). The sequencing and analysis of fungal genomes is ongoing, mainly through the Fungal Genomics Program (FGP; http://genome.jgi-psf.org/programs/fungi/about-program.jsf), launched by the US Department of Energy Joint Genome Institute (JGI). This program will sequence the genomes of many species, including decomposer and mycorrhizal species enabling comparative studies focused on the pathways and mechanisms involved in being a symbiont or a decomposer across the fungal tree of life. The genomes of species from lineages with no genomic information will also be sequenced, allowing further studies on fungal evolution (Martin, 2011).
