About Us | Help Videos | Contact Us | Subscriptions

Soil Science Society of America Journal - The 11th Dahlia Greidinger Memorial Symposium: Advanced Methods for Investigating Nutrient Dynamics in Soils and Ecosystems

The Potential of Metagenomic Approaches for Understanding Soil Microbial Processes


This article in SSSAJ

  1. Vol. 78 No. 1, p. 3-10
    unlockOPEN ACCESS
    Received: July 17, 2013
    Published: November 7, 2013

    * Corresponding author(s): david.myrold@oregonstate.edu
Request Permissions

  1. David D. Myrold *a,
  2. Lydia H. Zeglina and
  3. Janet K. Janssonb
  1. a Dep. of Crop and Soil Science Oregon State Univ. Corvallis, OR 97331
    b Earth Sciences Division Lawrence Berkeley National Lab. Berkeley, CA 94720


Technological advances in sequencing technologies and bioinformatics analysis tools now enable the generation of a metagenome from soil, although the ultimate goal of obtaining the entire complement of all genes of all organisms in a given sample of soil still lies in the future. The rich information obtained from a soil metagenome will undoubtedly provide new insights into the taxonomic and functional diversity of soil microorganisms; the question is whether it will also yield greater understanding of how C, N, and other nutrients cycle in soil. The purpose of this review is to describe the steps involved in producing a soil metagenome, including some of the potential pitfalls associated with its production and annotation. Possible solutions to some of these challenges are presented. Selected examples from published soil metagenomic studies are discussed, with an emphasis on clues that they have provided about biogeochemical cycling.


    bp, base pairs; DNA, deoxyribonucleic acid; mRNA, messenger ribonucleic acid; RNA, ribonucleic acid; rRNA, ribosomal ribonucleic acid

Soil microbial communities are known to be incredibly diverse, harboring tens of thousands of species of bacteria and thousands of species of fungi in a gram of soil. This knowledge is based primarily on recent advances in DNA sequencing, which have made it possible to generate millions of sequence reads quickly and economically. The initial application of this high-throughput sequencing technology explored the taxonomic diversity and composition of soil microbial communities using a polymerase chain reaction (PCR)-based approach that focused on phylogenetically informative ribosomal genes (Buée et al., 2009; Roesch et al., 2007). This has become known as pyrotagged sequencing and, with the incorporation of barcoded primers or tags (Hamady et al., 2008), has rapidly became the standard approach for describing the taxonomic composition of soil microbial communities—the soil “microbiome”. The pyrotagging approach has subsequently been extended to include targeted functional genes, such as nifH, amoA, etc. (Mao et al., 2011). Targeted pyrosequencing has the advantage of being able to focus on a gene, or a few genes, of specific interest; however, for some applications, such as the interaction among multiple community members or their collective response to environmental perturbations, a more comprehensive and complete inventory of microbial genes is desired—a “metagenome”.

Initial soil metagenomic studies relied on constructing libraries (e.g., plasmid, bacterial artificial chromosome [BAC], cosmid, fosmid) that were then sequenced with the intent of finding genes that encoded for products of interest, such as antimicrobials or enzymes (Daniel, 2005). The first attempt to generate a comprehensive soil metagenome was reported by Tringe et al. (2005), who constructed it by Sanger sequencing of random lengths of DNA that were cloned into a virus (i.e., a phage library). Although the depth of sequencing was insufficient to achieve significant assembly of the DNA sequences into larger contiguous segments, or contigs, the individual reads were useful in comparing the gene content of the Minnesota farm soil metagenome with those obtained from other environments. Subsequent efforts to produce soil metagenomes have primarily used a shotgun sequencing approach that eliminates the need for making libraries and directly sequences the extracted DNA (Table 1).

View Full Table | Close Full ViewTable 1.

Summary of soil metagenome studies.

Location Site description Soil type Experimental design Biological replication Sequencing platform Sequencing depth Assembly Functional assignment† Comments Reference
Nunavut, Canada tundra permafrost two depths: active layer and permafrost no 454 GS FLX Titanium 853 Mbp (0.35–0.99 million reads/sample) Phrap assembler (140 Mbp, 134,000 contigs) not given multiple displacement amplification (MDA) was needed to generate sufficient DNA; detected genes for methanogenesis Yergeau et al. (2010)
Rothamsted, UK Park Grass experiment (permanent grassland) silty clay loam (Chromic Luvisol) comparison of direct and indirect DNA extraction methods no 454 GS FLX Titanium ?1 million reads none not given extraction methods gave similar functional gene profiles Delmont et al. (2011)
Pru Toh Daeng, Thailand swamp forest peat pooled sample no 454 GS FLX 45.9 Mbp (0.18 million reads) GS De Novo Assembler (13.4 Mbp, 54,000 contigs averaging 248 bp) SEED: 53 searched mainly for polysaccharide degrading genes Kanokratana et al. (2011)
Alaska Hess Creek (black spruce forest) permafrost two depths: active layer and permafrost; before and 2 and 7 d after thawing yes (n = 2) IlluminaGAII (2 ´ 113 bp) 39.8 Gbp (176 million reads) Velvet assembler (9.6 Mbp, 3758 contigs >1 kb) KEGG: 11 used emulsion polymerase chain reaction to generate sufficient DNA; draft genome of dominant methanogen obtained Mackelprang et al. (2011)
New Hampshire Harvard Forest (northern hardwood forest) sandy loam (Typic Dystrochrept) single composite from two soil cores (0–10 cm) no 454 GS FLX Titanium 748 Mbp (1.4 million reads) also companion metatranscriptome Stewart et al. (2011)
São Paulo, Brazil mangrove forest submerged sediment samples from four sites, one of which was impacted by oil contamination no 454 GS FLX Titanium 215 Mbp (?0.25 million reads/site) none COG: 60; KEGG: 30 metabolic reconstruction of C, N, and S cycles Andreote et al. (2012)
Rothamsted, UK Park Grass experiment (permanent grassland) silty clay loam (Chromic Luvisol) 13 samples differing by date, depth, and DNA extraction method no 454 GS FLX Titanium 4.8 Gbp (12.5 million reads) Newbler assembler (15.2 Mbp, 267,000 contigs) SEED: 56 extraction method showed more variation than date or depth of sampling Delmont et al. (2012a)
Michigan and Minnesota Kellogg Biological Station (cropped fields) and Cedar Creek Ecosystem Science Reserve (successional grasslands) sandy loam (Hapludalf) and sand (Udipsamment) low, medium, and high N addition at each site yes (n = 3) 454 GS FLX Titanium 518 Mbp (1.35 million reads) none SEED: 25–35 shift in functional capacity with N input; significant correlation of metagenome functional and phylogenetic composition Fierer et al. (2012a)
Worldwide cold and hot deserts, selected nondesert biomes various 16 total soils no Illumina GAIIx (2 ´ 100 bp) 6.2 Gbp total (0.39–1.9 million reads per soil) none SEED: 13–23 functional composition correlated with site characteristics Fierer et al. (2012b)
Lucknow, India dumpsite not specified three soils with a gradient of hexachlorocyclohexane contamination no 454 GS FLX Titanium 1.2 Gbp (1.1–1.2 million reads per soil) none not given draft genome of Chromohalobacter salaxigenes Sangwan et al. (2012)
Nevada, Nevada Desert FACE Facility (shrubland) loamy sand (Aridisol) four pooled samples (ambient and elevated CO2 plots, with two locations [creosote bush and interspace]) no 454 GS FLX Titanium 724 Mbp total (0.31–0.68 million reads per sample) none SEED: 36–40 functional genes less discriminating than 16S rRNA genes for the effect of elevated CO2 Steven et al. (2012)
Nunavut, Canada tundra permafrost four pooled samples (control and three times for treated biopiles of oil-contaminated soil) no 454 GS FLX Titanium 463 Mbp total (0.11–0.46 reads per sample) none not applicable used BLAST to focus on hydrocarbon-degrading genes (found to be higher in treated than control) Yergeau et al. (2012)
Breuil-Chenue, France Norway spruce plantation Alocrisol soil closely spaced soil cores separated into organic and mineral horizons yes (n = 3) 454 GS FLX Titanium Illumina HiSeq 2000 (1 ´ 75 bp) 1.9 Gbp total of 454 (0.41–0.62 million reads per sample); 11.9 Gbp total of Illumina (23–29 million reads per sample) unspecified assembler of Illumina data (0.77 Mbp, 3492 contigs) SEED: 36–60 (454 data); ?25 (Illumina data) differences in functional subsystems between organic and mineral horizons Uroz et al. (2013)
COG, Clusters of Orthologous Groups of proteins; KEGG, Kyoto Encyclopedia of Genes and Genomes; SEED, SEED subsystem hierarchy.

Most soil shotgun metagenomes have been obtained using the 454 GS FLX platform with Titanium chemistry, which generates reads of 400 to 500 bp in length. These studies have generated 100-fold differences in the amount of sequence generated, with 0.1 to 1.0 million reads per soil sample being typical and a maximum metagenome size of about 0.5 Gbp. Assembly was attempted in about one-third of these metagenomic studies with some success, although most contigs were relatively short, <1000 bp (Delmont et al., 2012b; Kanokratana et al., 2011; Yergeau et al., 2010). A few studies have used the Illumina sequencing platform for generating shotgun metagenomes from soils (Fierer et al., 2012b; Mackelprang et al., 2011), and one has combined both sequencing platforms (Uroz et al., 2013). The soil metagenomes produced using the Illumina system generally produced many more sequences (0.4–29 million reads per sample) and a maximum metagenome size of about 4.0 Gbp. Although a relatively small fraction of the short reads could be assembled into larger contigs, sufficient assembly was obtained in a permafrost soil to produce a draft genome of the dominant methanogen (Mackelprang et al., 2011).

A major goal of soil metagenomic studies is to identify the functional potential of the complex microbial communities, whether using individual reads or assembled contigs. As might be expected, greater success in assigning functions has been obtained with the longer reads generated with 454 sequencing: 20 to 60% assignment depending on the databases used (Table 1). In contrast, only 10 to 25% of the shorter Illumina reads have been successfully assigned, although the 10-fold greater number of sequences resulted in more total functional gene assignments.

Whether using individual reads or assembled contigs, the studies to date have been effective in understanding the functional potential of microbial communities in soil and in distinguishing among soils and treatments. For practical reasons, such as cost and computational constraints, many studies, particularly the earlier ones, have not had true biological replication.

Of course, many more soil metagenomic projects are in process: as of May 2013, 48 soil metagenomes were registered at www.genomesonline.org. Most of these are not yet published and not all are shotgun metagenomes. As part of our research on Mollisol soils of the Great Plains, several metagenomes have been obtained, including some in excess of 500 million Illumina reads (unpublished data). Assembly of such large datasets is challenging, however.


One goal of soil metagenomic studies is to gain insights into soil C, N, P, S, and other elemental cycles. Mackelprang et al. (2011) compared the response to thawing of the metagenome of microbial communities in the active (seasonally frozen) and permafrost layers of an Alaskan Gelisol. Thawing resulted in shifts in both microbial community structure and function. The increased rate of CH4 consumption was positively correlated with an increase in particulate methane monooxygenase genes and the 16S rRNA genes of type II, but not type I, methanotrophs. No change in functional genes associated with methanogenesis was observed. Thawing of the permafrost core also brought about an increase in genes associated with denitrification. Another interesting observation is that thawing resulted in changed abundances of functional genes of the permafrost in the two replicate cores, but this was not matched by a convergence of 16S rRNA gene composition, which suggests that microbial communities of different structure may have similar functions.

Andreote et al. (2012) generated metagenomes from submerged sediment of four mangrove forests in Brazil. Based on this metagenomic information, they constructed metabolic pathways associated with C, N, and S cycling. Not surprisingly, pathways for anaerobic metabolic processes (e.g., NO3 reduction and SO4 reduction) were dominant.

Fierer et al. (2012a) performed a comparative metagenomic study to focus on responses of soil metagenomes to N fertilization using two long-term experiments in the Upper Midwest of the United States. Several consistent responses to N fertilization were observed at the two sites, including: increased relative abundance of genes associated with respiration, protein metabolism, and nucleic acid metabolism; and decreased abundances of genes associated with urea decomposition and tricarboxylate transporters. They did not specifically report on changes in genes associated with classic N cycling processes of N2 fixation, nitrification, or denitrification. Based on assumptions that assign bacterial phyla into copiotrophic and oligotrophic life histories, they suggested that N fertilization causes a shift toward more copiotrophic phyla.


The initial call by Vogel et al. (2009b) to sequence the soil metagenome resulted in a number of commentaries, replies, and reviews about the efficacy of doing so (Baveye, 2009; Morales and Holben, 2011; Myrold and Nannipieri, 2013; Singh et al., 2009; Vogel et al., 2009a), but, as reported above, more than a dozen soil metagenomes have been published (Table 1) and more are in process. The challenges have not, however, disappeared (Thomas et al., 2012). Sequencing methods have improved and costs have diminished, but there at least three areas of opportunity for advancing metagenomics.

Obtaining a Representative Sample

Collecting a sample that is representative of a soil is a long-standing challenge for any type of soil study because soils are by nature heterogeneous and vary both spatially and temporally. Often stratified sampling, taking composite samples, and the use of replicated designs can help in obtaining representative samples appropriate for addressing the objective of an experiment, at least when the cost of analyses is not a major limitation (Prosser, 2010; Wollum, 1994). Although some level of replication in soil metagenomic studies is now financially feasible, other issues remain (Knight et al., 2012; Lombard et al., 2011). Working with a soil from a single location, Delmont et al. (2012a) found that DNA extraction protocols displayed greater variation in metagenomic composition than either soil depth or season of sampling, which was perhaps a surprising result but is consistent with their earlier observation using other molecular methods (Delmont et al., 2011). Because current metagenomic sequencing typically requires a few hundred nanograms of DNA, soils that yield small amounts of DNA, such as permafrost soils, may require additional DNA amplification, a step that may introduce some bias (Mackelprang et al., 2011; Yergeau et al., 2010). The difficulty in adopting standard methods in soil microbiology is not unique to DNA extraction (Philippot et al., 2012)—and a perfect extraction method is unlikely to be developed. Nevertheless, it is likely that the community will select one as a common protocol, as has been done by the Earth Microbiome Project (http://press.igsb.anl.gov/earthmicrobiome/).

Metagenome Assembly

The lack of efficient and effective strategies for assembling and annotating shotgun metagenomic data is a major challenge facing bioinformaticians. The sheer magnitude of data bogs down even the fastest, largest computers using standard assembly algorithms. The result is that only a small fraction of soil metagenomic data can be assembled, usually into relatively small contigs (Table 1); however, a number of metagenome-specific assemblers have been developed in an attempt to overcome this hurdle (Scholz et al., 2012; Segata et al., 2013). Such new approaches to de novo assembly often take the approach of normalizing and partitioning the sequence reads, which leads to greater efficiency and generally more complete assembly (Dröge and McHardy, 2012; Howe et al., 2012; Jones et al., 2012). Although larger amounts of sequencing should increase the number of contigs that can be assembled, more reads did not increase contig length in one study (Delmont et al., 2012a).

It should be noted that even if significant assembly of sequences into contigs is not achieved, insights can be gained by comparing individual reads with gene sequences archived in annotated databases. The results of annotation tools, such as MG-RAST and IMG-ER (Markowitz et al., 2009; Meyer et al., 2008), are dependent on database matching parameters as well as the content and quality of the sequence databases. Current sequence databases have a number of limitations that are important to recognize. For example, gene sequences may not be annotated correctly and genes from soil microorganisms are often underrepresented in these databases. Attempts to address the latter shortcoming are the Genomic Encyclopedia of Bacteria and Archaea and 1000 Fungal Genomes projects being undertaken by the Department of Energy’s Joint Genome Institute (http://genome.jgi.doe.gov/programs/bacteria-archaea/GEBA.jsf and http://genome.jgi.doe.gov/programs/fungi/1000fungalgenomes.jsf), which should greatly expand coverage of the microbial world.

Metagenome Interpretation

Even if well-assembled, unbiased soil metagenomes are obtained, there are challenges in how these data will be integrated and synthesized into a coherent representation of the potential functioning of the microbial community. A variety of analytical methods have been used, and are being developed, to provide insights into these complex systems and their emergent properties, such as clustering, ordination, and artificial neural networks, to name a few (Cloots and Marchal, 2011; Faust and Raes, 2012; Gianoulis et al., 2009; Muller et al., 2013; Segata et al., 2013). Two recent examples for soil are the use of random matrix theory to infer the interaction networks of microbial communities and how these networks changed in response to elevated CO2 (Zhou et al., 2011) and the co-occurrence of microbial taxa in relation to their life-history strategies (Barberan et al., 2012). Another option would be to extend metabolic modeling of individual organisms (Henry et al., 2010) to soil microbial communities, as has been done for simpler systems (Stolyar et al., 2007; Zomorrodi and Maranas, 2012). Finally, it may be possible to incorporate metagenomic data into trait-based models of soil biogeochemical cycles (Allison, 2012; Bouskill et al., 2012).


A common goal of metagenomic studies has been to link the structure, or composition, of the microbial community with its function. Metagenomic data provide insights into the metabolic, or functional, potential of the microbial community, but other approaches are needed to assess and strengthen the connections with actual microbial activity. This may be especially important in the soil environment, where many organisms detected on the basis of DNA may be quiescent or not growing (Bottomley et al., 2012; Buerger et al., 2012; Lennon and Jones, 2011). The development of additional “omics” methods along the canonical DNA-RNA-protein continuum (Fig. 1) is one approach to explore the functional activity of microbial communities (Muller et al., 2013).

Fig. 1.
Fig. 1.

The canonical continuum of transcription of DNA to RNA and translation of RNA to protein.



It is generally assumed that gene expression (the transcription of DNA into RNA) is indicative of microbial activity and reflects the response of microorganisms to environmental cues. The cellular rRNA content is roughly proportional to the growth rate, whereas the amount of the rRNA gene is relatively constant, leading to the use of the rRNA/DNA ratio as an indicator of relative microbial activity (Blazewicz et al., 2013; DeAngelis et al., 2010; Muttray et al., 2001). Several studies have also found that the composition of microbial communities based on RNA differs from that based on DNA and concluded that the active microbial community is just a subset of the potentially active microbial community (Anderson and Parkin, 2007; Baldrian et al., 2012; Griffiths et al., 2000). Such differences between genes and transcripts have also been observed when metagenomes and metatranscriptomes from marine or soil environments have been compared (Frias-Lopez et al., 2008; Gilbert et al., 2008; Urich et al., 2008).

There are just a few published soil metatranscriptomic studies (Table 2). Two have taken advantage of the poly-A tails of eukaryotic mRNA to selectively isolate this fraction for sequencing (Bailly et al., 2007; Damon et al., 2012). Shotgun metatranscriptomics have been used in two studies, an approach sometimes known as RNA-Seq (Carvalhais et al., 2012; Croucher and Thomson, 2010). The first study (Urich et al., 2008) did not use any type of enrichment procedure to maximize the number of mRNA sequences, which made up about one-quarter of the total. A customized rRNA hybridization technique has been developed to enrich the fraction of mRNA sequences significantly (Stewart et al., 2011). Annotation of functional gene transcripts was about as successful as it is for gene annotation (Tables 1 and 2). The utility of soil metatranscriptomics has not yet been fully explored but has provided insights about microbial functions in other complex microbial systems (e.g., Poretsky et al., 2010).

View Full Table | Close Full ViewTable 2.

Summary of soil metatranscriptomes.

Location Site description Soil type Experimental design Biological replication RNA preparation Sequencing platform Sequencing depth Functional assignment† Comments Reference
Southwestern France Pinus pinaster plantation on coastal sand dune nutrient poor sand single composite of 27 samples (0–20 cm) no polyT capture of eukaryotic mRNA, cloned into plasmid library Sanger 119 clones sequenced COG: 68 first metatranscriptome for soil. Bailly et al. (2007)
Am Rotböll, Germany lawn sandy single composite of three samples (0–10 cm) no no mRNA enrichment 454 GS20 260,000 sequences (75% rRNA) SEED: 33 first shotgun metatranscriptome from soil; noted database limitations for identifying transcripts based on sequenced genomes Urich et al. (2008)
New Hampshire Harvard Forest (northern hardwood forest) sandy loam (Typic Dystrochrept) single composite from two soil cores (0–10 cm) no mRNA enrichment with custom rRNA subtraction hybridization 454 GS FLX Titanium 1.2 million sequences, 458 Mbp (17% rRNA) KEGG: 45 observed lower relative abundances of transcripts than genes for motility and transport; higher abundances of transcripts than genes for carbohydrate and nucleotide metabolism; also companion metagenome Stewart et al. (2011)
Central France Breuil-Chenue forest (Picea abies and Fagus sylvatica plantations) sandy clay single composites from 14–16 soil cores in each stand (3–7 cm) no polyT capture of eukaryotic mRNA, cloned into plasmid library Sanger 8606 (Pinus), 7905 (Fagus) Blast2GO: 32–39 transcripts for enzymes that degrade plant constituent comprised just 0.5–0.8% of all functional transcripts Damon et al. (2012)
COG, Clusters of Orthologous Groups of proteins; KEGG, Kyoto Encyclopedia of Genes and Genomes; SEED, SEED subsystem hierarchy.


Microbial functions are mediated by proteins, thus the measurement of the microbial proteins should be the most direct “omics” estimator of the potential activity of the microbial community (Fig. 1). As with metagenomics and metatranscriptomics, there has been a steady evolution of the methodology for the extraction and analysis of proteins from soils (Bastida et al., 2009; Hettich et al., 2012; Siggins et al., 2012). Nonetheless, the extraction of proteins lags behind that of nucleic acids, in part because of the strong interactions between proteins and other organic molecules and inorganic minerals in soils (Adamczyk et al., 2008; Kleber et al., 2007). Although advances in protein extraction methods have been made (Chourey et al., 2010; Kanerva et al., 2013; Keiblinger et al., 2012; Taylor and Williams, 2010), a standard method is not currently available, and it is possible that soil-specific modifications will always be needed.

To date, soil proteomic studies have typically identified a relatively small number of proteins based on gel isolation methods (Table 3). The development of shotgun proteomics using more sophisticated two-dimensional liquid chromatography separation methods and subsequent tandem mass spectrometry analysis have increased that number to hundreds to thousands (Chourey et al., 2010; Keiblinger et al., 2012). This is far fewer identified molecules than for metagenomic or metatranscriptomic methods and may be a limitation of metaproteomic analyses, which identify only the most abundant proteins, those that are often associated with housekeeping activities rather than enzymes associated with specific biogeochemical processes.

View Full Table | Close Full ViewTable 3.

Summary of soil metaproteomes.

Location Site description Soil type Experimental design Biological replication Extraction method Protein analysis approach† Comments Reference
Waldstein, Germany spruce forest, 145 yr old Podzol mineral (heavy, >1.6 g/cm3) fraction isolated no proteins released by dissolving minerals in 10% HF SDS-PAGE followed by LC-MS/MS of tryptic digest identified eight enzymes involved in C cycling Schulze et al. (2005)
Leipzig, Germany laboratory study compost soil column with and without addition of 2,4-D-degrading bacteria no NaOH and phenol SDS-PAGE followed by LC-MS/MS of tryptic digest identified four proteins Benndorf et al. (2007)
Washington and California laboratory study soils with and without addition of a known bacterial species; four extraction protocols no SDS and TCA (and others for comparison) 2D LC-MS/MS of tryptic digest shotgun proteomics identified 145 to 925 proteins; for unamended soil, 333 nonredundant proteins were identified Chourey et al. (2010)
Mississippi agricultural soil Paleudult 0–10 cm, control or with addition of toluene or glucose no indirect with phenol extraction SDS-PAGE followed by MALDI-TOF/TOF identified 16 proteins from toluene-amended soils, eight from glucose-amended soils Williams et al. (2010)
Southeast, central, and southwest China agricultural soils no SDS and phenol 2D PAGE followed by MALDI-TOF/TOF just rice rhizosphere analyzed; 122 proteins identified Wang et al. (2011)
Henan, China Rehmannia glutinosa cropped field fallow and 1- and 2-yr monoculture no SDS and phenol 2D PAGE followed by MALDI-TOF/TOF identified 103 proteins, with 33 differentially expressed among the cultural treatments Wu et al. (2011)
Schottenwald, Austria beech forest Dystric Cambisol 0–10 cm, four extraction procedures, also a potting soil mix no SDS, NaOH–phenol, SDS–phenol 2D LC-MS/MS of tryptic digest shotgun proteomics identified 226–494 proteins in forest soil and 80–237 in potting soil, depending on extraction method, with just 59 or seven in common Keiblinger et al. (2012)
LC-MS/MS, liquid chromatography–tandem mass spectrometry; MALDI-TOF/TOF, matrix-assisted laser desorption/ionization tandem time-of-flight mass spectrometry; PAGE, polyacrylamide gel electrophoresis; SDS, sodium dodecyl sulfate.


This is clearly an exciting time for soil microbial ecology as advances in sequencing technologies generate huge amounts of information about microbial communities. At present, the metagenomics of soils is transitioning from studies about how to do it to those that focus on what can be done with it. Soil metagenomics has already provided insight into the long-standing questions of “who’s there?” and is making inroads into the question of “what are they doing?” Progress will require further developments related to metagenomics (e.g., data reduction and analysis, improvements in functional gene annotation, and database curation) combined with knowledge about the spatial and temporal variability of the soil habitat and its influence on microbial activities. This, as well as the integration with other “omics” methods, promises to enhance our understanding about the functioning of soil ecosystems.


We thank Peter Bottomley for reviewing an early draft of this manuscript. This material is based on work supported by the National Science Foundation under Grant no. 1051481 and by the USDOE Contract no. DE-SC0004953. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or the USDOE.




Be the first to comment.

Please log in to post a comment.
*Society members, certified professionals, and authors are permitted to comment.