Title
A statistical toolbox for metagenomics: assessing functional diversity in microbial communities.
Abstract
BACKGROUND: The 99% of bacteria in the environment that are recalcitrant to culturing have spurred the development of metagenomics, a culture-independent approach to sample and characterize microbial genomes. Massive datasets of metagenomic sequences have been accumulated, but analysis of these sequences has focused primarily on the descriptive comparison of the relative abundance of proteins that belong to specific functional categories. More robust statistical methods are needed to make inferences from metagenomic data. In this study, we developed and applied a suite of tools to describe and compare the richness, membership, and structure of microbial communities using peptide fragment sequences extracted from metagenomic sequence data. RESULTS: Application of these tools to acid mine drainage, soil, and whale fall metagenomic sequence collections revealed groups of peptide fragments with a relatively high abundance and no known function. When combined with analysis of 16S rRNA gene fragments from the same communities these tools enabled us to demonstrate that although there was no overlap in the types of 16S rRNA gene sequence observed, there was a core collection of operational protein families that was shared among the three environments. CONCLUSION: The results of comparisons between the three habitats were surprising considering the relatively low overlap of membership and the distinctively different characteristics of the three habitats. These tools will facilitate the use of metagenomics to pursue statistically sound genome-based ecological analyses.
Year
DOI
Venue
2008
10.1186/1471-2105-9-34
BMC Bioinformatics
Keywords
Field
DocType
computational biology,microarrays,relative abundance,genomics,bacteria,algorithms,robust statistics,soil microbiology,microbial community,protein family,bioinformatics,genetic variation
Species richness,Functional diversity,Biology,Toolbox,Microbial Genomes,Metagenomics,Genomics,Data sequences,Bioinformatics,DNA microarray
Journal
Volume
Issue
ISSN
9
1
1471-2105
Citations 
PageRank 
References 
34
0.87
6
Authors
2
Name
Order
Citations
PageRank
Patrick D. Schloss1995.79
Jo Handelsman2442.09