Title
etagenome Analysis using Megan
Abstract
In metagenomics, the goal is to analyze the genomic content of a sample of organisms collected from a common habitat. One approach is to apply large-scale random shotgun sequencing techniques to obtain a collection of DNA reads from the sample. This data is then compared against databases of known sequences such as NCBI-nr or NCBI-nt, in an attempt to identify the taxonomical content of the sample. We introduce a new software called MEGAN (Meta Genome ANalyzer) that generates species profiles from such sequencing data by assigning reads to taxa of the NCBI taxonomy using a straight-forward assignment algorithm. The approach is illustrated by application to a number of datasets obtained using both sequencing-by-synthesis and Sanger sequencing technology, including metagenomic data from a mammoth bone, a portion of the Sargasso sea data set, and several complete microbial test genomes used for validation proposes. Genomics is the study of the genome sequence of individual organisms. Most genome sequences available in databases today were obtained by "Sanger sequencing", using a shotgun approach that involves cloning small inserts of DNA and then determining their sequence using fluorescent dideoxynucleotides for termination and electrophoresis for measurement7. The NCBI website (www . ncbi . nlm . nih . gov) lists hundreds of bac- terial, tens of archaeal and about one hundred eukaryotic genomes as being completely sequenced, or in the process of being sequenced. Metagenomics has been defined as "the genomic analysis of microorganisms by direct extraction and cloning of DNA from an assemblage of microorganism^"^, and its impor- tance stems from the fact that 99% or more of all microbes are deemed unculturable. If we take a genome to be the entire genetic information of a single organism, then a metagenome can be defined as the entire genetic information of an ensemble of organisms, living in a common habitat. The aim of metagenomics is to understand the genetic diversity of a metagenome, ideally, by identifying the (relative abundances of) species present. Metage- nomics promises to lead to the discovery of new genes that have useful applications in biotechnology and medicinelo. One main technique in metagenomics is to apply large-scale random shotgun sequenc- ing. A number of recent projects use Sanger sequencing to create datasets in this way, for
Year
Venue
DocType
2007
APBC
Conference
Citations 
PageRank 
References 
1
0.49
1
Authors
4
Name
Order
Citations
PageRank
Daniel H. Huson176591.20
Alexander F. Auch21147.38
Qi Ji310.83
Stephan C. Schuster412011.47