Title
Optimization of alignment-based methods for taxonomic binning of metagenomics reads.
Abstract
Motivation: Alignment-based taxonomic binning for metagenome characterization proceeds in two steps: reads mapping against a reference database (RDB) and taxonomic assignment according to the best hits. Beyond the sequencing technology and the completeness of the RDB, selecting the optimal configuration of the workflow, in particular the mapper parameters and the best hit selection threshold, to get the highest binning performance remains quite empirical. Results: We developed a statistical framework to perform such optimization at a minimal computational cost. Using an optimization experimental design and simulated datasets for three sequencing technologies, we built accurate prediction models for five performance indicators and then derived the parameter configuration providing the optimal performance. Whatever the mapper and the dataset, we observed that the optimal configuration yielded better performance than the default configuration and that the best hit selection threshold had a large impact on performance. Finally, on a reference dataset from the Human Microbiome Project, we confirmed that the optimized configuration increased the performance compared with the default configuration. Availability and implementation: Not applicable.
Year
DOI
Venue
2016
10.1093/bioinformatics/btw040
BIOINFORMATICS
Field
DocType
Volume
Data mining,Performance indicator,Human Microbiome Project,Computer science,Reference database,Metagenomics,Hit selection,Bioinformatics,Predictive modelling,Completeness (statistics),Workflow
Journal
32
Issue
ISSN
Citations 
12
1367-4803
0
PageRank 
References 
Authors
0.34
10
4
Name
Order
Citations
PageRank
Magali Jaillard171.59
Maud Tournoud200.68
Faustine Meynier300.34
Jean-Baptiste Veyrieras4344.19