Title
Mapping reads on a genomic sequence: an algorithmic overview and a practical comparative analysis.
Abstract
Mapping short reads against a reference genome is classically the first step of many next-generation sequencing data analyses, and it should be as accurate as possible. Because of the large number of reads to handle, numerous sophisticated algorithms have been developped in the last 3 years to tackle this problem. In this article, we first review the underlying algorithms used in most of the existing mapping tools, and then we compare the performance of nine of these tools on a well controled benchmark built for this purpose. We built a set of reads that exist in single or multiple copies in a reference genome and for which there is no mismatch, and a set of reads with three mismatches. We considered as reference genome both the human genome and a concatenation of all complete bacterial genomes. On each dataset, we quantified the capacity of the different tools to retrieve all the occurrences of the reads in the reference genome. Special attention was paid to reads uniquely reported and to reads with multiple hits.
Year
DOI
Venue
2012
10.1089/cmb.2012.0022
JOURNAL OF COMPUTATIONAL BIOLOGY
Keywords
Field
DocType
NGS,benchmarking,short read alignment,Burrows-Wheeler Transform,suffix tree,suffix array,hashing,spaced seeds
Data mining,Hybrid genome assembly,Burrows–Wheeler transform,Genomics,Suffix array,Concatenation,Human genome,Bioinformatics,Bacterial genome size,Mathematics,Reference genome
Journal
Volume
Issue
ISSN
19.0
6
1066-5277
Citations 
PageRank 
References 
16
0.73
13
Authors
6
Name
Order
Citations
PageRank
S Schbath130340.02
Véronique Martin2293.10
Matthias Zytnicki31668.92
Julien Fayolle4211.84
Valentin Loux5162.09
Jean-françois Gibrat61266.08