Title | ||
---|---|---|
GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping. |
Abstract | ||
---|---|---|
Motivation: High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments - called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. Results: We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1093/bioinformatics/btx342 | BIOINFORMATICS |
Field | DocType | Volume |
Bottleneck,Computer science,Hamming distance,Software,Approximate string matching,Hardware acceleration,Bioinformatics,Reference genome,Hardware architecture,Speedup | Journal | 33 |
Issue | ISSN | Citations |
21 | 1367-4803 | 12 |
PageRank | References | Authors |
0.48 | 25 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Mohammed H. Alser | 1 | 52 | 5.34 |
Hasan Hassan | 2 | 352 | 17.76 |
Hongyi Xin | 3 | 97 | 5.13 |
Oguz Ergin | 4 | 424 | 25.84 |
Onur Mutlu | 5 | 9446 | 357.40 |
Can Alkan | 6 | 312 | 26.92 |