Title | ||
---|---|---|
SeGraM: a universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping |
Abstract | ||
---|---|---|
A critical step of genome sequence analysis is the mapping of sequenced DNA fragments (i.e., reads) collected from an individual to a known linear reference genome sequence (i.e., sequence-to-sequence mapping). Recent works replace the linear reference sequence with a graph-based representation of the reference genome, which captures the genetic variations and diversity across many individuals in a population. Mapping reads to the graph-based reference genome (i.e., sequence-to-graph mapping) results in notable quality improvements in genome analysis. Unfortunately, while sequence-to-sequence mapping is well studied with many available tools and accelerators, sequence-to-graph mapping is a more difficult computational problem, with a much smaller number of practical software tools currently available. We analyze two state-of-the-art sequence-to-graph mapping tools and reveal four key issues. We find that there is a pressing need to have a specialized, high-performance, scalable, and low-cost algorithm/hardware co-design that alleviates bottlenecks in both the seeding and alignment steps of sequence-to-graph mapping. Since sequence-to-sequence mapping can be treated as a special case of sequence-to-graph mapping, we aim to design an accelerator that is efficient for both linear and graph-based read mapping. To this end, we propose SeGraM, a universal algorithm/hardware co-designed genomic mapping accelerator that can effectively and efficiently support both <u>se</u>quence-to-<u>gra</u>ph <u>m</u>apping and sequence-to-sequence mapping, for both short and long reads. To our knowledge, SeGraM is the first algorithm/hardware co-design for accelerating sequence-to-graph mapping. SeGraM consists of two main components: (1) MinSeed, the first <u>min</u>imizer-based <u>seed</u>ing accelerator, which finds the candidate locations in a given genome graph; and (2) BitAlign, the first <u>bit</u>vector-based sequence-to-graph <u>align</u>ment accelerator, which performs alignment between a given read and the subgraph identified by MinSeed. We couple SeGraM with high-bandwidth memory to exploit low latency and highly-parallel memory access, which alleviates the memory bottleneck. We demonstrate that SeGraM provides significant improvements for multiple steps of the sequence-to-graph (i.e., S2G) and sequence-to-sequence (i.e., S2S) mapping pipelines. First, SeGraM outperforms state-of-the-art S2G mapping tools by 5.9×/3.9× and 106×/- 742× for long and short reads, respectively, while reducing power consumption by 4.1×/4.4× and 3.0×/3.2×. Second, BitAlign outperforms a state-of-the-art S2G alignment tool by 41×-539× and three S2S alignment accelerators by 1.2×-4.8×. We conclude that SeGraM is a high-performance and low-cost universal genomics mapping accelerator that efficiently supports both sequence-to-graph and sequence-to-sequence mapping pipelines. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1145/3470496.3527436 | ISCA: International Symposium on Computer Architecture |
Keywords | DocType | ISSN |
genomics, genome analysis, genome graphs, read mapping, algorithm/hardware co-design, hardware accelerator, read alignment, seeding, minimizer, bitvector | Conference | 1063-6897 |
Citations | PageRank | References |
4 | 0.36 | 13 |
Authors | ||
18 |
Name | Order | Citations | PageRank |
---|---|---|---|
Damla Senol Cali | 1 | 34 | 3.32 |
Konstantinos Kanellopoulos | 2 | 28 | 2.72 |
Joel Lindegger | 3 | 4 | 0.36 |
Zülal Bingöl | 4 | 19 | 1.54 |
Gurpreet S. Kalsi | 5 | 4 | 0.36 |
Ziyi Zuo | 6 | 4 | 0.36 |
Can Firtina | 7 | 9 | 1.74 |
Meryem Banu Cavlak | 8 | 4 | 0.36 |
Jeremie Kim | 9 | 4 | 0.69 |
Nika Mansouri Ghiasi | 10 | 40 | 2.38 |
Gagandeep Singh | 11 | 4 | 1.03 |
Juan Gómez-Luna | 12 | 22 | 3.88 |
Nour Almadhoun Alserr | 13 | 4 | 1.03 |
Mohammed Alser | 14 | 17 | 3.19 |
Sreenivas Subramoney | 15 | 4 | 0.69 |
Can Alkan | 16 | 4 | 0.69 |
Saugata Ghose | 17 | 718 | 36.45 |
Onur Mutlu | 18 | 9446 | 357.40 |