Title | ||
---|---|---|
GenStore: In-Storage Filtering of Genomic Data for High-Performance and Energy-Efficient Genome Analysis |
Abstract | ||
---|---|---|
Genome sequence analysis, which analyzes the DNA sequences of organisms, is important for many applications in personalized medicine [1]–[8], outbreak tracing [9]–[14], and evolutionary studies [15]–[21]. The information of an organism's DNA is converted to digital data via a process called sequencing. A sequencing machine extracts the sequences of DNA molecules from the organism's sample in the form of strings consisting of four base pairs
<tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$(bps)$</tex>
, denoted by
<tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\mathrm{A}, \mathrm{C}, \mathrm{G}$</tex>
, and T. No current sequencing technology has the capability to read a human DNA molecule in its entirety. Instead, state-of-the-art sequencing machines generate randomly sampled, inexact sub-strings of the original genome, called reads. The information about the corresponding location of each read in the complete genome is lost during sequencing in most technologies. State-of-the-art sequencing machines produce one of two kinds of reads. 1) Short read sequencing technologies, such as Illumina [22], [23], produce reads that are highly accurate (99-99.9%) [24]–[26], but short (e.g., up to a few hundred DNA base pairs [24], [27], [28]). 2) Long read sequencing technologies, such as Pacific Biosciences (PacBio) [29] and Oxford Nanopore Technologies (ONT) [30], produce reads that are less accurate (85-90%) [27,31–33], but long (e.g., lengths ranging from thousands to millions of base pairs [34]). |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/ISVLSI54635.2022.00062 | 2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) |
Keywords | DocType | ISSN |
Near Data Processing,Read Mapping,Filtering,Genomics,Storage | Conference | 2159-3469 |
ISBN | Citations | PageRank |
978-1-6654-6606-6 | 0 | 0.34 |
References | Authors | |
44 | 14 |
Name | Order | Citations | PageRank |
---|---|---|---|
Nika Mansouri Ghiasi | 1 | 40 | 2.38 |
Jisung Park | 2 | 40 | 6.96 |
Harun Mustafa | 3 | 0 | 0.34 |
Jeremie Kim | 4 | 4 | 0.69 |
Ataberk Olgun | 5 | 14 | 3.47 |
Arvid Gollwitzer | 6 | 0 | 0.34 |
Damla Senol Cali | 7 | 0 | 0.34 |
Can Firtina | 8 | 0 | 0.34 |
Haiyu Mao | 9 | 0 | 0.68 |
Nour Almadhoun Alserr | 10 | 4 | 1.03 |
Rachata Ausavarungnirun | 11 | 780 | 29.88 |
Nandita Vijaykumar | 12 | 146 | 7.55 |
Mohammed Alser | 13 | 17 | 3.19 |
Onur Mutlu | 14 | 9446 | 357.40 |