GenStore: In-Storage Filtering of Genomic Data for High-Performance and Energy-Efficient Genome Analysis - Citegraph

Paper Info

Title
GenStore: In-Storage Filtering of Genomic Data for High-Performance and Energy-Efficient Genome Analysis

Abstract
Genome sequence analysis, which analyzes the DNA sequences of organisms, is important for many applications in personalized medicine [1]–[8], outbreak tracing [9]–[14], and evolutionary studies [15]–[21]. The information of an organism's DNA is converted to digital data via a process called sequencing. A sequencing machine extracts the sequences of DNA molecules from the organism's sample in the form of strings consisting of four base pairs <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$(bps)$</tex> , denoted by <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\mathrm{A}, \mathrm{C}, \mathrm{G}$</tex> , and T. No current sequencing technology has the capability to read a human DNA molecule in its entirety. Instead, state-of-the-art sequencing machines generate randomly sampled, inexact sub-strings of the original genome, called reads. The information about the corresponding location of each read in the complete genome is lost during sequencing in most technologies. State-of-the-art sequencing machines produce one of two kinds of reads. 1) Short read sequencing technologies, such as Illumina [22], [23], produce reads that are highly accurate (99-99.9%) [24]–[26], but short (e.g., up to a few hundred DNA base pairs [24], [27], [28]). 2) Long read sequencing technologies, such as Pacific Biosciences (PacBio) [29] and Oxford Nanopore Technologies (ONT) [30], produce reads that are less accurate (85-90%) [27,31–33], but long (e.g., lengths ranging from thousands to millions of base pairs [34]).

Year	DOI	Venue
2022	10.1109/ISVLSI54635.2022.00062	2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)
Keywords	DocType	ISSN
Near Data Processing,Read Mapping,Filtering,Genomics,Storage	Conference	2159-3469
ISBN	Citations	PageRank
978-1-6654-6606-6	0	0.34
References	Authors
44	14

Authors (14 rows)

Cited by (0 rows)

References (44 rows)

Name	Order	Citations	PageRank
Nika Mansouri Ghiasi	1	40	2.38
Jisung Park	2	40	6.96
Harun Mustafa	3	0	0.34
Jeremie Kim	4	4	0.69
Ataberk Olgun	5	14	3.47
Arvid Gollwitzer	6	0	0.34
Damla Senol Cali	7	0	0.34
Can Firtina	8	0	0.34
Haiyu Mao	9	0	0.68
Nour Almadhoun Alserr	10	4	1.03
Rachata Ausavarungnirun	11	780	29.88
Nandita Vijaykumar	12	146	7.55
Mohammed Alser	13	17	3.19
Onur Mutlu	14	9446	357.40

1