Title
GPU-accelerated adaptive compression framework for genomics data
Abstract
Genomics data is being produced at an unprecedented rate, especially in the context of clinical applications and grand challenge questions. There are various types of data in genomics research, most of which are stored as plain text tables. A data compression framework tailored to this file type is introduced in this paper, featuring a combination of generic compression algorithms, GPU acceleration, and column-major storage. This approach is the first to achieve both compression and decompression rates of around 100MB/s on commodity hardware without compromising compression ratio. By selecting appropriate compression schemes for each column of data, this framework efficiently exploits data redundancy while remaining applicable to a wide range of formats. The GPU-accelerated implementation also properly exploits the parallelism of compression algorithms. Finally, this paper presents a novel first-order Markov model based transformation, with evidence that it is at least as effective as Burrows-Wheeler and Move-To-Front in some contexts.
Year
DOI
Venue
2013
10.1109/BigData.2013.6691572
BigData Conference
Keywords
Field
DocType
parallelism,graphics processing unit,compression ratio,commodity hardware,gpu,column-major storage,compression algorithms,data compression framework,big data,decompression rate,genomics data,markov model,genomics,gpu-accelerated adaptive compression framework,graphics processing units,generic compression algorithms,gpu acceleration,data compression,genomics research,first-order markov model based transformation,biology computing,parallel algorithm,burrows-wheeler transformation,markov processes,compression rate,move-to-front transformation
File format,Data mining,Lossy compression,Computer science,S3 Texture Compression,Theoretical computer science,Data redundancy,Compression ratio,Data type,Data compression,Computer engineering,Lossless compression
Conference
Volume
Issue
ISSN
null
null
2639-1589
Citations 
PageRank 
References 
4
0.47
12
Authors
8
Name
Order
Citations
PageRank
Guo GuiXin140.47
Shuang Qiu2327.78
Ye Zhiqiang361.89
Bingqiang Wang412510.45
Fang Lin540.81
Mian Lu665629.18
See Simon740.47
Rui Mao836841.23