Title
Fast Set Intersection through Run-Time Bitmap Construction over PForDelta-Compressed Indexes.
Abstract
Set intersection is a fundamental operation for evaluating conjunctive queries in the context of scientific data analysis. The state-of-the-art approach in performing set intersection, compressed bitmap indexing, achieves high computational efficiency because of cheap bitwise operations; however, overall efficiency is often nullified by the HPC I/O bottleneck, because compressed bitmap indexes typically exhibit a heavy storage footprint. Conversely, the recently-presented PForDelta-compressed index has been demonstrated to be storage-lightweight, but has limited performance for set intersection. Thus, a more effective set intersection approach should be efficient in both computation and I/O. Therefore, we propose a fast set intersection approach that couples the storage light-weight PForDelta indexing format with computationally-efficient bitmaps through a specialized on-the-fly conversion. The resultant challenge is to ensure this conversion process is fast enough to maintain the performance gains from both PForDelta and the bitmaps. To this end, we contribute two key enhancements to PForDelta, BitRun and BitExp, which improve bitmap conversion through bulk bit-setting and a more streamlined PForDelta decoding process, respectively. Our experimental results show that our integrated PForDelta-bitmap method speeds up conjunctive queries by up to 7.7x versus the state-of-the-art approach, while using indexes that require 15%-60% less storage in most cases.
Year
DOI
Venue
2014
10.1007/978-3-319-09873-9_56
Lecture Notes in Computer Science
Field
DocType
Volume
Inverted index,Intersection (set theory),Bottleneck,Bitmap index,Conjunctive query,Bitwise operation,Computer science,Parallel computing,Bitmap,Computation
Conference
8632
ISSN
Citations 
PageRank 
0302-9743
3
0.38
References 
Authors
11
7
Name
Order
Citations
PageRank
Xiaocheng Zou1645.90
Sriram Lakshminarasimhan218710.01
David A. Boyuka II3825.52
Stephen Ranshous4142.69
Houjun Tang55315.97
Scott Klasky6154799.00
Nagiza F. Samatova786174.04