Title
Multi-level Layout Optimization for Efficient Spatio-temporal Queries on ISABELA-compressed Data
Abstract
The size and scope of cutting-edge scientific simulations are growing much faster than the I/O subsystems of their runtime environments, not only making I/O the primary bottleneck, but also consuming space that pushes the storage capacities of many computing facilities. These problems are exacerbated by the need to perform data-intensive analytics applications, such as querying the dataset by variable and spatio-temporal constraints, for what current database technologies commonly build query indices of size greater than that of the raw data. To help solve these problems, we present a parallel query-processing engine that can handle both range queries and queries with spatio-temporal constraints, on B-spline compressed data with user-controlled accuracy. Our method adapts to widening gaps between computation and I/O performance by querying on compressed metadata separated into bins by variable values, utilizing Hilbert space-filling curves to optimize for spatial constraints and aggregating data access to improve locality of per-bin stored data, reducing the false positive rate and latency bound I/O operations (such as seek) substantially. We show our method to be efficient with respect to storage, computation, and I/O compared to existing database technologies optimized for query processing on scientific data.
Year
DOI
Venue
2012
10.1109/IPDPS.2012.83
IPDPS
Keywords
Field
DocType
multi-level layout optimization,existing database technology,o performance,aggregating data access,isabela-compressed data,scientific data,current database technology,cutting-edge scientific simulation,efficient spatio-temporal queries,o operation,spatio-temporal constraint,raw data,o subsystems,database indexing,computational modeling,hilbert spaces,layout,data compression,organizations,bandwidth,indexes
Data mining,Metadata,Bottleneck,Locality,Computer science,Parallel computing,Range query (data structures),Database index,Data compression,Analytics,Data access,Distributed computing
Conference
ISSN
Citations 
PageRank 
1530-2075
3
0.39
References 
Authors
0
9
Name
Order
Citations
PageRank
Zhenhuan Gong135113.71
Sriram Lakshminarasimhan218710.01
John Jenkins330.39
Hemanth Kolla425017.13
Stephane Ethier529131.10
Jackie Chen6804.62
Robert Ross72717173.13
Scott Klasky8154799.00
Nagiza F. Samatova986174.04