Title | ||
---|---|---|
Multi-level Layout Optimization for Efficient Spatio-temporal Queries on ISABELA-compressed Data |
Abstract | ||
---|---|---|
The size and scope of cutting-edge scientific simulations are growing much faster than the I/O subsystems of their runtime environments, not only making I/O the primary bottleneck, but also consuming space that pushes the storage capacities of many computing facilities. These problems are exacerbated by the need to perform data-intensive analytics applications, such as querying the dataset by variable and spatio-temporal constraints, for what current database technologies commonly build query indices of size greater than that of the raw data. To help solve these problems, we present a parallel query-processing engine that can handle both range queries and queries with spatio-temporal constraints, on B-spline compressed data with user-controlled accuracy. Our method adapts to widening gaps between computation and I/O performance by querying on compressed metadata separated into bins by variable values, utilizing Hilbert space-filling curves to optimize for spatial constraints and aggregating data access to improve locality of per-bin stored data, reducing the false positive rate and latency bound I/O operations (such as seek) substantially. We show our method to be efficient with respect to storage, computation, and I/O compared to existing database technologies optimized for query processing on scientific data. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/IPDPS.2012.83 | IPDPS |
Keywords | Field | DocType |
multi-level layout optimization,existing database technology,o performance,aggregating data access,isabela-compressed data,scientific data,current database technology,cutting-edge scientific simulation,efficient spatio-temporal queries,o operation,spatio-temporal constraint,raw data,o subsystems,database indexing,computational modeling,hilbert spaces,layout,data compression,organizations,bandwidth,indexes | Data mining,Metadata,Bottleneck,Locality,Computer science,Parallel computing,Range query (data structures),Database index,Data compression,Analytics,Data access,Distributed computing | Conference |
ISSN | Citations | PageRank |
1530-2075 | 3 | 0.39 |
References | Authors | |
0 | 9 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zhenhuan Gong | 1 | 351 | 13.71 |
Sriram Lakshminarasimhan | 2 | 187 | 10.01 |
John Jenkins | 3 | 3 | 0.39 |
Hemanth Kolla | 4 | 250 | 17.13 |
Stephane Ethier | 5 | 291 | 31.10 |
Jackie Chen | 6 | 80 | 4.62 |
Robert Ross | 7 | 2717 | 173.13 |
Scott Klasky | 8 | 1547 | 99.00 |
Nagiza F. Samatova | 9 | 861 | 74.04 |