Title
Autonomously improving query evaluations over multidimensional data in distributed hash tables
Abstract
The proliferation of observational devices and sensors with networking capabilities has led to growth in both the rates and sources of data that ultimately contribute to extreme-scale data volumes. Datasets generated in such settings are often multidimensional, with each dimension accounting for a feature of interest. We posit that efficient evaluation of queries over such datasets must account for both the distribution of data values and the patterns in the queries themselves. Configuring query evaluation by hand is infeasible given the data volumes, dimensionality, and the rates at which new data and queries arrive. In this paper, we describe our algorithm to autonomously improve query evaluations over voluminous, distributed datasets. Our approach autonomously tunes for the most dominant query patterns and distribution of values across a dimension. We evaluate our algorithm in the context of our system, Galileo, which is a hierarchical distributed hash table used for managing geospatial, time-series data. Our system strikes a balance between memory utilization, fast evaluations, and search space reductions. Empirical evaluations reported here are performed on a dataset that is multidimensional and comprises a billion files. The schemes described in this work are broadly applicable to any system that leverages distributed hash tables as a storage mechanism.
Year
DOI
Venue
2013
10.1145/2494621.2494638
CAC
Keywords
Field
DocType
time-series data,multidimensional data,hash table,query evaluation,approach autonomously tune,dominant query pattern,data value,new data,dimension accounting,data volume,efficient evaluation
Geospatial analysis,Data mining,Galileo (satellite navigation),Computer science,Curse of dimensionality,Hash function,Distributed hash table,Hash table
Conference
Citations 
PageRank 
References 
6
0.48
17
Authors
3
Name
Order
Citations
PageRank
Matthew Malensek19310.44
Sangmi Lee Pallickara217024.46
Shrideep Pallickara383792.72