Title
Adaptive Aggregation on Chip Multiprocessors
Abstract
The recent introduction of commodity chip multiprocessors requires that the design of core database operations be care- fully examined to take full advantage of on-chip parallelism. In this paper we examine aggregation in a multi-core en- vironment, the Sun UltraSPARC T1, a chip multiproces- sor with eight cores and a shared L2 cache. Aggregation is an important aspect of query processing that is seemingly easy to understand and implement. Our research, however, demonstrates that a chip multiprocessor adds new dimen- sions to understanding hash-based aggregation performance— concurrent sharing of aggregation data structures and con- tentious accesses to frequently used values. We also iden- tify a trade off between private data structures assigned to each thread versus shared data structures for aggrega- tion. Depending on input characteristics, different aggrega- tion strategies are optimal and choosing the wrong strategy can result in a performance penalty of over an order of mag- nitude. We provide a thorough explanation of the factors af- fecting aggregation performance on chip multiprocessors and identify three key input characteristics that dictate perfor- mance: (1) average run length of identical group-by values, (2) locality of references to the aggregation hash table, and (3) frequency of repeated accesses to the same hash table location. We then introduce an adaptive aggregation op- erator that performs lightweight sampling of the input to choose the correct aggregation strategy with high accuracy. Our experiments verify that our adaptive algorithm chooses the highest performing aggregation strategy on a number of common input distributions.
Year
Venue
Keywords
2007
VLDB
aggregation performance,adaptive aggregation operator,correct aggregation strategy,aggregation data structure,chip multiprocessors,hash-based aggregation performance,different aggregation strategy,aggregation strategy,chip multiprocessor,aggregation hash table,hash table,data structure,chip
Field
DocType
Citations 
Data structure,Locality,CPU cache,Computer science,Parallel computing,Multiprocessing,Hash function,Adaptive algorithm,Database,Distributed computing,Hash table,UltraSPARC
Conference
60
PageRank 
References 
Authors
2.93
17
2
Name
Order
Citations
PageRank
John Cieslewicz133519.95
Kenneth A. Ross24110750.80