Title
A self-adjusting data distribution mechanism for multidimensional load balancing in multiprocessor-based database systems
Abstract
With the advent of micro-processor, memory, and communication technology, it is economically feasible to develop a parallel database computer system to improve the performance of database systems. Relations in such an environment are usually partitioned and distributed across computing units. To achieve the optimal performance, it is essential for each unit to have a perfectly balanced load (i.e., identical amount of data). However, fragment sizes may vary due to insertions to and deletions from a relation. To retain good performance, the system needs to periodically rebalance the load of the processors by redistributing data among computing units. Traditionally, the redistribution is performed by reshuffling tuples among processors through a relation repartitioning (e.g., rehashing) process. The computation of this process is at the tuple level. In this paper, we present a self-adjusting data distribution scheme which balances computer workload at a cell (coarser grain than tuple) level during query processing to minimize redistribution cost. The entire scheme is built on top of the popular grid file structure. The adaptivity of the scheme and its relevant features are discussed. The cost of load rebalancing is estimated. The result shows that under our assumptions, it is always beneficial to rebalance computer workload before performing a join on skewed data.
Year
DOI
Venue
1994
10.1016/0306-4379(94)90014-0
Inf. Syst.
Keywords
Field
DocType
data skew,load balancing,multiprocessor-based database system,parallel query processing,self-adjusting data distribution mechanism,grid file,multidimensional load balancing,load balance,database system
Data mining,Parallel database,Computer science,Grid file,Redistribution (cultural anthropology),Computation,Distributed computing,Tuple,Workload,Load balancing (computing),Parallel computing,Multiprocessing,Database
Journal
Volume
Issue
ISSN
19
7
Information Systems
Citations 
PageRank 
References 
6
0.82
19
Authors
2
Name
Order
Citations
PageRank
Chiang Lee1294149.40
Kien A Hua22870425.79