Title
DBDesigner: A customizable physical design tool for Vertica Analytic Database.
Abstract
In this paper, we present Vertica's customizable physical design tool, called the DBDesigner (DBD), that produces designs optimized for various scenarios and applications. For a given workload and space budget, DBD automatically recommends a physical design that optimizes query performance, storage footprint, fault tolerance and recovery to meet different customer requirements. Vertica is a distributed, massively parallel columnar database that physically organizes data into projections. Projections are attribute subsets from one or more tables with tuples sorted by one or more attributes, that are replicated or segmented (distributed) on cluster nodes. The key challenges involved in projection design are picking appropriate column sets, sort orders, cluster data distributions and column encodings. To achieve the desired trade-off between query performance and storage footprint, DBD operates under three different design policies: (a) load-optimized, (b) query-optimized or (c) balanced. These policies indirectly control the number of projections proposed and queries optimized to achieve the desired balance. To cater to query workloads that evolve over time, DBD also operates in a comprehensive and incremental design mode. In addition, DBD lets users override specific features of projection design based on their intimate knowledge about the data and query workloads. We present the complete physical design algorithm, describing in detail how projection candidates are efficiently explored and evaluated using optimizer's cost and benefit model. Our experimental results show that DBD produces good physical designs that satisfy a variety of customer use cases.
Year
DOI
Venue
2014
10.1109/ICDE.2014.6816725
ICDE
Keywords
Field
DocType
parallel databases,query processing,DBD,DBDesigner tool,Vertica analytic database,balanced design policy,customer requirements,customer use case,customizable physical design tool,distributed massively parallel columnar database,fault recovery,fault tolerance,incremental design mode,load-optimized design policy,projection candidates,projection design,query performance,query workloads,query-optimized design policy,storage footprint
Data mining,Algorithm design,Use case,Computer science,Massively parallel,Tuple,sort,Fault tolerance,Distributed database,Physical design,Database,Distributed computing
Conference
ISSN
Citations 
PageRank 
1084-4627
1
0.35
References 
Authors
0
5
Name
Order
Citations
PageRank
Ramakrishna Varadarajan119510.66
Vivek Bharathan221.03
Ariel Cary31547.44
Jaimin Dave4122.25
Sreenath Bodagala511711.70