Abstract | ||
---|---|---|
Big data systems help organizations store, manipulate, and derive value from vast amounts of data. Relational database and MapReduce are the two most prominent technologies for such systems. Organizations use them to perform complex analysis on diverse and unconventional data types with fast growing data volumes. As more big data systems are deployed, the industry faces the challenge to develop representative benchmarks that can evaluate the capabilities of competing implementations. In this position paper, we argue for building future big data benchmarks using what we call a \"functional workload model\". This concept draws on combined experiences from standard benchmarks, exemplified by TPC-C. The functional workload model describes the functional goals that the system must achieve, the data access patterns, the load variations over time, and the computation required to achieve the functional goals. Abstracting functional workload models from empirical studies of MapReduce deployments represents the first step towards building truly representative big data benchmarks. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1007/978-3-642-53974-9_4 | WBDB |
Field | DocType | Volume |
Data science,Relational database,Workload,System deployment,Computer science,Implementation,Data type,Application domain,Data access,Big data,Database | Conference | 8163 |
ISSN | Citations | PageRank |
0302-9743 | 12 | 0.74 |
References | Authors | |
10 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yanpei Chen | 1 | 917 | 41.46 |
Francois Raab | 2 | 162 | 6.02 |
Randy H. Katz | 3 | 16819 | 3018.89 |