Indexing for summary queries: Theory and practice - Citegraph

Paper Info

Title
Indexing for summary queries: Theory and practice

Abstract
Database queries can be broadly classified into two categories: reporting queries and aggregation queries. The former retrieves a collection of records from the database that match the query's conditions, while the latter returns an aggregate, such as count, sum, average, or max (min), of a particular attribute of these records. Aggregation queries are especially useful in business intelligence and data analysis applications where users are interested not in the actual records, but some statistics of them. They can also be executed much more efficiently than reporting queries, by embedding properly precomputed aggregates into an index. However, reporting and aggregation queries provide only two extremes for exploring the data. Data analysts often need more insight into the data distribution than what those simple aggregates provide, and yet certainly do not want the sheer volume of data returned by reporting queries. In this article, we design indexing techniques that allow for extracting a statistical summary of all the records in the query. The summaries we support include frequent items, quantiles, and various sketches, all of which are of central importance in massive data analysis. Our indexes require linear space and extract a summary with the optimal or near-optimal query cost. We illustrate the efficiency and usefulness of our designs through extensive experiments and a system demonstration.

Year	DOI	Venue
2014	10.1145/2508702	ACM Trans. Database Syst.
Keywords	DocType	Volume
business intelligence,Database query,massive data analysis,aggregation query,data analyst,actual record,near-optimal query cost,data analysis application,statistical summary,data distribution,summary query	Journal	39
Issue	ISSN	Citations
1	0362-5915	4
PageRank	References	Authors
0.39	32	3

Authors (3 rows)

Cited by (4 rows)

References (32 rows)

Name	Order	Citations	PageRank
Ke Yi	1	1659	77.79
Lu Wang	2	33	1.91
Zhewei Wei	3	215	20.07

1