Title
Load shedding in a data stream manager
Abstract
A Data Stream Manager accepts push-based inputs from a set of data sources, processes these inputs with respect to a set of standing queries, and produces outputs based on Quality-of-Service (QoS) specifications. When input rates exceed system capacity, the system will become overloaded and latency will deteriorate. Under these conditions, the system will shed load, thus degrading the answer, in order to improve the observed latency of the results. This paper examines a technique for dynamically inserting and removing drop operators into query plans as required by the current load. We examine two types of drops: the first drops a fraction of the tuples in a randomized fashion, and the second drops tuples based on the importance of their content. We address the problems of determining when load shedding is needed, where in the query plan to insert drops, and how much of the load should be shed at that point in the plan. We describe efficient solutions and present experimental evidence that they can bring the system back into the useful operating range with minimal degradation in answer quality.
Year
DOI
Venue
2003
10.1016/B978-012722442-8/50035-5
VLDB
Keywords
Field
DocType
current load,efficient solution,observed latency,system capacity,data stream manager,answer quality,query plan,drop operator,load shed,data source
Data stream management system,Latency (engineering),Data stream,Computer science,Tuple,Quality of service,Real-time computing,Operator (computer programming),Database,Query plan,Load Shedding
Conference
ISSN
ISBN
Citations 
Proceedings 2003 VLDB Conference
0-12-722442-4
324
PageRank 
References 
Authors
13.67
20
5
Search Limit
100324
Name
Order
Citations
PageRank
Nesime Tatbul13415239.74
Ugur Çetintemel257934.26
Stanley B. Zdonik391861660.15
Mitch Cherniack44128293.66
Michael Stonebraker5124634310.17