Title
Conditioning and aggregating uncertain data streams: going beyond expectations
Abstract
Uncertain data streams are increasingly common in real-world deployments and monitoring applications require the evaluation of complex queries on such streams. In this paper, we consider complex queries involving conditioning (e.g., selections and group by's) and aggregation operations on uncertain data streams. To characterize the uncertainty of answers to these queries, one generally has to compute the full probability distribution of each operation used in the query. Computing distributions of aggregates given conditioned tuple distributions is a hard, unsolved problem. Our work employs a new evaluation framework that includes a general data model, approximation metrics, and approximate representations. Within this framework we design fast data-stream algorithms, both deterministic and randomized, for returning approximate distributions with bounded errors as answers to those complex queries. Our experimental results demonstrate the accuracy and efficiency of our approximation techniques and offer insights into the strengths and limitations of deterministic and randomized algorithms.
Year
DOI
Venue
2010
10.14778/1920841.1921001
PVLDB
Keywords
Field
DocType
randomized algorithm,approximation technique,approximation metrics,uncertain data stream,new evaluation framework,computing distribution,approximate distribution,general data model,complex query,approximate representation,data model,probability distribution
Uncertain data streams,Randomized algorithm,Data mining,Tuple,Computer science,Conditioning,Theoretical computer science,Probability distribution,Data model,Database,Bounded function
Journal
Volume
Issue
ISSN
3
1-2
2150-8097
Citations 
PageRank 
References 
13
0.60
17
Authors
5
Name
Order
Citations
PageRank
Thanh T. L. Tran12068.09
Andrew Mcgregor2134064.31
Yanlei Diao32234108.95
Liping Peng41077.50
Anna Liu544134.75