Title
Deterministically Estimating Data Stream Frequencies
Abstract
We consider updates to an n -dimensional frequency vector of a data stream, that is, the vector f is updated coordinate-wise by means of insertions or deletions in any arbitrary order. A fundamental problem in this model is to recall the vector approximately, that is to return an estimate $\hat{f}$ of f such that where *** is an accuracy parameter and p is the index of the *** p norm used to calculate the norm . This problem, denoted by , is fundamental in data stream processing and is used to solve a number of other problems, such as heavy hitters, approximating range queries and quantiles, approximate histograms, etc.. Suppressing poly-logarithmic factors in n and , for p = 1 the problem is known to have ${\it \tilde{\Theta}}(1/\epsilon)$ randomized space complexity [2,4] and ${\it \tilde{\Theta}}(1/\epsilon^2)$ deterministic space complexity [6,7]. However, the deterministic space complexity of this problem for any value of p 1 is not known. In this paper, we show that the deterministic space complexity of the problem is ${\it \tilde{ \Theta}}(n^{2-2/p}/\epsilon^2)$ for 1 p p *** 2.
Year
DOI
Venue
2009
10.1007/978-3-642-02026-1_28
COCOA
Keywords
Field
DocType
approximate histogram,deterministically estimating data stream,p p,accuracy parameter,randomized space complexity,p norm,data stream processing,fundamental problem,deterministic space complexity,dimensional frequency vector,data stream,space complexity
Discrete mathematics,Data stream processing,Combinatorics,Data stream,Range query (data structures),Quantile,Norm (mathematics),Mathematics
Conference
Volume
ISSN
Citations 
5573
0302-9743
3
PageRank 
References 
Authors
0.38
14
1
Name
Order
Citations
PageRank
Sumit Ganguly1813236.01