Title
On Estimating the First Frequency Moment of Data Streams
Abstract
Estimating the first moment of a data stream defined as $F_1 = \sum_{i \in \{1, 2, \ldots, n\}} \abs{f_i}$ to within $1 \pm \epsilon$-relative error with high probability is a basic and influential problem in data stream processing. A tight space bound of $O(\epsilon^{-2} \log (mM))$ is known from the work of [Kane-Nelson-Woodruff-SODA10]. However, all known algorithms for this problem require per-update stream processing time of $\Omega(\epsilon^{-2})$, with the only exception being the algorithm of [Ganguly-Cormode-RANDOM07] that requires per-update processing time of $O(\log^2(mM)(\log n))$ albeit with sub-optimal space $O(\epsilon^{-3}\log^2(mM))$. In this paper, we present an algorithm for estimating $F_1$ that achieves near-optimality in both space and update processing time. The space requirement is $O(\epsilon^{-2}(\log n + (\log \epsilon^{-1})\log(mM)))$ and the per-update processing time is $O( (\log n)\log (\epsilon^{-1}))$.
Year
Venue
Keywords
2010
Clinical Orthopaedics and Related Research
data structure,relative error,stream processing
Field
DocType
Volume
Binary logarithm,Discrete mathematics,Combinatorics,Data stream mining,Data stream processing,Omega,Moment (mathematics),Mathematics
Journal
abs/1005.0
Citations 
PageRank 
References 
0
0.34
11
Authors
2
Name
Order
Citations
PageRank
Sumit Ganguly1813236.01
Purushottam Kar237922.55