Title
The Value of Multiple Read/Write Streams for Approximating Frequency Moments
Abstract
We consider the read/write streams model, an extension of the standard data stream model in which an algorithm can create and manipulate multiple read/write streams in addition to its input data stream. Like the data stream model, the most important parameter for this model is the amount of internal memory used by such an algorithm. The other key parameters are the number of streams the algorithm uses and the number of passes it makes on these streams. We consider how the addition of multiple streams impacts the ability of algorithms to approximate the frequency moments of the input stream. We show that any randomized read/write stream algorithm with a fixed number of streams and a sublogarithmic number of passes that produces a constant factor approximation of the k-th frequency moment Fk of an input sequence of length of at most N from {1,..., N} requires space Ω(N 1−4/k−δ) for any δ > 0. For comparison, it is known that with a single read-only one-pass data stream there is a randomized constant-factor approximation for Fk using Õ(N1−2/k) space, and that by sorting, which can be done deterministically in O(log N) passes using 3 read/write streams, Fk can be computed exactly. Therefore, although the ability to manipulate multiple read/write streams can add substantial power to the data stream model, with a sublogarithmic number of passes this does not significantly improve the ability to approximate higher frequency moments efficiently. We obtain our results by showing a new connection between the read/write streams model and the multiparty number-in-hand communication model.
Year
DOI
Venue
2012
10.1145/2077336.2077339
TOCT
Keywords
Field
DocType
streams model,input stream,write streams,multiple read,data stream model,multiparty number-in-hand communication model,multiple stream,input data stream,standard data stream model,sublogarithmic number,approximating frequency moments,stream algorithm,data streams,streaming algorithm,communication complexity,communication model,external memory algorithms
Binary logarithm,Discrete mathematics,Data stream mining,Frequency moments,Data stream,Computer science,Parallel computing,Algorithm,Sorting,Communication complexity,Out-of-core algorithm,STREAMS
Journal
Volume
Issue
Citations 
3
2
11
PageRank 
References 
Authors
0.91
19
2
Name
Order
Citations
PageRank
Paul Beame12234176.07
Trinh Huynh2442.53