Title
External Memory Stream Sampling
Abstract
This paper aims to understand the I/O-complexity of maintaining a big sample set---whose size exceeds the internal memory's capacity---on a data stream. We study this topic in a new computation model, named the external memory stream (EMS) model, that naturally extends the standard external memory model to stream environments. A suite of EMS-indigenous techniques are presented to prove matching lower and upper bounds for with-replacement (WR) and without-replacement (WoR) sampling on append-only and time-based sliding window streams, respectively. Our results imply that, compared to RAM, the EMS model is perhaps a more suitable computation model for studying stream sampling, because the new model separates different problems by their hardness in ways that could not be observed in RAM.
Year
DOI
Venue
2015
10.1145/2745754.2745757
ACM SIGMOD Conference on Principles of DB Systems
Keywords
Field
DocType
Stream, Sampling, I/O-Efficient Algorithms, Lower Bound
Sliding window protocol,Suite,Data stream,Computer science,Upper and lower bounds,Algorithm,Theoretical computer science,Sampling (statistics),STREAMS,Computation,Auxiliary memory
Conference
Citations 
PageRank 
References 
0
0.34
14
Authors
3
Name
Order
Citations
PageRank
Xiaocheng Hu100.34
Miao Qiao233.44
Yufei Tao37231316.71