Title
Counting distinct items over update streams
Abstract
In data streaming applications, data arrives at rapid rates and in high volume, thus making it essential to process each stream update very efficiently in terms of both time and space. A data stream is a sequence of data records that must be processed continuously in an online fashion using sub-linear space and sub-linear processing time. We consider the problem of tracking the number of distinct items over data streams that allow insertion and deletion operations. We present two algorithms that improve on the space and time complexity of existing algorithms.
Year
DOI
Venue
2007
10.1016/j.tcs.2007.02.031
Theor. Comput. Sci.
Keywords
Field
DocType
update stream,data stream,stream update,online algorithm,distinct items,negative dependence,high volume,distinct item,time complexity,online fashion,randomized algorithm,sub-linear space,sub-linear processing time,deletion operation,data record
Data stream mining,Algorithm complexity,Computer science,Spacetime,Algorithm,Theoretical computer science,STREAMS,Time complexity,Distributed computing
Journal
Volume
Issue
ISSN
378
3
Theoretical Computer Science
ISBN
Citations 
PageRank 
3-540-30935-7
19
0.84
References 
Authors
15
1
Name
Order
Citations
PageRank
Sumit Ganguly1813236.01