Title
Fully decentralized computation of aggregates over data streams
Abstract
In several emerging applications, data is collected in massive streams at several distributed points of observation. A basic and challenging task is to allow every node to monitor a neighbourhood of interest by issuing continuous aggregate queries on the streams observed in its vicinity. This class of algorithms is fully decentralized and diffusive in nature: collecting all data at few central nodes of the network is unfeasible in networks of low capability devices or in the presence of massive data sets. The main difficulty in designing diffusive algorithms is to cope with duplicate detections. These arise both from the observation of the same event at several nodes of the network and/or receipt of the same aggregated information along multiple paths of diffusion. In this paper, we consider fully decentralized algorithms that answer locally continuous aggregate queries on the number of distinct events, total number of events and the second frequency moment in the scenario outlined above. The proposed algorithms use in the worst case or on realistic distributions sublinear space at every node. We also propose strategies that minimize the communication needed to update the aggregates when new events are observed. We finally present experimental analysis providing evidence for the efficiency and accuracy of our algorithms on realistic simulated scenarios.
Year
DOI
Venue
2011
10.1145/1964897.1964919
ACM SIGKDD Explorations Newsletter
Keywords
Field
DocType
decentralized computation,total number,continuous aggregate query,realistic simulated scenario,diffusive algorithm,massive stream,central node,decentralized algorithm,realistic distribution,massive data set,data stream,aggregated information
Sublinear function,Data mining,Data stream mining,Data set,Activity recognition,Computer science,Accelerometer,Real-time computing,Artificial intelligence,Machine learning,Computation,Distributed computing
Journal
Volume
Issue
Citations 
12
2
1
PageRank 
References 
Authors
0.48
22
4
Name
Order
Citations
PageRank
Luca Becchetti194555.75
Ilaria Bordino262928.81
Stefano Leonardi341128.87
Adi Rosen4132.08