Abstract | ||
---|---|---|
The recently-proposed Geometric Monitoring (GM) method has provided a general tool for the distributed monitoring of arbitrary non-linear queries over streaming data observed by a collection of remote sites, with numerous practical applications. Unfortunately, GM-based techniques can suffer from serious scalability issues with increasing numbers of remote sites. In this paper, we propose novel techniques that effectively tackle the aforementioned scalability problems by exploiting a carefully designed sample of the remote sites for efficient approximate query tracking. Our novel sampling-based scheme utilizes a sample of cardinality proportional to √N (compared to N for the original GM), where $N$ is the number of sites in the network, to perform the monitoring process. Our experimental evaluation over a variety of real-life data streams demonstrates that our sampling-based techniques can significantly reduce the communication cost during distributed monitoring with controllable, predefined accuracy guarantees. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1145/2882903.2915225 | SIGMOD Conference |
Field | DocType | Citations |
Data mining,Data stream mining,Computer science,Cardinality,Streaming data,Sampling (statistics),Database,Scalability | Conference | 3 |
PageRank | References | Authors |
0.37 | 31 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Nikos Giatrakos | 1 | 176 | 14.94 |
Antonios Deligiannakis | 2 | 828 | 48.19 |
Minos Garofalakis | 3 | 4904 | 664.22 |