Abstract | ||
---|---|---|
The growth of online services has created the need for duplicate elimination in high-volume streams of events. The sheer volume of data in applications such as pay-per-click clickstream processing, RSS feed syndication and notification services in social sites such Twitter and Facebook makes traditional centralized solutions hard to scale. In this paper, we propose an approach based on distributed filtering. To this end, we introduce a suite of distributed Bloom filters that exploit different ways of partitioning the event space. To address the continuous nature of event delivery, the filters are extended to support sliding window semantics. Moreover, we examine locality-related tradeoffs and propose a tree-based architecture to allow for duplicate elimination across geographic locations. We cast the design space and present experimental results that demonstrate the pros and cons of our various solutions in different settings. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1145/2063576.2063643 | CIKM |
Keywords | Field | DocType |
rss feed syndication,bloom filter,different way,continuous nature,design space,different setting,duplicate elimination,event delivery,event space,geographic location,difference set,sliding window | Bloom filter,Data mining,Sliding window protocol,Suite,Clickstream,Computer science,Exploit,RSS,Semantics,Web syndication | Conference |
Citations | PageRank | References |
7 | 0.42 | 17 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Georgia Koloniari | 1 | 220 | 16.49 |
Nikos Ntarmos | 2 | 219 | 15.40 |
evaggelia pitoura | 3 | 1968 | 321.56 |
Dimitris Souravlias | 4 | 48 | 4.34 |