Abstract | ||
---|---|---|
Flow cardinality estimation is the problem of estimating the number of distinct elements in a data flow, often with a stringent memory constraint. It has wide applications in network traffic measurement and in database systems. The virtual HyperLogLog (vHLL) algorithm proposed by Xiao, Chen, Chen and Ling [1] estimates the cardinalities of a large number of flows with a compact memory. This paper explores two new estimation algorithms based on the same compact memory used in [1]. Firstly, we propose and investigate a family of estimators that generalizes the original vHLL estimator. Secondly, we derive an approximate maximum-likelihood estimator. Empirical evidence suggests the near-optimality of the original vHLL estimator for per-flow estimation, analogous to the near-optimality of the HyperLogLog estimator for single-flow estimation. We also propose weighted square error, a single-value metric that quantifies the performance of an estimator. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/CISS.2019.8692884 | 2019 53rd Annual Conference on Information Sciences and Systems (CISS) |
DocType | Volume | ISBN |
Conference | abs/1812.03040 | 978-1-7281-1151-3 |
Citations | PageRank | References |
1 | 0.35 | 0 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zeyu Zhou | 1 | 3 | 1.08 |
Bruce Hajek | 2 | 154 | 17.84 |