Title
Towards optimal sampling for flow size estimation
Abstract
The flow size distribution is a useful metric for traffic modeling and management. It is well known however that its estimation based on sampled data is problematic. Previous work has shown that flow sampling (FS) offers enormous statistical benefits over packet sampling, however it suffers from high resource requirements and is not currently used in routers. In this paper we present Dual Sampling, which can to a large extent provide flow-sampling-like statistical performance for packet-sampling-like computational cost. Our work is grounded in a Fisher information based approach recently used to evaluate a number of sampling schemes, excluding however FS, for TCP flows. We show how to revise and extend the approach to include FS as well as DS and others, and how to make rigorous and fair comparisons. We show how DS significantly outperforms other packet based methods, but also prove that DS is inferior to flow sampling. However, since DS is a two-parameter family of methods which includes FS as a special case, DS can be used to approach flow sampling continuously. We then describe a packet sampling based implementation of DS and analyze its key computational costs to show that router implementation is feasible. Our approach offers insights into many issues, including how the notions of 'flow quality' and 'packet gain' can be used to understand the relative performance of methods, and how the problem of optimal sampling can be formulated. Our work is theoretical with some simulation support and a case study on Internet data.
Year
DOI
Venue
2008
10.1145/1452520.1452550
Internet Measurement Comference
Keywords
Field
DocType
flow size distribution,packet sampling,flow size estimation,internet data,previous work,flow quality,sampling scheme,case study,packet gain,optimal sampling,flow sampling,fisher information,sampling
Computer science,Flow (psychology),Network packet,Computer network,Fisher information,Sampling (statistics),Router,Packet sampling,Special case,The Internet
Conference
Citations 
PageRank 
References 
23
1.40
9
Authors
2
Name
Order
Citations
PageRank
Paul Tune1838.83
Darryl Veitch290384.47