Title
Hierarchical IP flow clustering
Abstract
The analysis of flow traces can help to understand a network's usage patterns. We present a hierarchical clustering algorithm for network flow data that can summarize terabytes of IP traffic into a parsimonious tree model. The method automatically finds an appropriate scale of aggregation so that each cluster represents a local maximum of the traffic density from a block of source addresses to a block of destination addresses. We apply this clustering method on NetFlow data from an enterprise network, find the largest traffic clusters, and analyze their stationarity across time. The existence of heavy-volume clusters that persist over long time scales can help network operators to perform usage-based accounting, capacity provisioning and traffic engineering. Also, changes in the layout of hierarchical clusters can facilitate the detection of anomalies and significant changes in the network workload.
Year
DOI
Venue
2017
10.1145/3098593.3098598
Big-DAMA@SIGCOMM
Keywords
DocType
Volume
Flow clustering, Hierarchical clustering, NetFlow, Unsupervised Machine Learning
Journal
47
Issue
ISSN
ISBN
5
0146-4833
978-1-4503-5054-9
Citations 
PageRank 
References 
1
0.34
11
Authors
3
Name
Order
Citations
PageRank
Kamal Shadi110.68
Preethi Natarajan222112.49
Constantine Dovrolis34047290.24