Title
Programmable Event Detection for In-Band Network Telemetry
Abstract
In-Band Network Telemetry (INT) is a novel framework for collecting telemetry items and switch internal state information from the data plane at line rate. With the support of programmable data planes and programming language P4, switches parse telemetry instruction headers and determine which telemetry items to attach using custom metadata. At the network edge, telemetry information is removed and the original packets are forwarded while telemetry reports are sent to a distributed stream processor for further processing by a network monitoring platform. In order to avoid excessive load on the stream processor, telemetry items should not be sent for each individual packet but rather when certain events are triggered. In this paper, we develop a programmable INT event detection mechanism in P4 that allows customization of which events to report to the monitoring system, on a per-flow basis, from the control plane. At the stream processor, we implement a fast INT report collector using the kernel bypass technique AF_XDP, which parses telemetry reports and streams them to a distributed Kafka cluster, which can apply machine learning, visualization and further monitoring tasks. In our evaluation, we use real-world traces from different data center workloads and show that our approach is highly scalable and significantly reduces the network overhead and stream processor load due to effective event pre-filtering inside the switch data plane. While the INT report collector can process around 3 Mpps telemetry reports per core, using event pre-filtering increases the capacity by 10-15x.
Year
DOI
Venue
2019
10.1109/CloudNet47604.2019.9064137
2019 IEEE 8th International Conference on Cloud Networking (CloudNet)
Keywords
DocType
ISSN
programmable event detection,internal state information,programmable data planes,parse telemetry instruction headers,telemetry information,distributed stream processor,network monitoring platform,programmable INT event detection mechanism,network overhead,switch data plane,event pre-filtering,data center workloads,machine learning,distributed Kafka cluster,kernel bypass technique,control plane,custom metadata,programming language P4,in-band network telemetry,INT report collector
Conference
2374-3239
ISBN
Citations 
PageRank 
978-1-7281-4833-5
1
0.37
References 
Authors
9
6
Name
Order
Citations
PageRank
Jonathan Vestin1213.76
Andreas Kassler232942.96
Bhamare Deval310.37
Karl-Johan Grinnemo414321.42
Andersson Jan-Olof510.37
Gergely Pongrácz66816.25