Title
Automatic failure recovery for software-defined networks
Abstract
Tolerating and recovering from link and switch failures are fundamental requirements of most networks, including Software-Defined Networks (SDNs). However, instead of traditional behaviors such as network-wide routing re-convergence, failure recovery in an SDN is determined by the specific software logic running at the controller. While this admits more freedom to respond to a failure event, it ultimately means that each controller application must include its own recovery logic, which makes the code more difficult to write and potentially more error-prone. In this paper, we propose a runtime system that automates failure recovery and enables network developers to write simpler, failure-agnostic code. To this end, upon detecting a failure, our approach first spawns a new controller instance that runs in an emulated environment consisting of the network topology excluding the failed elements. Then, it quickly replays inputs observed by the controller before the failure occurred, leading the emulated network into the forwarding state that accounts for the failed elements. Finally, it recovers the network by installing the difference ruleset between emulated and current forwarding states.
Year
DOI
Venue
2013
10.1145/2491185.2491218
HotSDN
Keywords
Field
DocType
automatic failure recovery,own recovery logic,failure event,failure recovery,failed element,network topology,controller application,switch failure,software-defined network,current forwarding state,new controller instance,network developer,software defined network,fault tolerance,computer science
Control theory,Computer science,Computer network,Installation,Network topology,Fault tolerance,Software,Software-defined networking,Runtime system,Distributed computing
Conference
Citations 
PageRank 
References 
17
0.92
7
Authors
5
Name
Order
Citations
PageRank
Maciej Kuzniar120711.47
Peter Perešíni237521.98
Nedeljko Vasić332613.59
Marco Canini485760.21
Dejan Kostic51707119.11