Title
FChain: Toward Black-Box Online Fault Localization for Cloud Systems
Abstract
Distributed applications running inside cloud systems are prone to performance anomalies due to various reasons such as resource contentions, software bugs, and hardware failures. One big challenge for diagnosing an abnormal distributed application is to pinpoint the faulty components. In this paper, we present a black-box online fault localization system called FChain that can pinpoint faulty components immediately after a performance anomaly is detected. FChain first discovers the onset time of abnormal behaviors at different components by distinguishing the abnormal change point from many change points caused by normal workload fluctuations. Faulty components are then pinpointed based on the abnormal change propagation patterns and inter-component dependency relationships. FChain performs runtime validation to further filter out false alarms. We have implemented FChain on top of the Xen platform and tested it using several benchmark applications (RUBiS, Hadoop, and IBM System S). Our experimental results show that FChain can quickly pinpoint the faulty components with high accuracy within a few seconds. FChain can achieve up to 90% higher precision and 20% higher recall than existing schemes. FChain is non-intrusive and light-weight, which imposes less than 1% overhead to the cloud system.
Year
DOI
Venue
2013
10.1109/ICDCS.2013.26
ICDCS
Keywords
Field
DocType
ibm system,change point,cloud system,black-box online fault localization,faulty component,higher precision,abnormal change propagation pattern,abnormal change point,cloud systems,higher recall,abnormal behavior,performance anomaly,cloud computing,benchmark testing,web servers,software bugs,accuracy,software fault tolerance,distributed applications,measurement
Black box (phreaking),IBM,Cloud systems,Computer science,Workload,Software bug,Software fault tolerance,Real-time computing,Localization system,Cloud computing,Distributed computing
Conference
ISSN
Citations 
PageRank 
1063-6927
25
0.89
References 
Authors
24
4
Name
Order
Citations
PageRank
Hiep Nguyen127710.84
Zhiming Shen248216.67
Yongmin Tan31697.58
Xiaohui Gu41975103.57