Title
CloudRanger: Root Cause Identification for Cloud Native Systems
Abstract
As more and more systems are migrating to cloud environment, the cloud native system becomes a trend. This paper presents the challenges and implications when diagnosing root causes for cloud native systems by analyzing some real incidents occurred in IBM Bluemix (a large commercial cloud). To tackle these challenges, we propose CloudRanger, a novel system dedicated for cloud native systems. To make our system more general, we propose a dynamic causal relationship analysis approach to construct impact graphs amongst applications without given the topology. A heuristic investigation algorithm based on second-order random walk is proposed to identify the culprit services which are responsible for cloud incidents. Experimental results in both simulation environment and IBM Bluemix platform show that CloudRanger outperforms some state-of-the-art approaches with a 10% improvement in accuracy. It offers a fast identification of culprit services when an anomaly occurs. Moreover, this system can be deployed rapidly and easily in multiple kinds of cloud native systems without any predefined knowledge.
Year
DOI
Venue
2018
10.1109/CCGRID.2018.00076
2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)
Keywords
Field
DocType
Cloud native systems,Micro service architecture,Root cause analysis,Anomaly detection,Causality
Anomaly detection,IBM,Heuristic,Random walk,Computer science,Root cause analysis,Throughput,Root cause,Distributed computing,Cloud computing
Conference
ISBN
Citations 
PageRank 
978-1-5386-5816-1
2
0.37
References 
Authors
11
7
Name
Order
Citations
PageRank
Ping Wang114914.37
Jing Min Xu26710.98
Meng Ma38212.29
Weilan Lin442.11
Disheng Pan531.74
Yuan Wang6869.67
Pengfei Chen7132.66