Title
DriftInsight: Detecting Anomalous Behaviors in Large-Scale Cloud Platform
Abstract
Detecting anomalous behaviors of cloud platforms is one of critical tasks for cloud providers. Every anomalous behavior potentially causes incidents, especially some unaware and/or unknown issues, which severely harm their SLA (Service Level Agreement). Existing solutions generally monitor cloud platform at different layers and then detect anomalies based on rules or learning algorithms on monitoring metrics. However, complexity of nowadays cloud platforms, high dynamics of cloud workloads and thousands of various types of metrics make anomalous behavior detection more challenging to be applied in production, especially in large scale cloud production environments. In this paper, we present a practical cloud anomalous behavior detection system called DriftInsight. It firstly analyzes multi-denominational metrics of each component and identifies a set of representative steady components based on the convergences of their states. Then it generates a state model and a state transit model for each steady cloud component. Finally, it detects behavior anomalies of these steady components in near real-time and meanwhile evolve behavior models on the fly. The evaluation results of this approach in a commercial large-scale PaaS (Platform-as-a-Service) cloud are demonstrated its capability and efficiency.
Year
DOI
Venue
2017
10.1109/CLOUD.2017.37
2017 IEEE 10th International Conference on Cloud Computing (CLOUD)
Keywords
Field
DocType
cloud computing,behavior model,anomaly detection,unsupervised clustering
Anomaly detection,Computer science,Service-level agreement,On the fly,Real-time computing,State model,Cloud testing,Distributed computing,Cloud computing,Anomalous behavior
Conference
ISSN
ISBN
Citations 
2159-6182
978-1-5386-1994-0
2
PageRank 
References 
Authors
0.38
12
4
Name
Order
Citations
PageRank
Fan Jing Meng1255.02
xiao zhang210330.75
Pengfei Chen3132.66
Jing Min Xu46710.98