Title
Utility Cloud: A Novel Approach for Diagnosis and Self-healing Based on the Uncertainty in Anomalous Metrics
Abstract
Diagnosis and self-healing anomalies are an important as pectin the operations for data centers and for computing in utility clouds. Data centers which offer greater size along with heterogeneous and complex applications, software and workload systems such as anomaly/fault detection require automatic operation with runtime self-healing. Moreover, they must do so without requirement of having previous knowledge regarding normal and/or anomalous performance. Diagnosis must function for levels of different abstraction such as hardware and software, along with multiple time-series metrics for the use in environments like cloud computing. By classifying and analyzing metrics that are qualitative can offer new methods for anomaly and fault diagnosis for their distribution instead of simply providing individual metric thresholds. Categorical counterparts (binning) and normal distribution are used as a measurement that captures the extent of the distributions for concentration or dispersion so that raw metric data can be gathered across the cloud stack to create a time-series for entropy (e.g. symptoms).These symptoms can be organized hierarchically and across many cloud-ecosystems to increase scalability. These symptoms can also address the uncertainty in anomaly data. This contribution is possible since the multi-value decision diagram (MDD) for the structure of Multiple-Valued Logic (MVL) is shown to be highly efficient in terms of rapid analytical performance and is adapted to integrate the effects of measurements for multiple values. Naive Bayes Classifier (NBC) with an influence diagram (ID) are used to create time-series diagnosis technique to categorize and detect anomalies/faults during runtime to approximate the influence of each self-healing system component as to systems functioning and reliability. For the utility cloud service scenarios, the results of experiment show the viability of the presented approach. With an average of 0.89% improvement in the accuracy of anomaly diagnosis on threshold methods and an average improvement of 0.04% for false alarm rates with the near-optimum threshold method the presented approach outperforms other methods.
Year
DOI
Venue
2017
10.1145/3034950.3034967
Proceedings of the 2017 International Conference on Management Engineering, Software Engineering and Service Sciences
DocType
ISBN
Citations 
Conference
978-1-4503-4834-8
0
PageRank 
References 
Authors
0.34
1
3
Name
Order
Citations
PageRank
Ameen Alkasem110.70
Hongwei Liu237663.93
De-Cheng Zuo38618.87