Title | ||
---|---|---|
Online anomaly detection framework for spark systems via stage-task behavior modeling |
Abstract | ||
---|---|---|
ABSTRACTWith rapid growth of Big Data, Apache Spark has been in widespread use. However, with the system scale growing, application delays caused by abnormal tasks/nodes become a common problem in Spark systems. In this paper, we propose an anomaly detection approach based on stage-task behaviors modeling. First, we assume that the abnormal behavior of tasks can reflect the node's abnormal situation. Then, from the collected Spark runtime logs, we extract the four-dimension feature vector that related to the tasks execution status, and then classify the task behaviors as normal and abnormal, which is used to discover the abnormal nodes from the distribution of abnormal tasks. Simultaneously, we build the online framework on Spark Streaming and it could integrate the offline learning methodologies, such as the logical regression method, which is a very simple and powerful classifier for the low-dimensional eigenvectors. Additionally, our experiments show that the accuracy of realtime anomaly detection reaches about 91%, and the given cases show that our framework is really effective for detecting abnormal nodes. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1145/3203217.3203265 | Computing Frontiers Conference |
Keywords | Field | DocType |
Spark system, Realtime anomaly detection, Offline logical regression, Feature extraction | Offline learning,Data mining,Anomaly detection,Feature vector,Spark (mathematics),Computer science,Abnormality,Feature extraction,Real-time computing,Classifier (linguistics),Big data | Conference |
Citations | PageRank | References |
1 | 0.36 | 3 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Rui Ren | 1 | 39 | 6.66 |
Shuai Tian | 2 | 1 | 0.36 |
Lei Wang | 3 | 577 | 46.85 |