Abstract | ||
---|---|---|
Nowadays, cloud co-location is often used for data centers to improve the utilization of computing resources. However, batch jobs in a Co-location Datacenter (CLD) are vulnerable to failures due to the competition for limited resources with online service jobs. Such failed batch jobs would be rescheduled and failed repeatedly, resulting in the waste of computing resources and instability of the computing clusters. Therefore, we propose a method to accurately predict the potential failures of batch jobs for CLD. The core of the proposed method is STLF (SMOTE Tomek and LightGBM [5] Framework), which is divided into three parts. First, we use the co-feature extraction method to generate Co-located Feature Dataset (CLFD). Then SMOTE Tomek is used to oversampling the CLFD to ensure that the classifier can learn more minority features. Finally, we use LightGBM classifier to predict batch jobs' failure. The performance experiments conducted on the Ali Trace 2018 dataset show that our proposed STLF significantly outperforms the existing popular classifiers in terms of the ROC curve, the area under the ROC curve (AUC), precision, and recall. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1109/ICPADS51040.2020.00080 | 2020 IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS) |
Keywords | DocType | ISSN |
cloud computing,co-located datacenter,failure prediction,resource efficiency,datacenter | Conference | 1521-9097 |
ISBN | Citations | PageRank |
978-1-7281-8382-4 | 0 | 0.34 |
References | Authors | |
0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yurui Li | 1 | 0 | 0.34 |
Weiwei Lin | 2 | 8 | 1.85 |
Keqin Li | 3 | 2778 | 242.13 |
James Z. Wang | 4 | 0 | 0.34 |
Fagui Liu | 5 | 23 | 6.06 |
Jie Liu | 6 | 199 | 22.56 |