Title
SEFEE: lightweight storage error forecasting in large-scale enterprise storage systems
Abstract
ABSTRACTWith the rapid growth in scale and complexity, today's enterprise storage systems need to deal with significant amounts of errors. Existing proactive methods mainly focus on machine learning techniques trained using SMART measurements. However, such methods are usually expensive to use in practice and can only be applied to a limited types of errors with a limited scale. We collected more than 23-million storage events from 87 deployed NetApp-ONTAP systems managing 14,371 disks for two years and propose a lightweight training-free storage error forecasting method SEFEE. SEFEE employs Tensor Decomposition to directly analyze storage error-event logs and perform online error prediction for all error types in all storage nodes. SEFEE explores hidden spatio-temporal information that is deeply embedded in the global scale of storage systems to achieve record breaking error forecasting accuracy with minimal prediction overhead.
Year
DOI
Venue
2020
10.5555/3433701.3433786
The International Conference for High Performance Computing, Networking, Storage, and Analysis
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Amirhessam Yazdi100.34
Xing Lin200.68
Lei Yang300.34
feng yan4407.98