Title
"Memory loss" in commodity hardware?: predicting DIMM failures with machine learning.
Abstract
Failures of memory modules have been a concern for a long time, as they are costly both in terms of hardware replacement and service disruption. These failures can be preceded by correctable (soft) and then uncorrectable (hard) errors, which accumulate over time. Valuable large scale studies of DIMM errors in the wild [2, 1] analyze in depth hard and soft errors and their correlations with specific sensors. However, little has been reported on how these findings could be used to automatically predict future DIMM failures. We show that by understanding which factors drive such failures, we can build intelligent predictive models with off-the-shelf machine learning techniques to predict DIMM failures ahead of time with high accuracy. Such models not only provide early signs of failures, but also allow administrators to proactively replace DIMMs at risk weeks in advance, thus avoiding \"memory loss\" of their commodity hardware.
Year
DOI
Venue
2017
10.1145/3078468.3078486
SYSTOR
Keywords
Field
DocType
Failure prediction, Memory systems, Machine learning
DIMM,Computer science,Artificial intelligence,Memory systems,Commodity hardware,Machine learning
Conference
Citations 
PageRank 
References 
0
0.34
0
Authors
3
Name
Order
Citations
PageRank
Ioana Giurgiu121314.09
Dorothea Wiesmann2414.30
John Bird350.78