Title
Mining Multivariate Discrete Event Sequences for Knowledge Discovery and Anomaly Detection
Abstract
Modern physical systems deploy large numbers of sensors to record at different time-stamps the status of different systems components via measurements such as temperature, pressure, speed, but also the component's categorical state. Depending on the measurement values, there are two kinds of sequences: continuous and discrete. For continuous sequences, there is a host of state-of-the-art algorithms for anomaly detection based on time-series analysis, but there is a lack of effective methodologies that are tailored specifically to discrete event sequences. This paper proposes an analytics framework for discrete event sequences for knowledge discovery and anomaly detection. During the training phase, the framework extracts pairwise relationships among discrete event sequences using a neural machine translation model by viewing each discrete event sequence as a "natural language". The relationship between sequences is quantified by how well one discrete event sequence is "translated" into another sequence. These pairwise relationships among sequences are aggregated into a multivariate relationship graph that clusters the structural knowledge of the underlying system and essentially discovers the hidden relationships among discrete sequences. This graph quantifies system behavior during normal operation. During testing, if one or more pairwise relationships are violated, an anomaly is detected. The proposed framework is evaluated on two real-world datasets: a proprietary dataset collected from a physical plant where it is shown to be effective in extracting sensor pairwise relationships for knowledge discovery and anomaly detection, and a public hard disk drive dataset where its ability to effectively predict upcoming disk failures is illustrated.
Year
DOI
Venue
2020
10.1109/DSN48063.2020.00067
2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)
Keywords
DocType
ISSN
anomaly detection, categorical event sequences, discrete event sequences, rare events, unsupervised learning, physical plant failures, disk failures
Conference
1530-0889
ISBN
Citations 
PageRank 
978-1-7281-5810-5
0
0.34
References 
Authors
18
5
Name
Order
Citations
PageRank
Bin Nie1275.56
Jianwu Xu2212.18
Jacob Alter300.34
Haifeng Chen476164.79
Evgenia Smirni51857161.97