Title
A local multiscale probabilistic graphical model for data validation and reconstruction, and its application in industry.
Abstract
The detection and subsequent reconstruction of incongruent data in time series by means of observation of statistically related information is a recurrent issue in data validation. Unlike outliers, incongruent observations are not necessarily confined to the extremes of the data distribution. Instead, these rogue observations are unlikely values in the light of statistically related information. This paper proposes a multiresolution Bayesian network model for the detection of rogue values and posterior reconstruction of the erroneous sample for non-stationary time-series. Our method builds local Bayesian Network models that best fit to segments of data in order to achieve a finer discretization and hence improve data reconstruction. Our local multiscale approach is compared against its single-scale global predecessor (assumed as our gold standard) in the predictive power and of this, both error detection capabilities and error reconstruction capabilities are assessed. This parameterization and verification of the model are evaluated over three synthetic data source topologies. The virtues of the algorithm are then further tested in real data from the steel industry where the aforementioned problem characteristics are met but for which the ground truth is unknown. The proposed local multiscale approach was found to dealt better with increasing complexities in data topologies.
Year
DOI
Venue
2018
10.1016/j.engappai.2018.01.001
Engineering Applications of Artificial Intelligence
Keywords
Field
DocType
Bayesian networks,Data validation,Multiscale approach,Outlier detection,Probabilistic graphical models
Data mining,Data validation,Computer science,Outlier,Error detection and correction,Ground truth,Synthetic data,Bayesian network,Artificial intelligence,Probabilistic logic,Graphical model,Machine learning
Journal
Volume
ISSN
Citations 
70
0952-1976
0
PageRank 
References 
Authors
0.34
6
7