Title
Temporal Dynamic Matrix Factorization for Missing Data Prediction in Large Scale Coevolving Time Series.
Abstract
Data missing in collections of time series occurs frequently in practical applications and turns out to be a major menace to precise data analysis. However, most of the existing methods either might be infeasible or could be inefficient to predict the missing values in large-scale coevolving time series. Also, the evolving of time series needs to be handled properly to adapt to the temporal characteristic. Furthermore, more massive volume of data is generated in many areas than ever before. In this paper, we have taken up the challenge of missing data prediction in coevolving time series by employing temporal dynamic matrix factorization techniques. First, our approaches are optimally designed to largely utilize both the interior patterns of each time series and the information of time series across multiple sources to build an initial model. Based on the idea, we have imposed hybrid regularization terms to constrain the objective functions of matrix factorization. Then, temporal dynamic matrix factorization is proposed to effectively update the initial already trained models. In the process of dynamic matrix factorization, batch updating and fine-tuning strategies are also employed to build an effective and efficient model. Extensive experiments on real-world data sets and synthetic data set demonstrate that the proposed approaches can effectively improve the performance of missing data prediction. Even when the missing ratio reaches as high as 90%, our proposed methods still show low prediction errors. Dynamic performance demonstrates that the methods can obtain satisfactory effectiveness and efficiency. Furthermore, we have also demonstrated how to take advantage of the high processing power of Apache Spark to perform missing data prediction in large-scale coevolving time series.
Year
DOI
Venue
2016
10.1109/ACCESS.2016.2606242
IEEE ACCESS
Keywords
Field
DocType
Matrix factorization,missing data prediction,time series,Apache Spark
Time series,Data modeling,Data mining,Data set,Spark (mathematics),Computer science,Matrix decomposition,Synthetic data,Linear programming,Missing data
Journal
Volume
ISSN
Citations 
4
2169-3536
3
PageRank 
References 
Authors
0.39
13
7
Name
Order
Citations
PageRank
WeiWei Shi15112.26
Yongxin Zhu246658.07
Philip S. Yu3306703474.16
Tian Huang4537.40
chang wang53312.55
Yishu Mao630.39
Yufeng Chen7505.37