Title
A Nested Two-Stage Clustering Method For Structured Temporal Sequence Data
Abstract
Mining patterns of temporal sequence data is an important problem across many disciplines. Under appropriate preprocessing procedures, a structured temporal sequence can be organized into a probability measure or a time series representation, which grants a potential to reveal distinctive temporal pattern characteristics. In this paper, we propose a nested two-stage clustering method that integrates optimal transport and the dynamic time warping distances to learn the distributional and dynamic shape-based dissimilarity at the respective stage. The proposed clustering algorithm preserves both the distribution and shape patterns present in the data, which are critical for the datasets composed of structured temporal sequences. The effectiveness of the method is tested against existing agglomerative and K-shape-based clustering algorithms on Monte Carlo simulated synthetic datasets, and the performance is compared through various cluster validation metrics. Furthermore, we apply the developed method to real-world datasets from three domains: temporal dietary records, online retail sales, and smart meter energy profiles. The expressiveness of the cluster and subcluster centroid patterns shows significant promise of our method for structured temporal sequence data mining.
Year
DOI
Venue
2021
10.1007/s10115-021-01578-0
KNOWLEDGE AND INFORMATION SYSTEMS
Keywords
DocType
Volume
Clustering, Optimal transport, Dynamic time warping, Structured temporal sequence
Journal
63
Issue
ISSN
Citations 
7
0219-1377
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Liang Wang100.34
Vignesh Narayanan2105.19
Yao-Chi Yu300.34
Yikyung Park400.34
Shin Li Jr.511219.45