Abstract | ||
---|---|---|
In data stream environment, most of the conventional clustering algorithms are not sufficiently efficient, since large volumes of data arrive in a stream and these data points unfold with time. The problem of clustering time-evolving metric data and categorical time-evolving data has separately been well explored in recent years, but the problem of clustering mixed type time-evolving data remains a challenging issue due to an awkward gap between the structure of metric and categorical attributes. In this paper, we devise a generalized framework, termed Equi-Clustream to dynamically cluster mixed type time-evolving data, which comprises three algorithms: a Hybrid Drifting Concept Detection Algorithm that detects the drifting concept between the current sliding window and previous sliding window, a Hybrid Data Labeling Algorithm that assigns an appropriate cluster label to each data vector of the current non-drifting window based on the clustering result of the previous sliding window, and a visualization algorithm that analyses the relationship between the clusters at different timestamps and also visualizes the evolving trends of the clusters. The efficacy of the proposed framework is shown by experiments on synthetic and real world datasets. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1007/s11634-018-0316-3 | Adv. Data Analysis and Classification |
Keywords | Field | DocType |
Clustering, Data streams, Time-evolving data, Data mining, 62-07, 62H30 | Data point,Data stream mining,Sliding window protocol,Pattern recognition,Data stream,Visualization,Categorical variable,Timestamp,Artificial intelligence,Cluster analysis,Mathematics | Journal |
Volume | Issue | ISSN |
12 | 4 | 1862-5347 |
Citations | PageRank | References |
1 | 0.34 | 30 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ravi Sankar | 1 | 656 | 55.66 |
H. Om | 2 | 99 | 17.56 |