Title
Classification of Evolving Data Streams with Infinitely Delayed Labels
Abstract
The majority of evolving data streams classification algorithms assume that the actual labels of the predicted examples are readily available without any time delay just after a prediction is made. However, given the high label costs, dependence of an expert, limitations in data transmission or even restrictions imposed by the problem's nature, there is a large number of real-world applications in which the availability of actual labels is infinitely delayed (never available). In these cases, it is necessary the use of algorithms that does not follow the traditional process of monitoring the error rate to detect changes in data distribution and uses the most recent labeled data to update the classification model. In this paper, we propose the method MClassification to classify evolving data streams with infinitely delayed labels. Our method is inspired on the use of Micro-Cluster representation from online clustering algorithms. Considering the presence of incremental drifts, our approach uses a distance-based strategy to maintain the Micro-Clusters' positions updated. An evaluation in several synthetic and real data shows that MClassification achieves competitive accuracy results to state-of-the-art methods and adequate computational cost. The main advantage of the proposed method is the absence of critical parameters that require user's prior knowledge, as occurs with rival methods.
Year
DOI
Venue
2015
10.1109/ICMLA.2015.174
2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)
Keywords
Field
DocType
data stream,classification,delayed labels,extreme verification latency,online clustering
Data mining,Data stream mining,Data transmission,Data stream,Computer science,Artificial intelligence,Labeled data,Cluster analysis,Data stream clustering,Pattern recognition,Word error rate,Statistical classification,Machine learning
Conference
Citations 
PageRank 
References 
5
0.43
8
Authors
4
Name
Order
Citations
PageRank
Vinícius M. A. de Souza1336.14
Diego F. Silva214814.29
Gustavo E. Batista3192892.83
João Gama43785271.37