Title
A RANDOM DECISION TREE ENSEMBLE FOR MINING CONCEPT DRIFTS FROM NOISY DATA STREAMS
Abstract
Detecting concept drifts and reducing the impact from the noise in real applications of data streams are challenging but valuable for inductive learning. It is especially a challenge in a light demand on the overheads of time and space. However, though a great number of inductive learning algorithms based on ensemble classification models have been proposed for handling concept drifting data streams, little attention has been focused on the detection of the diversity of concept drifts and the influence from noise in data streams simultaneously. Motivated by this, we present a new light-weighted inductive algorithm for concept drifting detection in virtue of an ensemble model of random decision trees (named CDRDT) to distinguish various types of concept drifts from noisy data streams in this article. We use variably small data chunks to generate random decision trees incrementally. Meanwhile, we introduce the inequality of Hoeffding bounds and the principle of statistical quality control to detect the different types of concept drifts and noise. Extensive studies on synthetic and real streaming data demonstrate that CDRDT could effectively and efficiently detect concept drifts from the noisy streaming data. Therefore, our algorithm provides a feasible reference framework of classification for concept drifting data streams with noise.
Year
DOI
Venue
2010
10.1080/08839514.2010.499500
Applied Artificial Intelligence
Keywords
Field
DocType
detecting concept drift,mining concept drifts,random decision tree ensemble,noisy data stream,noisy data streams,ensemble classification model,random decision tree,variably small data chunk,new light-weighted inductive algorithm,inductive learning,concept drift,data stream,ensemble model,decision tree,statistical quality control
Data mining,Decision tree,Data stream mining,Noisy data,Small data,Ensemble forecasting,Computer science,Streaming data,Artificial intelligence,Statistical process control,STREAMS,Machine learning
Journal
Volume
Issue
ISSN
24
7
0883-9514
Citations 
PageRank 
References 
7
0.45
31
Authors
5
Name
Order
Citations
PageRank
Peipei Li114017.30
Xindong Wu28830503.63
Xuegang Hu344244.50
Qianhui Liang427520.24
Yunjun Gao586289.71