Title
Ambiguous decision trees for mining concept-drifting data streams
Abstract
In real world situations, explanations for the same observations may be different depending on perceptions or contexts. They may change with time especially when concept drift occurs. This phenomenon incurs ambiguities. It is useful if an algorithm can learn to reflect ambiguities and select the best decision according to context or situation. Based on this viewpoint, we study the problem of deriving ambiguous decision trees from data streams to cope with concept drift. CVFDT (Concept-adapting Very Fast Decision Tree) is one of the most well-known streaming data mining methods that can learn decision trees incrementally. In this paper, we establish a method called ambiguous CVFDT (aCVFDT), which integrates ambiguities into CVFDT by exploring multiple options at each node whenever a node is to be split. When aCVFDT is used to make class predictions, it is guaranteed that the best and newest knowledge is used. When old concepts recur, aCVFDT can immediately relearn them by using the corresponding options recorded at each node. Furthermore, CVFDT does not automatically detect occurrences of concept drift and only scans trees periodically, whereas an automatic concept drift detecting mechanism is used in aCVFDT. In our experiments, hyperplane problem and two benchmark problems from the UCI KDD Archive, namely Network Intrusion and Forest CoverType, are used to validate the performance of aCVFDT. The experimental results show that aCVFDT obtains significantly improved results over traditional CVFDT.
Year
DOI
Venue
2009
10.1016/j.patrec.2009.07.017
Pattern Recognition Letters
Keywords
Field
DocType
data streams,data mining,traditional cvfdt,automatic concept,data mining method,acvfdt obtains,best decision,decision trees incrementally,concept drift,ambiguous decision tree,concept-drifting data stream,ambiguous cvfdt,benchmark problem,ambiguous decision trees,incremental learning,decision trees,decision tree
Decision tree,Data mining,Data processing,Data stream mining,Computer science,Artificial intelligence,Hyperplane,Intrusion,Pattern recognition,Concept drift,Streaming data,Concept drifting,Machine learning
Journal
Volume
Issue
ISSN
30
15
Pattern Recognition Letters
Citations 
PageRank 
References 
13
0.62
19
Authors
3
Name
Order
Citations
PageRank
Jing Liu11043115.54
Xue Li22196186.96
Weicai Zhong338126.14