Title | ||
---|---|---|
A Bayes Classifier-Based OVFDT Algorithm for Massive Stream Data Mining on Big Data Platform. |
Abstract | ||
---|---|---|
Recently, online incremental data mining has become an immensely growing area of research for stream data mining. VFDT algorithm, as an excellent incremental decision tree classification algorithm, is widely used in online data mining. To optimize VFDT algorithm, a dynamic tie-breaking threshold strategy and a pre-pruning mechanism strategy are utilized to achieve the reduction of the scale of decision tree. Furthermore, Bayes classifier is applied to leaf nodes of Hoeffding decision tree, which promotes the improvement of classification accuracy. In this paper, this improved algorithm is called OVFDT (Optimized VFDT) algorithm. To improve the performance of OVFDT for massive streaming data processing, an implementation scheme of OVFDT Algorithm on MapReduce Platform is proposed in our paper. Considering the need for real-time computing, the implementation scheme on Storm Platform is designed. Three comparison experiments are designed to compare the scale, the classification accuracy and the execution time of decision tree of three algorithm generate. The simulation results reveal that compared with C4.5 and VFDT algorithm, OVFDT algorithm can effectively reduce the scale of the decision tree, achieves the improvement of classification accuracy as well. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1007/978-3-319-61566-0_49 | COMPLEX, INTELLIGENT, AND SOFTWARE INTENSIVE SYSTEMS, CISIS-2017 |
Field | DocType | Volume |
Data mining,Decision tree,Data stream mining,Computer science,Algorithm,Stream data,Streaming data,Execution time,Big data,Bayes classifier,Incremental decision tree | Conference | 611 |
ISSN | Citations | PageRank |
2194-5357 | 0 | 0.34 |
References | Authors | |
12 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Liangde Li | 1 | 0 | 0.68 |
Peng Li | 2 | 25 | 4.13 |
he xu | 3 | 36 | 22.25 |
Fangzhou Chen | 4 | 0 | 1.01 |