Abstract | ||
---|---|---|
The data imbalance problem hampers the classification task. In streaming environments, this becomes even more cumbersome as the proportion of classes can vary over time. Approaches based on misclassification costs can be used to mitigate this problem. In this paper, we present the Cost-sensitive Adaptive Random Forest (CSARF) and compare it to the Adaptive Random Forest (ARF) and ARF with Resampling (ARFRE) in six real-world and six synthetic data sets with different class ratios. The empirical study analyzes two misclassification costs strategies of the CSARF and shows that the CSARF obtained statistically superior w.r.t. the average recall and average F1 when compared to ARF.
|
Year | DOI | Venue |
---|---|---|
2020 | 10.1145/3341105.3373949 | SAC '20: The 35th ACM/SIGAPP Symposium on Applied Computing
Brno
Czech Republic
March, 2020 |
Keywords | DocType | ISBN |
cost-sensitive, ensemble, data stream, imbalanced datasets, adaptive random forest | Conference | 978-1-4503-6866-7 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Lucas Loezer | 1 | 0 | 0.34 |
Fabrício Enembreck | 2 | 274 | 38.42 |
Jean Paul Barddal | 3 | 140 | 16.77 |
Alceu Britto | 4 | 94 | 18.30 |