Title
IIvotes ensemble for imbalanced data
Abstract
In the paper we present IIvotes --a new framework for constructing an ensemble of classifiers from imbalanced data. IIvotes incorporates the SPIDER method for selective data pre-processing into the adaptive Ivotes ensemble. Such an integration is aimed at improving balance between sensitivity and specificity evaluated by the G-mean measure for the minority class in comparison with single classifiers also combined with SPIDER. Using SPIDER to pre-process specific learning samples inside the ensemble improves sensitivity of derived component classifiers. At the same time the controlling mechanism of IIvotes ensures that overall accuracy and thus specificity is kept at a reasonable level. The new proposed IIvotes ensemble was thoroughly evaluated in a series of experiments where we tested it with symbolic decision trees and rules and non-symbolic Naive Bayes component classifiers. The results confirmed that combining SPIDER with an ensemble improved the performance in terms of the G-mean measures in comparison to a single classifier with SPIDER for all tested types of classifiers and two SPIDER pre-processing options weak and strong amplification. These advantages were especially evident for decision trees and rules where differences between single and ensemble classifiers with SPIDER were more significant for both pre-processing options than for Naive Bayes. Moreover, the results demonstrated advantages of using a special abstaining classification strategy inside IIvotes rule ensembles, where component rule-based classifiers may refrain from predicting a class when in doubt. Abstaining rule ensembles performed much better with regard to G-mean than their non-abstaining variants.
Year
DOI
Venue
2012
10.3233/IDA-2012-0551
Intell. Data Anal.
Keywords
Field
DocType
spider pre-processing option,spider method,single classifier,g-mean measure,adaptive ivotes ensemble,new proposed iivotes ensemble,imbalanced data,ensemble classifier,present iivotes,iivotes rule ensemble,abstaining rule ensemble
Decision tree,Pattern recognition,Naive Bayes classifier,Spider,Random subspace method,Computer science,Cascading classifiers,Artificial intelligence,Classifier (linguistics),Ensemble learning,Machine learning
Journal
Volume
Issue
ISSN
16
5
1088-467X
Citations 
PageRank 
References 
5
0.41
28
Authors
4
Name
Order
Citations
PageRank
Jerzy Błaszczyński127713.20
Magdalena Deckert2462.71
Jerzy Stefanowski31653139.25
Szymon Wilk446140.94