Title
Selective Pre-processing of Imbalanced Data for Improving Classification Performance
Abstract
In this paper we discuss problems of constructing classifiers from imbalanced data. We describe a new approach to selective pre-processing of imbalanced data which combines local over-sampling of the minority class with filtering difficult examples from the majority classes. In experiments focused on rule-based and tree-based classifiers we compare our approach with two other related pre-processing methods --- NCR and SMOTE. The results show that NCR is too strongly biased toward the minority class and leads to deteriorated specificity and overall accuracy, while SMOTE and our approach do not demonstrate such behavior. Analysis of the degree to which the original class distribution has been modified also reveals that our approach does not introduce so extensive changes as SMOTE.
Year
DOI
Venue
2008
10.1007/978-3-540-85836-2_27
DaWaK
Keywords
Field
DocType
related pre-processing method,minority class,imbalanced data,original class distribution,selective pre-processing,majority class,difficult example,extensive change,deteriorated specificity,new approach,improving classification performance,behavior analysis,rule based
Data mining,Computer science,Filter (signal processing),Artificial intelligence,Machine learning
Conference
Volume
ISSN
Citations 
5182
0302-9743
42
PageRank 
References 
Authors
1.50
8
2
Name
Order
Citations
PageRank
Jerzy Stefanowski11653139.25
Szymon Wilk246140.94