Title
Learning from imbalanced data in presence of noisy and borderline examples
Abstract
In this paper we studied re-sampling methods for learning classifiers from imbalanced data. We carried out a series of experiments on artificial data sets to explore the impact of noisy and borderline examples from the minority class on the classifier performance. Results showed that if data was sufficiently disturbed by these factors, then the focused re-sampling methods - NCR and our SPIDER2 - strongly outperformed the oversampling methods. They were also better for real-life data, where PCA visualizations suggested possible existence of noisy examples and large overlapping ares between classes.
Year
DOI
Venue
2010
10.1007/978-3-642-13529-3_18
RSCTC
Keywords
Field
DocType
oversampling method,minority class,real-life data,pca visualization,noisy example,artificial data,large overlapping are,imbalanced data,classifier performance,borderline example,sampling methods
Data set,Oversampling,Pattern recognition,Computer science,Artificial intelligence,Classifier (linguistics),Machine learning
Conference
Volume
ISSN
ISBN
6086
0302-9743
3-642-13528-5
Citations 
PageRank 
References 
59
1.60
6
Authors
3
Name
Order
Citations
PageRank
Krystyna Napierała1713.24
Jerzy Stefanowski21653139.25
Szymon Wilk346140.94