Title
Undersampling Near Decision Boundary for Imbalance Problems
Abstract
Undersampling the dataset to rebalance the class distribution is effective to handle class imbalance problems. However, randomly removing majority examples via a uniform distribution may lead to unnecessary information loss. This would result in performance deterioration of classifiers trained using this rebalanced dataset. On the other hand, examples have different sensitivities with respect to class imbalance. Higher sensitivity means that this example is more easily to be affected by class imbalance, which can be used to guide the selection of examples to rebalance the class distribution and to boost the classifier performance. Therefore, in this paper, we propose a novel undersampling method, the UnderSampling using Sensitivity (USS), based on sensitivity of each majority example. Examples with low sensitivities are noisy or safe examples while examples with high sensitivities are borderline examples. In USS, majority examples with higher sensitivities are more likely to be selected. Experiments on 20 datasets confirm the superiority of the USS against one baseline method and five resampling methods.
Year
DOI
Venue
2019
10.1109/ICMLC48188.2019.8949290
2019 International Conference on Machine Learning and Cybernetics (ICMLC)
Keywords
Field
DocType
Sensitivity,Undersampling,Imbalance learning,USS
Information loss,Pattern recognition,Computer science,Uniform distribution (continuous),Undersampling,Imbalance problems,Artificial intelligence,Classifier (linguistics),Resampling,Decision boundary,Machine learning
Conference
ISSN
ISBN
Citations 
2160-133X
978-1-7281-2817-7
1
PageRank 
References 
Authors
0.35
24
5
Name
Order
Citations
PageRank
Jianjun Zhang193.48
Ting Wang2725120.28
Wing W. Y. Ng352856.12
Shuai Zhang410.35
Chris D. Nugent51150128.39