Title
MILES: <u>m</u>ulticlass <u>i</u>mbalanced <u>l</u>earning in <u>e</u>nsembles through <u>s</u>elective sampling.
Abstract
Imbalanced learning is the problem of learning from datasets when the class proportions are highly imbalanced. Imbalanced datasets are increasingly seen in many domains and pose a challenge to traditional classification techniques. Learning from imbalanced multiclass data (three or more classes) creates additional complexities. Studies suggest that ensemble learners can be trained to emphasize different segments of data pertaining to different classes and thereby produce more accurate results than regular imbalance learning techniques. Thus, we propose a new approach to building ensembles of classifiers for multiclass imbalanced datasets, called Multiclass Imbalance Learning in Ensembles through Selective Sampling (MILES). Each member of MILES is trained with the data selectively sampled from the bands around cluster centroids in a way that diversity is aggressively encouraged within the ensemble. Resampling techniques are utilized to balance the distribution of the data that comes from each cluster. We performed several experiments applying our approach to different datasets demonstrating improved performance for recognizing minority class examples and balancing the G-mean and Mean Area Under the Curve (MAUC). We further applied MILES to classify prolonged emergency department (ED) stays with consistently higher performance as compared to existing methods.
Year
DOI
Venue
2017
10.1145/3019612.3019667
symposium on applied computing
Field
DocType
Citations 
Data mining,Ensembles of classifiers,Computer science,Artificial intelligence,Sampling (statistics),Resampling,Ensemble learning,Machine learning,Centroid,Multiclass classification
Conference
0
PageRank 
References 
Authors
0.34
16
3
Name
Order
Citations
PageRank
Ali Azari100.34
Vandana P. Janeja214118.93
Scott R. Levin3133.90