Abstract | ||
---|---|---|
The paper presents a novel approach for the resampling of imbalanced datasets aiming at the improvement of classifiers performance. The method exploits two self-organizing-maps for the determinations of the clusters of majority and minority data. Clusters centroids are used to select the samples whose under-sampling or over-sampling is more convenient while the optimal resampling rates are determined through a genetic algorithm that maximizes the classifier performance. The algorithm is tested on several datasets coming from both the UCI repository and real industrial applications and compared to other widely used resampling methods. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1007/978-3-030-20257-6_34 | Communications in Computer and Information Science |
Keywords | Field | DocType |
Imbalanced datasets,Resampling,Self organizing maps,Genetic Algorithms | Pattern recognition,Computer science,Self-organizing map,Artificial intelligence,Classifier (linguistics),Resampling,Genetic algorithm,Centroid | Conference |
Volume | ISSN | Citations |
1000 | 1865-0929 | 0 |
PageRank | References | Authors |
0.34 | 0 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Marco Vannucci | 1 | 94 | 15.60 |
Valentina Colla | 2 | 159 | 29.50 |