Title
Predicting The Influence Of Additional Training Data On Classification Performance For Imbalanced Data
Abstract
It is desirable to predict the influence of additional training data on classification performance because the generation of samples is often costly. Current methods can only predict performance as measured by accuracy, which is not suitable if one class is much rarer than another. We propose an approach which is able to also predict other measures such as G-mean and F-measure, which are used in cases of imbalanced data. We show that our method leads to more correct decisions whether to generate more training samples or not using a highly imbalanced real-world dataset of scanning electron microscopy images of nanoparticles.
Year
DOI
Venue
2014
10.1007/978-3-319-11752-2_30
PATTERN RECOGNITION, GCPR 2014
Field
DocType
Volume
Training set,Pattern recognition,Computer science,Artificial intelligence,Machine learning
Conference
8753
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
4
3
Name
Order
Citations
PageRank
Stephen Kockentiedt100.34
Klaus D. Tönnies221544.39
Erhardt Gierke301.35