A two-stage ensemble method for the detection of class-label noise. - Citegraph

Paper Info

Title
A two-stage ensemble method for the detection of class-label noise.

Abstract
The properties of bootstrap ensembles, such as bagging or random forest, are utilized to detect and handle label noise in classification problems. The first observation is that subsampling is a regularization mechanism that can be used to render bootstrap ensembles more robust to this type of noise. Furthermore, appropriate values of the sampling rate can be estimated using out-of-bag data. A second observation is that the ensemble classifiers tend to make more errors in incorrectly labeled instances. Thus, instances for which a sufficiently large fraction of ensemble predictors err are marked as noisy. Suitable values of this threshold, which are problem dependent, are determined by cross-validation using a wrapper method. Instances identified as noisy can then be either filtered (i.e. discarded for training), or cleaned by correcting their class labels. Finally, an ensemble is built afresh on these cleansed training data. Extensive experiments in classification problems from different areas of application show that this procedure is effective to build accurate ensembles, even in the presence of high levels of class-label noise. (C) 2017 Elsevier B.V. All rights reserved.

Year	DOI	Venue
2018	10.1016/j.neucom.2017.11.012	NEUROCOMPUTING
Keywords	Field	DocType
Noise detection,Ensemble learning,Subsampling,Robust classification,Random forest	Training set,Pattern recognition,Computer science,Sampling (signal processing),Regularization (mathematics),Artificial intelligence,Noise detection,Random forest,Ensemble learning,Machine learning,Bootstrapping (electronics)	Journal
Volume	ISSN	Citations
275	0925-2312	5
PageRank	References	Authors
0.42	13	3

Authors (3 rows)

Cited by (5 rows)

References (13 rows)

Name	Order	Citations	PageRank
Maryam Sabzevari	1	10	2.57
Gonzalo Martínez-Muñoz	2	524	23.76
Alberto Suárez	3	487	22.33

1