Title
Improving Label Noise Filtering by Exploiting Unlabeled Data.
Abstract
With the significant growth in the scale of data, an increasing amount of training data is available in many machine learning tasks. However, it is difficult to ensure perfect labeling with a large volume of training data. Some labels can be incorrect, resulting in label noise, which could lead to deterioration in learning performance. A common way to address label noise is to apply noise filtering techniques to identify and remove noise prior to learning. Multiple noise filtering approaches have been proposed. However, almost all existing works focus on only mislabeled training data and ignore the existence of unlabeled data. In fact, unlabeled data are common in many applications, and their values have been extensively studied and recognized. Therefore, in this paper, we explore the effective use of unlabeled data to improve the noise filtering performance. To this end, we propose a novel noise filtering algorithm called enhanced soft majority voting by exploiting unlabeled data (ESMVU), which is an ensemble-learning-based filter that adopts a soft majority voting strategy. ESMVU provides a systematic way to measure the value of unlabeled data by considering different aspects, such as label confidence and the sample distribution. Finally, the effectiveness of the proposed method is confirmed by experiments and comparison with other methods.
Year
DOI
Venue
2018
10.1109/ACCESS.2018.2807779
IEEE ACCESS
Keywords
Field
DocType
Label noise,noise filtering,unlabeled data,soft majority voting
Training set,Sampling distribution,Data modeling,Noise measurement,Computer science,Filter (signal processing),Prediction algorithms,Artificial intelligence,Majority rule,Machine learning,Distributed computing
Journal
Volume
ISSN
Citations 
6
2169-3536
0
PageRank 
References 
Authors
0.34
0
7
Name
Order
Citations
PageRank
Donghai Guan134848.29
Hongqiang Wei200.34
Yuan Wei Wei331229.13
Guangjie Han41890172.76
Yuan Tian527021.90
Mohammed Al-Dhelaan6274.95
Abdullah Al-Dhelaan752339.77