Title
Improving semi-supervised learning through optimum connectivity.
Abstract
The annotation of large data sets by a classifier is a problem whose challenge increases as the number of labeled samples used to train the classifier reduces in comparison to the number of unlabeled samples. In this context, semi-supervised learning methods aim at discovering and labeling informative samples among the unlabeled ones, such that their addition to the correct class in the training set can improve classification performance. We present a semi-supervised learning approach that connects unlabeled and labeled samples as nodes of a minimum-spanning tree and partitions the tree into an optimum-path forest rooted at the labeled nodes. It is suitable when most samples from a same class are more closely connected through sequences of nearby samples than samples from distinct classes, which is usually the case in data sets with a reasonable relation between number of samples and feature space dimension. The proposed solution is validated by using several data sets and state-of-the-art methods as baselines. HighlightsA new algorithm for semi-supervised learning based on optimum-path forest.The algorithm provides significant improvements in accuracy and efficiency.Labels are propagated from labeled to unlabeled training samples with less errors.The novel classifier can be more accurate than other state-of-the-art methods.A fast and effective algorithm suitable for developing active learning methods.
Year
DOI
Venue
2016
10.1016/j.patcog.2016.04.020
Pattern Recognition
Keywords
Field
DocType
Semi-supervised learning,Optimum-path forest classifiers
Training set,Feature vector,Data set,Active learning,Semi-supervised learning,Annotation,Pattern recognition,Unsupervised learning,Artificial intelligence,Classifier (linguistics),Machine learning,Mathematics
Journal
Volume
Issue
ISSN
60
C
0031-3203
Citations 
PageRank 
References 
12
0.54
30
Authors
4
Name
Order
Citations
PageRank
Willian Paraguassu Amorim1244.52
Alexandre X. Falcão21877132.30
João P. Papa368946.87
Marcelo Henriques de Carvalho4120.88