Title
A Novel Uncertainty Sampling Algorithm For Cost-Sensitive Multiclass Active Learning
Abstract
Active learning is a setup that allows the learning algorithm to iteratively and strategically query the labels of some instances for reducing human labeling efforts. One fundamental strategy, called uncertainty sampling, measures the uncertainty of each instance when making querying decisions. Traditional active learning algorithms focus on binary or multiclass classification, but few works have studied active learning for cost-sensitive multiclass classification (CSMCC), which allows charging different costs for different types of misclassification errors. The few works are generally based on calculating the uncertainty of each instance by probability estimation, and can suffer from the inaccuracy of the estimation. In this paper, we propose a novel active learning algorithm that relies on a different way of calculating the uncertainty. The algorithm is based on our newly-proposed cost embedding approach (CE) for CSMCC. CE embeds the cost information in the distance measure of a special hidden space with non-metric multidimensional scaling, and deals with both symmetric and asymmetric cost information by our carefully designed mirroring trick. The embedding allows the proposed algorithm, active learning with cost embedding (ALCE), to define a cost-sensitive uncertainty measure from the distance in the hidden space. Extensive experimental results demonstrate that ALCE selects more useful instances by taking the cost information into account through the embedding and is superior to existing cost-sensitive active learning algorithms.
Year
DOI
Venue
2016
10.1109/ICDM.2016.131
2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM)
Field
DocType
ISSN
Data mining,Active learning (machine learning),Multidimensional scaling,Computer science,Artificial intelligence,Multiclass classification,Algorithm design,Active learning,Embedding,Algorithm,Measurement uncertainty,Sampling (statistics),Machine learning
Conference
1550-4786
Citations 
PageRank 
References 
0
0.34
0
Authors
2
Name
Order
Citations
PageRank
Kuan-Hao Huang1142.38
Hsuan-Tien Lin282974.77