Submodularity in Data Subset Selection and Active Learning - Citegraph

Paper Info

Title
Submodularity in Data Subset Selection and Active Learning

Abstract
We study the problem of selecting a subset of big data to train a classifier while incurring minimal performance loss. We show the connection of submodularity to the data likelihood functions for Naïve Bayes (NB) and Nearest Neighbor (NN) classifiers, and formulate the data subset selection problems for these classifiers as constrained submodular maximization. Furthermore, we apply this framework to active learning and propose a novel scheme called filtered active submodular selection (FASS), where we combine the uncertainty sampling method with a submodular data subset selection framework. We extensively evaluate the proposed framework on text categorization and handwritten digit recognition tasks with four different classifiers, including deep neural network (DNN) based classifiers. Empirical results indicate that the proposed framework yields significant improvement over the state-of-the-art algorithms on all classifiers.

Year	Venue	Field
2015	International Conference on Machine Learning	k-nearest neighbors algorithm,Active learning,Pattern recognition,Naive Bayes classifier,Computer science,Random subspace method,Submodular set function,Artificial intelligence,Artificial neural network,Classifier (linguistics),Big data,Machine learning
DocType	Citations	PageRank
Conference	31	0.98
References	Authors
36	3

Authors (3 rows)

Cited by (31 rows)

References (36 rows)

Name	Order	Citations	PageRank
Kai Wei	1	143	9.34
Rishabh K. Iyer	2	88	6.04
Jeff A. Bilmes	3	278	16.88

1