Title
Online Optimization Methods for the Quantification Problem.
Abstract
The estimation of class prevalence, i.e., of the fraction of a population that belongs to a certain class, is an important task in data analytics, and finds applications in many domains such as the social sciences, market research, epidemiology, and others. For example, in sentiment analysis the goal is often not to estimate whether a specific text conveys a positive or a negative sentiment, but rather to estimate the overall distribution of positive and negative sentiments, e.g., in a certain time frame. A popular way of performing the above task, often dubbed quantification, is to use supervised learning in order to train a prevalence estimator from labeled data. In the literature there are several performance metrics for measuring the success of such prevalence estimators. In this paper we propose the first online stochastic algorithms for directly optimizing these quantification-specific performance measures. We also provide algorithms that optimize hybrid performance measures that seek to balance quantification and classification performance. Our algorithms present a significant advancement in the theory of multivariate optimization; we show, via a rigorous theoretical analysis, that they exhibit optimal convergence. We also report extensive experiments on benchmark and real data sets which demonstrate that our methods significantly outperform existing optimization techniques used for these performance measures.
Year
DOI
Venue
2016
10.1145/2939672.2939832
KDD
Field
DocType
Citations 
Convergence (routing),Data mining,Population,Data set,Stochastic optimization,Data analysis,Sentiment analysis,Computer science,Supervised learning,Artificial intelligence,Machine learning,Estimator
Conference
7
PageRank 
References 
Authors
0.69
24
5
Name
Order
Citations
PageRank
Purushottam Kar137922.55
Shuai Li219223.09
Narasimhan, Harikrishna316117.48
Sanjay Chawla41372105.09
Fabrizio Sebastiani56724395.14