Title
Maximum Likelihood in Cost-Sensitive Learning: Model Specification, Approximations, and Upper Bounds
Abstract
The presence of asymmetry in the misclassification costs or class prevalences is a common occurrence in the pattern classification domain. While much interest has been devoted to the study of cost-sensitive learning techniques, the relationship between cost-sensitive learning and the specification of the model set in a parametric estimation framework remains somewhat unclear. To that end, we differentiate between the case of the model including the true posterior, and that in which the model is misspecified. In the former case, it is shown that thresholding the maximum likelihood (ML) estimate is an asymptotically optimal solution to the risk minimization problem. On the other hand, under model misspecification, it is demonstrated that thresholded ML is suboptimal and that the risk-minimizing solution varies with the misclassification cost ratio. Moreover, we analytically show that the negative weighted log likelihood (Elkan, 2001) is a tight, convex upper bound of the empirical loss. Coupled with empirical results on several real-world data sets, we argue that weighted ML is the preferred cost-sensitive technique.
Year
DOI
Venue
2010
10.5555/1756006.1953037
Journal of Machine Learning Research
Keywords
Field
DocType
empirical result,former case,model specification,weighted ml,upper bounds,thresholded ml,cost-sensitive learning,asymptotically optimal solution,maximum likelihood,preferred cost-sensitive technique,model misspecification,empirical loss,upper bound
Data set,Upper and lower bounds,Maximum likelihood,Regular polygon,Artificial intelligence,Thresholding,Specification,Statistics,Asymmetry,Asymptotically optimal algorithm,Machine learning,Mathematics
Journal
Volume
ISSN
Citations 
11,
1532-4435
13
PageRank 
References 
Authors
0.75
15
3
Name
Order
Citations
PageRank
Jacek P. Dmochowski19511.17
Paul Sajda265189.86
Lucas C. Parra392888.98