Title
What do different evaluation metrics tell us about saliency models?
Abstract
How best to evaluate a saliency model's ability to predict where humans look in images is an open research question. The choice of evaluation metric depends on how saliency is defined and how the ground truth is represented. Metrics differ in how they rank saliency models, and this results from how false positives and false negatives are treated, whether viewing biases are accounted for, whether spatial deviations are factored in, and how the saliency maps are pre-processed. In this paper, we provide an analysis of 8 different evaluation metrics and their properties. With the help of systematic experiments and visualizations of metric computations, we add interpretability to saliency scores and more transparency to the evaluation of saliency models. Building off the differences in metric properties and behaviors, we make recommendations for metric selections under specific assumptions and for specific applications.
Year
DOI
Venue
2019
10.1109/TPAMI.2018.2815601
IEEE transactions on pattern analysis and machine intelligence
Keywords
DocType
Volume
Measurement,Computational modeling,Analytical models,Visualization,Benchmark testing,Observers,Task analysis
Journal
abs/1604.03605
Issue
ISSN
Citations 
3
0162-8828
68
PageRank 
References 
Authors
2.34
46
5
Name
Order
Citations
PageRank
Zoya Gavrilov128716.20
Tilke Judd2104639.52
Aude Oliva35121298.19
Antonio Torralba414607956.27
Frédo Durand58625414.94