Abstract | ||
---|---|---|
What jumps out in a single glance of an image is different than what you might notice after closer inspection. Yet conventional models of visual saliency produce predictions at an arbitrary, fixed viewing duration, offering a limited view of the rich interactions between image content and gaze location. In this paper we propose to capture gaze as a series of snapshots, by generating population-level saliency heatmaps for multiple viewing durations. We collect the CodeCharts1K dataset, which contains multiple distinct heatmaps per image corresponding to 0.5, 3, and 5 seconds of free-viewing. We develop an LSTM-based model of saliency that simultaneously trains on data from multiple viewing durations. Our Multi-Duration Saliency Excited Model (MD-SEM) achieves competitive performance on the LSUN 2017 Challenge with 57% fewer parameters than comparable architectures. It is the first model that produces heatmaps at multiple viewing durations, enabling applications where multi-duration saliency can be used to prioritize visual content to keep, transmit, and render. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1109/CVPR42600.2020.00453 | 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) |
DocType | ISSN | Citations |
Conference | 1063-6919 | 0 |
PageRank | References | Authors |
0.34 | 27 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Fosco Camilo | 1 | 2 | 0.70 |
Newman Anelise | 2 | 2 | 0.70 |
Sukhum Pat | 3 | 2 | 0.70 |
Yun Bin Zhang | 4 | 0 | 0.34 |
Nanxuan Zhao | 5 | 18 | 2.95 |
Aude Oliva | 6 | 5121 | 298.19 |
Zoya Gavrilov | 7 | 287 | 16.20 |