Abstract | ||
---|---|---|
David Marr famously defined vision as "knowing what is where by seeing". In the framework described here, attention is the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that performs well in recognition tasks and that predicts some of the main properties of attention at the level of psychophysics and physiology. We propose an algorithmic implementation a Bayesian network that can be mapped into the basic functional anatomy of attention involving the ventral stream and the dorsal stream. This description integrates bottom-up, feature-based as well as spatial (context based) attentional mechanisms. We show that the Bayesian model predicts well human eye fixations (considered as a proxy for shifts of attention) in natural scenes, and can improve accuracy in object recognition tasks involving cluttered real world images. In both cases, we found that the proposed model can predict human performance better than existing bottom-up and top-down computational models. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1117/12.876734 | HUMAN VISION AND ELECTRONIC IMAGING XVI |
Keywords | DocType | Volume |
Attention, Bayesian inference, Eye-movements | Conference | 7865 |
ISSN | Citations | PageRank |
0277-786X | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sharat Chikkerur | 1 | 839 | 37.21 |
Thomas Serre | 2 | 510 | 64.62 |
Cheston Tan | 3 | 155 | 15.27 |
Tomaso Poggio | 4 | 13488 | 3380.01 |