Title
Where To Look: Focus Regions For Visual Question Answering
Abstract
We present a method that learns to answer visual questions by selecting image regions relevant to the text-based query. Our method maps textual queries and visual features from various regions into a shared space where they are compared for relevance with an inner product. Our method exhibits significant improvements in answering questions such as "what color," where it is necessary to evaluate a specific location, and "what room," where it selectively identifies informative image regions. Our model is tested on the recently released VQA [1] dataset, which features free-form human-annotated questions and answers.
Year
DOI
Venue
2015
10.1109/CVPR.2016.499
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)
Field
DocType
Volume
Question answering,Information retrieval,Pattern recognition,Computer science,Artificial intelligence
Journal
abs/1511.07394
Issue
ISSN
Citations 
1
1063-6919
87
PageRank 
References 
Authors
2.28
19
3
Name
Order
Citations
PageRank
Kevin J. Shih11838.77
Saurabh Singh286033.24
Derek Hoiem34998302.66