Title
Large Scale Retrieval and Generation of Image Descriptions
Abstract
What is the story of an image? What is the relationship between pictures, language, and information we can extract using state of the art computational recognition systems? In an attempt to address both of these questions, we explore methods for retrieving and generating natural language descriptions for images. Ideally, we would like our generated textual descriptions (captions) to both sound like a person wrote them, and also remain true to the image content. To do this we develop data-driven approaches for image description generation, using retrieval-based techniques to gather either: (a) whole captions associated with a visually similar image, or (b) relevant bits of text (phrases) from a large collection of image + description pairs. In the case of (b), we develop optimization algorithms to merge the retrieved phrases into valid natural language sentences. The end result is two simple, but effective, methods for harnessing the power of big data to produce image captions that are altogether more general, relevant, and human-like than previous attempts.
Year
DOI
Venue
2016
10.1007/s11263-015-0840-y
International Journal of Computer Vision
Keywords
Field
DocType
Retrieval,Image description,Data driven,Big data,Natural language processing
Image description,Data-driven,Information retrieval,Computer science,Image retrieval,Image content,Natural language,Artificial intelligence,Natural language processing,Merge (version control),Big data,Visual Word
Journal
Volume
Issue
ISSN
119
1
0920-5691
Citations 
PageRank 
References 
15
0.75
49
Authors
14
Name
Order
Citations
PageRank
Vicente Ordonez1141869.65
Xufeng Han241915.28
Polina Kuznetsova329315.86
Kulkarni, Girish432417.49
Margaret Mitchell5145065.37
Kota Yamaguchi669128.85
Karl Stratos732821.07
Amit Goyal830218.09
Jesse Dodge947219.28
Alyssa Mensch101638.05
Hal Daumé, III113673200.05
Alexander C. Berg1210554630.24
Yejin Choi132239153.18
Tamara L. Berg143221225.32