Learning Words By Drawing Images - Citegraph

Paper Info

Title
Learning Words By Drawing Images

Abstract
We propose a framework for learning through drawing. Our goal is to learn the correspondence between spoken words and abstract visual attributes, from a dataset of spoken descriptions of images. Building upon recent findings that GAN representations can be manipulated to edit semantic concepts in the generated output, we propose a new method to use such GAN-generated images to train a model using a triplet loss. To apply the method, we develop Audio CLEVRGAN, a new dataset of audio descriptions of GAN-generated CLEVR images, and we describe a training procedure that creates a curriculum of GAN-generated images that focuses training on image pairs that differ in a specific, informative way. Training is done without additional supervision beyond the spoken captions and the GAN. We find that training that takes advantage of GAN-generated edited examples results in improvements in the model's ability to learn attributes compared to previous results. Our proposed learning framework also results in models that can associate spoken words with some abstract visual concepts such as color and size.

Year	DOI	Venue
2019	10.1109/CVPR.2019.00213	2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019)
Field	DocType	ISSN
Computer vision,Computer science,Human–computer interaction,Artificial intelligence	Conference	1063-6919
Citations	PageRank	References
1	0.34	0
Authors
6

Authors (6 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Didac Suris	1	1	0.68
Adrià Recasens	2	74	6.55
David Bau	3	149	9.18
David F. Harwath	4	63	8.34
James Glass	5	3123	413.63
Antonio Torralba	6	14607	956.27

1