Title
Semantic Speech Retrieval With a Visually Grounded Model of Untranscribed Speech.
Abstract
There is a growing interest in models that can learn from unlabelled speech paired with visual context. This setting is relevant for low-resource speech processing, robotics, and human language acquisition research. Here, we study how a visually grounded speech model, trained on images of scenes paired with spoken captions, captures aspects of semantics. We use an external image tagger to generate...
Year
DOI
Venue
2019
10.1109/TASLP.2018.2872106
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Keywords
DocType
Volume
Semantics,Visualization,Task analysis,Predictive models,Analytical models,Speech processing,Data models
Journal
27
Issue
ISSN
Citations 
1
2329-9290
4
PageRank 
References 
Authors
0.40
71
3
Name
Order
Citations
PageRank
Herman Kamper115020.70
Gregory Shakhnarovich21579106.33
Karen Livescu3125471.43