Name
Affiliation
Papers
DAVID F. HARWATH
MIT, Lincoln Lab, 244 Wood St, Lexington, MA 02173 USA
21
Collaborators
Citations 
PageRank 
40
63
8.34
Referers 
Referees 
References 
119
323
137
Search Limit
100323
Title
Citations
PageRank
Year
Everything at Once – Multi-modal Fusion Transformer for Video Retrieval00.342022
Cascaded Multilingual Audio-Visual Learning from Videos.00.342021
Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions00.342021
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos.00.342021
Pair Expansion for Learning Multilingual Semantic Embeddings Using Disjoint Visually-Grounded Speech Audio Datasets.00.342020
Trilingual Semantic Embeddings of Visually Grounded Speech with Self-Attention Mechanisms00.342020
Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input20.392020
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech10.352020
Towards Bilingual Lexicon Discovery From Visually Grounded Speech Audio10.342019
Towards Visually Grounded Sub-Word Speech Unit Discovery10.342019
Grounding Spoken Words in Unlabeled Video.10.352019
Transfer Learning from Audio-Visual Grounding to Speech Recognition10.342019
Learning Words By Drawing Images10.342019
Learning modality-invariant representations for speech and images40.392017
Learning Word-Like Units From Joint Audio-Visual Analysis140.612017
Unsupervised Learning of Spoken Language with Visual Context.40.402016
Look, listen, and decode: Multimodal speech recognition with images10.342016
On the Use of Acoustic Unit Discovery for Language Recognition.30.372016
Deep multimodal semantic embeddings for speech and images160.732015
Speech recognition without a lexicon - bridging the gap between graphemic and phonetic systems.20.412014
Zero Resource Spoken Audio Corpus Analysis110.612013