Title
Bidirectional Joint Representation Learning with Symmetrical Deep Neural Networks for Multimodal and Crossmodal Applications.
Abstract
Common approaches to problems involving multiple modalities (classification, retrieval, hyperlinking, etc.) are early fusion of the initial modalities and crossmodal translation from one modality to the other. Recently, deep neural networks, especially deep autoencoders, have proven promising both for crossmodal translation and for early fusion via multimodal embedding. In this work, we propose a flexible crossmodal deep neural network architecture for multimodal and crossmodal representation. By tying the weights of two deep neural networks, symmetry is enforced in central hidden layers thus yielding a multimodal representation space common to the two original representation spaces. The proposed architecture is evaluated in multimodal query expansion and multimodal retrieval tasks within the context of video hyperlinking. Our method demonstrates improved crossmodal translation capabilities and produces a multimodal embedding that significantly outperforms multimodal embeddings obtained by deep autoencoders, resulting in an absolute increase of 14.14 in precision at 10 on a video hyperlinking task.
Year
DOI
Venue
2016
10.1145/2911996.2912064
ICMR
Keywords
Field
DocType
neural networks, deep learning, representation, embedding, multimodal, crossmodal, retrieval, video retrieval, video hyperlinking, image and text, autoencoder, bidirectional learning, tied weights, shared weights
Crossmodal,Embedding,Autoencoder,Pattern recognition,Query expansion,Computer science,Hyperlink,Artificial intelligence,Deep learning,Artificial neural network,Machine learning,Feature learning
Conference
Citations 
PageRank 
References 
8
0.53
7
Authors
3
Name
Order
Citations
PageRank
Vedran Vukotic1294.59
christian raymond211813.80
guillaume gravier31413127.38