Title
Learning visual and textual representations for multimodal matching and classification.
Abstract
•A unified network for image-text matching and classification.•Seamlessly incorporating the matching and classification components.•A multi-stage training algorithm by combining the matching and classification loss.•Comprehensive study on the effectiveness of the proposed approach.•Comparisons on four well-known multimodal benchmarks.
Year
DOI
Venue
2018
10.1016/j.patcog.2018.07.001
Pattern Recognition
Keywords
Field
DocType
Vision and language,Multimodal matching,Multimodal classification,Deep learning
Embedding,Pattern recognition,Artificial intelligence,Unified Model,Multimodal learning,Discriminative model,Machine learning,Mathematics
Journal
Volume
Issue
ISSN
84
1
0031-3203
Citations 
PageRank 
References 
4
0.42
46
Authors
4
Name
Order
Citations
PageRank
Yu Liu119825.45
Li Liu273350.04
Yanming Guo312813.06
Michael S. Lew42742166.02