Title | ||
---|---|---|
Learning visual and textual representations for multimodal matching and classification. |
Abstract | ||
---|---|---|
•A unified network for image-text matching and classification.•Seamlessly incorporating the matching and classification components.•A multi-stage training algorithm by combining the matching and classification loss.•Comprehensive study on the effectiveness of the proposed approach.•Comparisons on four well-known multimodal benchmarks. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1016/j.patcog.2018.07.001 | Pattern Recognition |
Keywords | Field | DocType |
Vision and language,Multimodal matching,Multimodal classification,Deep learning | Embedding,Pattern recognition,Artificial intelligence,Unified Model,Multimodal learning,Discriminative model,Machine learning,Mathematics | Journal |
Volume | Issue | ISSN |
84 | 1 | 0031-3203 |
Citations | PageRank | References |
4 | 0.42 | 46 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yu Liu | 1 | 198 | 25.45 |
Li Liu | 2 | 733 | 50.04 |
Yanming Guo | 3 | 128 | 13.06 |
Michael S. Lew | 4 | 2742 | 166.02 |