Title
An angle-based method for measuring the semantic similarity between visual and textual features
Abstract
The main challenge for most image–text tasks, such as zero-shot, is the way to measure the semantic similarity between visual and textual feature vectors. The common solution is to map the image feature vectors and text feature vectors into the Hilbert space and then rank the similarity by the inner product between feature vectors. In this paper, we learn the feature representation of images and their sentence descriptions by different deep neural networks to learn about the inner-modal correspondences between visual and language data. We then use a joint embedding structure based on angle calculation for measuring the semantic similarity between visual and textual features. In the proposed method, a constant factor b keeps the similarities of positive samples and negative samples at a certain distance. Since the proposed cosine similarity method involves both normalization and vectors computation, we also develop the learning algorithm on neural networks for expressing the semantic features of texts and images. We applied the angle-based method to the challenging Caltech-UCSD Birds and the Oxford-102 Flowers datasets. The experiments demonstrate good performances on both recognition and retrieval tasks.
Year
DOI
Venue
2019
10.1007/s00500-018-3051-y
soft computing
Keywords
Field
DocType
Semantic similarity measurement, Joint embedding structure, Angle-based method, Image–text tasks, Deep neural network
Semantic similarity,Feature vector,Normalization (statistics),Embedding,Pattern recognition,Cosine similarity,Computer science,Artificial intelligence,Artificial neural network,Sentence,Machine learning,Computation
Journal
Volume
Issue
ISSN
23
12
1433-7479
Citations 
PageRank 
References 
1
0.35
28
Authors
4
Name
Order
Citations
PageRank
Chenwei Tang133.09
Jian Cheng Lv233754.52
yao chen3249.82
Jixiang Guo4135.29