Title
Are You Watching Closely? Content-based Retrieval of Hand Gestures
Abstract
Gestures play an important role in our daily communications. However, recognizing and retrieving gestures in-the-wild is a challenging task which is not explored thoroughly in literature. In this paper, we explore the problem of identifying and retrieving gestures in a large-scale video dataset provided by the computer vision community and based on queries recorded in-the-wild. Our proposed pipeline, I3DEF, is based on the extraction of spatio-temporal features from intermediate layers of an I3D network, a state-of-the-art network for action recognition, and the fusion of the output of feature maps from RGB and optical flow input. The obtained embeddings are used to train a triplet network to capture the similarity between gestures. We further explore the effect of a person and body part masking step for improving both retrieval performance and recognition rate. Our experiments show the ability of I3DEF to recognize and retrieve gestures which are similar to the queries independently of the depth modality. This performance holds both for queries taken from the test data, and for queries using recordings from different people performing relevant gestures in a different setting.
Year
DOI
Venue
2020
10.1145/3372278.3390723
ICMR '20: International Conference on Multimedia Retrieval Dublin Ireland June, 2020
DocType
ISBN
Citations 
Conference
978-1-4503-7087-5
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Mahnaz Amiri Parian102.03
Luca Rossetto29221.00
H. Schuldt39820.60
Stéphane Dupont413426.78