Title
A Transformer-Based Network for Dynamic Hand Gesture Recognition
Abstract
Transformer-based neural networks represent a successful self-attention mechanism that achieves state-of-the-art results in language understanding and sequence modeling. However, their application to visual data and, in particular, to the dynamic hand gesture recognition task has not yet been deeply investigated. In this paper, we propose a transformer-based architecture for the dynamic hand gesture recognition task. We show that the employment of a single active depth sensor, specifically the usage of depth maps and the surface normals estimated from them, achieves state-of-the-art results, overcoming all the methods available in the literature on two automotive datasets, namely NVidia Dynamic Hand Gesture and Briareo. Moreover, we test the method with other data types available with common RGB-D devices, such as infrared and color data. We also assess the performance in terms of inference time and number of parameters, showing that the proposed framework is suitable for an online in-car infotainment system.
Year
DOI
Venue
2020
10.1109/3DV50981.2020.00072
2020 International Conference on 3D Vision (3DV)
Keywords
DocType
ISSN
dynamic hand gesture recognition,depth images,surface normals,transformer,automotive,human computer interaction,gesture recognition
Conference
2378-3826
ISBN
Citations 
PageRank 
978-1-7281-8129-5
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Andrea D'Eusanio111.70
Alessandro Simoni212.38
Stefano Pini354.55
Guido Borghi4198.16
Roberto Vezzani523.08
Rita Cucchiara6267.62