Abstract | ||
---|---|---|
In this paper, we focus on isolated gesture recognition from RGB-D videos. Our main idea is to design an algorithm that can extract global and local information from multi-modality inputs. To this end, we propose a novel attention-based method with 3D convolutional neural network (CNN) to recognize isolated gesture recognition. It includes two parts. The first one is a global and local spatial-attention network (GLSANet), which takes into account the global information that focuses on the context of the frame and the local information that focuses on the hand/arm actions of the person, to extract efficient features from multi-modality inputs simultaneously. The second part is an adaptive model fusion strategy to fuse the predicted probabilities from multi-modality inputs. Experiments demonstrate that the proposed method has achieved state-of-the-art performance on the IsoGD dataset. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1007/978-3-030-31456-9_10 | BIOMETRIC RECOGNITION (CCBR 2019) |
Keywords | DocType | Volume |
Gesture recognition,Fusion strategy,RGB-D video | Conference | 11818 |
ISSN | Citations | PageRank |
0302-9743 | 0 | 0.34 |
References | Authors | |
0 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Qi Yuan | 1 | 1 | 0.69 |
Jun Wan | 2 | 255 | 22.37 |
Chi Lin | 3 | 6 | 1.51 |
yunan li | 4 | 17 | 2.68 |
Qiguang Miao | 5 | 355 | 49.69 |
Stan Z. Li | 6 | 8951 | 535.26 |
Lihua Wang | 7 | 6 | 5.75 |
Yunxiang Lu | 8 | 0 | 0.34 |