Title
Multi-Localized Sensitive Autoencoder-Attention-LSTM For Skeleton-based Action Recognition
Abstract
One of key challenges of skeleton-based action recognition (SAR) tasks is the complex nature of human motion patterns. Variations such as performers and viewpoints may impose negative effects to the action recognition accuracy. In this work, we propose the Multi-Localized Sensitive Autoencoder-Attention-LSTM (Multi-LiSAAL) for SAR. The Localized Stochastic Sensitive Autoencoder (LiSSA) encodes both spatial and temporal information, and extracts meaningful features from different parts (four limbs and a trunk) from the skeleton. The LiSSA is trained by minimizing the localized generalization error to enhance the robustness of autoencoders via reducing its sensitivity with respect to small variations in inputs. We apply an attention mechanism to assign different weights to different skeleton parts and focus more on informative sections. Then, a backbone classifier network takes weighted features as inputs to differentiates actions. Experimental results on five public benchmarking datasets show that the Multi-LiSAAL outperforms state-of-the-art methods.
Year
DOI
Venue
2022
10.1109/TMM.2021.3070127
IEEE TRANSACTIONS ON MULTIMEDIA
Keywords
DocType
Volume
Skeleton, Feature extraction, Joints, Hidden Markov models, Convolution, Task analysis, Bones, Skeleton-based action recognition (SAR), Localized Stochastic Sensitive Autoencoder (LiSSA)
Journal
24
ISSN
Citations 
PageRank 
1520-9210
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Wing W. Y. Ng152856.12
Mingyang Zhang200.34
Ting Wang3369.43