Title
Learning Human Actions by Combining Global Dynamics and Local Appearance
Abstract
In this paper, we address the problem of human action recognition through combining global temporal dynamics and local visual spatio-temporal appearance features. For this purpose, in the global temporal dimension, we propose to model the motion dynamics with robust linear dynamical systems (LDSs) and use the model parameters as motion descriptors. Since LDSs live in a non-Euclidean space and the descriptors are in non-vector form, we propose a shift invariant subspace angles based distance to measure the similarity between LDSs. In the local visual dimension, we construct curved spatio-temporal cuboids along the trajectories of densely sampled feature points and describe them using histograms of oriented gradients (HOG). The distance between motion sequences is computed with the Chi-Squared histogram distance in the bag-of-words framework. Finally we perform classification using the maximum margin distance learning method by combining the global dynamic distances and the local visual distances. We evaluate our approach for action recognition on five short clips data sets, namely Weizmann, KTH, UCF sports, Hollywood2 and UCF50, as well as three long continuous data sets, namely VIRAT, ADL and CRIM13. We show competitive results as compared with current state-of-the-art methods.
Year
DOI
Venue
2014
10.1109/TPAMI.2014.2329301
IEEE Trans. Pattern Anal. Mach. Intell.
Keywords
Field
DocType
local visual dimension,global temporal dimension,global temporal dynamics,motion descriptors,model parameters,global dynamic distances,local visual spatio-temporal appearance features,learning (artificial intelligence),noneuclidean space,robust linear dynamical systems,densely-sampled feature point trajectories,kth data set,long-continuous data sets,short-clip data sets,action recognition,human action recognition,adl data set,motion dynamics,ucf sports data set,spatiotemporal phenomena,weizmann data set,maximum margin distance learning method,crim13 data set,motion estimation,shift invariant subspace angle-based distance,linear dynamical system,local visual distances,curved spatio-temporal cuboids,virat data set,image classification,non-vector descriptor,ucf50 data set,hollywood2 data set,image sequences,hog,lds,similarity measurement,distance learning,motion sequences,nonvector descriptors,classification,bag-of-words framework,human action learning,chi-squared histogram distance,histogram-of-oriented gradients,local spatio-temporal feature,visualization,feature extraction,robustness,hidden markov models,measurement,dynamics,histograms
Linear dynamical system,Histogram,Computer vision,Data set,Pattern recognition,Computer science,Spatio-Temporal Analysis,Feature extraction,Invariant subspace,Robustness (computer science),Artificial intelligence,Hidden Markov model
Journal
Volume
Issue
ISSN
36
12
0162-8828
Citations 
PageRank 
References 
10
0.50
57
Authors
6
Name
Order
Citations
PageRank
Guan Luo11418.66
Shuang Yang2172.06
Guodong Tian3241.12
Chunfeng Yuan441830.84
Weiming Hu55300261.38
Stephen J. Maybank64105493.12