Title
Rate-Invariant Analysis of Trajectories on Riemannian Manifolds with Application in Visual Speech Recognition
Abstract
In statistical analysis of video sequences for speech recognition, and more generally activity recognition, it is natural to treat temporal evolutions of features as trajectories on Riemannian manifolds. However, different evolution patterns result in arbitrary parameterizations of these trajectories. We investigate a recent framework from statistics literature that handles this nuisance variability using a cost function/distance for temporal registration and statistical summarization & modeling of trajectories. It is based on a mathematical representation of trajectories, termed transported square-root vector field (TSRVF), and the L2 norm on the space of TSRVFs. We apply this framework to the problem of speech recognition using both audio and visual components. In each case, we extract features, form trajectories on corresponding manifolds, and compute parametrization-invariant distances using TSRVFs for speech classification. On the OuluVS database the classification performance under metric increases significantly, by nearly 100% under both modalities and for all choices of features. We obtained speaker-dependent classification rate of 70% and 96% for visual and audio components, respectively.
Year
DOI
Venue
2014
10.1109/CVPR.2014.86
CVPR
Keywords
Field
DocType
riemannian manifolds,visual speech recognition,statistical summarization,l2 norm,speech recognition,video sequences,speaker-dependent classification rate,statistical analysis,temporal registration,speech classification,rate-invariant analysis,audio components,transported square-root vector field,trajectories modeling,parametrization-invariant distances,feature extraction,image sequences,tsrvf,ouluvs database,temporal evolutions,visual components,arbitrary parameterizations,evolution patterns,vectors
Computer science,Artificial intelligence,Representation (mathematics),Manifold,Automatic summarization,Computer vision,Activity recognition,Pattern recognition,Vector field,Speech recognition,Invariant (mathematics),Norm (mathematics),Classification rate
Conference
Volume
Issue
ISSN
2014
1
1063-6919
Citations 
PageRank 
References 
19
0.71
14
Authors
4
Name
Order
Citations
PageRank
Jing-yong Su115610.93
Anuj Srivastava22853199.47
Fillipe Dias Moreira de Souza3555.81
Sudeep Sarkar42839317.68