Feature-Independent Action Spotting without Human Localization, Segmentation, or Frame-wise Tracking - Citegraph

Paper Info

Title
Feature-Independent Action Spotting without Human Localization, Segmentation, or Frame-wise Tracking

Abstract
In this paper, we propose an unsupervised framework for action spotting in videos that does not depend on any specific feature (e.g. HOG/HOF, STIP, silhouette, bag-of-words, etc.). Furthermore, our solution requires no human localization, segmentation, or framewise tracking. This is achieved by treating the problem holistically as that of extracting the internal dynamics of video cuboids by modeling them in their natural form as multilinear tensors. To extract their internal dynamics, we devised a novel Two-Phase Decomposition (TP-Decomp) of a tensor that generates very compact and discriminative representations that are robust to even heavily perturbed data. Technically, a Rank-based Tensor Core Pyramid (Rank-TCP) descriptor is generated by combining multiple tensor cores under multiple ranks, allowing to represent video cuboids in a hierarchical tensor pyramid. The problem then reduces to a template matching problem, which is solved efficiently by using two boosting strategies: (1) to reduce search space, we filter the dense trajectory cloud extracted from the target video, (2) to boost the matching speed, we perform matching in an iterative coarse-to-fine manner. Experiments on 5 benchmarks show that our method outperforms current state-of-the-art under various challenging conditions. We also created a challenging dataset called Heavily Perturbed Video Array (HPVA) to validate the robustness of our framework under heavily perturbed situations.

Year	DOI	Venue
2014	10.1109/CVPR.2014.344	CVPR
Keywords	Field	DocType
matching speed,tp-decomp,image representation,rank-tcp descriptor,hpva,discriminative representations,iterative coarse-to-fine manner,image matching,dense trajectory cloud,rank-based tensor core pyramid descriptor,internal dynamics,unsupervised framework,video cuboids,compact representations,feature-independent action spotting,multiple tensor cores,heavily perturbed video array,two-phase decomposition,feature extraction,search space,multilinear tensors,hierarchical tensor pyramid,tensors,iterative methods,template matching problem,boosting,tensile stress,noise,robustness,vectors	Template matching,Computer vision,Pattern recognition,Tensor,Silhouette,Computer science,Segmentation,Robustness (computer science),Artificial intelligence,Boosting (machine learning),Multilinear map,Discriminative model	Conference
ISSN	Citations	PageRank
1063-6919	7	0.40
References	Authors
25	3

Authors (3 rows)

Cited by (7 rows)

References (25 rows)

Name	Order	Citations	PageRank
Chuan Sun	1	77	4.86
Marshall F. Tappen	2	1901	89.34
Hassan Foroosh	3	748	59.98

1