Efficient Action Localization with Approximately Normalized Fisher Vectors - Citegraph

Paper Info

Title
Efficient Action Localization with Approximately Normalized Fisher Vectors

Abstract
The Fisher vector (FV) representation is a high-dimensional extension of the popular bag-of-word representation. Transformation of the FV by power and ℓ2 normalizations has shown to significantly improve its performance, and led to state-of-the-art results for a range of image and video classification and retrieval tasks. These normalizations, however, render the representation non-additive over local descriptors. Combined with its high dimensionality, this makes the FV computationally expensive for the purpose of localization tasks. In this paper we present approximations to both these normalizations, which yield significant improvements in the memory and computational costs of the FV when used for localization. Second, we show how these approximations can be used to define upper-bounds on the score function that can be efficiently evaluated, which enables the use of branch-and-bound search as an alternative to exhaustive sliding window search. We present experimental evaluation results on classification and temporal localization of actions in videos. These show that the our approximations lead to a speedup of at least one order of magnitude, while maintaining state-of-the-art action recognition and localization performance.

Year	DOI	Venue
2014	10.1109/CVPR.2014.326	Computer Vision and Pattern Recognition
Keywords	Field	DocType
image classification,image representation,tree searching,vectors,video retrieval,video signal processing,ℓ2 normalizations,FV representation,FV transformation,Fisher vector representation,action recognition,bag-of-word representation,branch-and-bound search,computational costs,high-dimensional extension,image classification tasks,image retrieval tasks,memory costs,temporal action localization performance,video classification tasks,video retrieval tasks,Fisher vectors,action classification,action localization,branch and bound,normalizations,sliding window	Computer vision,Branch and bound,Sliding window protocol,Normalization (statistics),Pattern recognition,Computer science,Feature extraction,Curse of dimensionality,Artificial intelligence,Score,Contextual image classification,Speedup	Conference
ISSN	Citations	PageRank
1063-6919	29	0.91
References	Authors
29	3

Authors (3 rows)

Cited by (29 rows)

References (29 rows)

Name	Order	Citations	PageRank
Oneata, D.	1	29	0.91
J. J. Verbeek	2	3944	181.44
Cordelia Schmid	3	28581	1983.22

1