Title
From Actemes to Action: A Strongly-Supervised Representation for Detailed Action Understanding
Abstract
This paper presents a novel approach for analyzing human actions in non-scripted, unconstrained video settings based on volumetric, x-y-t, patch classifiers, termed actemes. Unlike previous action-related work, the discovery of patch classifiers is posed as a strongly-supervised process. Specifically, key point labels (e.g., position) across space time are used in a data-driven training process to discover patches that are highly clustered in the space time key point configuration space. To support this process, a new human action dataset consisting of challenging consumer videos is introduced, where notably the action label, the 2D position of a set of key points and their visibilities are provided for each video frame. On a novel input video, each acteme is used in a sliding volume scheme to yield a set of sparse, non-overlapping detections. These detections provide the intermediate substrate for segmenting out the action. For action classification, the proposed representation shows significant improvement over state-of-the-art low-level features, while providing spatiotemporal localization as additional output, which sheds further light into detailed action understanding.
Year
DOI
Venue
2013
10.1109/ICCV.2013.280
ICCV
Keywords
Field
DocType
supervised representation,video signal processing,action detection,data-driven training process,image representation,strongly-supervised representation,space time,action-related work,spatiotemporal localization,human action dataset,actemes,detailed action understanding,spacetime key-point configuration space,sliding volume scheme,unconstrained video settings,consumer video,sparse nonoverlapping detections,new human action dataset,action understanding,image classification,patch classifiers discovery,consumer videos,configuration space,action label,patch classifier,gesture recognition,key point,human action,action classification,human actions analyzing
Space time,Computer vision,Market segmentation,Pattern recognition,Computer science,Image representation,Gesture recognition,Artificial intelligence,Contextual image classification,Configuration space
Conference
Volume
Issue
ISSN
2013
1
1550-5499
Citations 
PageRank 
References 
44
1.07
20
Authors
3
Name
Order
Citations
PageRank
Weiyu Zhang18712.67
Menglong Zhu21868.21
Konstantinos G. Derpanis343122.45