Title
ZSTAD: Zero-Shot Temporal Activity Detection
Abstract
An integral part of video analysis and surveillance is temporal activity detection, which means to simultaneously recognize and localize activities in long untrimmed videos. Currently, the most effective methods of temporal activity detection are based on deep learning, and they typically perform very well with large scale annotated videos for training. However, these methods are limited in real applications due to the unavailable videos about certain activity classes and the time-consuming data annotation. To solve this challenging problem, we propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected. We design an end-to-end deep network based on R-C3D as the architecture for this solution. The proposed network is optimized with an innovative loss function that considers the embeddings of activity labels and their super-classes while learning the common semantics of seen and unseen activities. Experiments on both the THUMOS’14 and the Charades datasets show promising performance in terms of detecting unseen activities.
Year
DOI
Venue
2020
10.1109/CVPR42600.2020.00096
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Keywords
DocType
ISSN
unavailable videos,zero-shot temporal activity detection,video analysis,video surveillance,long untrimmed videos,deep learning,large scale annotated videos,time-consuming data annotation,unseen activities detection
Conference
1063-6919
ISBN
Citations 
PageRank 
978-1-7281-7169-2
1
0.35
References 
Authors
36
7
Name
Order
Citations
PageRank
Lingling Zhang127645.79
Xiaojun Chang2158576.85
Jun Liu317825.96
Minnan Luo426921.18
Sen Wang547737.24
zongyuan ge614927.83
Alexander G. Hauptmann77472558.23