3D-TDC: A 3D temporal dilation convolution framework for video action recognition - Citegraph

Paper Info

Title
3D-TDC: A 3D temporal dilation convolution framework for video action recognition

Abstract
Video action recognition is a vital area of computer vision. By adding temporal dimension into convolution structure, 3D convolution neural network owns the capacity to extract spatio-temporal features from videos. However, due to computing constraints, it is hard to input the whole video into the convolution network at one time, resulting in a limited temporal receptive field of the network. To address this issue, we propose a novel 3D temporal dilation convolution (3D-TDC) framework, to extract spatio-temporal features of actions from videos. First, we deploy the 3D temporal dilation convolution as the shallow temporal compression layer, enabling an effective capture of spatio-temporal information in a larger time domain with the reduced computational load. Then, an action recognition framework is constructed by integrating two networks with different temporal receptive fields to balance the long-short time difference. We conduct extensive experiments on three widely-used public datasets (UCF-101, HMDB-51, and Kinetics-400) for performance evaluation, and the experimental results demonstrate the effectiveness of our proposed framework in video action recognition with low computational load.

Year	DOI	Venue
2021	10.1016/j.neucom.2021.03.120	Neurocomputing
Keywords	DocType	Volume
3D convolution,Temporal dilation,Action recognition,Temporal compression	Journal	450
ISSN	Citations	PageRank
0925-2312	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yue Ming	1	0	0.34
Fan Feng	2	2	1.52
Chao Li	3	0	0.34
Jing-Hao Xue	4	15	10.05

1