Title
Deep Concept-wise Temporal Convolutional Networks for Action Localization
Abstract
Existing action localization approaches adopt shallow temporal convolutional networks (i.e., TCN) on 1D feature map extracted from video frames. In this paper, we empirically find that stacking more conventional temporal convolution layers actually deteriorates action classification performance, possibly ascribing to that all channels of 1D feature map, which generally are highly abstract and can be regarded as latent concepts, are excessively recombined in temporal convolution. To address this issue, we introduce a novel concept-wise temporal convolutional network (C-TCN) as an alternative to TCN for training deeper action localization networks. To address this issue, we introduce a novel concept-wise temporal convolution (CTC) layer as an alternative to conventional temporal convolution layer for training deeper action localization networks. Instead of recombining latent concepts, CTC layer deploys a number of temporal filters to each concept separately with shared filter parameters across concepts. Thus can capture common temporal patterns of different concepts and significantly enrich representation ability. Via stacking CTC layers, we proposed a deep concept-wise temporal convolutional network (C-TCN), which boosts the state-of-the-art action localization performance on THUMOS'14 from 42.8 to 52.1 in terms of mAP(%), achieving a relative improvement of 21.7%. Favorable result is also obtained on ActivityNet.
Year
DOI
Venue
2020
10.1145/3394171.3413860
MM '20: The 28th ACM International Conference on Multimedia Seattle WA USA October, 2020
DocType
ISBN
Citations 
Conference
978-1-4503-7988-5
1
PageRank 
References 
Authors
0.35
0
10
Name
Order
Citations
PageRank
Xin Li132.74
Tianwei Lin2546.67
Xiao Liu328441.90
Wangmeng Zuo4828.01
Chao Li531.74
Xiang Long63010.70
He, D.73313.67
Fu Li832.42
Shilei Wen97913.59
Chuang Gan1025331.92