Title
Actionness-Guided Transformer for Anchor-Free Temporal Action Localization
Abstract
Temporal action localization, detecting actions in untrimmed videos, is widely studied by anchor-based approaches that first generate excessive action proposals, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e.</i> , temporal windows, then evaluate and classify these proposals. To reduce the number of action proposals, recent studies use an anchor-free approach that leverages each time point rather than a temporal window to represent an action instance. However, this point representation, usually modeled by temporal convolutions, may have the fixed and limited receptive field to detect an entire action. So we propose an Actionness-guided Transformer (Ag-Trans) model to learn representations for each point proposal. Ag-Trans first predicts the actionness, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e.</i> , time sequences of the action starting, continuing, and ending phases, then the corresponding action phase can be embedded to model the point representation. Experimental results show that the Ag-Trans model outperforms the CNN-based model under the same experiment settings, especially for long-duration actions.
Year
DOI
Venue
2022
10.1109/LSP.2021.3132287
IEEE Signal Processing Letters
Keywords
DocType
Volume
Temporal action localization,anchor-free,transformer
Journal
29
ISSN
Citations 
PageRank 
1070-9908
0
0.34
References 
Authors
8
4
Name
Order
Citations
PageRank
Peisen Zhao102.03
Ling-Xi Xie242937.79
Ya Zhang3134091.72
Qi Tian46443331.75