Actionness-Guided Transformer for Anchor-Free Temporal Action Localization - Citegraph

Paper Info

Title
Actionness-Guided Transformer for Anchor-Free Temporal Action Localization

Abstract
Temporal action localization, detecting actions in untrimmed videos, is widely studied by anchor-based approaches that first generate excessive action proposals, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e.</i> , temporal windows, then evaluate and classify these proposals. To reduce the number of action proposals, recent studies use an anchor-free approach that leverages each time point rather than a temporal window to represent an action instance. However, this point representation, usually modeled by temporal convolutions, may have the fixed and limited receptive field to detect an entire action. So we propose an Actionness-guided Transformer (Ag-Trans) model to learn representations for each point proposal. Ag-Trans first predicts the actionness, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e.</i> , time sequences of the action starting, continuing, and ending phases, then the corresponding action phase can be embedded to model the point representation. Experimental results show that the Ag-Trans model outperforms the CNN-based model under the same experiment settings, especially for long-duration actions.

Year	DOI	Venue
2022	10.1109/LSP.2021.3132287	IEEE Signal Processing Letters
Keywords	DocType	Volume
Temporal action localization,anchor-free,transformer	Journal	29
ISSN	Citations	PageRank
1070-9908	0	0.34
References	Authors
8	4

Authors (4 rows)

Cited by (0 rows)

References (8 rows)

Name	Order	Citations	PageRank
Peisen Zhao	1	0	2.03
Ling-Xi Xie	2	429	37.79
Ya Zhang	3	1340	91.72
Qi Tian	4	6443	331.75

1