Title
PcmNet: Position-sensitive context modeling network for temporal action localization
Abstract
Temporal action localization, which aims to locate temporal regions where actions take place and recognize their corresponding classes in untrimmed real-world videos, is a challenging task. As a critical cue to video understanding, exploiting the video context has become an important strategy to boost the localization performance. However, previous methods mainly focus on exploring semantic context which captures the feature similarity among frames or proposals. The temporal position context which is also vital for temporal action localization is less explored. In this paper, we propose a position-sensitive context modeling approach to fuse both semantic and position context for more precise action localization. Specifically, we first propose a position encoding method tailored for temporal action localization on both frame-level and proposal-level, which ensures that the generated position representations can model the distance and chronological relationships among frames or proposals. Then we conduct attention-based context aggregation to produce discriminative features and help with precise boundary detection and proposal evaluation. Our method achieves state-of-the-art performance on two widely used datasets, THUMOS-14 and ActivityNet-1.3, demonstrating the effectiveness and generalizability of our method.
Year
DOI
Venue
2022
10.1016/j.neucom.2022.08.040
Neurocomputing
Keywords
DocType
Volume
Temporal action localization,Position-sensitive,Context modeling
Journal
510
ISSN
Citations 
PageRank 
0925-2312
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Xin Qin101.01
Hanbin Zhao262.86
Guangchen Lin300.34
Hao Zeng400.68
Songcen Xu5928.91
Xi Li61850137.71