Title
Imitative Learning for Multi-Person Action Forecasting
Abstract
ABSTRACTMulti-person action forecasting is an emerging task and a pivotal step towards video understanding. The major challenge lies in estimating a distribution characterizing the upcoming actions of all individuals in the scene. The state-of-the-art solutions attempt to solve this problem via a step-by-step prediction procedure. However, they are not adequate to address some particular limitations, such as the compounding errors, the innate uncertainty of the future and the spatio-temporal contexts. To handle the multi-person action forecasting challenges, we put forth a novel imitative learning framework upon the basis of inverse reinforcement learning. Specifically, we aim to learn a policy to model the aforementioned distribution up to a coming horizon through an objective that naturally solves the compounding errors. Such a policy is able to explore multiple plausible futures via extrapolating a series of latent variables and taking them into account to generate predictions. The impacts of these latent variables are further investigated by optimizing the directed information. Moreover, we reason the spatial context along with the temporal cue in a single pass with the usage of graph structural data. The experimental outcomes on two large-scale datasets reveal that our approach yields considerable improvements in terms of both diversity and quality with respect to recent leading studies.
Year
DOI
Venue
2021
10.1145/3474085.3475187
International Multimedia Conference
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Yu-Ke Li100.34
Wang, Pin263.06
Mang Ye330425.92
Ching-Yao Chan400.34