Title
Learning Disentangled Representations of Videos with Missing Data
Abstract
Missing data poses significant challenges while learning representations of video sequences. We present Disentangled Imputed Video autoEncoder (DIVE), a deep generative model that imputes and predicts future video frames in the presence of missing data. Specifically, DIVE introduces a missingness latent variable, disentangles the hidden video representations into static and dynamic appearance, pose, and missingness factors for each object, while it imputes each object trajectory where data is missing. On a moving MNIST dataset with various missing scenarios, DIVE outperforms the state of the art baselines by a substantial margin. We also present comparisons for real-world MOTSChallenge pedestrian dataset, which demonstrates the practical value of our method in a more realistic setting.
Year
Venue
DocType
2020
NIPS 2020
Conference
Volume
Citations 
PageRank 
33
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Armand Comas Massague100.34
Chi Zhang200.34
Zlatan Feric301.01
Octavia I Camps4493.23
Qi Yu518812.87