Title
Physical Representation Learning and Parameter Identification from Video Using Differentiable Physics
Abstract
Representation learning for video is increasingly gaining attention in the field of computer vision. For instance, video prediction models enable activity and scene forecasting or vision-based planning and control. In this article, we investigate the combination of differentiable physics and spatial transformers in a deep action conditional video representation network. By this combination our model learns a physically interpretable latent representation and can identify physical parameters. We propose supervised and self-supervised learning methods for our architecture. In experiments, we consider simulated scenarios with pushing, sliding and colliding objects, for which we also analyze the observability of the physical properties. We demonstrate that our network can learn to encode images and identify physical properties like mass and friction from videos and action sequences. We evaluate the accuracy of our training methods, and demonstrate the ability of our method to predict future video frames from input images and actions.
Year
DOI
Venue
2022
10.1007/s11263-021-01493-5
INTERNATIONAL JOURNAL OF COMPUTER VISION
Keywords
DocType
Volume
Physical scene understanding, Video representation learning, Differentiable physics
Journal
130
Issue
ISSN
Citations 
1
0920-5691
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Rama Krishna Kandukuri100.34
Jan Achterhold200.34
Michael Moller323.10
Joerg Stueckler400.34