Title | ||
---|---|---|
3d Human Pose Estimation In Video With Temporal Convolutions And Semi-Supervised Training |
Abstract | ||
---|---|---|
In this work, we demonstrate that 3D poses in video can be effectively estimated with a fully convolutional model based on dilated temporal convolutions over 2D keypoints. We also introduce back-projection, a simple and effective semi-supervised training method that leverages unlabeled video data. We start with predicted 2D keypoints for unlabeled video, then estimate 3D poses and finally back-project to the input 2D keypoints. In the supervised setting, our fully-convolutional model outperforms the previous best result from the literature by 6 mm mean per-joint position error on Human3.6M, corresponding to an error reduction of 11%, and the model also shows significant improvements on HumanEva-I. Moreover experiments with back-projection show that it comfortably outperforms previous state-of-the-art results in semi-supervised settings where labeled data is scarce. Code and models are available at https://github.com/facebookresearch/VideoPose3D |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/CVPR.2019.00794 | 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) |
Field | DocType | Volume |
Pattern recognition,Convolution,Computer science,Position error,Pose,Artificial intelligence,Supervised training,Labeled data | Journal | abs/1811.11742 |
ISSN | Citations | PageRank |
1063-6919 | 31 | 0.81 |
References | Authors | |
28 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dario Pavllo | 1 | 46 | 3.14 |
Christoph Feichtenhofer | 2 | 519 | 20.44 |
David Grangier | 3 | 816 | 41.60 |
Michael Auli | 4 | 1061 | 53.54 |