Abstract | ||
---|---|---|
This paper proposes a universal framework, called OVE6D, for model-based 6D object pose estimation from a single depth image and a target object mask. Our model is trained using purely synthetic data rendered from ShapeNet, and, unlike most of the existing methods, it generalizes well on new real-world objects without any fine-tuning. We achieve this by decomposing the 6D pose into viewpoint, in-plane rotation around the camera optical axis and translation, and introducing novel lightweight modules for estimating each component in a cascaded manner. The resulting network contains less than 4M parameters while demon-strating excellent performance on the challenging T-LESS and Occluded LINEMOD datasets without any dataset-specific training. We show that OVE6D outperforms some contemporary deep learning-based pose estimation methods specifically trained for individual objects or datasets with real-world training data. The implementation is available at https://github.com/dingdingcai/OVE6D-pose. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/CVPR52688.2022.00668 | IEEE Conference on Computer Vision and Pattern Recognition |
Keywords | DocType | Volume |
Pose estimation and tracking, Recognition: detection,categorization,retrieval, RGBD sensors and analytics | Conference | 2022 |
Issue | Citations | PageRank |
1 | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dingding Cai | 1 | 0 | 0.34 |
Janne Heikkilä | 2 | 2163 | 160.55 |
Esa Rahtu | 3 | 832 | 52.76 |