Abstract | ||
---|---|---|
Determining the viewpoint (pose) of rigid objects in monocular 2D images is a classic vision problem with applications to robotic grasping, augmented reality, semantic SLAM, autonomous navigation and scene understanding in general. Using only 3D CAD models of an object class as input, we demonstrate the ability to accurately predict viewpoint in real-world images even in the presence of clutter and occlusion. We report results on eight datasets, one of which is new, in the hope of providing the community with new viewpoint prediction baselines. We show that deep representations (from convolutional networks) can bridge the large divide between purely synthetic training data and real-world test data to achieve near state-of-the-art results in viewpoint prediction but without the need for labeled, real-world training data. Our general approach to viewpoint prediction is applicable to any object class where 3D models are available. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/CRV.2016.58 | 2016 13th Conference on Computer and Robot Vision (CRV) |
Keywords | DocType | ISBN |
computer vision,pattern recognition,neural networks,object viewpoint | Conference | 978-1-5090-2492-6 |
Citations | PageRank | References |
0 | 0.34 | 16 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Andy Hess | 1 | 0 | 0.68 |
Ray Nilanjan | 2 | 541 | 55.39 |
Hong Zhang | 3 | 582 | 74.33 |