Abstract | ||
---|---|---|
User interaction provides useful information for solving challenging computer vision problems in practice. In this paper, we show that a very limited number of user clicks could greatly boost monocular depth estimation performance and overcome monocular ambiguities. We formulate this task as a deep structured model, in which the structured pixel-wise depth estimation has ordinal constraints introduced by user clicks. We show that the inference of the proposed model could be efficiently solved through a feed-forward network. We demonstrate the effectiveness of the proposed model on NYU Depth V2 and Stanford 2D-3D datasets. On both datasets, we achieve state-of-the-art performance when encoding user interaction into our deep models. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/3DV.2018.00071 | 2018 International Conference on 3D Vision (3DV) |
Keywords | Field | DocType |
Monocular depth estimation,deep structured models,ordinal constraints | Task analysis,Ordinal number,Inference,Computer science,Artificial intelligence,Artificial neural network,Monocular,Machine learning,Encoding (memory) | Conference |
ISSN | ISBN | Citations |
2378-3826 | 978-1-5386-8426-9 | 0 |
PageRank | References | Authors |
0.34 | 15 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Daniel Ron | 1 | 0 | 0.34 |
Kun Duan | 2 | 26 | 3.89 |
Chongyang Ma | 3 | 257 | 19.21 |
Ning Xu | 4 | 184 | 20.03 |
shenlong wang | 5 | 346 | 19.68 |
Sumant Hanumante | 6 | 0 | 0.34 |
Dhritiman Sagar | 7 | 0 | 0.34 |