Abstract | ||
---|---|---|
ABSTRACTWe present MVLayoutNet, a network for holistic 3D reconstruction from multi-view panoramas. Our core contribution is to seamlessly combine learned monocular layout estimation and multi-view stereo (MVS) for accurate layout reconstruction in both 3D and image space. We jointly train a layout module to produce an initial layout and a novel MVS module to obtain accurate layout geometry. Unlike standard MVSNet, our MVS module takes a newly-proposed layout cost volume, which aggregates multi-view costs at the same depth layer into corresponding layout elements. We additionally provide an attention-based scheme that guides the MVS module to focus on structural regions. Such a design considers both local pixel-level costs and global holistic information for better reconstruction. Experiments show that our method outperforms state-of-the-arts in terms of depth rmse by 21.7% and 41.2% on the 2D-3D-S [1] and ZInD [4] datasets. For complex scenes with multiple rooms, our method can be applied to each layout element of a precomputed topology to accurately reconstruct a globally coherent layout geometry. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1145/3503161.3548071 | International Multimedia Conference |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zhihua Hu | 1 | 0 | 0.34 |
Bo Duan | 2 | 0 | 0.34 |
Yanfeng Zhang | 3 | 170 | 15.56 |
Mingwei Sun | 4 | 0 | 0.34 |
Jingwei Huang | 5 | 0 | 0.34 |