Title
Learning To Reconstruct And Understand Indoor Scenes From Sparse Views
Abstract
This paper proposes a new method for simultaneous 3D reconstruction and semantic segmentation for indoor scenes. Unlike existing methods that require recording a video using a color camera and/or a depth camera, our method only needs a small number of (e.g., 35) color images from uncalibrated sparse views, which significantly simplifies data acquisition and broadens applicable scenarios. To achieve promising 3D reconstruction from sparse views with limited overlap, our method first recovers the depth map and semantic information for each view, and then fuses the depth maps into a 3D scene. To this end, we design an iterative deep architecture, named IterNet, to estimate the depth map and semantic segmentation alternately. To obtain accurate alignment between views with limited overlap, we further propose a joint global and local registration method to reconstruct a 3D scene with semantic information. We also make available a new indoor synthetic dataset, containing photorealistic high-resolution RGB images, accurate depth maps and pixel-level semantic labels for thousands of complex layouts. Experimental results on public datasets and our dataset demonstrate that our method achieves more accurate depth estimation, smaller semantic segmentation errors, and better 3D reconstruction results over state-of-the-art methods.
Year
DOI
Venue
2019
10.1109/TIP.2020.2986712
IEEE TRANSACTIONS ON IMAGE PROCESSING
Keywords
DocType
Volume
3D reconstruction, deep learning, semantic segmentation, indoor scenes, sparse views
Journal
29
Issue
ISSN
Citations 
1
1057-7149
0
PageRank 
References 
Authors
0.34
22
8
Name
Order
Citations
PageRank
Jingyu Yang127431.04
Ji Xu200.34
Kun Li393.33
Yu-Kun Lai4102580.48
Huanjing Yue5246.89
Jianzhi Lu600.34
Hao Wu700.34
Yebin Liu868849.05