Title
Dimension Embeddings for Monocular 3D Object Detection
Abstract
Most existing deep learning-based approaches for monocular 3D object detection directly regress the dimensions of objects and overlook their importance in solving the illposed problem. In this paper, we propose a general method to learn appropriate embeddings for dimension estimation in monocular 3D object detection. Specifically, we consider two intuitive clues in learning the dimension-aware embeddings with deep neural networks. First, we constrain the pair-wise distance on the embedding space to reflect the similarity of corresponding dimensions so that the model can take advantage of inter-object information to learn more discriminative embeddings for dimension estimation. Second, we propose to learn representative shape templates on the dimension-aware embedding space. Through the attention mechanism, each object can interact with the learnable templates and obtain the attentive dimensions as the initial estimation, which is further refined by the combined features from both the object and the attentive templates. Experimental results on the well-established KITTI dataset demonstrate the proposed method of dimension embeddings can bring consistent improvements with negligible computation cost overhead. We achieve new state-of-the-art performance on the KITTI 3D object detection benchmark.
Year
DOI
Venue
2022
10.1109/CVPR52688.2022.00164
IEEE Conference on Computer Vision and Pattern Recognition
Keywords
DocType
Volume
3D from single images, Recognition: detection,categorization,retrieval
Conference
2022
Issue
Citations 
PageRank 
1
0
0.34
References 
Authors
0
7
Name
Order
Citations
PageRank
Yunpeng Zhang100.34
Wenzhao Zheng2152.91
Zheng Zhu311.71
Guan Huang411.37
Dalong Du545915.78
Jie Zhou62103190.17
Jiwen Lu73105153.88