Title
Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features
Abstract
In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.
Year
DOI
Venue
2022
10.1016/j.knosys.2022.108754
Knowledge-Based Systems
Keywords
DocType
Volume
View-based 3D object retrieval,View attention module,Instance attention module,ArcFace loss,Cosine distance triplet-center loss
Journal
247
ISSN
Citations 
PageRank 
0950-7051
0
0.34
References 
Authors
0
7
Name
Order
Citations
PageRank
Dongyun Lin113.06
Yiqun Li202.37
Yi Cheng300.34
Shitala Prasad403.04
Tin Lay Nwe512.72
Sheng Dong612.72
Aiyuan Guo700.34