AFT-VO: Asynchronous Fusion Transformers for Multi-View Visual Odometry Estimation. - Citegraph

Paper Info

Title
AFT-VO: Asynchronous Fusion Transformers for Multi-View Visual Odometry Estimation.

Abstract
Motion estimation approaches typically employ sensor fusion techniques, such as the Kalman Filter, to handle individual sensor failures. More recently, deep learning-based fusion approaches have been proposed, increasing the performance and requiring less model-specific implementations. However, current deep fusion approaches often assume that sensors are synchronised, which is not always practical, especially for low-cost hardware. To address this limitation, in this work, we propose AFT-VO, a novel transformer-based sensor fusion architecture to estimate VO from multiple sensors. Our framework combines predictions from asynchronous multi-view cameras and accounts for the time discrepancies of measurements coming from different sources. Our approach first employs a Mixture Density Network (MDN) to estimate the probability distributions of the 6-DoF poses for every camera in the system. Then a novel transformer-based fusion module, AFT-VO, is introduced, which combines these asynchronous pose estimations, along with their confidences. More specifically, we introduce Discretiser and Source Encoding techniques which enable the fusion of multi-source asynchronous signals. We evaluate our approach on the popular nuScenes and KITTI datasets. Our experiments demonstrate that multi-view fusion for VO estimation provides robust and accurate trajectories, outperforming the state of the art in both challenging weather and lighting conditions.

Year	DOI	Venue
2022	10.1109/IROS47612.2022.9981835	IEEE/RJS International Conference on Intelligent RObots and Systems (IROS)
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Nimet Kaygusuz	1	0	0.34
Oscar Mendez	2	0	0.68
Richard Bowden	3	1840	118.50

1