Title
Mfas: Multimodal Fusion Architecture Search
Abstract
We tackle the problem of finding good architectures for multimodal classification problems. We propose a novel and generic search space that spans a large number of possible fusion architectures. In order to find an optimal architecture for a given dataset in the proposed search space, we leverage an efficient sequential model-based exploration approach that is tailored for the problem. We demonstrate the value of posing multimodal fusion as a neural architecture search problem by extensive experimentation on a toy dataset and two other real multimodal datasets. We discover fusion architectures that exhibit state-of-the-art performance for problems with different domain and dataset size, including the NTU RGB+D dataset, the largest multimodal action recognition dataset available.
Year
DOI
Venue
2019
10.1109/CVPR.2019.00713
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019)
Field
DocType
Volume
Architecture,Action recognition,Fusion,Artificial intelligence,RGB color model,Search problem,Sequential model,Machine learning,Mathematics
Journal
abs/1903.06496
ISSN
Citations 
PageRank 
1063-6919
7
0.40
References 
Authors
0
5
Name
Order
Citations
PageRank
Juan-Manuel Perez-Rua1171.54
Valentin Vielzeuf270.40
stephane pateux3242.06
Moez Baccouche4452.14
Frédéric Jurie53924235.82