Title
MILA Multi-Task Learning from Videos via Efficient Inter-Frame Attention
Abstract
Prior work in multi-task learning has mainly focused on predictions on a single image. In this work, we present a new approach for multi-task learning from videos via efficient inter-frame local attention (MILA). Our approach contains a novel inter-frame attention module which allows learning of task-specific attention across frames. We embed the attention module in a "slow-fast" architecture, where the slow network runs on sparsely sampled keyframes and the fast shallow network runs on non-keyframes at a high frame rate. We also propose an effective adversarial learning strategy to encourage the slow and fast network to learn similar features to well align keyframes and non-keyframes. Our approach ensures low-latency multi-task learning while maintaining high quality predictions. MILA obatins competitive accuracy compared to state-ofthe-art on two multi-task learning benchmarks while reducing the number of floating point operations (FLOPs) by up to 70%. In addition, our attention based feature propagation method (IIA) outperforms prior work in terms of task accuracy while also reducing up to 90% of FLOPs.
Year
DOI
Venue
2021
10.1109/ICCVW54120.2021.00251
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021)
Keywords
DocType
Volume
n/a
Conference
2021
Issue
ISSN
Citations 
1
2473-9936
0
PageRank 
References 
Authors
0.34
0
8
Name
Order
Citations
PageRank
Donghyun Kim101.69
Tian Lan200.34
Chuhang Zou300.68
Ning Xu400.34
Bryan A. Plummer5768.15
Stan Sclaroff65631705.89
Jayan Eledath711.02
Gérard G. Medioni800.68