MILA Multi-Task Learning from Videos via Efficient Inter-Frame Attention - Citegraph

Paper Info

Title
MILA Multi-Task Learning from Videos via Efficient Inter-Frame Attention

Abstract
Prior work in multi-task learning has mainly focused on predictions on a single image. In this work, we present a new approach for multi-task learning from videos via efficient inter-frame local attention (MILA). Our approach contains a novel inter-frame attention module which allows learning of task-specific attention across frames. We embed the attention module in a "slow-fast" architecture, where the slow network runs on sparsely sampled keyframes and the fast shallow network runs on non-keyframes at a high frame rate. We also propose an effective adversarial learning strategy to encourage the slow and fast network to learn similar features to well align keyframes and non-keyframes. Our approach ensures low-latency multi-task learning while maintaining high quality predictions. MILA obatins competitive accuracy compared to state-ofthe-art on two multi-task learning benchmarks while reducing the number of floating point operations (FLOPs) by up to 70%. In addition, our attention based feature propagation method (IIA) outperforms prior work in terms of task accuracy while also reducing up to 90% of FLOPs.

Year	DOI	Venue
2021	10.1109/ICCVW54120.2021.00251	2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021)
Keywords	DocType	Volume
n/a	Conference	2021
Issue	ISSN	Citations
1	2473-9936	0
PageRank	References	Authors
0.34	0	8

Authors (8 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Donghyun Kim	1	0	1.69
Tian Lan	2	0	0.34
Chuhang Zou	3	0	0.68
Ning Xu	4	0	0.34
Bryan A. Plummer	5	76	8.15
Stan Sclaroff	6	5631	705.89
Jayan Eledath	7	1	1.02
Gérard G. Medioni	8	0	0.68

1