Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization - Citegraph

Paper Info

Title
Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization

Abstract
We address the challenging task of event localization, which requires the machine to localize an event and recognize its category in unconstrained videos. Most existing methods leverage only the visual information of a video while neglecting its audio information, which, however, can be very helpful and important for event localization. For example, humans often recognize an event by reasoning with the visual and audio content simultaneously. Moreover, the audio information can guide the model to pay more attention on the informative regions of visual scenes, which can help to reduce the interference brought by the background. Motivated by these, in this paper, we propose a relation-aware network to leverage both audio and visual information for accurate event localization. Specifically, to reduce the interference brought by the background, we propose an audio-guided spatial-channel attention module to guide the model to focus on event-relevant visual regions. Besides, we propose to build connections between visual and audio modalities with a relation-aware module. In particular, we learn the representations of video and/or audio segments by aggregating information from the other modality according to the cross-modal relations. Last, relying on the relation-aware representations, we conduct event localization by predicting the event relevant score and classification score. Extensive experimental results demonstrate that our method significantly outperforms the state-of-the-arts in both supervised and weakly-supervised AVE settings.

Year	DOI	Venue
2020	10.1145/3394171.3413581	MM '20: The 28th ACM International Conference on Multimedia Seattle WA USA October, 2020
DocType	ISBN	Citations
Conference	978-1-4503-7988-5	0
PageRank	References	Authors
0.34	11	5

Authors (5 rows)

Cited by (0 rows)

References (11 rows)

Name	Order	Citations	PageRank
Haoming Xu	1	11	2.65
Runhao Zeng	2	29	3.51
Wu Qingyao	3	259	33.46
Mingkui Tan	4	501	38.31
Chuang Gan	5	253	31.92

1