On Learning Disentangled Representation for Acoustic Event Detection - Citegraph

Paper Info

Title
On Learning Disentangled Representation for Acoustic Event Detection

Abstract
Polyphonic Acoustic Event Detection (AED) is a challenging task as the sounds are mixed with the signals from different events, and the features extracted from the mixture do not match well with features calculated from sounds in isolation, leading to suboptimal AED performance. In this paper, we propose a supervised β-VAE model for AED, which adds a novel event-specific disentangling loss in the objective function of disentangled learning. By incorporating either latent factor blocks or latent attention in disentangling, supervised β-VAE learns a set of discriminative features for each event. Extensive experiments on benchmark datasets show that our approach outperforms the current state-of-the-arts (top-1 performers in the Detection and Classification of Acoustic Scenes and Events (DCASE) 2017 AED challenge). Supervised β-VAE has great success in challenging AED tasks with a large variety of events and imbalanced data.

Year	DOI	Venue
2019	10.1145/3343031.3351086	Proceedings of the 27th ACM International Conference on Multimedia
Keywords	Field	DocType
acoustic event detection, disentangled latent representation, supervised variational autoencoder	Computer vision,Computer science,Speech recognition,Artificial intelligence,Acoustic event detection,Discriminative model	Conference
ISBN	Citations	PageRank
978-1-4503-6889-6	0	0.34
References	Authors
0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Lijian Gao	1	0	1.01
Qirong Mao	2	261	34.29
Ming Dong	3	849	49.17
Yu Jing	4	0	0.34
Ratna Babu Chinnam	5	210	18.59

1