DCAR: A Discriminative and Compact Audio Representation for Audio Processing. - Citegraph

Paper Info

Title
DCAR: A Discriminative and Compact Audio Representation for Audio Processing.

Abstract
This paper presents a novel two-phase method for audio representation, discriminative and compact audio representation (DCAR), and evaluates its performance at detecting events and scenes in consumer-produced videos. In the first phase of DCAR, each audio track is modeled using a Gaussian mixture model (GMM) that includes several components to capture the variability within that track. The second phase takes into account both global structure and local structure. In this phase, the components are rendered more discriminative and compact by formulating an optimization problem on a Grassmannian manifold. The learned components can effectively represent the structure of audio. Our experiments used the YLI-MED and DCASE Acoustic Scenes datasets. The results show that variants on the proposed DCAR representation consistently outperform four popular audio representations (mv-vector, i-vector, GMM, and HEM-GMM). The advantage is significant for both easier and harder discrimination tasks; we discuss how these performance differences across tasks follow from how each type of model leverages (or does not leverage) the intrinsic structure of the data.

Year	DOI	Venue
2017	10.1109/TMM.2017.2703939	IEEE Trans. Multimedia
Keywords	Field	DocType
Event detection,Feature extraction,Mel frequency cepstral coefficient,Content-based retrieval,Covariance matrices,Gaussian mixture model	Mel-frequency cepstrum,Pattern recognition,Computer science,Speech recognition,Feature extraction,Artificial intelligence,Grassmannian,Audio signal processing,Optimization problem,Discriminative model,Manifold,Mixture model	Journal
Volume	Issue	ISSN
19	12	1520-9210
Citations	PageRank	References
1	0.35	18
Authors
7

Authors (7 rows)

Cited by (1 rows)

References (18 rows)

Name	Order	Citations	PageRank
Liping Jing	1	550	47.13
Bo Liu	2	521	84.67
Jae-Young Choi	3	783	110.19
Adam Janin	4	250	34.11
Julia Bernd	5	19	4.98
Michael W. Mahoney	6	3297	218.10
Gerald Friedland	7	1127	96.23

1