Title
Audio concept classification with Hierarchical Deep Neural Networks
Abstract
Audio-based multimedia retrieval tasks may identify semantic information in audio streams, i.e., audio concepts (such as music, laughter, or a revving engine). Conventional Gaussian-Mixture-Models have had some success in classifying a reduced set of audio concepts. However, multi-class classification can benefit from context window analysis and the discriminating power of deeper architectures. Although deep learning has shown promise in various applications such as speech and object recognition, it has not yet met the expectations for other fields such as audio concept classification. This paper explores, for the first time, the potential of deep learning in classifying audio concepts on User-Generated Content videos. The proposed system is comprised of two cascaded neural networks in a hierarchical configuration to analyze the short- and long-term context information. Our system outperforms a GMM approach by a relative 54%, a Neural Network by 33%, and a Deep Neural Network by 12% on the TRECVID-MED database.
Year
Venue
Keywords
2014
Signal Processing Conference
Gaussian processes,audio signal processing,audio streaming,content-based retrieval,learning (artificial intelligence),mixture models,multimedia databases,neural nets,signal classification,GMM,Gaussian mixture models,TRECVID-MED database,audio concept classification,audio streams,audio-based multimedia retrieval,cascaded neural networks,context information,context window analysis,hierarchical deep neural networks,multiclass classification,object recognition,semantic information,speech recognition,user-generated content videos,TRECVID,audio concepts classification,deep neural networks
Field
DocType
Volume
Speech coding,Audio mining,Computer science,Speech recognition,Time delay neural network,Artificial intelligence,Deep learning,Artificial neural network,Deep neural networks,Machine learning,Cognitive neuroscience of visual object recognition,Acoustic model
Conference
abs/1710.04288
ISSN
Citations 
PageRank 
EUSIPCO 2014
3
0.39
References 
Authors
13
4
Name
Order
Citations
PageRank
Mirco Ravanelli118517.87
Benjamin Elizalde235922.38
Karl Ni330.39
Gerald Friedland4112796.23