Audio-Based Multimedia Event Detection with DNNs and Sparse Sampling - Citegraph

Paper Info

Title
Audio-Based Multimedia Event Detection with DNNs and Sparse Sampling

Abstract
This paper presents advances in analyzing audio content information to detect events in videos, such as a parade or a birthday party. We developed a set of tools for audio processing within the predominantly vision-focused deep neural network (DNN) framework Caffe. Using these tools, we show, for the first time, the potential of using only a DNN for audio-based multimedia event detection. Training DNNs for event detection using the entire audio track from each video causes a computational bottleneck. Here, we address this problem by developing a sparse audio frame-sampling method that improves event-detection speed and accuracy. We achieved a 10 percentage-point improvement in event classification accuracy, with a 200x reduction in the number of training input examples as compared to using the entire track. This reduction in input feature volume led to a 16x reduction in the size of the DNN architecture and a 300x reduction in training time. We applied our method using the recently released YLI-MED dataset and compared our results with a state-of-the-art system and with results reported in the literature for TRECVIDMED. Our results show much higher MAP scores compared to a baseline i-vector system - at a significantly reduced computational cost. The speed improvement is relevant for processing videos on a large scale, and could enable more effective deployment in mobile systems.

Year	DOI	Venue
2015	10.1145/2671188.2749396	ICMR
Keywords	Field	DocType
Multimedia Event Detection, Audio, Video, Deep Neural, Networks, Caffe	Bottleneck,Software deployment,Computer science,Artificial intelligence,Audio signal processing,Artificial neural network,Deep neural networks,Pattern recognition,Caffè,Speech recognition,Sampling (statistics),Multimedia,Machine learning	Conference
Citations	PageRank	References
7	0.58	6
Authors
7

Authors (7 rows)

Cited by (7 rows)

References (6 rows)

Name	Order	Citations	PageRank
Khalid Ashraf	1	8	0.98
Benjamin Elizalde	2	359	22.38
Forrest N. Iandola	3	352	17.25
Matthew W. Moskewicz	4	1743	102.93
Julia Bernd	5	19	4.98
Gerald Friedland	6	11	1.31
Kurt Keutzer	7	5040	801.67

1