Robust Speech Activity Detection In Movie Audio: Data Resources And Experimental Evaluation - Citegraph

Paper Info

Title
Robust Speech Activity Detection In Movie Audio: Data Resources And Experimental Evaluation

Abstract
Speech activity detection in highly variable acoustic conditions is a challenging task. Many approaches to detect speech activity in such conditions involve an inherent knowledge of the noise types involved. Movie audio can offer an excellent research test-bed for developing speech activity models. A robust speech detection in movie audio is also a crucial step for subsequent content analyses such as audio diarization. Obtaining labels for supervision of such data can be very expensive, and may not be scalable. In this paper, we employ a simple, yet effective approach to obtain speech labels for movie data by coarse aligning the subtitles with movie audio. We compiled a dataset, called Subtitle-aligned Movie Corpus (SAM) of nearly 23 hours of data labelled as speech from ninety-five Hollywood movies. We propose convolutional neural network architectures that use log-mel spectrograms as input features to predict speech at a segment-level, as opposed to frame-level. We show that our models trained on SAM outperform existing baselines on two independent, publicly released movie speech datasets. We have made the SAM corpus and pretrained models publicly available for further research.

Year	DOI	Venue
2019	10.1109/icassp.2019.8682532	2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
Keywords	Field	DocType
Speech activity detection, movie audio, convolutional neural networks	Data modeling,Pattern recognition,Task analysis,Convolutional neural network,Computer science,Voice activity detection,Spectrogram,Feature extraction,Speech recognition,Speaker diarisation,Artificial intelligence,Scalability	Conference
ISSN	Citations	PageRank
1520-6149	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Rajat Hebbar	1	0	0.68
Krishna S.	2	9	8.31
Narayanan Shrikanth	3	5558	439.23

1