Title
Speech Representation Learning Using Unsupervised Data-Driven Modulation Filtering For Robust Asr
Abstract
The performance of an automatic speech recognition (ASR) system degrades severely in noisy and reverberant environments in part due to the lack of robustness in the underlying representations used in the ASR system. On the other hand, the auditory processing studies have shown the importance of modulation filtered spectrogram representations in robust human speech recognition. Inspired by these evidences, we propose a speech representation learning paradigm using data-driven 2-D spectro-temporal modulation filter learning. In particular, multiple representations are derived using the convolutional restricted Boltzmann machine (CRBM) model in an unsupervised manner from the input speech spectrogram. A filter selection criteria based on average number of active hidden units is also employed to select the representations for ASR. The experiments are performed on Wall Street Journal (WSJ) Aurora-4 database with clean and multi condition training setup. In these experiments. the ASR results obtained from the proposed modulation filtering approach shows significant robustness to noise and channel distortions compared to other feature extraction methods (average relative improvements of 19% over baseline features in clean training). Furthermore, the ASR experiments performed on reverberant speech data from the REVERB challenge corpus highlight the benefits of the proposed representation learning scheme for far field speech recognition.
Year
DOI
Venue
2017
10.21437/Interspeech.2017-901
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION
Keywords
Field
DocType
unsupervised learning, data-driven modulation filtering, convolutional restricted Boltzmann machine, speech recognition
Data-driven,Pattern recognition,Computer science,Filter (signal processing),Speech recognition,Modulation,Unsupervised learning,Artificial intelligence,Feature learning
Conference
ISSN
Citations 
PageRank 
2308-457X
0
0.34
References 
Authors
13
2
Name
Order
Citations
PageRank
Purvi Agrawal122.38
Sriram Ganapathy225239.62