Title
Time-Frequency Masking For Large Scale Robust Speech Recognition
Abstract
Time-frequency mask estimation has shown considerable success recently. In this paper, we demonstrate its utility as a feature enhancement frontend for large vocabulary conversational speech recognition. Additionally, we investigate how masking compares with feature denoising, which directly reconstructs clean features from noisy ones. We train a mask estimator that predicts ideal ratio masks. Experimental results on Google voice search evaluation sets demonstrate that masking is superior to feature denoising, and a lightweight masking frontend produces significant improvements over a strong baseline. We also show that masking improves performance of a multi condition trained (MTR) acoustic model.
Year
Venue
Keywords
2015
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5
Robust speech recognition, time-frequency masking, deep neural network, feature denoising
Field
DocType
Citations 
Pattern recognition,Computer science,Speech recognition,Artificial intelligence,Time frequency masking
Conference
2
PageRank 
References 
Authors
0.36
6
3
Name
Order
Citations
PageRank
Yu-Xuan Wang165032.68
Ananya Misra27711.46
Kean K. Chin3423.49