BLSTM supported GEV beamformer front-end for the 3RD CHiME challenge - Citegraph

Paper Info

Title
BLSTM supported GEV beamformer front-end for the 3RD CHiME challenge

Abstract
We present a new beamformer front-end for Automatic Speech Recognition and apply it to the 3rd-CHiME Speech Separation and Recognition Challenge. Without any further modification of the back-end, we achieve a 53% relative reduction of the word error rate over the best baseline enhancement system for the relevant test data set. Our approach leverages the power of a bi-directional Long Short-Term Memory network to robustly estimate soft masks for a subsequent beamforming step. The utilized Generalized Eigenvalue beamforming operation with an optional Blind Analytic Normalization does not rely on a Direction-of-Arrival estimate and can cope with multi-path sound propagation, while at the same time only introducing very limited speech distortions. Our quite simple setup exploits the possibilities provided by simulated training data while still being able to generalize well to the fairly different real data. Finally, combining our front-end with data augmentation and another language model nearly yields a 64 % reduction of the word error rate on the real data test set.

Year	DOI	Venue
2015	10.1109/ASRU.2015.7404829	2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
Keywords	Field	DocType
Robust Speech Recognition,Beamforming,Feature Enhancement,Neural Networks	Beamforming,Speech processing,Pattern recognition,Computer science,Voice activity detection,Word error rate,Speech recognition,Time delay neural network,Test data,Artificial intelligence,Test set,Acoustic model	Conference
Citations	PageRank	References
14	0.71	11
Authors
4

Authors (4 rows)

Cited by (14 rows)

References (11 rows)

Name	Order	Citations	PageRank
Jahn Heymann	1	102	10.29
Lukas Drude	2	95	11.10
Aleksej Chinaev	3	22	3.05
Reinhold Haeb-Umbach	4	1487	211.71

1