IMPROVEMENTS ON SPEECH RECOGNITON FOR FAST TALKERS - Citegraph

Paper Info

Title
IMPROVEMENTS ON SPEECH RECOGNITON FOR FAST TALKERS

Abstract
The accuracy of a speech recognition (SR) system depends on many factors, such as the presence of background noise, mismatches in microphone and language models, variations in speaker, accent and even speaking rates. In addition to fast speakers, even normal speakers will tend to speak faster when using a speech recognition system in order to get higher throughput. Unfortunately, state-of-the-art SR systems perform significantly worse on fast speech. In this paper, we present our efforts in making our system more robust to fast sp eech. We propose cepstrum length normalization, applied to the incoming testing utterances, which results in a 13% word error rate reduction on an independent evaluation corpus. Moreover, this improvement is additive to the contribution of Maximum Likelihood Linear Regression (MLLR) adaptation. Together with MLLR, a 23% error rate reduction was achieved.

Year	Venue	Keywords
1999	EUROSPEECH	speech recognition,word error rate,system performance,error rate,language model
Field	DocType	Citations
Normalization (statistics),Background noise,Pattern recognition,Computer science,Word error rate,Cepstrum,Speech recognition,Maximum likelihood linear regression,Artificial intelligence,Throughput,Microphone,Language model	Conference	12
PageRank	References	Authors
0.73	6	4

Authors (4 rows)

Cited by (12 rows)

References (6 rows)

Name	Order	Citations	PageRank
Matthew Richardson	1	4655	411.67
M. Hwang	2	42	10.02
A. Acero	3	4390	478.73
Xuedong Huang	4	1390	283.19

1