Large-scale discriminative language model reranking for voice-search - Citegraph

Paper Info

Title
Large-scale discriminative language model reranking for voice-search

Abstract
We present a distributed framework for large-scale discriminative language models that can be integrated within a large vocabulary continuous speech recognition (LVCSR) system using lattice rescoring. We intentionally use a weakened acoustic model in a baseline LVCSR system to generate candidate hypotheses for voice-search data; this allows us to utilize large amounts of unsupervised data to train our models. We propose an efficient and scalable MapReduce framework that uses a perceptron-style distributed training strategy to handle these large amounts of data. We report small but significant improvements in recognition accuracies on a standard voice-search data set using our discriminative reranking model. We also provide an analysis of the various parameters of our models including model size, types of features, size of partitions in the MapReduce framework with the help of supporting experiments.

Year	Venue	Keywords
2012	WLM@NAACL-HLT	standard voice-search data,unsupervised data,large vocabulary continuous speech,scalable mapreduce framework,voice-search data,discriminative reranking model,large-scale discriminative language model,mapreduce framework,model size,large amount
Field	DocType	Citations
Computer science,Speech recognition,Natural language processing,Artificial intelligence,Discriminative model,Vocabulary,Machine learning,Language model,Voice search,Scalability,Acoustic model	Conference	3
PageRank	References	Authors
0.47	11	4

Authors (4 rows)

Cited by (3 rows)

References (11 rows)

Name	Order	Citations	PageRank
Preethi Jyothi	1	57	7.85
Leif Johnson	2	37	4.34
Ciprian Chelba	3	1055	111.19
Brian Strope	4	95	10.99

1