Mixture Of Mixture N-Gram Language Models - Citegraph

Paper Info

Title
Mixture Of Mixture N-Gram Language Models

Abstract
This paper presents a language model adaptation technique to build a single static language model from a set of language models each trained on a separate text corpus while aiming to maximize the likelihood of an adaptation data set given as a development set of sentences. The proposed model can be considered as a mixture of mixture language models. The mixture model at the top level is a sentence-level mixture model where each sentence is assumed to be drawn from one of a discrete set of topic or task clusters. After selecting a cluster, each n-gram is assumed to be drawn from one of the given n-gram language models. We estimate cluster mixture weights and n-gram language model mixture weights for each cluster using the expectation-maximization (EM) algorithm to seek the parameter estimates maximizing the likelihood of the development sentences. This mixture of mixture models can be represented efficiently as a static n-gram language model using the previously proposed Bayesian language model interpolation technique. We show a significant improvement with this technique (both perplexity and WER) compared to the standard one level interpolation scheme.

Year	DOI	Venue
2013	10.1109/ASRU.2013.6707701	2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU)
Keywords	Field	DocType
language model, adaptation, interpolation, mixture models, bayesian, speech recognition	Perplexity,Pattern recognition,Computer science,Interpolation,Text corpus,Speech recognition,n-gram,Artificial intelligence,Sentence,Mixture model,Language model,Bayesian probability	Conference
Citations	PageRank	References
0	0.34	6
Authors
4

Authors (4 rows)

Cited by (0 rows)

References (6 rows)

Name	Order	Citations	PageRank
Hasim Sak	1	690	39.56
Cyril Allauzen	2	690	47.64
Kaisuke Nakajima	3	8	1.34
Françoise Beaufays	4	341	27.76

1