Title
Gram Approximation Of Latent Words Language Models For Domain Robust Automatic Speech Recognition
Abstract
This paper aims to improve the domain robustness of language modeling for automatic speech recognition (ASR). To this end, we focus on applying the latent words language model (LWLM) to ASR. LWLMs are generative models whose structure is based on Bayesian soft class-based modeling with vast latent variable space. Their flexible attributes help us to efficiently realize the effects of smoothing and dimensionality reduction and so address the data sparseness problem; LWLMs constructed from limited domain data are expected to robustly cover unknown multiple domains in ASR. However, the attribute flexibility seriously increases computation complexity. If we rigorously compute the generative probability for an observed word sequence, we must consider the huge quantities of all possible latent word assignments. Since this is computationally impractical, some approximation is inevitable for ASR implementation. To solve the problem and apply this approach to ASR, this paper presents an n-gram approximation of LWLM. The n-gram approximation is a method that approximates LWLM as a simple back-off n-gram structure, and offers LWLM-based robust one-pass ASR decoding. Our experiments verify the effectiveness of our approach by evaluating perplexity and ASR performance in not only in-domain data sets but also out-of-domain data sets.
Year
DOI
Venue
2016
10.1587/transinf.2016SLP0014
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
Keywords
Field
DocType
language models, domain robustness, latent words language models, n-gram approximation, automatic speech recognition
Cache language model,Pattern recognition,Computer science,Speech recognition,Natural language processing,Artificial intelligence,n-gram,Language model
Journal
Volume
Issue
ISSN
E99D
10
1745-1361
Citations 
PageRank 
References 
0
0.34
17
Authors
6
Name
Order
Citations
PageRank
Ryo Masumura12528.24
Taichi Asami22210.49
Takanobu Oba35312.09
Hirokazu Masataki4189.21
Sumitaka Sakauchi5368.30
Satoshi Takahashi624.09