Title
An environment structuring framework to facilitating suitable prior density estimation for MAPLR on robust speech recognition
Abstract
In this paper, we propose using an environment structuring framework to facilitate suitable prior density estimation for maximum a posteriori linear regression (MAPLR) under adverse testing conditions. The framework is constructed in a two-stage hierarchical tree structure by performing two algorithms, environment clustering and environment partitioning. The constructed framework has good capability to characterize detailed regional information of various speaker and speaking environments. We intend to incorporate such information into prior density calculation for MAPLR and have designed three types of prior density, namely clustered prior, hierarchical prior, and integrated prior densities. We conduct experiments with the Aurora-2 task. From the testing results, we first observe that MAPLR provides improvements over baseline and maximum likelihood linear regression (MLLR) using either one of the three prior densities. Moreover, we find that by using the integrated prior density that combines the advantages of the other two, MAPLR can give the best performance. When using the best integrated prior density, MAPLR achieves a clear improvement of 10.72% word error rate reduction over the baseline result.
Year
DOI
Venue
2010
10.1109/ISCSLP.2010.5684880
ISCSLP
Keywords
Field
DocType
smaplr,environment partitioning,maximum a posteriori linear regression,maplr,robust automatic speech recognition,pattern clustering,clustering algorithm,regression analysis,maximum likelihood estimation,prior density estimation,speaker recognition,environment clustering,asr,speaker information,aurora-2 task,two stage hierarchical tree structure,robust speech recognition,environment structuring framework,speech,hidden markov models,tree structure,linear regression,word error rate,testing,density estimation,automatic speech recognition,estimation,speech recognition
Density estimation,Regression analysis,Computer science,Speaker recognition,Artificial intelligence,Tree structure,Cluster analysis,Pattern recognition,Word error rate,Speech recognition,Maximum a posteriori estimation,Hidden Markov model,Machine learning
Conference
ISBN
Citations 
PageRank 
978-1-4244-6244-5
2
0.38
References 
Authors
12
4
Name
Order
Citations
PageRank
Yu Tsao16016.52
Ryosuke Isotani23810.60
Hisashi Kawai325054.04
Satoshi Nakamura41099194.59