Title
A sampling-based environment population projection approach for rapid acoustic model adaptation
Abstract
We propose an environment population projection (EPP) approach for rapid acoustic model adaptation to reduce environment mismatches with limited amounts of adaptation data. This approach consists of two stages: population construction and projection. In the population construction stage, we apply a sampling scheme on the adaptation data to construct an environment population based on acoustic models prepared in the training phase. With this sampling procedure, the environment samples in the population characterize diverse acoustic information embedded in the adaptation data. Next, the projection stage estimates a function to map the environment population into one set of acoustic models that matches the testing condition. With a well constructed environment population, a simple projection function can enable the EPP approach to accurately characterize the testing environment even with a small amount of adaptation data. To examine the rapid adaptation ability of EPP, we used only one adaptation utterance and tested performance in both supervised and unsupervised adaptation modes on Aurora-2 and Aurora-2J tasks. It is found that EPP achieves satisfactory performance under both modes for both tasks. On the Aurora-2J task for example, EPP gives a clear improvement of a 13.87% (8.58% to 7.39%) word error rate (WER) reduction over our baseline in the unsupervised adaptation mode.
Year
DOI
Venue
2011
10.1109/ICASSP.2011.5947605
ICASSP
Keywords
Field
DocType
speech recognition,environment mismatch reduction,diverse acoustic information,sampling-based epp approach,ensemble classification,aurora-2j task,asr,population projection stage,population construction stage,aurora-2 task,rapid acoustic model adaptation,acoustic model adaptation,stochastic matching,sampling methods,speech enhancement,sampling-based environment population projection approach,environment population projection,automatic speech recognition,hidden markov models,speech,indexing terms,signal to noise ratio,testing,word error rate,hidden markov model,acoustics
Speech enhancement,Population,Pattern recognition,Computer science,Projection (set theory),Word error rate,Population projection,Sampling (statistics),Artificial intelligence,Hidden Markov model,Machine learning,Acoustic model
Conference
ISSN
ISBN
Citations 
1520-6149 E-ISBN : 978-1-4577-0537-3
978-1-4577-0537-3
0
PageRank 
References 
Authors
0.34
10
6
Name
Order
Citations
PageRank
Yu Tsao16016.52
Shigeki Matsuda213418.52
Shinsuke Sakai312623.52
Ryosuke Isotani43810.60
Hisashi Kawai525054.04
Satoshi Nakamura61099194.59