Rapid Training Of Acoustic Models Using Graphics Processing Units - Citegraph

Paper Info

Title
Rapid Training Of Acoustic Models Using Graphics Processing Units

Abstract
Robust and accurate speech recognition systems can only be realized with adequately trained acoustic models. For common languages, state-of-the-art systems are now trained on thousands of hours of speech data. Even with a large cluster of machines the entire training process can take many weeks. To overcome this development bottleneck we propose a new framework for rapid training of acoustic models using highly parallel graphics processing units (GPUs). In this paper we focus on Viterbi training and describe the optimizations required for effective throughput on GPU processors. Using a single NVIDIA GTX580 GPU our proposed approach is shown to be 51x faster than a sequential CPU implementation, enabling a moderately sized acoustic model to be trained on 1000 hours of speech data in just over 9 hours. Moreover, we show that our implementation on a two-GPU system can perform 67% faster than a standard parallel reference implementation on a high-end 32-core Xeon server. Our GPU-based training platform empowers research groups to rapidly evaluate new ideas and build accurate and robust acoustic models on very large training corpora.

Year	Venue	Keywords
2011	12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5	Continuous Speech Recognition, Acoustic Model Training, Graphics Processing Unit
Field	DocType	Citations
Computer graphics (images),Computer science,Speech recognition,Graphics processing unit	Conference	0
PageRank	References	Authors
0.34	1	3

Authors (3 rows)

Cited by (0 rows)

References (1 rows)

Name	Order	Citations	PageRank
Senaka Buthpitiya	1	123	9.07
Ian R. Lane	2	259	33.64
Jike Chong	3	136	11.62

1