ON LATTICE-FREE BOOSTED MMI TRAINING OF HMM AND CTC-BASED FULL-CONTEXT ASR MODELS - Citegraph

Paper Info

Title
ON LATTICE-FREE BOOSTED MMI TRAINING OF HMM AND CTC-BASED FULL-CONTEXT ASR MODELS

Abstract
Hybrid automatic speech recognition (ASR) models are typically sequentially trained with CTC or LF-MMI criteria. However, they have vastly different legacies and are usually implemented in different frameworks. In this paper, by decoupling the concepts of modeling units and label topologies and building proper numerator/denominator graphs accordingly, we establish a generalized framework for hybrid acoustic modeling (AM). In this framework, we show that LF-MMI is a powerful training criterion applicable to both limited-context and full-context models, for wordpiece/mono-char/bi-char/chenone units, with both HMM/CTC topologies. From this framework, we propose three novel training schemes: chenone(ch)/wordpiece(wp)-CTC-bMMI, and wordpiece(wp)-HMM-bMMI with different advantages in training performance, decoding efficiency and decoding time-stamp accuracy. The advantages of different training schemes are evaluated comprehensively on Librispeech, and wp-CTC-bMMI and ch-CTC-bMMI are evaluated on two real world ASR tasks to show their effectiveness. Besides, we also show bi-char(bc) HMM-MMI models can serve as better alignment models than traditional non-neural GMM-HMMs.

Year	DOI	Venue
2021	10.1109/ASRU51503.2021.9688056	2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU)
Keywords	DocType	Citations
LF-MMI, CTC, HMM, modeling units, boost	Conference	0
PageRank	References	Authors
0.34	0	10

Authors (10 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Xiaohui Zhang	1	194	19.81
Vimal Manohar	2	54	7.99
David Zhang	3	7365	360.85
Frank Zhang	4	10	6.00
Yangyang Shi	5	6	4.47
Nayan Singhal	6	0	0.68
Julian Chan	7	12	3.27
Fuchun Peng	8	0	2.03
Yatharth Saraf	9	0	0.68
Mike Seltzer	10	0	0.34

1