Title
Enriching Mandarin Speech Recognition By Incorporating A Hierarchical Prosody Model
Abstract
This paper presents a new probabilistic framework of Mandarin speech recognition by incorporating a sophisticated hierarchical prosody model into the conventional HMM-based system. The prosody model describes the relations of linguistic cues of various levels, break types and prosodic states which represent the prosody hierarchical structure, and prosody-related acoustic features. Aside from producing the recognized word sequences, the system also decodes other information including word's part-of-speech, punctuation marks, inter-syllable break types, and prosodic states of syllables. Experimental results on the TCC300 corpus, which consists of paragraphic utterances, showed that the proposed system significantly outperformed the baseline system. The word and character error rates decreased from 24.4% and 18.1% to 20.7% and 14.4% (or 15.2% and 20.4% relative improvements), respectively.
Year
DOI
Venue
2011
10.1109/ICASSP.2011.5947492
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING
Keywords
Field
DocType
Hierarchical prosody model, Mandarin speech recognition
Prosody,Pragmatics,Computer science,Speech recognition,Natural language processing,Artificial intelligence,Baseline system,Hidden Markov model,Speech recognition hidden markov models,Punctuation,Mandarin speech recognition,Probabilistic framework
Conference
ISSN
Citations 
PageRank 
1520-6149
1
0.35
References 
Authors
6
6
Name
Order
Citations
PageRank
Jyh-Her Yang191.68
Ming-Chieh Liu271.28
Hao-Hsiang Chang361.03
Chen-Yu Chiang43111.55
Yih-Ru Wang523734.68
Sin-Horng Chen627339.86