Title | ||
---|---|---|
Improved Tonal Language Speech Recognition By Integrating Spectro-Temporal Evidence And Pitch Information With Properly Chosen Tonal Acoustic Units |
Abstract | ||
---|---|---|
We propose an improved Tandem system for tonal language speech recognition. Three different types of features, cepstral, spectro-temporal and pitch features, are integrated for modeling tone and phoneme variation simultaneously. Tonal phonemes (or tonemes) are used for MLP posterior estimation, and tonal acoustic units for HMM recognition. In our experiments conducted on Mandarin broadcast news, a 19.3% relative CER reduction was achieved over the conventional MFCC Tandem baseline. With different training acoustic units, we analyze the complementarity among the three types of features in tone, phoneme, and toneme classification. |
Year | Venue | Keywords |
---|---|---|
2011 | 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | spectro-temporal features, pitch, Tandem system, LVCSR |
Field | DocType | Citations |
Language speech,Computer science,Speech recognition,Natural language processing,Artificial intelligence | Conference | 1 |
PageRank | References | Authors |
0.35 | 1 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shangwen Li | 1 | 32 | 6.54 |
Yow-Bang Wang | 2 | 46 | 3.81 |
Liang-Che Sun | 3 | 36 | 3.43 |
Lin-shan Lee | 4 | 1525 | 182.03 |