Abstract | ||
---|---|---|
The major difficulty of prosody modeling and automatic tone recognition of continuous Mandarin speech is the complex interaction of tones and prosody/intonation on FO contours. In this study, we propose a latent prosody model (LPM) aiming to jointly model the affections of tone and prosody state on FO. The main purposes are twofold including (1) automatic prosody state labeling and (2) improving tone recognition accuracy. The basic idea is to introduce latent prosody state variables into an additive statistic model of FO which already considers the affecting factors of tone and speaker. Experiments on the Tree-Bank corpus showed that LPM not only gave meaningful prosody state labeling results but also improved the average tone recognition rate from 80.86% of a multi-layer perceptron (MLP) baseline to 82.55%. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1109/ICASSP.2007.366990 | ICASSP (4) |
Keywords | Field | DocType |
speech processing,continuous mandarin speech,speech recognition,multi-layer perceptron,tone recognition,latent prosody model,multilayer perceptrons,automatic prosody state labeling,additive statistic model,automatic tone recognition,tree-bank corpus,gaussian distribution,natural languages,statistical model,labeling,context modeling,multi layer perceptron,statistics,automatic speech recognition,recurrent neural networks | Prosody,Speech processing,Statistic,Pattern recognition,Computer science,Tone recognition,Speech recognition,Artificial intelligence,Natural language processing,State variable,Perceptron,Mandarin Chinese | Conference |
Volume | ISSN | ISBN |
4 | 1520-6149 | 1-4244-0727-3 |
Citations | PageRank | References |
1 | 0.39 | 2 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Chen-Yu Chiang | 1 | 31 | 11.55 |
Xiao-Dong Wang | 2 | 1 | 0.39 |
Yuan-Fu Liao | 3 | 73 | 20.38 |
Yih-Ru Wang | 4 | 237 | 34.68 |
Sin-Horng Chen | 5 | 273 | 39.86 |
Keikichi Hirose | 6 | 714 | 175.38 |