Abstract | ||
---|---|---|
This paper presents a method for speaker-independent automatic phonetic alignment that is distinguished from standard HMM-based "forced alignment" in three respects: (1) specific acoustic-phonetic features are used, in addition to PLP features, by the phonetic classifier; (2) the units of classification consist of distinctive phonetic features instead of phonemes; and (3) observation probabilities depend not only on the current state, but also on the state transition information. This proposed method is compared with a state-of-the-art baseline forced- alignment system on a number of corpora, including telephone speech, microphone speech, and children's speech. The new method has agreement of 92.57% within 20 msec on the TIMIT corpus, which is a 26% reduction in error over the baseline method (with 89.95% agreement on TIMIT). Average reduction in error over all corpora is 28%. |
Year | Venue | Keywords |
---|---|---|
2002 | INTERSPEECH | state transition |
Field | DocType | Citations |
TIMIT,Pattern recognition,Computer science,Speech recognition,Artificial intelligence,Classifier (linguistics),Hidden Markov model,Microphone | Conference | 10 |
PageRank | References | Authors |
0.77 | 9 | 1 |
Name | Order | Citations | PageRank |
---|---|---|---|
John-Paul Hosom | 1 | 231 | 23.43 |