Title
F0 estimation for adult and children's speech
Abstract
While there are numerous methods for estimating the fundamental frequency (F0) of speech, existing methods often suffer from pitch doubling or halving errors. Heuristics can be added to constrain the range of allowable F0 values, but it is still difficult to appropriately set the algorithm parameters if one does not know in advance the speaker's age or gender. The proposed method is distinct from most other F0- estimation algorithms in that it does not use autocorrelation, cepstral, or pattern-recognition techniques. Instead, information from 32 band-pass filters is combined at every frame, a Viterbi search provides an initial F0-contour estimate, and this estimate is then refined based on intensity discrimination of the speech signal. Despite the use of a large number of filters (which provide complementary information and hence robustness), the implementation works in less than real-time on a 2.4 GHz processor without optimization for processing speed. Results are presented for two corpora, one corpus of an adult male and one of children of different ages. For the first corpus, average absolute error is 4.10 Hz (percent error of 4.15%); for the second corpus, average absolute error is 7.74 Hz (percent error of 3.38%).
Year
Venue
Keywords
2005
INTERSPEECH
pattern recognition,fundamental frequency,real time,band pass filter,numerical method
Field
DocType
Citations 
Intensity discrimination,Viterbi search,Fundamental frequency,Pattern recognition,Computer science,Cepstrum,Speech recognition,Robustness (computer science),Heuristics,Artificial intelligence,Approximation error,Autocorrelation
Conference
0
PageRank 
References 
Authors
0.34
5
1
Name
Order
Citations
PageRank
John-Paul Hosom123123.43