Title
Multiple Fundamental Frequency Estimation by Modeling Spectral Peaks and Non-Peak Regions
Abstract
This paper presents a maximum-likelihood approach to multiple fundamental frequency (F0) estimation for a mixture of harmonic sound sources, where the power spectrum of a time frame is the observation and the F0s are the parameters to be estimated. When defining the likelihood model, the proposed method models both spectral peaks and non-peak regions (frequencies further than a musical quarter tone from all observed peaks). It is shown that the peak likelihood and the non-peak region likelihood act as a complementary pair. The former helps find F0s that have harmonics that explain peaks, while the latter helps avoid F0s that have harmonics in non-peak regions. Parameters of these models are learned from monophonic and polyphonic training data. This paper proposes an iterative greedy search strategy to estimate F0s one by one, to avoid the combinatorial problem of concurrent F0 estimation. It also proposes a polyphony estimation method to terminate the iterative process. Finally, this paper proposes a postprocessing method to refine polyphony and F0 estimates using neighboring frames. This paper also analyzes the relative contributions of different components of the proposed method. It is shown that the refinement component eliminates many inconsistent estimation errors. Evaluations are done on ten recorded four-part J. S. Bach chorales. Results show that the proposed method shows superior F0 estimation and polyphony estimation compared to two state-of-the-art algorithms.
Year
DOI
Venue
2010
10.1109/TASL.2010.2042119
IEEE Transactions on Audio, Speech & Language Processing
Keywords
Field
DocType
fundamental frequency,f0 estimate,modeling spectral peaks,polyphonic training data,multiple fundamental frequency estimation,signal processing,pitch estimation,spectral peak,harmonic sound source,frequency estimation,non-peak region,maximum likelihood estimation,inconsistent estimation error,four-part j. s. bach chorales,non-peak regions,maximum likelihood detection,postprocessing method,proposed method model,nonpeak regions,likelihood model,polyphony estimation,maximum-likelihood approach,monophonic training data,greedy algorithms,maximum likelihood,power spectrum,polyphony estimation method,f0 estimation,iterative greedy search strategy,iterative methods,spectral peaks,data models,estimation,harmonic analysis,computational modeling,frequency domain analysis
Frequency domain,Fundamental frequency,Pattern recognition,Iterative and incremental development,Computer science,Iterative method,Harmonic,Speech recognition,Harmonics,Spectral density,Artificial intelligence,Estimation theory
Journal
Volume
Issue
ISSN
18
8
1558-7916
Citations 
PageRank 
References 
56
1.78
19
Authors
3
Name
Order
Citations
PageRank
Zhiyao Duan130526.86
Bryan Pardo283063.92
Changshui Zhang35506323.40