Title
Multi-pitch Streaming of Harmonic Sound Mixtures
Abstract
Multi-pitch analysis of concurrent sound sources is an important but challenging problem. It requires estimating pitch values of all harmonic sources in individual frames and streaming the pitch estimates into trajectories, each of which corresponds to a source. We address the streaming problem for monophonic sound sources. We take the original audio, plus frame-level pitch estimates from any multi-pitch estimation algorithm as inputs, and output a pitch trajectory for each source. Our approach does not require pre-training of source models from isolated recordings. Instead, it casts the problem as a constrained clustering problem, where each cluster corresponds to a source. The clustering objective is to minimize the timbre inconsistency within each cluster. We explore different timbre features for music and speech. For music, harmonic structure and a newly proposed feature called uniform discrete cepstrum (UDC) are found effective; while for speech, MFCC and UDC works well. We also show that timbre-consistency is insufficient for effective streaming. Constraints are imposed on pairs of pitch estimates according to their time-frequency relationships. We propose a new constrained clustering algorithm that satisfies as many constraints as possible while optimizing the clustering objective. We compare the proposed approach with other state-of-the-art supervised and unsupervised multi-pitch streaming approaches that are specifically designed for music or speech. Better or comparable results are shown.
Year
DOI
Venue
2014
10.1109/TASLP.2013.2285484
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Keywords
Field
DocType
speech processing,cochannel speech,pitch value,pattern clustering,clustering algorithm,constrained clustering,pitch streaming,timbre tracking,uniform discrete cepstrum,supervised multipitch streaming approach,multi-pitch analysis,frame-level pitch estimate,constrained clustering algorithm,udc,pitch trajectory,time-frequency relationships,unsupervised multipitch streaming approach,clustering objective,challenging problem,pitch estimate,multi-pitch streaming,multipitch streaming analysis,harmonic structure,timbre-consistency,frame-level pitch estimation,monophonic sound sources,harmonic source,multipitch estimation algorithm,clustering problem,mfcc,harmonic sound mixtures,speech,concurrent sound source,concurrent sound sources,time-frequency analysis,time frequency analysis
Mel-frequency cepstrum,Speech processing,Computer science,Cepstrum,Harmonic,Speech recognition,Constrained clustering,Cluster analysis,Pitch detection algorithm,Timbre
Journal
Volume
Issue
ISSN
22
1
2329-9290
Citations 
PageRank 
References 
13
0.68
34
Authors
3
Name
Order
Citations
PageRank
Zhiyao Duan130526.86
Jinyu Han2697.92
Bryan Pardo383063.92