Title
Rhythm Based Music Segmentation And Octave Scale Cepstral Features For Sung Language Recognition
Abstract
Sung language recognition relies on both effective feature extraction and acoustic modeling. In this paper, we study rhythm based music segmentation in which the frame size varies in proportion to inter-beat interval of the music, in contrast to fixed length segmentation (FIX) in spoken language recognition. We show that acoustic feature extracted from the BSS scheme outperforms that from FIX. We also compare the effectiveness of musically motivated acoustic features, Octave scale cepstral coefficients (OSCCs) with Log frequency cepstral coefficients. We adopt Gaussian mixture model for sung language classifier design. Experiments are conducted on a database of 400 popular songs sung in four languages, including English, Chinese, German and Indonesian, which show that OSCC feature outperforms other features. We achieve 64.9% of sung language identification accuracy with Gaussian mixture models trained on shifted-delta-cepstral OSCC acoustic features extracted via BSS.
Year
Venue
Keywords
2008
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5
Sung Language, Octave scale, vocal detection
Field
DocType
Citations 
Octave,Segmentation,Computer science,Cepstrum,Speech recognition,Language recognition,Rhythm
Conference
0
PageRank 
References 
Authors
0.34
1
2
Name
Order
Citations
PageRank
Namunu Chinthaka Maddage110811.28
Haizhou Li23678334.61