Rhythm Based Music Segmentation And Octave Scale Cepstral Features For Sung Language Recognition - Citegraph

Paper Info

Title
Rhythm Based Music Segmentation And Octave Scale Cepstral Features For Sung Language Recognition

Abstract
Sung language recognition relies on both effective feature extraction and acoustic modeling. In this paper, we study rhythm based music segmentation in which the frame size varies in proportion to inter-beat interval of the music, in contrast to fixed length segmentation (FIX) in spoken language recognition. We show that acoustic feature extracted from the BSS scheme outperforms that from FIX. We also compare the effectiveness of musically motivated acoustic features, Octave scale cepstral coefficients (OSCCs) with Log frequency cepstral coefficients. We adopt Gaussian mixture model for sung language classifier design. Experiments are conducted on a database of 400 popular songs sung in four languages, including English, Chinese, German and Indonesian, which show that OSCC feature outperforms other features. We achieve 64.9% of sung language identification accuracy with Gaussian mixture models trained on shifted-delta-cepstral OSCC acoustic features extracted via BSS.

Year	Venue	Keywords
2008	INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5	Sung Language, Octave scale, vocal detection
Field	DocType	Citations
Octave,Segmentation,Computer science,Cepstrum,Speech recognition,Language recognition,Rhythm	Conference	0
PageRank	References	Authors
0.34	1	2

Authors (2 rows)

Cited by (0 rows)

References (1 rows)

Name	Order	Citations	PageRank
Namunu Chinthaka Maddage	1	108	11.28
Haizhou Li	2	3678	334.61

1