Title
Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space.
Abstract
Automatic music mood classification is an important and challenging problem in the field of music information retrieval (MIR) and has attracted growing attention from variant research areas. In this paper, we proposed a novel multimodal method for music mood classification that exploits the complementarity of the lyrics and audio information of music to enhance the classification accuracy. We first extract descriptive sentence-level lyrics and audio features from the music. Then, we project the paired low-level features of two different modalities into a learned common discriminative latent space, which not only eliminates between modality heterogeneity, but also increases the discriminability of the resulting descriptions. On the basis of the latent representation of music, we employ a graph learning based multi-modal classification model for music mood, which takes the cross-modality similarity between local audio and lyrics descriptions of music into account for effective exploitation of correlations between different modalities. The acquired predictions of mood category for every sentence of music are then aggregated by a simple voting scheme. The effectiveness of the proposed method has been demonstrated in the experiments on a real dataset composed of more than 3,000 min of music and corresponding lyrics.
Year
DOI
Venue
2017
10.1007/978-3-319-51811-4_13
Lecture Notes in Computer Science
Keywords
Field
DocType
Music mood classification,Multimodal,Graph learning,Locality Preserving Projection,Bag of sentences
Modalities,Complementarity (molecular biology),Mood,Music information retrieval,Voting,Pattern recognition,Computer science,Artificial intelligence,Lyrics,Sentence,Discriminative model
Conference
Volume
ISSN
Citations 
10132
0302-9743
0
PageRank 
References 
Authors
0.34
9
2
Name
Order
Citations
PageRank
Feng Su117018.63
Hao Xue200.34