Abstract | ||
---|---|---|
Deep generative modeling has already become the leading technique for music automation. However, long-term generation remains a challenging task as most methods fall short in preserving a natural structure and the overall musicality when the generation scope exceeds several beats. In this study, we tackle the problem of long-term, phrase-level symbolic melody inpainting by equipping a sequence prediction model with phrase-level representation (as an extra condition) and contrastive loss (as an extra optimization term). The underlying ideas are twofold. First, to predict phrase-level music, we need phrase-level representations as a better context. Second, we should predict notes and their high-level representations simultaneously, while contrastive loss serves as a better target for abstract representations. Experimental results show that our method significantly outperforms the baselines. In particular, contrastive loss plays a critical role in the generation quality, and the phase-level representation further enhances the structure of long-term generation.
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/ICASSP43922.2022.9747817 | ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Keywords | DocType | ISSN |
Music Inpainting,Contrastive Learning,Representation Learning,Deep Music Generation | Conference | 1520-6149 |
ISBN | Citations | PageRank |
978-1-6654-0541-6 | 0 | 0.34 |
References | Authors | |
1 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shiqi Wei | 1 | 0 | 0.34 |
Gus Xia | 2 | 0 | 0.34 |
Yixiao Zhang | 3 | 0 | 0.34 |
Liwei Lin | 4 | 122 | 28.76 |
Weiguo Gao | 5 | 0 | 0.34 |