LP-29: Interpretable Modular Representation Learning for Full-Band Accompaniment Arrangement

Zhao, Jingwei*, Xia, Gus, Wang, Ye

Abstract: Style transfer, in the context of music generation, has been primarily enabled through content-style disentanglement. However, a notable limitation of existing disentanglement models is their confinement to short musical clips, typically spanning only a few bars. In this late-breaking demo, we propose and formalize a novel idea of modular style prior modelling to bridge this research gap. Our focus centres on the specific task of accompaniment arrangement, beginning with AccoMontage, a piano arranger that leverages chord-texture disentanglement and a primitive, rule-based style planner to maintain a long-term texture structure. Subsequently, we introduce Q&A-XL, a multi-track orchestrator with a more generic latent style prior model, which characterizes the global structure of orchestration style. From end to end, the complete system is named AccoMontage-3, which is capable of generating full-band accompaniment for whole pieces of music, with cohesive multi-track arrangement and coherent long-term structure.