LV-44: Equipping MusicGen with Chord and Rhythm Controls

Lin, Liwei*, Xia, Guangyu, Jiang, Junyan, Zhang, Yixiao

Abstract: We propose \textit{Coco-Mulla}, a plug-and-play module that adds direct content-based controls (chords, drums, and piano-roll) to existing text-to-music large language models, e.g. MusicGen. Coco-Mulla employs the Parameter-Efficient Fine-Tuning (PEFT) method, which only requires fine-tuning with fewer than 4\% of the original model's parameters on a dataset containing fewer than 300 songs. Experiments show that our approach achieves effective content-based controls while keeping the quality of generated music. Furthermore, by combining content-based controls and text descriptions, our system achieves flexible music variation generation and style transfer. \footnote{Our source codes and demos are available at \url{https://kikyo-16.github.io/coco-mulla}.}