P7-03: Exploring Sampling Techniques for Generating Melodies With a Transformer Language Model
Mathias Rose Bjare (Johannes Kepler University Linz)*, Stefan Lattner (Sony CSL), Gerhard Widmer (Johannes Kepler University)
Subjects (starting with primary): MIR tasks -> music synthesis and transformation ; Applications -> music composition, performance, and production ; MIR fundamentals and methodology -> symbolic music processing ; Musical features and properties -> structure, segmentation, and form ; Musical features and properties -> melody and motives ; MIR tasks -> music generation
Presented In Person: 4-minute short-format presentation
Research in natural language processing has demonstrated that the quality of generations from trained autoregressive language models is significantly influenced by the used sampling strategy. In this study, we investigate the impact of different sampling techniques on musical qualities such as diversity and structure. To accomplish this, we train a high-capacity transformer model on a vast collection of highly-structured Irish folk melodies and analyze the musical qualities of the samples generated using distribution truncation sampling techniques. Specifically, we use nucleus sampling, the recently proposed "typical sampling", and conventional ancestral sampling. We evaluate the effect of these sampling strategies in two scenarios: optimal circumstances with a well-calibrated model and suboptimal circumstances where we systematically degrade the model’s performance. We assess the generated samples using objective and subjective evaluations. We discover that probability truncation techniques may restrict diversity and structural patterns in optimal circumstances, but may also produce more musical samples in suboptimal circumstances.
If the video does not load properly please use the direct link to video