P6-06: Predicting Performance Difficulty From Piano Sheet Music Images
Pedro Ramoneda (Universitat Pompeu Fabra)*, Dasaem Jeong (Sogang University), Jose J. Valero-Mas (Universitat Pompeu Fabra), Xavier Serra (Universitat Pompeu Fabra )
Subjects (starting with primary): Applications -> digital libraries and archives ; Applications -> music training and education ; Applications
Presented In Person: 4-minute short-format presentation
Estimating the performance difficulty of a musical score is crucial in music education for adequately designing the learning curriculum of the students. Although the music information retrieval community has recently shown interest in this task, existing approaches mainly use machine-readable scores, leaving the broader case of sheet music images unaddressed. Based on previous works involving sheet music images, we use a mid-level representation, bootleg score, describing notehead positions relative to staff lines coupled with a transformer model. This architecture is adapted to our task by introducing a different encoding scheme that reduces the encoded sequence length to one-eighth of the original size. In terms of evaluation, we consider five datasets---more than 7500 scores with up to 9 difficulty levels---, two being mainly compiled for this work. The results obtained when pretraining the scheme on the IMSLP corpus and fine-tuning it on the considered datasets prove the proposal's validity, achieving the best-performing model with a balanced accuracy of 40.3\% and a mean square error of 1.3. Finally, we provide access to our code, data, and models for transparency and reproducibility.
If the video does not load properly please use the direct link to video