LP-20: FMAK: A DATASET OF KEY AND MODE ANNOTATIONS FOR THE FREE MUSIC ARCHIVE – EXTENDED ABSTRACT

Wong, Stella*, Hernandez, Gandalf

Abstract: Despite the importance of musical key detection to computational music understanding, programmatically identifying the tonal center of a musical composition remains a challenging MIR task for modern deep learning systems, resulting in shortcomings in playlist generation and DJ systems. One bottleneck is the lack of availability of reliable tonal center ground truth. Furthermore, deep learning systems that are trained on classical and pop songs exhibit limitations generalizing to other genres. In this paper, we present a new expert-labeled dataset for the evaluation of key detection containing 260 hours (5489 songs) of song-level key and mode annotations, spread across 17 genres. Code for the dataset is made freely available for public use under a Creative Commons license. The dataset’s reusability is enhanced by a bonus script we attach which enables researchers to recreate the dataset as attuned to their individual MIR tasks.