P3-05: From West to East: Who Can Understand the Music of the Others Better?
Charilaos Papaioannou (School of ECE, National Technical University of Athens)*, Emmanouil Benetos (Queen Mary University of London), Alexandros Potamianos (National Technical University of Athens)
Subjects (starting with primary): MIR fundamentals and methodology -> music signal processing ; Knowledge-driven approaches to MIR -> machine learning/artificial intelligence for music ; Knowledge-driven approaches to MIR -> computational ethnomusicology ; MIR tasks -> automatic classification
Presented In Person: 4-minute short-format presentation
Recent developments in MIR have led to several benchmark deep learning models whose embeddings can be used for a variety of downstream tasks. At the same time, the vast majority of these models have been trained on Western pop/rock music and related styles. This leads to research questions on whether these models can be used to learn representations for different music cultures and styles, or whether we can build similar music audio embedding models trained on data from different cultures or styles. To that end, we leverage transfer learning methods to derive insights about the similarities between the different music cultures to which the data belongs to. We use two Western music datasets, two traditional/folk datasets coming from eastern Mediterranean cultures, and two datasets belonging to Indian art music. Three deep audio embedding models are trained and transferred across domains, including two CNN-based and a Transformer-based architecture, to perform auto-tagging for each target domain dataset. Experimental results show that competitive performance is achieved in all domains via transfer learning, while the best source dataset varies for each music culture. The implementation and the trained models are both provided in a public repository.
If the video does not load properly please use the direct link to video