Abstract
Compared to studies with symbolic music data, advances in music description from audio have overwhelmingly focused on ground truth reconstruction and maximizing prediction accuracy, with only a small fraction of studies using audio description to gain insight into musical data. We present a strategy for the corpus analysis of audio data that is optimized for interpretable results. The approach brings two previously unexplored concepts to the audio domain: audio bigram distributions, and the use of corpus-relative or 'second-order' descriptors. To test the real-world applicability of our method, we present an experiment in which we model song recognition data collected in a widely-played music game. By using the proposed corpus analysis pipeline we are able to present a cognitively adequate analysis that allows a model interpretation in terms of the listening history and experience of our participants. We find that our corpus-based audio features are able to explain a comparable amount of variance to symbolic features for this task when used alone and that they can supplement symbolic features profitably when the two types of features are used in tandem. Finally, we highlight new insights into what makes music recognizable.
Original language | English |
---|---|
Number of pages | 7 |
Publication status | Published - 2015 |
Event | International Society for Music Information Retrieval Conference - Taipei, Taiwan, Province of China Duration: 27 Oct 2014 → 31 Oct 2014 |
Conference
Conference | International Society for Music Information Retrieval Conference |
---|---|
Country/Territory | Taiwan, Province of China |
City | Taipei |
Period | 27/10/14 → 31/10/14 |