Abstract
We predict drops in electronic dance music (EDM), employing
different multimodal approaches. We combine three
sources of data: noisy labels collected through crowdsourcing,
timed comments from SoundCloud and audio content
analysis. We predict the correct labels from the noisy labels
using the majority vote and Dawid-Skene methods. We also
employ timed comments from SoundCloud users to count
the occurrence of specific terms near the potential drop
event, and, finally, we conduct an acoustic analysis of the
audio excerpts. The best results are obtained, when both
annotations, metadata and audio, are combined, though the
differences between them are not significant.
different multimodal approaches. We combine three
sources of data: noisy labels collected through crowdsourcing,
timed comments from SoundCloud and audio content
analysis. We predict the correct labels from the noisy labels
using the majority vote and Dawid-Skene methods. We also
employ timed comments from SoundCloud users to count
the occurrence of specific terms near the potential drop
event, and, finally, we conduct an acoustic analysis of the
audio excerpts. The best results are obtained, when both
annotations, metadata and audio, are combined, though the
differences between them are not significant.
| Original language | English |
|---|---|
| Title of host publication | MediaEval 2014 Multimedia Benchmark Workshop |
| Subtitle of host publication | Working Notes Proceedings of the MediaEval 2014 Workshop Barcelona, Catalunya, Spain, October 16-17, 2014. |
| Editors | Martha Larson, Bogdan Ionescu, Xavier Anguera |
| Number of pages | 2 |
| Publication status | Published - 2014 |
Publication series
| Name | CEUR workshop proceedings |
|---|---|
| Volume | 1263 |
| ISSN (Electronic) | 1613-0073 |