Comparing Audio Boundary Annotation of Vocal Polyphony: Experts, Non-experts, and Algorithms

Research output: Contribution to conferencePaperAcademic

Abstract

It is a challenging computational problem to perform segmentation on vocal polyphony from the Renaissance and early Baroque. In this genre, boundaries between segments are often hidden by overlapping voices. To test algorithms for segmentation, we need boundary annotations by humans as a ground truth, but experts in this field are rare and short on time.
Our study aims to evaluate the effectiveness of segmentation algorithms on vocal polyphony using both expert and non-expert annotations. For this, we collect boundary annotations by human experts and non-experts on polyphony. Then, we compare the annotations by the two groups to see whether we can use segmentations by non-experts instead of experts. Finally, we use the expert annotations to evaluate different segmentation algorithms from the MSAF library by Nieto and Bello.
The results show that the performance of non-experts comes quite close to that of experts, whereas the tested algorithms are not yet able to perform the task at a similar level. We conclude that non-expert annotations are adequate to act as ground truth for evaluating boundary detectors on vocal polyphony and we present next steps to create a larger dataset for such evaluations.
Original languageEnglish
Pages1-8
Number of pages8
DOIs
Publication statusPublished - 6 Jul 2024
EventSound and Music Computing Conference - School of Music and Performing Arts, Porto, Portugal
Duration: 4 Jul 20246 Jul 2024
https://smcnetwork.org/smc2024

Conference

ConferenceSound and Music Computing Conference
Abbreviated titleSMC
Country/TerritoryPortugal
CityPorto
Period4/07/246/07/24
Internet address

Keywords

  • Renaissance Music
  • boundaries
  • Music perception
  • cadences
  • sound and music computing

Fingerprint

Dive into the research topics of 'Comparing Audio Boundary Annotation of Vocal Polyphony: Experts, Non-experts, and Algorithms'. Together they form a unique fingerprint.

Cite this