Abstract
This study assesses the performance of a state-of-the-art automatic speech recognition (ASR) system at extracting target words in two different speech registers: infant-directed speech (IDS) and adult-directed speech (ADS). We used the Kaldi-NL ASR-service, developed by the Dutch Foundation of Open Speech Technology. The results indicate that the accuracy of the tool is much lower in IDS than in ADS. There are differences between IDS and ADS which negatively affect the performance of the existing ASR system. Therefore, new tools need to be developed for the automatic annotation of IDS. Nevertheless, the ASR system can already find more than half of the target words, which is promising.
Original language | English |
---|---|
Title of host publication | ICMI '20 Companion |
Subtitle of host publication | Companion Publication of the 2020 International Conference on Multimodal Interaction |
Place of Publication | New York |
Publisher | Association for Computing Machinery |
Pages | 522 |
Number of pages | 1 |
ISBN (Print) | 978-1-4503-8002-7 |
DOIs | |
Publication status | Published - 2020 |
Event | ACM International Conference on Multimodal Interaction - ICMI2020 is going virtual, Utrecht, Netherlands Duration: 25 Oct 2020 → 29 Oct 2020 Conference number: 22 |
Conference
Conference | ACM International Conference on Multimodal Interaction |
---|---|
Country/Territory | Netherlands |
City | Utrecht |
Period | 25/10/20 → 29/10/20 |
Keywords
- infant-directed speech
- automatic speech recognition
- keyword extraction
- speech registers