LSTM based cross-corpus and cross-task acoustic emotion recognition

Heysem Kaya, Dmitrii Fedotov, Ali Yeşilkanat, Oxana Verkholyak, Yang Zhang, Alexey Karpov

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Acoustic emotion recognition is a popular and central research direction in paralinguistic analysis, due its relation to a wide range of affective states/traits and manifold applications. Developing highly generalizable models still remains as a challenge for researchers and engineers, because of multitude of nuisance factors. To assert generalization, deployed models need to handle spontaneous speech recorded under different acoustic conditions compared to the training set. This requires that the models are tested for cross-corpus robustness. In this work, we first investigate the suitability of Long-Short-Term-Memory (LSTM) models trained with time- and space-continuously annotated affective primitives for cross-corpus acoustic emotion recognition. We next employ an effective approach to use the frame level valence and arousal predictions of LSTM models for utterance level affect classification and apply this approach on the ComParE 2018 challenge corpora. The proposed method alone gives motivating results both on development and test set of the Self-Assessed Affect Sub-Challenge. On the development set, the cross-corpus prediction based method gives a boost to performance when fused with top components of the baseline system. Results indicate the suitability of the proposed method for both time-continuous and utterance level cross-corpus acoustic emotion recognition tasks.

Original languageEnglish
Title of host publicationINTERSPEECH-2018
Pages521-525
Number of pages5
Volume2018-September
DOIs
Publication statusPublished - 1 Sept 2018
Event19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 - Hyderabad, India
Duration: 2 Sept 20186 Sept 2018

Conference

Conference19th Annual Conference of the International Speech Communication, INTERSPEECH 2018
Country/TerritoryIndia
CityHyderabad
Period2/09/186/09/18

Funding

The participation in the ComParE 2018 challenge with experiments on USoMS corpus (Section 4) was supported exclusively by the Russian Science Foundation (Project No. 18-11-00145). The rest research was supported by the Huawei Innovation Research Program (Agreement No. HO2017050001BM).

Keywords

  • Computational paralinguistics
  • Context modeling
  • Cross-corpus emotion recognition
  • LSTM
  • Speech emotion recognition

Fingerprint

Dive into the research topics of 'LSTM based cross-corpus and cross-task acoustic emotion recognition'. Together they form a unique fingerprint.

Cite this