Efficient and effective strategies for cross-corpus acoustic emotion recognition

Heysem Kaya*, Alexey A. Karpov

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

An important research direction in speech technology is robust cross-corpus and cross-language emotion recognition. In this paper, we propose computationally efficient and performance effective feature normalization strategies for the challenging task of cross-corpus acoustic emotion recognition. We particularly deploy a cascaded normalization approach, combining linear speaker level, nonlinear value level and feature vector level normalization to minimize speaker- and corpus-related effects as well as to maximize class separability with linear kernel classifiers. We use extreme learning machine classifiers on five corpora representing five languages from different families, namely Danish, English, German, Russian and Turkish. Using a standard set of suprasegmental features, the proposed normalization strategies show superior performance compared to benchmark normalization approaches commonly used in the literature.

Original languageEnglish
Pages (from-to)1028-1034
Number of pages7
JournalNeurocomputing
Volume275
DOIs
Publication statusPublished - 31 Jan 2018

Funding

This research is partially supported by the Russian Foundation for Basic Research (project № 16-37-60100 ) and by the Council for Grants of the President of Russia (project № MD-254.2017.8).

Keywords

  • Acoustic emotion recognition
  • Cross-corpus adaptation
  • Extreme learning machines

Fingerprint

Dive into the research topics of 'Efficient and effective strategies for cross-corpus acoustic emotion recognition'. Together they form a unique fingerprint.

Cite this