Speaker- And corpus-independent methods for affect classification in computational paralinguistics

Heysem Kaya*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

The analysis of spoken emotions is of increasing interest in human computer interaction, in order to drive the machine communication into a humane manner. It has manifold applications ranging from intelligent tutoring systems to affect sensitive robots, from smart call centers to patient telemonitoring. In general the study of computational paralinguistics, which covers the analysis of speaker states and traits, faces with real life challenges of inter-speaker and intercorpus variability. In this paper, a brief summary of the progress and future directions of my PhD study titled Adap- Tive Mixture Models for Speech Emotion Recognition that targets these challenges are given. An automatic mixture model selection method for Mixture of Factor Analyzers is proposed for modeling high dimensional data. To provide the mentioned statistical method a compact set of potent features, novel feature selection methods based on Canonical Correlation Analysis are introduced.

Original languageEnglish
Title of host publicationICMI 2014 - Proceedings of the 2014 International Conference on Multimodal Interaction
PublisherAssociation for Computing Machinery
Pages359-363
Number of pages5
ISBN (Electronic)9781450328852
DOIs
Publication statusPublished - 12 Nov 2014
Event16th ACM International Conference on Multimodal Interaction, ICMI 2014 - Istanbul, Turkey
Duration: 12 Nov 201416 Nov 2014

Conference

Conference16th ACM International Conference on Multimodal Interaction, ICMI 2014
Country/TerritoryTurkey
CityIstanbul
Period12/11/1416/11/14

Keywords

  • Canonical correlation analysis
  • Depression recognition
  • Factor analysis
  • Feature extraction
  • Local fisher discriminant analysis
  • Mixture modeling
  • Mixture of factor analyzers
  • Speech emotion recognition audio-visual fusion

Fingerprint

Dive into the research topics of 'Speaker- And corpus-independent methods for affect classification in computational paralinguistics'. Together they form a unique fingerprint.

Cite this