Abstract
This paper presents our contribution to ACM ICMI 2015 Emotion Recognition in the Wild Challenge (EmotiW 2015). We participate in both static facial expression (SFEW) and audio-visual emotion recognition challenges. In both challenges, we use a set of visual descriptors and their early and late fusion schemes. For AFEW, we also exploit a set of popularly used spatio-temporal modeling alternatives and carry out multi-modal fusion. For classification, we employ two least squares regression based learners that are shown to be fast and accurate on former EmotiW Challenge corpora. Specifically, we use Partial Least Squares Regression (PLS) and Kernel Extreme Learning Machines (ELM), which is closely related to Kernel Regularized Least Squares. We use a General Procrustes Analysis (GPA) based alignment for face registration. By employing different alignments, descriptor types, video modeling strategies and classifiers, we diversify learners to improve the final fusion performance. Test set accuracies reached in both challenges are relatively 25% above the respective baselines.
Original language | English |
---|---|
Title of host publication | ICMI 2015 - Proceedings of the 2015 ACM International Conference on Multimodal Interaction |
Publisher | Association for Computing Machinery |
Pages | 459-466 |
Number of pages | 8 |
ISBN (Electronic) | 9781450339124 |
DOIs | |
Publication status | Published - 9 Nov 2015 |
Event | ACM International Conference on Multimodal Interaction, ICMI 2015 - Seattle, United States Duration: 9 Nov 2015 → 13 Nov 2015 |
Conference
Conference | ACM International Conference on Multimodal Interaction, ICMI 2015 |
---|---|
Country/Territory | United States |
City | Seattle |
Period | 9/11/15 → 13/11/15 |
Keywords
- AFEW
- Audio-visual emotion corpus
- Audio-visual fusion
- Emotion Recognition in the wild
- Feature extraction
- SFEW