Abstract
Automatic personality traits assessment (PTA) provides high-level, intelligible predictive inputs for subsequent critical downstream tasks, such as job interview recommendations and mental healthcare monitoring. In this work, we introduce a novel Multimodal Personality Traits Assessment (MuPTA) corpus. Our MuPTA corpus is unique in that it contains both spontaneous and read speech collected in the midly-resourced Russian language. We present a novel audio-visual approach for PTA that is used in order to set up baseline results on this corpus. We further analyze the impact of both spontaneous and read speech types on the PTA predictive performance. We find that for the audio modality, the PTA predictive performances on short signals are almost equal regardless of the speech type, while PTA using video modality is more accurate with spontaneous speech compared to read one regardless of the signal length.
| Original language | English |
|---|---|
| Pages | 4049-4053 |
| Number of pages | 5 |
| DOIs | |
| Publication status | Published - 20 Aug 2023 |
| Event | INTERSPEECH 2023 - Dublin, Dublin, Ireland Duration: 20 Aug 2023 → 24 Aug 2023 https://interspeech2023.org/ |
Conference
| Conference | INTERSPEECH 2023 |
|---|---|
| Abbreviated title | INTERSPEECH 2023 |
| Country/Territory | Ireland |
| City | Dublin |
| Period | 20/08/23 → 24/08/23 |
| Internet address |
Bibliographical note
Funding Information:This work was supported by the Analytical Center for the Government of the Russian Federation (IGK 000000D730321P5Q0002), agreement No. 70-2021-00141.
Publisher Copyright:
© 2023 International Speech Communication Association. All rights reserved.
Keywords
- audio-visual resources
- big five traits
- data annotation
- multimodal paralinguistics
- personality computing