Abstract
Word embeddings, renowned for their role as superior semantic feature vector representation in diverse NLP tasks, can exhibit an undesired bias for stereotypical categories. The bias arises from the statistical and societal biases within the datasets used for training. In this study, we analyze the gender bias in four different pre-trained word embeddings for a range of affective computing tasks in the mental health domain including the detection of psychiatric disorders such as depression, and alcohol/substance abuse. We incorporate both contextual and non-contextual embeddings, which are trained not just on general domain data but also on data specific to the clinical domain. Our findings indicate that the bias in embeddings is towards different gender groups, depending on the type of embeddings and the training dataset. Furthermore, we highlight how these existing associations transfer to subsequent tasks and might even be amplified during supervised training for patient phenotyping. We also show that a simple method of data augmentation- swapping gender words - noticeably reduces bias in these subsequent tasks. The scripts to reproduce the results are available at: https://github.com/gizemsogancioglu/gender-bias-mental-health.
Original language | English |
---|---|
Title of host publication | 2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII) |
Publisher | IEEE |
Pages | 1-8 |
Number of pages | 8 |
ISBN (Print) | 979-8-3503-2744-1 |
DOIs | |
Publication status | Published - 13 Sept 2023 |
Event | 2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII) - Duration: 10 Sept 2023 → 13 Sept 2023 |
Conference
Conference | 2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII) |
---|---|
Period | 10/09/23 → 13/09/23 |
Keywords
- fairness
- bias mitigation
- gender bias
- fairness in machine learning
- bias in mental health
- depression