Words Matter: A Computational Toolkit For Charged Terms

Research output: ThesisDoctoral thesis 1 (Research UU / Graduation UU)

Abstract

This thesis investigates word-level biases, employing computational linguistics methods to support decolonisation efforts within cultural heritage institutions. Museum catalogues often contain contested terminology shaped by colonial legacies. The identification and retrospective handling of such word-level biases and the negative biases potentially propagated by such terms, is a key activity in current decolonisation initiatives of museum institutions in the Western world. The research develops and demonstrates the utility of computational methods for detecting and analysing the biases of contested and potentially contested terms, with the goal of providing interpretable insights to heritage professionals. Through a series of studies spanning historical newspapers, literary fiction, and social media, the thesis proposes methodologies and supporting pipelines, which identify key behaviours, attributes, received behaviours, and linguistic markers of known problematic terms as core vectors for social biasing for interpretation. Outcomes are shown to align well with known biases of well-recognised problematic terminology. In addition to surface level context features, the research explores proxy signals for prejudicial narratives, specifically offering empirical support for the phenomenon of aporophobia—disdain for poverty—by revealing the disproportionate association of low socio-economic contexts with negatively connoted topics. Additionally, the thesis introduces the ConConCor dataset—multi-sentence contexts annotated for offensiveness—offering a foundation for future studies into subjective judgments of harm in contested language. Overall, the research provides a methodological and conceptual framework for uncovering latent biases in cultural data, equipping institutions with tools to help facilitate decolonisation efforts.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Utrecht University
Supervisors/Advisors
  • van den Bosch, Antal, Supervisor
  • Hollink, L., Co-supervisor, External person
  • Marieke van Erp, M.G.J., Co-supervisor, External person
Award date10 Jul 2025
Publisher
DOIs
Publication statusPublished - 10 Jul 2025

Keywords

  • NLP
  • language models
  • linguistic variations
  • sociolinguistics
  • corpus linguistics
  • structural causal modelling
  • decolonisation
  • context analysis

Fingerprint

Dive into the research topics of 'Words Matter: A Computational Toolkit For Charged Terms'. Together they form a unique fingerprint.

Cite this