Abstract
We extract nouns and corresponding cooccurrent targeted context features from a large corpus of Dutch language newspaper articles, from 1950s through the 1990s. Applying a well-established approach for scoring context feature and centre word associativity, we explore using the scores in the task of identifying key characteristics of known–charged terminology. Then use these features to draw parallels between known–charged and other terms. In the context of the very current decolonisation efforts amongst museum institutions, such approaches offer an opportunity to condense large quantities of data into the most significant, salient information for digestion by heritage professionals. The methods were found to indeed yield insights into known and candidate charged terms.
Original language | English |
---|---|
Title of host publication | Language, Data and Knowledge 2023 (LDK 2023) |
Subtitle of host publication | Proceedings of the 4th Conference on Language, Data and Knowledge |
Pages | 97-108 |
ISBN (Electronic) | 978-989-54081-5-3 |
DOIs | |
Publication status | Published - 23 Sept 2023 |
Keywords
- language
- data
- knowledge