Abstract
Being able to obtain timely information about an event, like a protest, becomes increasingly more relevant with the rise of affective polarisation and social unrest over the world. Nowadays, large-scale protests tend to be organised and broadcast through social media. Analysing social media platforms like X has proven to be an effective method to follow events during a protest. Thus, we trained several language models on Dutch tweets to analyse their ability to classify if a tweet expresses discontent, considering these tweets may contain practical information about a protest. Our results show that models pre-trained on Twitter data, including Bernice and TwHIN-BERT, outperform models that are not. Additionally, the results showed that Sentence Transformers is a promising model. The added value of oversampling is greater for models that were not trained on Twitter data. In line with previous work, pre-processing the data did not help a transformer language model to make better predictions.
Original language | English |
---|---|
Pages | 6-23 |
Number of pages | 18 |
Publication status | Published - Mar 2024 |
Event | 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text - Radisson Blu, St Julians, Malta Duration: 22 Mar 2024 → 22 Mar 2024 https://aclanthology.org/2024.case-1.0 |
Workshop
Workshop | 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text |
---|---|
Abbreviated title | CASE 2024 |
Country/Territory | Malta |
City | St Julians |
Period | 22/03/24 → 22/03/24 |
Internet address |
Bibliographical note
Publisher Copyright:© 2024 Association for Computational Linguistics.
Funding
This work was supported by the Swedish Research Council, award number 2022-03196.
Funders | Funder number |
---|---|
Vetenskapsrådet | 2022-03196 |