JST and rJST: joint estimation of sentiment and topics in textual data using a semi-supervised approach

Christian Pipal*, Martijn Schoonvelde, Gijs Schumacher, Max Boiten

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

This paper demonstrates the performance of the Joint Sentiment Topic model (JST) and the reversed Joint Sentiment Topic model (rJST) in measuring sentiment in political speeches, comparing them against a set of popular methods for sentiment analysis: widely used off-the-shelf sentiment dictionaries; an embeddings-enhanced dictionary approach; Latent Semantic Scaling, a semi-supervised approach; and a zero-shot transformer-based approach using a large language model (GPT-4). The findings reveal JST’s superiority over all non-transformer-based approaches in predicting human-coded sentiment in multiple languages and its ability to replicate known sentiment trends in legislative speech. rJST, meanwhile, provides valuable topic-specific sentiment estimates, responsive to political dynamics and significant events. Both models are, however, outperformed by transformer-based models like GPT-4. Additionally, the paper introduces the ’sentitopics’ R-package, designed to facilitate the use of JST and rJST in computational text analysis workflows. This package is compatible with popular text analysis tools, making the models accessible for applied researchers in communication science.

Original languageEnglish
JournalCommunication Methods and Measures
DOIs
Publication statusE-pub ahead of print - 18 Oct 2024

Bibliographical note

Publisher Copyright:
© 2024 The Author(s). Published with license by Taylor & Francis Group, LLC.

Funding

Pipal and Schumacher received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme under grant agreement No 759079, POLEMIC

FundersFunder number
European Research Council (ERC)759079

    Fingerprint

    Dive into the research topics of 'JST and rJST: joint estimation of sentiment and topics in textual data using a semi-supervised approach'. Together they form a unique fingerprint.

    Cite this