Abstract
The move towards preserving judgement disagreements in NLP requires the identification of adequate evaluation metrics. We identify a set of key properties that such metrics should have, and assess the extent to which natural candidates for soft evaluation such as Cross Entropy satisfy such properties. We employ a theoretical framework, supported by a visual approach, by practical examples, and by the analysis of a real case scenario. Our results indicate that Cross Entropy can result in fairly paradoxical results in some cases, whereas other measures Manhattan distance and Euclidean distance exhibit a more intuitive behavior, at least for the case of binary classification.
Original language | English |
---|---|
Title of host publication | 3rd Workshop on Perspectivist Approaches to NLP, NLPerspectives 2024 at LREC-COLING 2024 - Workshop Proceedings |
Editors | Gavin Abercrombie, Valerio Basile, Davide Bernardi, Shiran Dudy, Simona Frenda, Lucy Havens, Sara Tonelli |
Publisher | European Language Resources Association (ELRA) |
Pages | 84-94 |
Number of pages | 11 |
ISBN (Electronic) | 9782493814234 |
ISBN (Print) | 9782493814234 |
Publication status | Published - 21 May 2024 |
Event | 3rd Workshop on Perspectivist Approaches to NLP, NLPerspectives 2024 - Torino, Italy Duration: 21 May 2024 → … |
Publication series
Name | 3rd Workshop on Perspectivist Approaches to NLP, NLPerspectives 2024 at LREC-COLING 2024 - Workshop Proceedings |
---|
Conference
Conference | 3rd Workshop on Perspectivist Approaches to NLP, NLPerspectives 2024 |
---|---|
Country/Territory | Italy |
City | Torino |
Period | 21/05/24 → … |
Bibliographical note
Publisher Copyright:© 2024 ELRA Language Resource Association.