Abstract
In this paper we showcase and evaluate MWE-Finder, a system that allows users to search for occurrences of an MWE in a large Dutch text corpus. To this end, we conduct three small case studies, and discuss the results in detail. We make use of the MWEs 0geen *+haan zal naar iets kraaien ‘no one will say anything about something’, iemand zal 0dat *+varken wassen ‘someone will deal with that problem’ and iemand zal iemand het hemd van het lijf vragen ‘someone will want to know all the ins and outs of something from someone’, which are all in canonical form following Odijk (2023) and Odijk and Kroon (2024).
The results show that MWE-Finder is very accurate in retrieving the target MWEs, reaching an accuracy of 93.7%, and an F1-score of 95.2%. The case studies additionally lay bare points of improvement of MWE-Finder, specifically concerning the enrichment of syntactic parses by making the object relation explicit in certain constructions.
The results show that MWE-Finder is very accurate in retrieving the target MWEs, reaching an accuracy of 93.7%, and an F1-score of 95.2%. The case studies additionally lay bare points of improvement of MWE-Finder, specifically concerning the enrichment of syntactic parses by making the object relation explicit in certain constructions.
Original language | English |
---|---|
Title of host publication | Selected papers from the CLARIN Annual Conference 2023 |
Place of Publication | Linköping |
Publisher | Linköping University Electronic Press, Linköpings universitet |
Chapter | 9 |
Pages | 126-142 |
ISBN (Electronic) | 978-91-8075-740-9 |
DOIs | |
Publication status | Published - 11 Jul 2024 |
Publication series
Name | Linköping Electronic Conference Proceedings |
---|---|
Publisher | Linköping University Press |
Volume | 210 |
ISSN (Print) | 1650-3686 |
ISSN (Electronic) | 1650-3740 |
Keywords
- Multiword Expressions
- Dutch
- GrETEL
- treebanks