A Canonical Form for Flexible Multiword Expressions

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

This paper proposes a canonical form for Multiword Expressions (MWEs), in particular for the Dutch language. The canonical form can be enriched with all kinds of annotations that can be used to describe the properties of the MWE and its components. It also introduces the DUCAME (DUtch CAnonical Multiword Expressions) lexical resource with more than 11k MWEs in canonical form. DUCAME is used in MWE-Finder to automatically generate queries for searching for flexible MWEs in large text corpora.
Original languageEnglish
Title of host publication2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
EditorsNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Place of PublicationTorino, Italia
PublisherELRA and ICCL
Pages91-101
Number of pages11
ISBN (Electronic)9782493814104
Publication statusPublished - 1 May 2024

Bibliographical note

Publisher Copyright:
© 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Keywords

  • Dutch
  • automatic query generation
  • canonical form
  • design principles
  • multiword expressions
  • searching for multiword expressions

Fingerprint

Dive into the research topics of 'A Canonical Form for Flexible Multiword Expressions'. Together they form a unique fingerprint.

Cite this