Abstract
Developments in the Dutch language during the 17th century, part of the Early Modern period, form an active research topic in historical linguistics and literature. To enable automatic quantitative analysis, a corpus of letters by the 17th century Dutch author and politician P.C. Hooft is manually annotated with parts-of-speech, document segmentation and sociolinguistic metadata. The corpus is developed as part of the Nederlab online research portal, which is available through the CLARIN ERIC European research infrastructure. This paper discusses the design and evaluation of the annotation effort, as well as adding new annotations to an existing annotated corpus.
Original language | English |
---|---|
Title of host publication | Proceedings of the Eleventh International Conference on Language Resources and Evaluation |
Editors | Nicoletta Calzolari |
Place of Publication | Miyazaki,Japan |
Publisher | European Language Resources Association (ELRA) |
Pages | 1146-1152 |
ISBN (Electronic) | 979-10-95546-00-9 |
Publication status | Published - 7 May 2018 |
Event | Language Resources and Evaluation Conference (LREC 2018): LREC - Duration: 7 May 2018 → … |
Conference
Conference | Language Resources and Evaluation Conference (LREC 2018) |
---|---|
Period | 7/05/18 → … |
Keywords
- Early Modern Dutch
- POS tagging
- sociolinguistic annotation
- data integration