A probabilistic annotation model for crowdsourcing coreference

Silviu Paun, Jon Chamberlain, Udo Kruschwitz, Juntao Yu, Massimo Poesio

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    Abstract

    The availability of large scale annotated corpora for coreference is essential to the development of the field. However, creating resources at the required scale via expert annotation would be too expensive. Crowdsourcing has been proposed as an alternative; but this approach has not been widely used for coreference. This paper addresses one crucial hurdle on the way to make this possible, by introducing a new model of annotation for aggregating crowdsourced anaphoric annotations. The model is evaluated along three dimensions: the accuracy of the inferred mention pairs, the quality of the post-hoc constructed silver chains, and the viability of using the silver chains as an alternative to the expert-annotated chains in training a state of the art coreference system. The results suggest that our model can extract from crowdsourced annotations coreference chains of comparable quality to those obtained with expert annotation.

    Original languageEnglish
    Title of host publicationProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
    EditorsEllen Riloff, David Chiang, Julia Hockenmaier, Jun'ichi Tsujii
    PublisherAssociation for Computational Linguistics
    Pages1926-1937
    Number of pages12
    ISBN (Electronic)9781948087841
    DOIs
    Publication statusPublished - 2018
    Event2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 - Brussels, Belgium
    Duration: 31 Oct 20184 Nov 2018

    Conference

    Conference2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
    Country/TerritoryBelgium
    CityBrussels
    Period31/10/184/11/18

    Fingerprint

    Dive into the research topics of 'A probabilistic annotation model for crowdsourcing coreference'. Together they form a unique fingerprint.

    Cite this