Extraction of semantic relations in noisy user-generated law enforcement data

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    Abstract

    Relation extraction from text is a well-known and extensively studied topic in Natural Language Processing research. However, the implementation of relation extraction approaches in real-world application scenarios raises various methodological considerations which are often left implicit in existing research. This paper explores these considerations using a real-world dataset of user-generated police reports in Dutch. The use of linguistic features based on dependency trees is investigated, including an ablation analysis of the importance of individual features. The construction of negative examples for machine learning models is discussed, as well as the construction of a baseline model. The methodological implications of using a small dataset are discussed in terms of the design and performance of a Long Short Term Memory network as well as a Support Vector Machine. In general the models perform well, however the definition of the classification task, and in particular the construction of negative examples, are shown to have a large impact on classification accuracy and subsequently on the interpretation of the evaluation results.
    Original languageEnglish
    Title of host publicationProceedings of the 13th IEEE International Conference on Semantic Computing
    PublisherIEEE
    Pages79-86
    Number of pages8
    ISBN (Print)978-1-5386-6783-5
    DOIs
    Publication statusPublished - 30 Jan 2019
    Event13th International Conference on Semantic Computing - Newport Beach, United States
    Duration: 30 Jan 20191 Feb 2019
    https://www.ieee-icsc.org/

    Conference

    Conference13th International Conference on Semantic Computing
    Abbreviated titleICSC
    Country/TerritoryUnited States
    CityNewport Beach
    Period30/01/191/02/19
    Internet address

    Fingerprint

    Dive into the research topics of 'Extraction of semantic relations in noisy user-generated law enforcement data'. Together they form a unique fingerprint.

    Cite this