Addressing the challenges of reconstructing systematic reviews datasets: a case study and a noisy label filter procedure

R Neeleman, CHC Leenaars, M Oud, F Weijdema, R van de Schoot*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Systematic reviews and meta-analyses typically require significant time and effort. Machine learning models have the potential to enhance screening efficiency in these processes. To effectively evaluate such models, fully labeled datasets—detailing all records screened by humans and their labeling decisions—are imperative. This paper presents the creation of a comprehensive dataset for a systematic review of treatments for Borderline Personality Disorder, as reported by Oud et al. (2018) for running a simulation study. The authors adhered to the PRISMA guidelines and published both the search query and the list of included records, but the complete dataset with all labels was not disclosed. We replicated their search and, facing the absence of initial screening data, introduced a Noisy Label Filter (NLF) procedure using active learning to validate noisy labels. Following the NLF application, no further relevant records were found. A simulation study employing the reconstructed dataset demonstrated that active learning could reduce screening time by 82.30% compared to random reading. The paper discusses potential causes for discrepancies, provides recommendations, and introduces a decision tree to assist in reconstructing datasets for the purpose of running simulation studies.
Original languageEnglish
Article number69
Pages (from-to)1-12
Number of pages12
JournalSystematic Reviews
Volume13
Issue number1
DOIs
Publication statusPublished - 17 Feb 2024

Bibliographical note

Publisher Copyright:
© The Author(s) 2024.

Funding

The first and last authors were supported by a grant from the Dutch Research Council (grant no. 406.22.GO.048).

FundersFunder number
Nederlandse Organisatie voor Wetenschappelijk OnderzoekGO.048, 406.22
Aard- en Levenswetenschappen, Nederlandse Organisatie voor Wetenschappelijk Onderzoek406.22.GO.048

    Keywords

    • ASReview
    • Active learning
    • Noisily labeled records
    • Replication
    • Systematic reviews

    Fingerprint

    Dive into the research topics of 'Addressing the challenges of reconstructing systematic reviews datasets: a case study and a noisy label filter procedure'. Together they form a unique fingerprint.

    Cite this