Addressing the Challenges of Reconstructing Systematic Reviews Datasets V2

Rutger Chris Neeleman, Cathalijn Leenaars, Matthijs Oud, Felix Weijdema, Rens van de Schoot

Research output: Working paperPreprintAcademic

Abstract


Systematic reviews are crucial yet time-consuming and labor-intensive, managing large numbers of studies. Active learning techniques can improve the efficiency of screening and prioritize the most likely relevant studies. The performance of these techniques can only be evaluated using fully labeled datasets, which are not always available. The main goal of the current paper was to create such a dataset by reconstructing a fully labeled dataset based on the search queries, number of results for each query, list of included papers, and number of initially screened articles. A systematic review of the treatment of Borderline Personality Disorder, correctly following the PRISMA guidelines for reporting systematic reviews, was our case study. The reconstructed dataset (k=1053) did not exactly match the initial dataset (k=1013), due to mismatching in the closed-source search tools, retracted papers, and other reasons outside the influence of the original authors. Consequently, although the reconstructed dataset contained all initially relevant records, we could not simply label all other records as irrelevant; within the label noise, additional relevant records could be present. Therefore, we developed a noisy label filter (NLF) procedure to deal with unknown labels. After applying the NLF procedure, we used the reconstructed dataset for a simulation study using the open-source software ASReview. On average, 77.36% of screening time could have been saved, and Naïve Bayes had the best model fit (work saved over sampling = 82.30%). In the discussion section, we provide recommendations and a decision tree for reconstructing datasets.
Original languageEnglish
PublisherPsyArXiv
DOIs
Publication statusPublished - 3 Aug 2023

Fingerprint

Dive into the research topics of 'Addressing the Challenges of Reconstructing Systematic Reviews Datasets V2'. Together they form a unique fingerprint.

Cite this