Large-scale simulation study of active learning models for systematic reviews

Jelle Jasper Teijema*, Jonathan de Bruin, Ayoub Bagheri, Rens van de Schoot

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Despite progress in active learning, evaluation remains limited by constraints in simulation size, infrastructure, and dataset availability. This study advocates for large-scale simulations as the gold standard for evaluating active learning models in systematic review screening. Two large-scale simulations, totaling over 29 thousand runs, assessed active learning solutions. The first study evaluated 13 combinations of classification models and feature extraction techniques using high-quality datasets from the SYNERGY dataset. The second expanded this to 92 model combinations with additional classifiers and feature extractors. In every scenario tested, active learning outperformed random screening. The performance gained varied across datasets, models, and screening progression, ranging from considerable to near-flawless results. The findings demonstrate that active learning consistently outperforms random screening in systematic review tasks, offering significant efficiency gains. While the extent of improvement varies depending on the dataset, model choice, and screening stage, the overall advantage is clear. Since model performance differs, active learning systems should remain adaptable to accommodate new classifiers and feature extraction techniques. The publicly available results underscore the importance of open benchmarking to ensure reproducibility and the development of robust, generalizable active learning strategies.
Original languageEnglish
Article numbere33219
Number of pages22
JournalInternational Journal of Data Science and Analytics
DOIs
Publication statusE-pub ahead of print - 2 May 2025

Bibliographical note

Publisher Copyright:
© The Author(s) 2025.

Keywords

  • Active learning
  • Large-scale simulation
  • Screening phase
  • Systematic review

Fingerprint

Dive into the research topics of 'Large-scale simulation study of active learning models for systematic reviews'. Together they form a unique fingerprint.

Cite this