Abstract
Technology-assisted review (TAR) is software that aids in high-recall information retrieval tasks, such as abstract screening for systematic literature reviews. Often, TAR systems use a form of Active Learning (AL); during this process, human reviewers label documents as relevant or irrelevant according to a screening protocol, while the system incrementally updates a classifier based on the reviewers’ previous decisions. After each model update, the system uses the classifier to rerank the remaining workload by prioritizing predicted relevant documents over irrelevant ones, enabling a reduced workload. Recently, studies have been performed that study the ability of solely using Large Language Models (LLMs) to perform this task by supplying the LLM prompts that contain the task, screening protocol, and a document from the corpus. The LLM then provides a classification of the document in question. While the results of these studies are promising, the LLM’s predictions are not error-free, resulting in a recall or precision that is lower than desired. In this work, we propose a new Active Learning method for TAR that integrates the results of the LLM in the review process that may correct some of the shortcomings of the LLM results, leveraging a reduced workload with respect to current TAR systems.
Original language | English |
---|---|
Pages (from-to) | 77-95 |
Number of pages | 19 |
Journal | CEUR Workshop Proceedings |
Volume | 3770 |
Publication status | Published - 30 Sept 2024 |
Event | 8th International Workshop and Tutorial on Interactive Adaptive Learning, IAL@ECML-PKDD 2024 - Vilnius, Lithuania Duration: 9 Sept 2024 → … |
Keywords
- active learning
- information retrieval
- large language model
- technology-assisted review
- weak supervision