A comparison of duplicate detection automation tools: a head-to-head comparison study
2Bond University Library, Bond University, Australia.
3Centre for Reviews and Dissemination, University of York, United Kingdom.
4PICO Portal, United States
5CAMARADES, University of Edinburgh, United Kingdom
6Anschutz Medical Campus, University of Colorado, United States
7University College London, United Kingdom
8Haskayne School of Business, University of Calgary, Canada
A key task when conducting a systematic review is to identify and remove duplicate records retrieved by a literature search across multiple databases, a process referred to as deduplication. Deduplication is time-consuming and error prone, particularly when processing thousands of references from multiple sources. Some approaches use automation combined with manual checking by humans and might be done using reference management software or bespoke deduplication tools that are available, either standalone or within systematic review software. Some are only accessible through expensive, proprietary software or operate in a “black box” environment. It is not known how these tools compare against each other and which performs best to minimise errors and reduce the time spent deduplicating.
Objectives: We are evaluating how the eight bespoke deduplication tools perform to inform choices about which to use. We are evaluating the following tools: 1) Covidence; 2) EPPI-Reviewer; 3) the Deduplicator; 4) Rayyan; 5) PICO Portal; 6) Deduclick; 7) ASySD; and 8) HubMeta Deduplicator.
Methods: Our sample set comprises re-run searches from a random selection of 10 Cochrane reviews published between 2020 and 2022. We will independently deduplicate these with two experienced information specialists to create 10 deduplicated gold standard sets. Each of the sets will be deduplicated using each tool under investigation and compared with the gold standard sets. The following outcomes will be measured: 1. Unique references removed 2. Duplicates missed 3. Additional duplicates identified 4. Time required to deduplicate 5. Qualitative analysis of unexpected findings of interest
Results: Early testing suggests that the automation tools that comprise a human checking component produce fewer errors than those that are fully automated. The majority of these errors are missed duplicates, with few unique references removed. The tools that comprise a human checking component do require more time to deduplicate records sets. We expect to present the error rates of each tool and processing time.
Conclusions: Our conclusions will be presented at the conference.
Patient, public and/or healthcare consumer involvement: Although not directly relevant to patients, this study will help patients by contributing to methods that will result in more robust and efficient evidence production.