Machine-learning assisted screening increases efficiency of systematic review

Date & Time
Monday, September 4, 2023, 2:55 PM - 3:05 PM
Location Name
St James
Session Type
Oral presentation
Rapid reviews and other rapid evidence products
Oral session
Rapid reviews and other rapid evidence products 1
Qureshi R1, Robinson K2, Butler M3, Agai E4
1University of Colorado Anschutz Medical Campus, United States
2Johns Hopkins School of Medicine, United States
3University of Minnesota School of Public Health, United States
4PICO Portal, United States

Background: Conventional systematic review (SR) methods are time-consuming and highly resource intensive. Artificial intelligence (AI) algorithms such as machine learning and deep learning can help reviewers complete these tasks in less time and with fewer resources. PICO Portal (PP) is an AI-assisted SR platform that prioritizes articles for screening using several algorithms including both decision tree and deep learning models.
Objectives: To assess the AI-assisted screening in PICO Portal.
Methods: Our data set comprised eight completed SRs, each using two independent screeners, with a total of 56,728 records (range: 4,204 to 14,185) on a range of topics from social to biomedical sciences. For each SR, we simulated the screening using batches of 100 articles to train and build predictions for eligibility, re-ranking successive articles, and comparing the predicted eligibility with the actual results from the SRs. We plotted the proportions of title/abstract and full-text included records that were captured by the AI screening at the title and abstract level for each project and calculated a weighted average of this efficiency (by project size). We meta-analyzed the sensitivity and specificity of the predictions versus the reviewers’ final decisions using Stata ‘metadta’.
Results: We estimate that if the active learning AI predictions had been used, reviewers would have needed to screen only 20-50% of title/abstracts to capture 95% of eligible records (Figure 1). After screening 10%, 25%, 50%, and 70% of title/abstract records, the average project would have captured approximately 60%, 85%, 95%, and 99% of the records included in the title/abstract stage (Figure 2). Sensitivity was better than specificity (95% vs. 68%) (Figures 3 and 4).
Conclusions: Based on our analysis, we estimate that 40-60% of screening effort can be saved using PICO Portal, an AI-assisted, web-based, SR platform. Future research should examine the impact of missing the final 5% of records on review conclusions and assess the resource-benefit ratio. Patient relevance and involvement: Our findings and future recommendations are from the researcher and funder perspective. Our conclusions directly impact the amount of time reviewers need to complete an SR. This work did not involve any stakeholders, patients, or consumers.

Figure 1a & 1b.png
Figure 2.png
FIgure 3.png
Figure 4.png