Is Quality Appraisal Enough? Confidence in Heterogenous Interventions’ Effectiveness
Quality appraisal (QA) tools, e.g. RoB2, are used to assess the likely risk of bias caused by methodological limitations of studies included in systematic reviews.
Objectives: To consider the utility of standard QA tools and other methodological issues that, whilst ostensibly not reflecting an individual study’s quality, undermine confidence in a body of primary evidence.
Methods: We conducted a systematic review of controlled studies evaluating structural adolescent contraceptive interventions in low- and middle-income countries. We planned a Qualitative Comparative Analysis (QCA) to explore differences between studies we categorised as ‘likely effective’ and ‘likely ineffective’. However, methodological variability undermined our confidence in this categorisation. Furthermore, these methodological issues were not captured by standard QA tools.
Results: We included 17 studies with heterogenous study designs. Some methodological characteristics, e.g. whether outcomes were self-reported, were captured in standard QA tools. However, others, such as variable lengths of follow-up and different baseline levels of the outcome measure, could not. Similarly, heterogeneity in the exact outcome measure used and who was included in the sample for that particular outcome, all affected the comparability of effectiveness without being methodological limitations per se. Variation in these characteristics influenced our confidence in comparing and categorising the interventions’ effectiveness. These methodological concerns were spread throughout the set of studies, i.e. both the ‘likely effective’ or ‘likely ineffective’ sets.
Conclusions: QA tools do not capture all of the factors that can affect reviewers’ confidence in studies’ findings, or the comparability of findings. This is particularly likely for topics lacking standardised measurements, or heterogenous methods, which is not unusual in the fields of public health and development. Reviewers should be aware of the risks of over-relying on standard tools in determining their confidence in studies’ findings. Although we could have only included a narrow set of studies with comparable methodological characteristics, this would have prevented synthesis and limited our findings. Reviews have a role in highlighting methodological heterogeneity and calling for greater consensus on methods and outcomes among primary researchers. PPI: Two advisory groups were convened. One, comprising global decision-makers and evaluators, met twice. Another, involving Mozambican adolescents, met once.