The Fragility of Statistically Significant Findings from Depression Randomized Controlled Trials

Session Type
Evidence synthesis innovations and technology
Luo M1, Li Y2, Wang Y2, Huang J2, Liu Z1, Gao Y1, Chai Q1, Liu J1, Fei Y1
1Centre for Evidence-Based Chinese Medicine, Beijing University of Chinese Medicine, China
2Beijing University of Chinese Medicine, China

Background:Efficacy of an intervention is commonly evaluated using the P-value, however, recent literature has drawn attention to the potential inadequacy about robustness of threshold P-value as a tool for reporting discontinuous outcomes in clinical trials. The fragility index (FI), which is the minimum number of changes from events to non-events resulting in loss of statistical significance, has been suggested as a means to aid the interpretation of trial results.
Objectives:In this systematic survey, we calculated the FI of clinical trials in depression, which report positive eligible outcomes.
Methods:This is a retrospective analysis of randomized controlled trials in depression published from 2012-2022 in The NEJM, The Lancet, JAMA, The BMJ, and 35 top journals listed in Psychiatry-SSCI category in the field of psychiatric medicine focusing primarily on depression. Two-arm studies with 1:1 randomization and significant positive results for discontinuous outcomes were eligible for the fragility index calculation, which involves the iterative reduction of an event to the experimental group (defined as the group with the larger number of events in positive trials) and concomitant subtraction of a non-event from that group, until positive significance (defined as p<0·05 by Fisher’s exact test) is lost.
Results:We identified 1120 trials, while a total of 130 randomized controlled trials were included, of which 33 trials fulfilled with two eligible outcomes (remission rate and response rate). The median FI of total trials included was 4 (25th-75th percentile, 2-8; range, 1-40), and greater than 33.85% of trials had a FI of equal or less to 2. 68.46 % of trials reported the loss to follow-up greater than their FI. Trial sample size, the total number of events, the impact factor of journals which included trials published, and the ratio of the sample size of enrolled to the sample size of screened were associated with FI. In trials with two eligible outcomes, the distributions of FI were different but positively correlated.
Conclusions:In depression trials reporting positive discontinuous outcomes, the findings often hinge on small numbers of events. Clinicians should be wary of basing decisions on trials with a low FI.
Patient, public and/or healthcare consumer involvement:24 345 patients.