Analysis of continuous outcomes when a large proportion of the data are zeroes

Date & Time
Monday, September 4, 2023, 12:30 PM - 2:00 PM
Location Name
Session Type
Statistical methods
Sandercock J1, Geneen LJ1
1Systematic Review Initiative, NHS Blood and Transplant, Oxford, UK

Background: Continuous outcomes are not straightforward to analyse if there are a large proportion of zeroes, as frequently occurs with outcomes such as volume of blood transfused for conditions for which blood transfusion is not always necessary. Should we analyse volume per person randomised (PPR) or per person transfused (PPT)? If numbers randomised and transfused are both known, we can convert between PPR and PPT using the relationship between sums of squares familiar from analysis of variance, but several statistical and practical issues arise. Statistical Issues: Standard methods of analysis for continuous outcomes rely on the Central Limit Theorem. When the underlying distribution is extreme owing to a large proportion of zeroes in the data, an intention-to-treat analysis (PPR) may require sample sizes of many hundreds or even thousands before the sampling distribution of the mean is closely approximated by the normal distribution. A PPT analysis will converge with smaller sample sizes, but if only a small proportion required transfusion, this still requires a large number randomised. Practical Issues: Some trials report the mean volume transfused without reporting the number receiving a transfusion and so would be excluded from a PPT analysis. It is sometimes unclear whether a trial has reported PPR or PPT. Converting between PPR and PPT may produce impossible values (negative sums of squares), helping to clarify this. This may also highlight cases where a standard error has been mislabelled as a standard deviation.
Conclusions: When extracting these sorts of outcomes, it is important to be sure which value (PPR or PPT) was reported and to ensure that pooled results are all based on the same denominator. Extracted data should be carefully assessed for unusual or impossible values. There is no clear answer as to which analysis is best under any given circumstance; it may be useful to present both and note where the sample sizes available may not be sufficient to meet the underlying assumptions. We have produced a spreadsheet (available on request) to convert between PPR and PPT, including error flags and plots to help diagnose issues with the data extracted.
Patient, public and/or healthcare consumer involvement: None.