To the Editor:
The same patient outcome data from clinical trial results, when presented as absolute or relative changes, may appear different in magnitude. Recommendations are to report both absolute and relative, or at least baseline, data from which to calculate absolute values [1, 2]. A systematic review of efficacy trials demonstrated that only relative values were reported in most study abstracts (88%) and the main text (75%) [3].
To inform clinical practice, outcome improvements, whether relative or absolute, must be statistically significant and clinically meaningful. A minimal clinically important difference (MCID) should inform sample size calculations for clinical trials.
Two main methods identify an MCID (distribution and anchor-based methods); ideally used together to interpret one in the context of the other [4]. The distribution method is a statistical calculation based on the baseline variability of the measure in the population studied. This gives an effect size (change after intervention divided by standard deviation of baseline scores), the magnitude of which relates to a small, moderate or large clinical effect [5]. Thus the distribution method can only be used to calculate an absolute MCID as there is no standard deviation of baseline score for a relative measure.
The anchor-based method relates the change in score to another patient-rated effect (e.g. relief score, function, or global impression of change). The anchor-based method can be used to calculate the relative MCID.
Debate surrounds whether the MCID for symptoms (e.g. pain or breathlessness) should be based on absolute or relative measures. Measures may include 0–100 mm visual analogue scale (VAS) or 0–10 numerical rating scale (NRS) (0 NRS is no symptom and 10 NRS or 100 mm VAS is the worst imaginable symptom) for each aspect of a symptom. An absolute difference of 10 mm VAS may be perceived as a larger effect if baseline intensity was 30 mm (33% relative reduction), than a baseline intensity of 90 mm (11% relative reduction) [6].
In studies of chronic breathlessness, absolute or relative differences are used in sample size calculations. Clinically meaningful relative differences are still consensus based, varying between 10% and 25% [7, 8], although the absolute MCID for chronic breathlessness using the distribution and patient anchor-based methods have been calculated from pooled patient data [9]. Using the same dataset, and with respect to assessments of breathlessness intensity, this current analysis investigates: 1) whether the variability of the difference from baseline is more stable for absolute or relative measures; and 2) the patient anchor-based method calculated relative MCID.
This study analysed anonymised individual patient data pooled from four clinical trials of oral opioids for the management of breathlessness (three randomised control trials (RCT) and one observational study) as previously described [7, 10–12]. A total of 213 sets of data from 178 participants allowed for the calculation of effect size.
National Health Service ethical permission was not required for pooling anonymised data for secondary analyses. Appropriate ethics approval and written informed consent by participants had been obtained for contributing studies.
The relationship between end of intervention and baseline breathlessness intensity were plotted. To check whether variability was related to magnitude, the relationship of end-of-intervention minus baseline intensity with baseline intensity was examined, firstly for absolute and then for relative values. The relationships between these measures and baseline, and the pattern of variability of responses according to baseline intensity were displayed graphically.
The patient anchor-based method for calculating MCID, using absolute measures, was previously reported [9]. In the current communication, we use the same methods to calculate the MCID expressed as a relative value. At the end of the three placebo-controlled crossover RCTs, participants provided a blinded choice for their preferred arm. Participants' perceptions of change in breathlessness intensity, expressed as mean ratios, were examined in relation to the preferred study arm (opioid, placebo or neither). We found the ratio in the preferred arm and compared preferences for drug or placebo, repeating this for the other arm. This allowed comparison of ratios between arms using a paired t-test. As the distribution of ratios was highly skewed, the calculation was repeated using log ratios.
End of intervention and baseline breathlessness were related. As expected, a negative correlation between end-of-intervention minus baseline breathlessness and baseline breathlessness was seen. (fig. 1). Importantly, a uniform variability across all baseline intensities was seen with these absolute measures. This was not the case with ratio measures, where ratios became very large for lower intensity of breathlessness increasing the variability. Using log ratios decreased the variability.
Variability against baseline intensity. The difference (end-of-treatment breathlessness intensity minus baseline measures) against baseline shown as absolute values has uniform variability when plotted against baseline intensity (a), whereas variability becomes very wide at lower intensities when values are plotted using ratios (b).
Using data from the three placebo-controlled crossover trials (N=95) [10–12], there were 113 evaluable preference responses from 93 participants to the question of which treatment arm gave the best benefit for their breathlessness. A preference was given in 93 of the 113 responses (drug n=62, placebo n=33, no preference n=18). The mean ratio of breathlessness scores (end/baseline) in the preferred arm and combining all preferences (drug or placebo) was mean±sd, 0.87±0.87. The mean ratio in the arm not chosen was 1.00±0.92. There were 90 preferences where both a positive and a negative preference were stated. For these, a comparison of the ratios in the preferred and other arms showed a difference of −0.14. Using log ratios, not including zero ratios, the comparison of the preferred and other arm ratios was a difference of −0.21 (after anti-log).
We believe that this is the first data-based demonstration that absolute measures of change in dyspnoea are preferable for study planning and evaluation of research results. There was uniform variation in response presented by change in absolute measures despite the relationship with baseline intensity, irrespective of baseline breathlessness intensity, which was not seen with breathlessness response expressed as a relative change.
When relative reductions are reported, using blinded patient preference as a patient anchor-based method, this study demonstrates that the MCID for the relative reduction should be 14% (using ratios) or 21% (using log ratios).
Eligibility criteria for entry to the studies contributing data to the pooled analysis resulted in fewer measures of “mild” breathlessness than for measures of ≥30 mm. Thus results for mild baseline dyspnoea are less easily interpreted. However, our data show that variability of ratio measures becomes very large for low baseline breathlessness. This finding does not appear to support previous recommendations in the pain literature that in studies with no minimum baseline symptom intensity requirement clinical relevance should be defined in terms of relative change [13].
Results of clinical trials for chronic breathlessness should be presented as both absolute and relative measures since each differentially informs the interpretation of study results. The MCID expressed as a relative reduction is between 14% and 21%.
However, there is uniform variation in response presented as absolute measures despite the relationship with baseline intensity, but not with relative measures. In view of this and because both distribution and patient anchor-based method can be used to calculate the MCID for absolute measures, we suggest that absolute measures should be used for the MCID in the calculation of sample sizes.
Acknowledgments
Thanks go to Tracey Hawkes (Scarborough Hospital, Scarborough, UK) for able assistance in combining the datasets.
Footnotes
Conflict of interest: Disclosures can be found alongside the online version of this article at erj.ersjournals.com
- Received March 12, 2014.
- Accepted July 23, 2014.
- ©ERS 2014