Abstract
There is uncertainty about the interpretation of changes in the 6-min walk distance (6MWD) in chronic obstructive pulmonary disease (COPD) patients and whether the minimal important difference (MID) for this useful outcome measure exists.
Data were used from nine trials enrolling a wide spectrum of COPD patients with 6MWD at baseline and follow-up and used to determine threshold values for important changes in 6MWD using three distribution-based methods. Anchor-based methods to determine a MID were also evaluated.
Data were included of 460 COPD patients with a mean±sd forced expiratory volume in one second (FEV1) of 39.2±14.1% predicted and 6MWD of 361±112 m at baseline. Threshold values for important effects in 6MWD were between 29 and 42 m, respectively, using the empirical rule effect size and the standardised response mean. The threshold value was 35 m (95% confidence interval 30–42 m) based on the standard error of measurement. Correlations of 6MWD with patient-reported anchors were too low to provide meaningful MID estimates.
6-min walk distance should change by ∼35 m for patients with moderate to severe chronic obstructive pulmonary disease in order to represent an important effect. This corresponds to a 10% change of baseline 6-min walk distance. The low correlations of 6-min walk distance with patient-reported anchors question whether a minimal important difference exists for the 6-min walk distance.
- Chronic obstructive pulmonary disease
- exercise test
- interpretation
- randomised trials
- 6-min walk distance
The two most widely used outcomes in respiratory rehabilitation of patients with chronic obstructive pulmonary disease (COPD) are exercise capacity, e.g. as measured by the 6-min walk distance (6MWD), and heath-related quality of life (HRQL) 1. HRQL expresses the patient's perception of impairment and is, therefore, critical for decisions regarding healthcare interventions 2, 3. 6MWD is important for documenting changes during a physical exercise programme 4, but it also has become an important measure in COPD because it is associated with outcomes important to patients, such as activities of daily living, exacerbations and death 5, 6.
The minimal important difference (MID) has become the standard approach for the interpretation of the clinical relevance of changes in these outcomes induced by respiratory rehabilitation or other treatments 7, 8. The MID is “the smallest difference in the outcome of interest that informed patients perceive as important and which would lead the patient or informed proxies including physicians to consider a change in management” 9. While HRQL and interpretation of its changes are arguably more important for COPD patients, many investigators use 6MWD as the primary outcome 1. However, trial planning, in particular sample size calculations and interpretation of trial results require knowledge of what constitutes an important change in 6MWD.
A decade ago, Redelmeier et al. 10 determined that ∼54 m represents an important change in 6MWD using a single methodological approach. This approach relied on between-patient comparison and was based on cross-sectional correlations (r = 0.59) and longitudinal correlations (r = 0.20) of the 6MWD with self-reported categorical scale anchors. Since then, numerous studies have used this estimate for sample size calculations and interpretation of their trials 1, 11, 12.
Despite agreement that a single approach is not sufficient to determine what constitutes an important effect, and despite some scepticism that 54 m might be too great a change in the 6MWD, investigators have yet not applied other acceptable methods, such as distribution-based methods and within-patient anchor-based approaches 7, 13–15. Given the importance of an interpretation aid for the design and interpretation of studies in COPD 16, the current authors’ aim was to provide more evidence regarding the MID as interpretation support for the 6MWD using various suggested methods in a large sample of COPD patients with varying degrees of disease severity.
METHODS
Studies and patients
All completed studies were included on which the present authors were principal or co-investigators and that fulfilled the following criteria: prospectively planned longitudinal studies with approval from ethical committees; inclusion of COPD patients with any disease severity; at least one arm using effective treatment; at least one measurement of 6MWD at baseline and follow-up; and inclusion of patient-important outcomes for which the MID had been established, such as the Chronic Respiratory Questionnaire (MID 0.5 point), St George’s Respiratory Questionnaire (MID 4 points), Feeling Thermometer (MID 6 points), or other COPD-specific instruments.
Measurement of 6MWD and other outcomes used for anchor-based methods
In all included trials, patients who completed the 6MWD followed standard protocols for this test under supervision of qualified staff 17. Details about these tests have been reported in previous papers 11, 18–22. Briefly, in five trials, patients completed the 6MWD at least twice also at follow-up. Four of these trials have been published 18, 19, 22, 23 and one is unpublished (M.J. Mador, unpublished data). For these analyses, the data of the best 6MWD each at baseline and follow-up were used.
Based on the methodological framework for the MID, few outcomes fulfil the requirements as anchors to determine the MID of 6MWD. The Chronic Respiratory Questionnaire 2, the St. George’s Respiratory Questionnaire 3, the Feeling Thermometer 24, 25 and transition ratings were considered as potential anchors, because the MID has been established for these outcomes. 26.
Statistical analysis
The analyses were based on one combined data set generated from all included studies. The primary aim was to use several approaches to determine the MID of 6MWD, including anchor- and distribution-based methods. The anchor-based method was considered 24 using the Chronic Respiratory Questionnaire or other COPD-specific instruments. However, correlations between the anchors and 6MWD were low (r<0.30) and did not fulfil the methodological criteria (r≥0.5) for providing meaningful estimates for the MID 24.
As a consequence of the low correlations between 6MWD and patient-reported outcomes, the current authors do not refer to the threshold values that are derived in the present study as MID. However, the analyses were designed to help in interpreting changes in 6MWD on three distribution-based methods. The most established method is the standard error of measurement (sem) proposed by Wyrwich and co-workers 27, 28 The sem is equal to the standard deviation (sd)×√(1-r), where r is the reliability coefficient. The intraclass correlation coefficient from the two baseline 6MWDs was used as a measure of test–retest reliability where between-person differences served as the signal (numerator) and within-person differences as the noise (denominator). In order to assess the variability of the sem estimates, the nonparametric bootstrap for the generation of 95% confidence intervals (CIs) was used 29. These CIs are not necessarily symmetric around the sem estimates. Finally, the analysis was stratified for age, forced expiratory volume in one second (FEV1) and sex, in order to evaluate whether the threshold values for a relevant effect in 6MWD differed between these subgroups.
Another distribution-based method was used based on effect sizes 30. The sd of 6MWD change scores (difference between baseline and follow-up SMWD) were calculated. The sd of 6MWD change scores were used because respiratory rehabilitation has an established and patient-important effect on exercise capacity 1. According to Cohen, 0.5×sd units represent a moderate effect size and investigators usually consider this estimate to correspond to an important effect 30. Finally, an empirical rule effect size proposed by Sloan et al. 31 which combines the empirical theorem and Cohen's definition of small, moderate and large changes was determined. In total, 99% of all observations fall, according to the empirical rule, within 6×sd. A change of 0.5×sd (moderate effect according to Cohen), corresponding to an ∼8% change, represents an important effect. It was determined that 8% of the empirically observed range (from the 0.5th to the 99.5th percentile) corresponds to a moderate effect or a relevant effect.
Finally, the proportion of patients with change scores (between baseline and follow-up) exceeding the MID of patient reported outcomes (Chronic Respiratory Questionnaire, St. George’s Respiratory Questionnaire and Feeling Thermometer) were compared with the proportion of patients with relevant effects in 6MWD. Although this approach is also limited by low correlations between patient-reported outcomes and 6MWD and it is influenced by the responsiveness of the measurements 11, 32. This analysis would provide some reassurance that the distribution-based methods provided valid estimates of important changes.
RESULTS
In total, nine trials that provided data for the present analysis (table 1⇓). In all trials, patients followed a respiratory rehabilitation programme that included physical exercise as the main component, along with patient education, breathing exercises or relaxation sessions. The 460 patients had a mean age of 68.9±8.3 yrs, 71% were male, mean FEV1 was 39.2±14.1% predicted and mean 6MWD at baseline was 361±112 m.
Seven studies with 305 patients provided sufficient data to calculate an intraclass correlation coefficient. Table 2⇓ shows the intraclass correlation coefficients and sds to derive the sem. The overall sem was 35 m (95% CI 30–42 m). Figure 1⇓ shows that the sems were similar across studies, with the exception of one trial in which it was only 20 m (95% CI 16–26 m). Figure 2⇓ demonstrates that in the stratified analysis for FEV1, sex and age there were no significant differences between subgroups.
The other distribution-based methods yielded similar estimates of what constitutes an important change in 6MWD. These analyses were based on all 460 patients. The moderate effect size (0.5×sd of change scores) was 29 m for all patients. Results for single studies were smaller (18–32 m) because of their inclusion and exclusion criteria. The empirical rule effect size for all patients was 42 m. Within studies, single estimates of the empirical rule effect size were smaller (20–41 m) because the range of 6MWD was smaller. This was a consequence of more restricted patient groups as a result of restricted inclusion and exclusion criteria. Again, stratified analyses by FEV1, sex and age did not show any significant differences between subgroups for these two distribution-based methods.
In total, 60.4 and 57.3% of patients exceeded the MID of the Chronic Respiratory Questionnaire and the Feeling Thermometer, respectively. This was comparable to the proportion of patients with 6MWD changes ≥35 m (50.7%), the average threshold for an important effect in 6MWD in the present analysis. The change scores of 24.4% of patients exceeded the MID of the St George's Respiratory Questionnaire.
DISCUSSION
Three different distribution-based methods showed that 6MWD should change by ∼35 m for patients with moderate to severe COPD to represent an important effect. Since correlations of 6MWD with patient-reported anchors were low, anchor-based methods were inappropriate and the interpretation aid for important effects derived from the present study does not reflect the MID of a patient-reported outcome. The pooled analyses across several studies yielded greater estimates compared with those based on single studies, as a result of widening the overall inclusion criteria for the study population. For example, one study included only Global Initiative for Chronic Obstructive Lung Disease stage III to IV patients 11. Therefore, the results based on the pooled data set were considered to be more informative because they generalise better to COPD patients.
The present study has strengths and limitations. An advantage is that nine trials with patients from five countries were included. Thus, the present population represented a broad COPD patient spectrum. This is particularly important when using distribution-based methods to avoid underestimation of what constitutes an important effect. In more homogenous study populations, as is generally the case in single studies, threshold values for important effects can be underestimated because distributions are narrower (or sd smaller), as a consequence of stricter eligibility criteria. Another advantage of including a broad patient spectrum to determine what constitutes an important effect is that it can be used for any COPD population, including those enrolled in pharmacological intervention trials. The distribution of 6MWD of COPD patients included in these trials is likely to be covered by the distribution observed in the present analysis. However, the threshold estimates might not apply to COPD patients who are minimally limited in their exercise capacity, and for whom the 6MWD may not be a sensible test. Another strength is that all studies were methodologically sound studies following strict study protocols. A limitation is that the anchor-based approach could not be used because correlations were too low (correlation coefficients <0.5) and thus a solid MID for the 6MWD could not be detected 24. Finally, the analyses could not be stratified for baseline 6MWD because, by building subgroups, the sd would be unduly influenced. For example, if patients are stratified based on quartiles, the sd of patients in the two middle quartiles have much lower sds than those in the lowest and highest quartiles. Thus for distribution-based methods a valid and “unrestricted” sd was required.
The present estimates to interpret effects in 6MWD are lower than those reported by Redelmeier et al. 10 (54 m). The present authors do not believe that the difference between the results of the study by Redelmeier et al. 10 and the present study is due to differences in patient characteristics. In the study by Redelmeier et al. 10, patients also participated in a respiratory rehabilitation programme and they appear to be similar to patients included in the present study. In addition, neither the present stratified analyses nor those of Redelmeier et al. 10 indicated that the interpretation of effects differs between subgroups. Differences in the study design and statistical reasons could account for this difference. The sample size of the study by Redelmeier et al. 10 was smaller and 95% CIs around the 54 m were wide (37–71 m), with the lower boundary within the present estimates. Thus, the estimate of 54 m might differ from the present results only by chance, which the current authors consider to be a likely reason. Another possibility for lower estimates in the present study is that distribution-based approaches were used whereas Redelmeier et al. 10 used an anchor-based approach, in which patients judged their own walking ability relative to that of other patients. It is possible that this approach leads to larger estimates of what constitutes an important effect in general but there are limited data supporting or refuting this hypothesis. However, it is likely that the stringent criteria for interpreting change implemented by the present study would not have allowed Redelmeier et al. 10 to develop an MID estimate.
What evidence should future studies provide in order to further support the interpretation of effects in 6MWD? To determine the MID of patient-reported outcomes, anchor-based methods are recommended 14, 15. However, 6MWD is not a patient-reported outcome and, thus, these recommendations do not fully apply. They would only apply if the correlations with patient-reported outcomes (such as the HRQL instruments used in the present study) were sufficiently high for the change scores. The reason is that change score correlations are required in order to be certain that the new measure for which one intends to determine an MID has indeed measured a change related to a patient-important aspect. Redelmeier et al. 10 also considered a within-patient anchor-based approach, but found that correlations with 6MWD were too low for these anchors to provide meaningful estimates. Only the cross-sectional, but not the longitudinal, between-patient anchor-based approach was based on strong correlations that justify anchor-based methods 13, 14.
In agreement with these results, correlations of 6MWD with patient-reported outcomes were also too low in the present study. In the current authors’ view, it is unlikely that appropriate anchors reflecting the patients' perspective exist for 6MWD. However, investigators should not refrain from using anchor-based methods with patient-reported outcomes to explore if other anchors might fulfil these criteria. In particular, future studies should include a broad spectrum of COPD patients and attempt to use distribution- and anchor-based methods, if methodologically appropriate. Finally, only systematic reviews of these methodological studies may definitively inform clinicians and investigators about the interpretation of changes in 6MWD and ensure that the limitations of single studies can be detected.
If threshold values for important effects in 6MWD were in fact lower than previously assumed, this finding would have important implications for the design of studies. Randomised trials would need larger sample sizes to detect an effect of 35 m instead of 54 m, but they would be more likely to detect important changes if they were indeed sufficiently powered. Given that the 6MWD is a continuous outcome, the implications for sample size are not severe. Also, an increasing number of studies compare active treatments, such as drugs or physical exercise, in order to explore whether they are similarly effective in equivalence studies 11. For the design of these studies it is essential to establish a priori a threshold for what constitutes an important effect. Taking equivalence boundaries of 35 m (these two interventions would be deemed equally effective if the difference and its 95% CI were within ±35 m) is more conservative than equivalence boundaries of 54 m and also has important implications for study design and patients.
Conversely, knowledge of what constitutes an important effect informs the interpretation of clinical trials 16, Consider randomised trials comparing respiratory rehabilitation and usual care. In nine (81.8%) out of 11 trials, effect estimates exceeded the MID of the Chronic Respiratory Questionnaire establishing large and patient important effects of this intervention 21, 34–43. In contrast, assuming 54 m for a relevant change in 6MWD only three (15.8%) out of 19 trials showed effects above this threshold 21, 34–42, 44–52. This inconsistency between the interpretation of effects on HRQL and 6MWD may raise the suspicion that 54 m may present an exceedingly high estimate for an important change. If the estimate of ∼35 m is considered for the 6MWD, 12 (63.2%) out of the 19 trials showed patient-important effects, showing greater agreement with the interpretation of the effects of rehabilitation on HRQL. However, a note of caution is in order. Despite the validity of the present results for the statistical approaches, the findings by Redelmeier et al. 10 and the present observation of low correlations between patient-reported outcomes and the 6MWD cast doubt on the importance of the 6MWD as a primary patient-important outcome.
In conclusion, the present analysis of a large set of data across a broad spectrum of chronic obstructive pulmonary disease patients suggests that an important effect in 6-min walk distance may be lower than previously assumed. Three distribution-based methods showed that 6-min walk distance should change by ∼35 m for patients with moderate to severe chronic obstructive pulmonary disease to represent a relevant effect. This corresponded to a ∼10% change of the baseline 6-min walk distance (350 m) in these patients.
Support statement
M.A. Puhan is supported by a career award of the Swiss National Science Foundation (# 3233B0/115216/1). H.J. Schünemann is funded by a European Commission: The Human Factor, Mobility and Marie Curie Actions. Scientist Reintegration Grant (IGR 42192).
Statement of interest
Statements of interest for M.A. Puhan and H.J. Schünemann can be found at www.erj.ersjournals.com/misc/statements.shtml
Acknowledgments
The authors would like to thank all physiotherapists and physicians for assistance in the conduct of the study.
- Received October 24, 2007.
- Accepted May 19, 2008.
- © ERS Journals Ltd