The reproducibility of the 6-min walking test (6MWT) needs to be more solidly studied. This study aimed to investigate the reproducibility of two 6MWTs performed on subsequent days in a large and representative sample of patients with chronic obstructive pulmonary disease (COPD), and to quantify the learning effect between the two tests, as well as its determinants.
In a retrospective observational study, 1,514 patients with COPD performed two 6MWTs on subsequent days. Other measurements included body composition (dual X-ray absorptiometry), dyspnoea (Medical Research Council scale) and comorbidity (Charlson index).
Although the 6MWT was reproducible (intraclass correlation coefficient = 0.93), patients walked farther in the second test (391 m, 95% CI 155–585 m versus 418 m, 95% CI 185–605 m; p<0.0001). On average, the second 6MWT increased by 27 m (or 7%), and 82% of patients improved in the second test. Determinants of improvement ≥42 m in the second test (upper limit of the clinically important change) were as follows: first 6MWT <350 m, Charlson index <2 and body mass index <30 kg·m−2 (OR 2.49, 0.76 and 0.60, respectively).
The 6MWT was statistically reproducible in a representative sample of patients with COPD. However, the vast majority of patients improved significantly in the second test by an average learning effect of 27 m.
Chronic obstructive pulmonary disease (COPD) is a systemic disease characterised by progressive airflow limitation, exercise intolerance and physical inactivity [1, 2]. Although the degree of airflow obstruction is frequently used as a marker of disease severity, it does not adequately reflect extrapulmonary manifestations of COPD [3, 4]. Some modalities of field tests are available to assess these patients' exercise capacity , which better reflect the extrapulmonary features of the disease.
The 6-min walking test (6MWT) is a simple and inexpensive test that provides a global and integrated response of both physical (pulmonary and nonpulmonary factors) and psychological factors [5, 6]. The 6MWT is used to assess functional exercise capacity before and after interventions [7, 8] and as a predictor of morbidity and mortality in COPD .
In general, the 6MWT is a reliable test in COPD patients but a learning effect has been suggested [10–15], i.e. patients achieving a considerably higher walked distance when a second test is performed. Indeed, there is controversy about the size of the learning effect, which may range from 2.6% to 22% [10, 11, 16–20]. Moreover, the external validity of the previous studies is limited due to the pre-specified inclusion criteria [10, 11, 13, 18]. Furthermore, researchers usually used statistical analysis that did not demonstrate trends and agreement between both 6MWTs, thereby compromising the internal validity of the results . Additionally, the determinants of improvement in walking distance remain unknown. Considering the importance of the 6MWT in clinical practice when assessing patients with COPD and in putting together exercise training programmes , its reproducibility and determining factors need to be more solidly studied.
Therefore, the aim of this study was, first, to investigate the reproducibility of the 6MWT in a large and representative sample of patients with COPD and to quantify the learning effect between two 6MWTs performed on subsequent days. Secondly, we intended to study the determining factors of changes in the second 6MWT.
In a retrospective observational study, 1,683 patients with COPD were included. Data were collected from patients who were evaluated during the baseline assessment before entering a pulmonary rehabilitation programme at CIRO+ (Centre of Expertise for Chronic Organ Failure; Horn, the Netherlands) from January 2005 to August 2009. These retrospective analyses are institutional review board exempt due to the use of de-identified, pre-existing data.
Inclusion criteria were as follows: diagnosis of COPD according to criteria determined by the Global Initiative for Chronic Obstructive Lung Disease (GOLD) ; clinical stability (absence of exacerbations in the last 3 months); nonparticipation in rehabilitation programmes during the last 2 yrs; absence of unstable cardiac disease; and absence of neurological comorbidities that may limit 6MWT performance. 169 patients were excluded because they only performed one 6MWT. Therefore, 1,514 patients with stable COPD (59% males) completed two 6MWTs and were included in the analyses (table 1).
Two 6MWTs were performed according to the guidelines of the American Thoracic Society  on subsequent days using a triangular walking course of 125 m. The walking tests were executed by a physiotherapist or a biometrist who walked behind the patient. Patients were instructed to walk as fast as they could and the distance walked was registered after 6 min. During the test, standardised encouraging phrases were given to patients each minute. Patients who used walking aids in daily life were allowed to use the devices during the 6MWT. Cardiac frequency (fC), arterial oxygen saturation as measured by pulse oximetry (Sp,O2), perceived dyspnoea and leg fatigue (modified Borg scale) were assessed before and after the 6MWTs. Oxygen supplementation was used if required, and oxygen desaturation during the 6MWT was defined as the difference between end and beginning Sp,O2 of ≥ -4% and/or an end Sp,O2 of <88% . Reference values for delta of Borg score and fC were those from van Stel et al.  and for 6MWT were those from Troosters et al. .
Spirometry (Masterlab®; Jaeger AG, Würzburg, Germany) was performed according to the European Respiratory Society recommendations  and reference values were those from Knudson et al. . Lung function parameters used for analysis were forced expiratory volume in 1 s (FEV1) and forced vital capacity (FVC).
A total body scan was performed by whole-body dual energy X-ray absorptiometry using a Lunar Prodigy system (GE Healthcare, Piscataway, NJ, USA) as described previously . Fat-free mass (FFM) was provided from the sum of lean and mineral bone mass. The FFM index (FFMI) was calculated as FFM (kg) divided by height squared (m2) . The body mass index (BMI) was calculated as body weight (kg) divided by height squared (m2).
Functional limitation due to dyspnoea
The Medical Research Council (MRC) scale was used to evaluate the level of functional limitation due to breathlessness in activities of daily living .
The body mass index, airflow obstruction, dyspnoea and exercise capacity (BODE) index is a multidimensional grading system used as a predictor of risk of death in COPD patients and as an outcome reflecting disease severity. The index was calculated according to Celli et al. .
The presence of comorbidities was evaluated using the Charlson index. It is composed of 19 categories of comorbidities and the total score reflects the cumulative increased likelihood of 1-yr mortality . A higher score indicates more severe burden of comorbidities.
Statistical analysis was performed using the statistical packge SPSS 17.0 (SPSS Inc., Chicago, IL, USA) and GraphPad Prism 5 (GraphPad Software Inc., La Jolla, CA, USA). Data were described as mean±sd. The intra-class correlation coefficient was used to verify the reproducibility of the 6MWT. The Bland and Altman plot was used to evaluate trend and agreement between first and second tests and, additionally, the paired t-test was used to compare outcome parameters between the two tests. Unpaired t-test and one-way ANOVA (post hoc Tukey) were used to compare patient characteristics between different groups. Ordinal data were analysed using nonparametric tests. Logistic regression assessed determinant factors of a clinically important change in walked distance between the first and second tests (≥42 m ). A p-value of ≤0.05 was considered to be statistically significant for all.
In general, patients presented with moderate-to-very severe COPD. 24% of patients used ambulatory oxygen therapy, 32% used rollators and 1.3% used canes. The majority of patients (60%) had an abnormal BMI (low, overweight or obese), whereas 30% had an abnormally low FFMI. Moreover, 18% of patients had Charlson index >2 points, and 97% scored grade 2 or higher on the MRC dyspnoea scale. Males were older than females, presenting with lower FEV1 and FVC, higher inspiratory capacity, lower MRC dyspnoea grade and higher FFMI (table 1).
Reproducibility of the 6MWT
On average, patients walked 391 m (95% CI 155–585 m) in the first 6MWT and 418 m (95% CI 185–605 m) in the second 6MWT. The distance walked in the second test increased on average by 27 m (95% CI -37–107 m). 35% of the patients had a poor first 6MWT (<350 m). This proportion of poor walkers decreased in the second 6MWT to 28%.
82% of patients with COPD walked farther during the second 6MWT. In fact, 28% of the subjects increased their walked distance by >42 m, which is currently the upper limit of change considered to be an important treatment effect . Conversely, 6% of COPD patients decreased their walked distance by >42 m in the second test.
Rollator users tended to have a higher change than nonrollator users (mean±sd 30±51 versus 25±46 m, respectively; p = 0.058). No difference was observed when comparing the changes in walked distance between male and female subjects (26±51 versus 28±42 m, respectively; p>0.05). A similar proportion of males and females increased their walked distance in the second test (81% versus 83%, respectively). Furthermore, no difference was observed when comparing the changes in walked distance among GOLD stages I, II, III and IV (26±38, 24±48, 27±45 and 26±49 m, respectively; p>0.05).
The Bland and Altman plot confirms that patients increased the distance walked in the second test (fig. 1). Indeed, statistically the 6MWT was reproducible (intraclass correlation coefficient (ICC) = 0.93, p<0.0001). Then, again, the limits of agreement between the two 6MWTs ranged from -67 to 120 m. There was no correlation of the mean distance walked in the two 6MWTs on the improvement in the second test (r = -0.01; p = 0.61).
A model of logistic regression showed that a higher odds ratio (OR) for a clinically important improvement of ≥42 m in the second 6MWT in comparison with the first 6MWT occurred in patients who had a poor first 6MWT (<350 m) (OR 2.49; 95% CI 1.80–3.46; p<0.0001), patients without self-reported comorbidities other than COPD itself (OR 0.76; 95% CI 0.58–0.99; p = 0.043) and patients with BMI <30 kg·m−2 (OR 0.60; 95% CI 0.43–0.85; p = 0.004) (table 2). Conversely, none of the studied patient characteristics was a determinant for decreasing the second 6MWT.
Reproducibility of oxygen desaturation, fC and Borg symptom scores
On average, the change in oxygen saturation during the first and second 6MWTs was -5.7% (95% CI -15–0%) and -5.5% (95% CI -16–0%), respectively. The change in oxygen saturation was reproducible (ICC = 0.81, p<0.0001). The Bland and Altman plot shows agreement between change in oxygen saturation during both 6MWTs, with limits of agreement ranging from -7% to 8% (fig. 2). Moreover, the sensitivity and specificity to detect oxygen desaturation during the second 6MWT based on oxygen desaturation during the first 6MWT were 80% and 77%, respectively.
When comparing the change in fC, patients in GOLD stage IV had a lower change (16±14 beats·min−1) compared with GOLD I, II and III during the first test (24±15, 24±15 and 22±14 beats·min−1, respectively; p>0.05 for all) and during the second test (18±13 beats·min−1 versus 25±13, 24±14 and 22±14 beats·min−1, respectively; p>0.05 for all). However, when taking into account the whole group, change in fC during the first 6MWT was on average 21 beats·min−1 (95% CI 0–46 beats·min−1) and 21 beats·min−1 (1–46 beats·min−1) during the second 6MWT. The reproducibility of change in fC when two 6MWTs were performed was only modest (ICC = 0.62; p<0.0001).
Change in Borg dyspnoea score was 2.60 points (0–6 points) and 2.52 points (0–6 points) during the first and second 6MWT, respectively. In addition, the change in Borg leg fatigue score was 1.98 points (0–5 points) during the first test and 2.01 points (0–6 points) during the second test. Once again, the reproducibility of change in Borg dyspnoea and leg fatigue score when two 6MWTs were performed was only modest (ICC = 0.59 and p<0.0001 for both).
This study examined the reproducibility of the 6MWT when two tests on subsequent days were performed in daily clinical practice in a large sample of patients with COPD entering pulmonary rehabilitation. Statistically, the 6MWT showed to be reproducible. However, a majority of patients increased the distance walked in the second test (mean change: 27 m or 7% of the initial 6MWT), suggesting that there is a considerable learning effect. The Bland and Altman analysis confirmed that the second 6MWT was better than the first, with limits of agreement largely exceeding 42 m, which is considered the upper limit of a clinically important change in the 6MWT . Furthermore, we determined that a poor first 6MWT (<350 m), Charlson index <2 points or a BMI <30 kg·m−2 are significant determinants of a clinically important change in the second 6MWT (≥42 m) in comparison with the first 6MWT.
Various studies have reported a learning effect when repeated walking tests are performed in COPD. However, the results are controversial regarding the size of this learning effect. For example, Troosters et al.  found a learning effect of 2.6% in 20 patients with COPD who performed two 6MWTs in the same day. Leach et al.  found that patients with COPD and interstitial lung disease increased 8.6% the distance walked when two 6MWTs were performed in the same day. Stevens et al.  demonstrated a learning effect of ∼10% in patients with advanced lung disease who were participants of a pulmonary rehabilitation programme. Spencer et al.  found a 7% improvement in the second 6MWT in patients with COPD, which is in accordance with our findings. Sciurba et al.  also found an increase of 7% when two 6MWTs were performed. However, they included only patients with severe and very severe COPD. Besides the heterogeneity in the learning effect size, those studies often included patients with COPD and patients with other lung diseases [11, 13, 18] or only patients with severe disease , and this could compromise the external validity of their findings. Moreover, the majority of studies did not show the limits of agreement between the walking distance assessments, compromising the interpretation of the results and their internal validity [10, 18–20].
The clinical implication of our findings is evident as the disagreement between two 6MWTs can influence interpretation of the performance in the test. Our results demonstrated that 82% of patients walked further in the second 6MWT, meaning that the vast majority of patients entering into a pulmonary rehabilitation programme could have an incorrect conclusion about their functional exercise capacity if only one test was performed. Indeed, we have found that a poor first 6MWT, a score of <2 points on the Charlson index and non-obesity are significant determinants of a change in 6MWT ≥42 m. Therefore, incorrect exercise training workloads may be applied during exercise training in case of only one baseline 6MWT. Furthermore, this disagreement can be clinically relevant as the 6MWT has been widely used as a predictor of morbidity and mortality in patients with COPD. For example, the proportion of patients who had a poor 6MWT (<350 m) decreased when the best of two walked distances was considered for analysis in our sample (35% versus 28%). It shows that, at the first test, 7% of patients had a false poor walked distance detected. Therefore, it is recommended that at least two 6MWT are performed in clinical settings.
In our study, we also found that the change in oxygen saturation during the 6MWT was reproducible. Moreover, its sensitivity and specificity were 0.80 and 0.77, respectively. These findings are interesting for clinical practice as the 6MWT has been used to determine the need for oxygen ambulatory prescription in patients with COPD and results about its reproducibility are controversial. Poulain et al.  found that oxygen desaturation, defined as a fall of ≥4% of resting Sp,O2 value during at least 3 min, was reproducible when three 6MWTs were performed in 10 patients with COPD. In contrast, Chatterjee et al.  demonstrated that the 6MWT oxygen saturation has only modest reproducibility in determining the need for ambulatory oxygen (Sp,O2 ≤88% during at least 5 s) in stable COPD patients actively participating in a pulmonary rehabilitation programme when three 6MWTs were performed (κ statistic = 0.62, 72% of agreement between measurements).
Strengths, limitations and future perspectives
Although some authors have already studied the reproducibility of the 6MWT, our study raises several points that strengthen the trustworthiness of our findings. First, the present sample (n = 1,514) was undoubtedly representative of a COPD population and comprised of patients with different levels of disease severity. Secondly, the statistical analysis produced results of clinical importance as not only average data were described, but also limits of agreement between assessments. Furthermore, some methodological strategies were carefully provided to ensure results reliability: the same walking course was always used, avoiding influence of length and layout track on the walked distance ; tests were performed at the same period of the day; encouragement and instructions were standardised and supervisors were familiarised with the 6MWT ; and patients who had previously participated in rehabilitation programmes were not included in the study. However, some limitations occurred. First, some variables that can influence the improvement of the second 6MWT, such as anxiety, depression and balance, were not studied . Secondly, the tests were not always executed by the same supervisor; however, as previously mentioned, all supervisors were strongly familiarised with the 6MWT standardisation. In addition, the present results are only applicable to 6MWT performed before a pulmonary rehabilitation programme and on subsequent days.
In summary, the 6MWT was reproducible in a large and representative sample of patients with COPD. However, the vast majority of patients increased their second 6MWT, presenting a learning effect of 27 m (7%) and with limits of agreement that largely exceed the minimal clinical difference. A poor performance in the first 6MWT, combined with few comorbidities and non-obesity, were the most relevant determinant factors for a change between the first and second tests that exceeds the clinical importance. Standardisation of the 6MWT and at least two tests are necessary to avoid incorrect assessment of functional exercise capacity.
We acknowledge the support of the European Respiratory Society, Fellowship number 634.
For editorial comments see page 244.
N.A. Hernandes was a recipient of a European Respiratory Society short-term fellowship (grant number: 634).
Statement of Interest
- Received September 4, 2010.
- Accepted November 26, 2010.
- ©ERS 2011