Abstract
We studied the distribution profiles and repeatability of key exercise performance parameters in the first large multicentre trials to include these measurements in chronic obstructive pulmonary disease (COPD).
After a screening visit, 463 subjects with COPD (mean±sd forced expiratory volume in 1 s 43±13% predicted) completed two run-in visits before treatment randomisation. At the run-in visits, measurements were conducted at rest, at a standardised time near end-exercise (isotime) and at peak exercise during constant work rate (CWR) cycle tests at 75% of each individual’s maximum work capacity. The intraclass correlation coefficient was used to evaluate the test-retest repeatability of measurements of endurance time (ET), inspiratory capacity (IC), ventilation and dyspnoea intensity (Borg scale) during exercise.
IC, ventilation and dyspnoea ratings were normally distributed; ET showed rightward skew (median<mean, skewness of 10.9 (much greater than zero)) with 16% of the sample exceeding 1 sd of the mean. ET was highly repeatable across run-in visits: 7.9±4.8 and 8.4±5.1 min (R = 0.84). IC values at rest, isotime and peak exercise were all highly repeatable (R≥0.87). Ventilation was repeatable over the same time-points (R≥0.92), as was dyspnoea intensity at isotime (R = 0.79) and at peak exercise (R = 0.81).
In conclusion, key perceptual and ventilatory parameters can be reliably measured during CWR cycle exercise in multicentre clinical trials in moderate to very severe COPD.
Measurements of exercise endurance time (ET), dynamic pulmonary hyperinflation and dyspnoea intensity during cardiopulmonary exercise testing are increasingly employed in clinical exercise testing laboratories as important outcome measures in chronic obstructive pulmonary disease (COPD) 1–8. Constant work rate (CWR) cycle exercise tests conducted at a fixed fraction of each individual’s pre-established maximal work rate have been shown to be more responsive than incremental cycle exercise tests or the 6-min walking distance tests for the purpose of bronchodilator evaluation 7. Indeed, CWR cycle endurance tests appear to be highly responsive to interventions that therapeutically alter dynamic ventilatory mechanics or ventilatory requirements, or both in tandem, in COPD 2–5, 7, 9. However, formal evaluation of the test-retest repeatability of this experimental technique has never previously been undertaken in the context of multicentre clinical trials. This information becomes critical for refining future study designs and exercise testing protocols and for interpreting the results of such studies. Moreover, validation of these measurements is a prerequisite for their future incorporation into testing protocols in clinical exercise testing laboratories.
Evidence from small single-centre studies of COPD patients has suggested satisfactory test-retest repeatability of measurements of perceptual and ventilatory parameters during CWR cycle exercise 10–19. In this study, a pooled analysis of data from two large multicentre, multinational clinical trials allowed us the unique opportunity to study the reliability of these perceptual and ventilatory responses to CWR cycle exercise conducted at 75% of peak work capacity (Wmax) in a large group of COPD patients. Our first objective, therefore, was to evaluate test-retest repeatability of measurements of ET, inspiratory capacity (IC), minute ventilation (V′E) and dyspnoea intensity during CWR exercise in a large patient population with moderate to very severe COPD.
Our second objective was to evaluate frequency distribution profiles of ET, dynamic pulmonary hyperinflation and dyspnoea intensity in this large population of patients. ET during CWR protocols in COPD is determined by the proximity of the targeted work rate during testing to the individuals’ critical power asymptote on the power–duration relationship 20. This could result in a high degree of variation in ET across individuals so as to make the test unsuitable for the purpose of physiological and subjective measurements in some patients. Frequency distribution analysis of ET in our population cohort allowed us to evaluate the general utility of this exercise testing protocol in COPD. Although dynamic pulmonary hyperinflation is an inevitable consequence of increased ventilatory demands in patients with expiratory flow limitation, very little information is available about the behaviour and distribution profile of operating lung volumes in COPD populations. This is the first large study to chart the variability of operating lung volumes during CWR cycle exercise in COPD and provides new insights into the heterogeneous pathophysiology of this condition. Finally, a number of studies have indicated that, during cycle exercise, leg discomfort and not dyspnoea is likely to be the proximate exercise-limiting symptom in patients with moderate-to-severe COPD 21, 22. We therefore evaluated the suitability of this exercise modality for the purpose of dyspnoea assessment.
METHODS
Subjects
Subjects were participants in one of two multicentre, randomised clinical trials examining the effects of tiotropium on exercise performance, dynamic lung volumes and exertional dyspnoea in COPD 2, 3. Subject eligibility criteria were identical in each trial and have been described previously 2, 3. Subjects were clinically stable patients with COPD, aged 40–70 yrs, with a cigarette smoking history >10 pack-yrs, a forced expiratory volume in 1 s (FEV1) ≤65% predicted and a plethysmographic functional residual capacity (FRC) ≥120% pred. Subjects with asthma or who had a contraindication to participation in pulmonary function or exercise testing were excluded. Subjects who had participated in the first trial were excluded from the second trial.
Study design
The two trials incorporated a similar study design. Studies were approved by the medical ethics committees of all sites (36 sites across five countries) and all subjects gave written informed consent before undertaking any study procedures. Eligibility criteria were assessed during an initial screening visit (visit 1) conducted 15 days prior to randomisation. At this visit, patients performed pulmonary function tests followed by a symptom-limited incremental cycle exercise test to determine a Wmax. Eligible patients completed another two visits (visit 2 on day -10, visit 3 on day -5) during a “run-in” phase, in order to familiarise them with all testing procedures and establish a standardised series of pre-randomisation familiarisation tests (i.e. training history). On these days, pulmonary function tests were followed by a symptom-limited CWR exercise test at 75% Wmax. On day 0 (visit 4), patients were subsequently randomised to the treatment portion of the studies. Visit 3 values were considered the baseline prior to the post-dose testing on visit 4. All analyses in the present study were performed on data obtained at the pre-randomisation visits. Subjects were instructed not to use their inhaled bronchodilators prior to testing based on a predefined schedule.
Procedures
Spirometry, body plethysmography and single-breath diffusing capacity of the lung for carbon monoxide (DL,CO) were performed in accordance with recommended techniques 23–25. The symptom-limited cycle exercise testing protocols have been described previously 2, 3. Across centres, various cardiopulmonary exercise testing systems (SensorMedics, Yorba Linda, CA, USA; MedGraphics, St Paul, MN, USA; Viasys Healthcare GmbH, Hoechberg, Germany) were used; all systems underwent stringent quality control and physiological validation prior to the study and were calibrated prior to each test. Incremental exercise tests included a steady-state resting period of ≥3 min followed by 3 min of loadless pedalling followed by stepwise increases of 10 W·min−1; Wmax was defined as the greatest work rate that the subject was able to sustain for ≥30 s. CWR exercise tests included a steady-state resting period followed by 1 min of loadless pedalling and then an immediate increase in work rate to 75% of Wmax; ET was recorded from the start of loaded pedalling to the point of symptom-limitation. IC was measured at rest, at 2-min intervals during exercise and at end-exercise (see online supplementary material) 2. The 10-point Borg scale 26 was used to assess dyspnoea intensity at similar time-points. “Rest” was defined as the steady-state period after ≥3 min of breathing on the mouthpiece while seated at rest on the cycle ergometer before exercise started, “peak” was defined as the last 30 s of loaded pedalling and “isotime” was defined as the duration of the shortest CWR exercise test on all testing days rounded down to the nearest full minute within each individual 2, 3.
Statistical analysis
Values are presented as mean±sd. A p-value of 0.05 was used as the threshold of statistical significance. Frequency statistics examining the primary reasons for stopping exercise were analysed using a Pearson Chi-squared test. Subsequent analyses were conducted for ET, dyspnoea intensity ratings and physiological measurements that included oxygen consumption (V′O2), V′E, tidal volume (VT), respiratory frequency (fR) and IC. Frequency distributions were analysed for normality using the one-sample Kolmogorov–Smirnov test. Non-normal distributions were further examined for skewness (symmetry) or kurtosis (peakedness), which were considered significant if the skewness or kurtosis coefficients were >2. Test-retest reproducibility was assessed using the intraclass correlation (ICC) 27. A one-way random-effect ANOVA model was used unless there was a systematic difference between variables measured in the first and second CWR exercise test, in which case a two-way random-effect ANOVA model was used. Systematic differences were assessed by two-tailed, paired t-tests. A lower limit of the 95% confidence interval for the ICC ≥0.75 indicates high reproducibility 28. Reproducibility was also evaluated by calculating the within-subject coefficient of variation (CV). The subject sd should be independent of the subject mean for the CV to be reliable; this assumption was tested using a rank correlation coefficient.
RESULTS
A total of 463 subjects completed the run-in phase of the two clinical trials and were included in this analysis. Subject characteristics are shown in table 1⇓. There was no significant heterogeneity between the two trials in these subject characteristics. There were also similar means and distributions for exercise outcomes (i.e. ET, V′O2, V′E, IC and dyspnoea ratings) across the two trials; therefore, the data was pooled for further analysis.
Subject characteristics
Peak incremental cycle work rate and V′O2 indicated significant exercise limitation (table 1⇑). The primary reasons for stopping incremental exercise were only available for 248 subjects completing the second validation study 3: breathing discomfort (51% of subjects), a combination of breathing and leg discomfort (33%), and leg discomfort (16%).
For both run-in CWR exercise tests, ET and dyspnoea intensity ratings were analysed for all 463 subjects; one subject was missing peak values for V′E and V′O2 and exercise IC values were available for 454 subjects. Peak V′O2 for both CWR tests was similar to that in the incremental test (tables 1⇑ and 2⇓). Similar to the incremental test, the majority of subjects stopped CWR exercise primarily due to dyspnoea or a combination of dyspnoea and leg discomfort (fig. 1⇓). The distribution of reasons for stopping was also consistent across both CWR tests for the 435 subjects with available data in the intent-to-treat group.
The patient-reported main reasons for stopping exercise are shown for the initial incremental cycle exercise test (▪; n = 248 in the second validation study only) and for both run-in constant work rate cycle exercise tests (▓: first test; □: second test; n = 435 in both studies).
Reproducibility of exercise test variables
Measurement distribution
By visual inspection and by the Kolmogorov–Smirnov test (p>0.05), physiological variables (V′O2, V′E, fR, VT and IC) measured at isotime during exercise and at peak exercise all followed a normal distribution. Frequency histograms for resting IC and the magnitude of change in IC during exercise are provided in figure 2⇓.
Frequency distributions for a and c) resting inspiratory capacity (IC) and b and d) the change in IC from rest to peak exercise (ΔIC peak-rest) are shown for the two pre-randomisation run-in visits. Results show a normal distribution and high reproducibility across visits. DH: dynamic pulmonary hyperinflation as indicated by a reduction in IC during exercise.
At both run-in visits, median ET values (6.4 and 6.8 min) were less than mean values (7.9 and 8.4 min), indicating a non-normal distribution. ET had an asymmetric distribution such that there was a significant skewness with a long right tail (skewness coefficient 10.9; fig. 3⇓). Log transformation of the data normalised the ET distribution (Kolmogorov–Smirnov test, p = 0.46; fig. 3⇓). The distribution of the difference between ET at each run-in visit was also normal.
Distribution of constant work rate cycle exercise endurance time, a and c) showing significant rightward skew but good reproducibility across run-in visits in 463 subjects with moderate to very severe chronic obstructive pulmonary disease. b and d) A log transformation normalised the distribution of endurance time.
At both run-in visits, mean and median values were similar for dyspnoea intensity ratings measured at both isotime and peak exercise. The distribution of dyspnoea ratings measured at isotime and at peak exercise was symmetrical but not normal: measurements at both time-points showed significant kurtosis with flattening of the distribution curve (kurtosis coefficient >2).
Reliability of measurements
Repeatability of resting spirometric and plethysmographic lung volume measurements across visits 1–3 was excellent (ICC values: IC 0.89, slow vital capacity 0.92, FRC 0.90). Repeatability of exercise variables is shown in table 2⇑. The assumption that the subject sd was independent of the subject mean was not violated for any variable except the ET, where there was a clear relationship, i.e. variability increased as ET increased (r = 0.122, p = 0.011). There was a small but significant increase in ET in the second constant-load test, suggesting a learning effect (mean difference 0.56 min; p<0.001); therefore, the two-way ANOVA model was used to calculate the ICC for this variable. The CV is not valid as a measure of the reproducibility of ET measurement because of this effect. There was also a very small decrease in isotime measurements of dyspnoea intensity (mean difference -0.19 Borg units; p = 0.007). The increase in ET from the first to second run-in visit correlated significantly with the corresponding decrease in dyspnoea intensity at isotime during exercise (r = 0.46, p<0.0005).
Cycle endurance time
Of the 463 subjects, 43 (9.3%) had an ET of >1 sd below the mean (i.e. <202 s) and 72 (15.6%) had an ET >1 sd above the mean (i.e. >810 s). Patients with a low ET had a significantly lower FEV1, IC and peak incremental exercise capacity (work rate and V′O2); those with the highest ET had a better preserved resting IC and higher peak exercise capacity (table 3⇓). The distribution of reasons for stopping exercise was also significantly different across these subgroups at the second run-in visit (p = 0.001): the low-ET group was more likely to stop primarily due to dyspnoea while the high-ET group stopped primarily due to leg discomfort (table 3⇓).
Cycle exercise endurance time(ET) subgroups
There was no significant relationship between ET variability across the two run-in visits (i.e. test-retest difference) and any of the following parameters: baseline FEV1 (% pred), FEV1/forced vital capacity (FVC) and IC (% pred); peak incremental V′O2 (mL·kg−1·min−1 or % pred) and peak work rate (% pred); height, weight and body mass index; or smoking history and COPD duration. Correlates of ET at the second run-in visit included (all p<0.0005) the peak incremental work rate (r = 0.370), peak incremental V′O2 (r = 0.336), pre-exercise resting IC (r = 0.313) and FEV1 (r = 0.274). The combination of the peak work rate and the resting IC accounted for 15% of the variability in ET (r = 0.38).
Dynamic lung hyperinflation
Of the 457 subjects with exercise IC measurements at the second run-in visit, 15% of subjects did not change or increased IC during exercise at the second run-in visit, i.e. the rest-to-peak exercise change in IC was outside 1 sd from the mean (> -0.04 L change). The remaining 85% of the sample who hyperinflated by at least 0.04 L had a significantly (p<0.05) lower FEV1 (by 8%), larger resting IC (by 17%), lower exercise capacity (by 7%) and greater dyspnoea/V′O2 slopes (by 16%). Interestingly, there was no significant difference across these subgroups for baseline ET or for their primary reasons for stopping exercise.
The best correlate of the exercise-induced change in IC during the second run-in visit was the resting pre-exercise IC (r = -0.400, p<0.0005). The larger the IC, the more it decreased during exercise; those with the smallest resting IC had the least change in IC (little room to reduce inspiratory reserve volume (IRV) further). The exercise-induced change in IC also correlated with FEV1/FVC ratio (r = 0.281, p<0.0005): those with the worst ratios experienced greater reductions in IC. The combination of resting IC and FEV1 explained 40% of the variance in the exercise-induced change in IC.
DISCUSSION
This study is the first to demonstrate high test-retest repeatability of ET, ventilation, IC and dyspnoea intensity ratings during CWR exercise in a large population of patients with COPD. Our results also provide novel information on the distribution characteristics of physiological and subjective responses to CWR cycle exercise testing in this COPD cohort.
Repeatability of ET measurements
The average peak V′O2 during incremental and CWR cycle tests were similar, thus confirming that the cycle endurance test (at 75% Wmax) is a valid test of maximal exercise performance in COPD. Peak symptom-limited V′O2 was also similar across the two run-in CWR tests (R = 0.93). The high level of reproducibility of ET reported here is in general accordance with previous small single-centre studies 9–14. Reproducibility of ET was high despite a variety of cardiopulmonary exercise testing systems being employed in multiple centres in diverse locations.
The ICC reflects the degree of variance in values between tests within a given subject, referenced to the degree of variance in values between subjects 33. Perfect agreement between tests within subjects results in an ICC of 1 and values ≥0.75 are accepted as being very highly reproducible. There are many sources of within-subject variance aside from random measurement errors. These include day-to-day variations in airway function, differences in medication use, subject motivation and extraneous factors such as subject encouragement, subject distraction and prior pre-test activity. In addition, learning or fatigue effects may contribute to within-subject variance between the two tests. Regardless of all these theoretical considerations, however, the ICC for ET in this study was 0.84, confirming high reproducibility. Confounding factors were minimised by performing tests using the same technician at the same time of day (on two separate days) with avoidance of inhaled medication and large meals prior to testing. Although the lung function inclusion criteria were generally broad, some disease heterogeneity was removed by requiring subjects to meet eligibility criteria for baseline levels of static hyperinflation. The within-subject CV was also presented. In the case of the ET measurement, the CV is likely to be less accurate than the ICC as a measure of reproducibility, since the variability was large, the variability was significantly dependent on the mean and the mean values changed significantly (albeit modestly) across run-in visits.
The fixed work rate of 75% of each individual’s pre-determined Wmax was selected because previous experience has taught us that this was likely to provide a desirable duration of ∼6–8 min of physiological data collection in patients with moderate to severe COPD 6. Indeed, the central tendency (mean, median) for ET fell within this range. The somewhat arbitrary use of 75% Wmax for all subjects would perhaps result in greater variability in ET than the choice of a work rate based on a fixed proportion of a narrower intensity range such as between the lactate threshold (or critical power asymptote) and peak. However, selection of a CWR intensity using this latter method would not be practical in many contexts.
Unlike the normal distribution of the physiological parameters tested, the frequency distribution for ET was significantly skewed to the right; however, a log transformation of the ET data normalised this distribution 34. Of the 463 enrolled patients, 9.3% had a low ET of >1 sd below the mean (i.e. ET <202 s) and 15.6% had a high ET >1 sd above the mean (i.e. ET >810 s). Patients with a low ET had a lower FEV1, IC and peak exercise capacity (work rate and V′O2), with evidence of intolerable dyspnoea at the peak of exercise. Those with the highest ET had a better preserved resting FEV1, IC and higher peak exercise capacity in the incremental test. Patients who occupied the extremes of the frequency distribution for ET could, for the most part, be readily identified during the screening visit based on peak incremental work rate (or V′O2) and baseline resting IC. The rightward skew of the ET distribution suggests that the 16% of patients with the best exercise endurance were probably exercising at a work rate below their critical power 20. For these patients, a higher work rate (relative to maximum) may be more appropriate for repeat testing. The corollary of this is that selection of a lower CWR might be preferable for severely breathless patients with an excessively low peak incremental work rate (or V′O2) and resting IC during the screening visit.
Measurement of dynamic pulmonary hyperinflation
Although many centres had little or no experience with IC measurements during exercise, test-retest reproducibility for measurement of operating lung volumes (IC) was excellent, as were measurements of ventilation and breathing pattern. IC values at rest, isotime and peak exercise were all highly repeatable (R>0.87) between run-in tests 1 and 2. Changes in IC during exercise (from the pre-exercise resting value) reliably reflect changes in end-expiratory lung volume (EELV), provided total lung capacity (TLC) does not change during exercise 35. A number of mechanical studies have confirmed stability of TLC as reflected by consistency of peak inspiratory oesophageal pressures during serial IC manoeuvres during exercise, thereby validating this method of measuring dynamic lung hyperinflation 12, 36. Recent studies employing optoelectronic plethysmography to measure chest wall and abdominal motion during exercise have suggested that some patients with COPD do not exhibit dynamic hyperinflation during exercise, at least as inferred from lower chest wall displacement 37, 38. The current study is the first to analyse the behaviour of operating lung volumes during exercise, measured by change in IC, in a large COPD cohort. The change in IC from rest to peak exercise, inversely reflecting the change in dynamic EELV, was normally distributed. The majority (85%) of the sample showed an increase in EELV, with an average increase of 0.42 L above resting values, in accordance with previous reports 2, 3, 9, 10, 12, 19, 39. The change in dynamic IC during exercise was directly related to the resting baseline IC: those with lowest resting IC (the greatest resting hyperinflation) showed the least amount of dynamic hyperinflation.
Measurement of exertional dyspnoea
Treadmill walk tests appear to be more popular in clinical exercise testing laboratories than cycle endurance tests and may better mimic daily activity. However, precise measurements of power output and operating lung volumes are more difficult during treadmill exercise, and arterial oxygen desaturation is more pronounced during such weight-bearing activity in COPD patients 40. Previous studies have provided evidence that leg discomfort is a more prominent exercise-limiting symptom during cycle exercise than dyspnoea 21, 22. In the current study, however, the majority (over 80%) of patients stated on direct questioning that dyspnoea was the primary or co-primary symptom-limiting exercise, and Borg ratings of dyspnoea intensity at peak exercise were “severe” or “very severe.”
Since dyspnoea is a subjective experience, it is reasonable to anticipate that intensity ratings would be less reproducible than measurement of objective physiological parameters. Given the stability of physiological measurements across run-in tests, a very small but statistically significant improvement in standardised measurements of dyspnoea intensity (by a mean of ∼0.2 Borg units) and a resultant small improvement in ET (by a mean of ∼30 s) over the course of the two run-in visits suggests that subjective measurements may be more susceptible to familiarisation effects on repeated testing. However, an ICC of 0.79 for Borg ratings at isotime exercise and of 0.81 at peak exercise was reassuring. The dominance of severe dyspnoea, and the proven repeatability of its measurement during cycle exercise in COPD, suggests that this testing protocol that includes two run-in tests before randomisation is suitable for evaluating the impact of dyspnoea-relieving interventions.
In conclusion, this is the first large study to demonstrate that the measurement of ET, operating lung volumes, ventilation and ratings of dyspnoea intensity is highly reproducible in COPD patients during symptom-limited, high-intensity, CWR exercise. Measurement of dynamic lung hyperinflation and dyspnoea intensity is potentially clinically important and can now be reliably incorporated into cardiopulmonary exercise test protocols. This approach allows a more comprehensive characterisation of the individual patient presenting with exercise limitation. Moreover, our results attest to the feasibility of conducting accurate measurement of these key exercise performance parameters using the CWR cycle test in the setting of multicentre clinical trials in patients with COPD.
Support statement
This study received financial support from Boehringer Ingelheim GmbH, Ingelheim, Germany. D.E. O’Donnell held a career scientist award from the Ontario Ministry of Health.
Statement of interest
Statements of interest for D.E. O’Donnell, F. Maltais and H. Magnussen, and for the study itself, can be found at www.erj.ersjournals.com/misc/statements.dtl
Acknowledgments
The results of this study were presented, in part, at the European Respiratory Society Annual Congress 2004.
Footnotes
-
This article has supplementary material accessible from www.erj.ersjournals.com
- Received November 7, 2008.
- Accepted February 23, 2009.
- © ERS Journals Ltd