Abstract
This document reviews 1) the measurement properties of commonly used exercise tests in patients with chronic respiratory diseases and 2) published studies on their utilty and/or evaluation obtained from MEDLINE and Cochrane Library searches between 1990 and March 2015.
Exercise tests are reliable and consistently responsive to rehabilitative and pharmacological interventions. Thresholds for clinically important changes in performance are available for several tests. In pulmonary arterial hypertension, the 6-min walk test (6MWT), peak oxygen uptake and ventilation/carbon dioxide output indices appear to be the variables most responsive to vasodilators. While bronchodilators do not always show clinically relevant effects in chronic obstructive pulmonary disease, high-intensity constant work-rate (endurance) tests (CWRET) are considerably more responsive than incremental exercise tests and 6MWTs. High-intensity CWRETs need to be standardised to reduce interindividual variability. Additional physiological information and responsiveness can be obtained from isotime measurements, particularly of inspiratory capacity and dyspnoea. Less evidence is available for the endurance shuttle walk test. Although the incremental shuttle walk test and 6MWT are reliable and less expensive than cardiopulmonary exercise testing, two repetitions are needed at baseline. All exercise tests are safe when recommended precautions are followed, with evidence suggesting that no test is safer than others.
Abstract
A review of exercise testing to evaluate interventions aimed to improve exercise tolerance in respiratory patients http://ow.ly/U37mQ
Introduction
From an evidence-based perspective, performance during standardised exercise tests (both laboratory and field tests) with associated pathophysiological responses are of considerable importance in the multidimensional evaluation of most respiratory diseases [1, 2]. Exercise testing is fundamental for the accurate quantification of cardiorespiratory fitness and for identifying mechanisms underlying exercise intolerance, particularly with regard to activities having a significant aerobic-energetic requirement [3, 4]. However, none of the primary test formats could reasonably be regarded as stressing purely “aerobic” mechanisms; their symptom-limited character also confers varying degrees of “anaerobiosis”. Several indices of physiological response relate independently to major clinical outcomes such as survival and hospital admissions, thus allowing considerable improvement in prognostic stratification [1, 5–12].
Aware of its importance, regulatory agencies such as the United States Food and Drug Administration (FDA) and European Medicines Agency/Committee for Medicinal Products for Human Use have issued (in draft or final form) guidelines for the pharmaceutical industry recognising exercise testing as an efficacy end-point for interventions in chronic obstructive pulmonary disease (COPD) and pulmonary arterial hypertension (PAH) [13–15]. Direct assessment of the effects of interventions on exercise performance is also likely to be relevant for other chronic respiratory conditions.
There is now a substantial body of evidence relating to the value of different exercise indices in assessing the effects of therapeutic interventions in respiratory diseases. In 2012, the scientific committee of the European Respiratory Society (ERS) approved the constitution of a task force whose goal was to review comprehensively the value and limitations of different exercise indices as outcomes for therapeutic interventions, based upon current scientific evidence. This document summarises the work of the task force.
The primary target audience comprises clinicians and researchers who use exercise testing in the evaluation of interventions. The task force agreed that only indices with demonstrable impact on intervention-related improvement in exercise tolerance would be addressed. Aspects such as quality control, laboratory requisites and safety measures are not included, as these are well covered elsewhere. Some tests, such as the sit-to-stand test and the gait-speed test are not addressed, since they depend substantially more on muscle strength, equilibrium and gait balance than on the mechanisms of oxygen and carbon dioxide transport to which interventions such as endurance training, bronchodilators and pulmonary vasoactive substances are directed. Other tests, such as the step test and the stair-climbing test, are mentioned only in the online supplementary material, since it was considered that there was insufficient published information at the time of writing to support their inclusion.
Methods
Studies that report the evaluation or use of the incremental (or ramp) exercise test (IET), the constant work-rate exercise test (CWRET), the 6-min walk test (6MWT), the incremental shuttle walk test (ISWT) and the endurance shuttle walk test (ESWT) in adults and children with chronic respiratory diseases were reviewed, without restrictions on study design. MEDLINE and the Cochrane Library were searched from 1990 to December 21, 2014. Selected references considered of great relevance were included up to March 2015. Reference lists of all primary studies and review articles were examined for additional citations. Only studies written in English, or for which an English translation was available, were consulted. Studies were included that referred (singly or in combination) to reported validity (i.e. the extent to which a test or variable is related to the function of a physiological system or to patient-meaningful variables such as symptoms or physical activity), precision or reproducibility, prognostic information (i.e. relationship with the natural history of the disease), discrimination (i.e. whether a variable can differentiate the severity of the disease as conventionally measured), clinical meaningful difference and test response to interventions. Reviewers excluded studies that did not meet the inclusion criteria based on title or abstract. Studies that met the inclusion criteria were retrieved in full text to determine whether they were suitable for inclusion. The articles included by the primary author of each section were approved by a second reviewer with expertise in the field. In case of discrepancies, differences were resolved by consensus.
Laboratory-based exercise tests
Laboratory-based tests are conducted on either a cycle ergometer or a motorised treadmill. Broad arrays of physiological responses are measured, most commonly on a breath-by-breath basis, throughout the test, at the limit of tolerance and in recovery [3, 4].
Incremental work-rate tests
The IET permits the evaluation of both submaximal and peak exercise responses, providing several indices relevant to the evaluation of patients with respiratory diseases [3, 4, 16]. This allows the identification of underlying mechanisms of exercise intolerance [3, 17, 18]. Use of the IET can also rule out certain medical conditions that pose a risk for exercise interventions, thus increasing their safety [3, 18].
Procedure
As the procedure is well standardised and is automated (i.e. computer-driven cycle ergometer or treadmill), there is little interoperator variability [3, 18]. The cycle ergometer has more advocates than the treadmill because it is less expensive, occupies little space, is less prone to movement artefacts so making it easier to take additional measurements, requires relatively little patient practice and (unlike the treadmill) the external power output is accurately known [3, 18]. Conversely, walking on the treadmill may be more familiar to the patient [3, 18]. Physiological responses to cycle ergometer and treadmill tests differ, as can the physiological mechanisms limiting exercise tolerance. For example, in COPD patients, cycle ergometry results in a greater likelihood of exercise intolerance resulting from leg fatigue than from dyspnoea [19] (online supplementary material). However, arterial desaturation occurs more frequently with treadmill walking than with cycling in COPD patients [20]. Imposing linear incremental work-rate (WR) profiles for treadmill exercise can be problematic, because most speed/grade increments incorporated into clinical exercise testing do not result in linear increases in WR [21]. A useful recent development is a protocol that generates a linear WR profile through continuous incrementing of both speed and grade [21]. Continuous monitoring of arterial oxygen saturation (SpO2) and heart rate, with verbal encouragement during the test, are recommended [3] (table 1).
Characteristics of the tests reviewed
It is conventional practice in healthy individuals to select the rate at which WR is incremented (ΔWR/Δt) for an IET such that the tolerable limit (tLIM) is reached within ∼10 min [3, 18]; i.e. ΔWR/Δt does not affect peak oxygen uptake (V′O2peak) [22] (online supplementary material). However, no recommendations have been developed for respiratory disease populations. Some data suggest that shorter test durations (e.g. ∼5–9 min) may be as suitable for COPD [23]. A ΔWR/Δt of 5–10 W·min−1 may be used in more severe patients to ensure a sufficient test duration [3, 18]. However, as oxygen uptake lags WR throughout the IET, peak WR (WRpeak) will be higher the more rapidly WR is incremented [22] (online supplementary material).
Most modern breath-by-breath cardiopulmonary exercise testing systems provide a variety of possibilities regarding data averaging (suitably designed mixing chamber systems can provide adequate temporal resolution for incremental testing). Differences in data averaging (i.e. number of breaths, or time intervals such as 10, 20 or 30 s) can have profound effects on some variables (table 1). Averaging of data over 20–30-s intervals has been recommended [3], but 10-s averages have also been used in some trials [24–26]. Therefore, data averaging needs to be standardised and maintained constant within any given trial.
Safety
Adverse events are rare during properly supervised tests (table 1). In the largest study looking at 5060 cardiopulmonary exercise tests in high-risk cardiovascular patients, including 196 PAH patients, the adverse event rate was 0.16% with no fatalities [27]. In a study of PAH in adults, there were no events in 242 tests [28]. Cardiopulmonary exercise testing appears also to be safe in the paediatric population [29]. In patients without known cardiac problems, the complication rate is even lower, with a rate of death of 2–5 per 100 000 tests [3]. Personnel conducting tests should be qualified to detect potentially life-threatening signals and follow safety recommendations [3, 18] (table 1).
Peak oxygen uptake
V′O2peak represents the highest oxygen uptake (V′O2) achieved in the IET at the subject's limit of tolerance; i.e. it is a symptom-limited measure. With good subject effort, V′O2peak is closely reflective of the subject's “maximum” V′O2, the gold-standard index of aerobic capacity [3, 18].
Validity
V′O2peak is a useful outcome (tables 1 and 2) because normal reference values have been better established than for other exercise variables [3, 18] and because, combined with other response variables obtained in the IET, characteristic pathophysiological profiles of the underlying causes of impairment can be discerned [3, 18, 30]. Sufficient aerobic capacity is required to adequately perform daily living activities [31] and V′O2peak demonstrates a good correlation (r=0.54) with daily living activity [32].
Measurement properties of the tests reviewed
With appropriate quality control measures (i.e. verification of calibration gases, training of personnel from the different laboratories on the procedures, and serial biological controls), V′O2peak is consistent among different laboratories in multisite COPD clinical trials [24, 33]. Inaccurate results from less-experienced sites [34] was considered the cause of the failure of V′O2peak to show an effect with the approved dose (100 mg) of sitaxsentan (an endothelin-A receptor antagonist) in the Strategies to Increase Confidence, Independence and Energy (STRIDE)-1 study [35]. Therefore, to use V′O2peak as an outcome in clinical trials, quality control measures should be taken, peak criteria standardised and expertise validated at all sites.
Precision
V′O2peak is remarkably repeatable, with no learning effect and coefficients of variation ranging from 3% to 9% in respiratory patients [3, 28, 36] (tables 1 and 2). Therefore, there is no need to perform more than one baseline IET in clinical trials when maximal effort criteria are standardised.
Prognostic information
V′O2peak is an excellent general predictor of survival for most chronic respiratory diseases [5, 6, 9–12, 37, 38]. In one study, V′O2peak (but not 6-min walking distance (6MWD)) was able to predict clinical stability in idiopathic PAH [39]. However, to date there is little information on the impact on survival of interventions yielding improvements in V′O2peak in respiratory patients.
Discrimination
V′O2peak can gauge severity in PAH patients [40]. In COPD, V′O2peak can stratify severity by survival [5] and spirometric Global Initiative for Chronic Obstructive Lung Disease (GOLD) stages [41]. In interstitial pulmonary fibrosis (IPF), arterial oxygen tension at V′O2peak can also help to gauge severity [2].
Clinically meaningful difference
There is very little information about what constitutes a minimal clinically important difference (MCID) in V′O2peak (table 2). In the National Emphysema Treatment Trial (NETT), 4±1 W was considered the symptoms-anchored MCID for severe COPD patients [42], translating into a V′O2peak change of ∼0.04±0.01 L·min−1.
In patients with PAH, V′O2peak has been suggested as a goal of therapy, with <10 mL·min−1·kg−1 indicating poor prognosis and the need to escalate treatment and >15 mL·min−1·kg−1 indicating better prognosis [43].
Evaluative aspects
Regarding nonpharmacological interventions, several studies including GOLD stages 2–4 COPD patients of widely ranging age (some aged >75 years) [44] have shown significant, yet modest, increases in V′O2peak after lower-limb endurance muscle training [25, 44–47]. In general, endurance training increases V′O2peak in PAH [48, 49] IPF [50], cystic fibrosis [51, 52] and asthma [53]. Reported changes following pulmonary rehabilitation in COPD patients are in the range 0.1–0.5 L·min−1 or ∼10–40% of baseline, with a mean improvement of ∼11% [54]. Rehabilitation in PAH patients increases V′O2peak by 1–1.5 mL·min−1·kg−1 [48, 49], and a similar responsiveness is observed in IPF patients [50]. Lung transplant surgery [55, 56] and successful lung volume reduction surgery [57] also consistently result in a greater post-intervention V′O2peak (table 3). V′O2peak has been also shown to be responsive to oxygen therapy both in patients with COPD and those with cystic fibrosis who desaturate during exercise [58, 59].
Responsiveness of the tests to rehabilitation
With regard to pharmacological interventions, V′O2peak can improve after short- [60] and long-acting inhaled bronchodilator therapy [60–63] in COPD patients, although the magnitude of this response tends to be modest (0.04–0.18 L·min−1) [60] and it is not always evident [60, 64]. V′O2peak has been able to detect the long-term (i.e. 3–12 months) effects of several approved pharmacological therapies for PAH. The effects seen in idiopathic PAH are of the order of 1.5–2 mL·min−1·kg−1 or ∼9–14% [65, 66]. In one PAH study, the effect observed was less deterioration than in the control group (−7% versus −16%, respectively) [67]; however, this is not the case in all studies [35]. V′O2peak has been shown to significantly improve after oxygen therapy in IPF [68] and COPD [69].
Peak work-rate
WRpeak is the highest WR achieved in an IET at the subject's limit of tolerance. It has the advantage over V′O2peak that it can be determined without a metabolic measurement system. However, a disadvantage is that it is dependent on the rate of WR increase [22] (online supplementary material).
Precision
Coefficients of variation for WRpeak in adult respiratory patients range from 3.5% to 13.8%, averaging 7.5% [3]. In children with cystic fibrosis, WRpeak appears to be slightly less reproducible, with a coefficient of variation ∼10% [70] (table 2).
Prognostic information
There is little direct information about the predictiveness of WRpeak, since most of the studies evaluating the IET have focused on V′O2peak. WRpeak in children with cystic fibrosis [71] and in adults with PAH is predictive of survival [12] (table 1).
Discrimination
WRpeak differs among COPD patients according to spirometric GOLD stage, but with significant overlap [41].
Clinically meaningful difference
From the NETT data, anchor- and distribution-based analysis suggested 4±1 W as the MCID for severe COPD patients [42] (table 2).
Evaluative aspects
WRpeak is at least as responsive as V′O2peak to rehabilitative interventions in studies including COPD patients ranging widely in age and severity [25, 44–47, 72]. In a meta-analysis of 16 studies of pulmonary rehabilitation, the mean (95% CI) pooled effect was 6.8 (1.9–11.6) W increase in WRpeak [73]. Another systematic review comparing continuous with interval training reported increases (95% CI) in WRpeak of 11 (9–13) W and 10 (8–11) W, respectively [74]. WRpeak increases after rehabilitation in PAH patients by 15–25 W [48, 49] (table 3). WRpeak was also increased after lung volume reduction surgery [42, 57]. Oxygen therapy has shown average increases of ≥10 W in patients with COPD, IPF and cystic fibrosis who desaturate during exercise [58, 59, 75].
WRpeak was an outcome in a few studies with both short-acting (salbutamol and ipratropium) and long-acting (formoterol, salmeterol and tiotropium) bronchodilators in COPD patients. Most, but not all of these studies reported significant, yet small (i.e. 3–10 W) improvements in WRpeak [60, 64] (table 4).
Responsiveness of the different tests to bronchodilators for chronic obstructive pulmonary disease (COPD), vasodilators for pulmonary arterial hypertension (PAH) or pirfenidone for idiopathic pulmonary fibrosis (IPF)
Ventilation–carbon dioxide output indices
Procedure
There is some controversy in the literature about how best to estimate the slope of the ventilation–carbon dioxide output (V′E–V′CO2) relationship (online supplemenary material). We recommend that the slope estimation is confined to the demonstrably linear region of the V′E–V′CO2 relationship, i.e. excluding the curvilinear region beyond the point at which arterial (PaCO2) and end-tidal carbon dioxide tension are reduced by ventilatory compensation for the exercise-associated metabolic acidosis. Because V′E–V′CO2 minimum (i.e. the lowest value attained on the IET, typically at the respiratory compensation point) and V′E–V′CO2 at the lactate threshold (θL) (online supplemenary material) are so similar (r=0.99; with limits of agreement ∼±1) and θL may be difficult to discern in some respiratory patients, V′E–V′CO2 minimum may be a more reliable measurement to use [76].
Validity
Both V′E–V′CO2 slope and V′E–V′CO2 minimum are typically elevated in most respiratory and cardiac diseases, reflective of an increased dead space and/or decreased PaCO2 set-point [3, 40, 43, 65, 66] (online supplemenary material). In PAH patients, V′E–V′CO2 indices or their changes with interventions correlate significantly with pulmonary artery pressure and with increased pulmonary vascular resistance (PVR) or changes in PVR [65, 77, 78], and their profiles can detect right-to-left shunt (e.g. foramen ovale) [79]. In COPD, V′E–V′CO2 slope correlates with the degree of emphysema (r=0.77) as measured by computed tomography [80]. Reference values for V′E–V′CO2 minimum have been established [76].
Precision
Test–retest reproducibility is similar for V′E–V′CO2 slope and V′E–V′CO2 minimum in healthy adults (i.e. 95% CI ±2.3) [81], although better reproducibility has been reported for V′E–V′CO2 minimum and V′E–V′CO2 at θL than for V′E–V′CO2 slope [76]. V′E–V′CO2 slope is repeatable in PAH and cardiac patients (coefficient of variation ∼5%, range 1–11%) [28, 82] (table 2).
Prognostic information
V′E–V′CO2 slope is predictive of survival in idiopathic PAH [77, 83], cystic fibrosis [71] and IPF [37]. However, changes in V′E–V′CO2 indices after interventions are not as predictive of survival and clinical events as V′O2peak in idiopathic PAH [39, 77] (table 1).
Discrimination
V′E–V′CO2 indices can detect patients with PAH and left-to-right shunt [79]. V′E–V′CO2 at θL increases in proportion to disease severity in PAH patients [40, 84]. In a small sample of moderate and severe COPD patients, V′E–V′CO2 at θL discriminated those with PAH from those without [85]. A high V′E–V′CO2 slope also identified PAH in patients with IPF [86].
Clinically meaningful difference
No MCID has been established for any of the V′E–V′CO2 indices. In PAH, values for V′E–V′CO2 at θL <45 have been proposed as a target for treatment [43].
Evaluation
V′E–V′CO2 slope did not change significantly after exercise training in PAH patients in two studies [48, 49] (table 3). V′E–V′CO2 slope was reduced significantly after surgical (thromboendarterectomy and lung transplant) intervention and the decreases were strongly related to the reduction in PVR post-surgery [87, 88]. V′E–V′CO2 indices respond to pharmacological interventions: with phosphodiesterase-5 inhibitors [89], endothelin receptor antagonists [67] and prostanoids [90–92] in idiopathic PAH; with bosentan and sildenafil in adult patients with Eisenmenger syndrome [93]; and with beraprost in patients with thromboembolic pulmonary hypertension [94]. Responses were generally between 3 and 6 units or 10–20%. However, there was no effect of nitric oxide inhalation [95]. A study of ghrelin in underweight COPD patients showed an average reduction in V′E–V′CO2 slope of ∼4 units [96]. In cystic fibrosis, oxygen therapy reduced the minimum V′E–V′CO2 [58].
High-intensity constant work-rate exercise tests
These tests are widely used to assess changes in exercise tolerance following interventions and the associated responses of key physiological and perceptual variables. The high-intensity CWRET has grown in popularity, particularly in COPD, because it can characterise exercise tolerance in a single exercise bout.
Procedure
As for the IET, CWRETs are typically performed on the treadmill [25, 97] or cycle ergometer [26, 62, 98–104], although alternative modes of exercise better suited to the patient's limitations can be used. As for the IET, CWRET implementation is automated and it is therefore less operator-dependent if the criteria for test termination are standardised (table 1). An IET must first be completed (ideally on a separate day or at least allowing a sufficient rest period) for an appropriate WR for the CWRET to be estimated [105]. The most straightforward approach is to assign this WR based on a fixed percentage of IET WRpeak, although this approach has some limitations [106, 107] (online supplementary material).
Continuous monitoring of SpO2 and ECG, with verbal encouragement throughout, are recommended [3]. Concomitant gas exchange measurement increases the value of the test by providing insight into putative mediators of the exercise limitation. With cycle ergometer tests, it is important to define a priori the criteria to determine intolerance, e.g. the maximum duration for which the patient may pedal below a minimum pedalling frequency despite encouragement. Typical criteria for termination include ≤10 s sustained below a lower-bound target frequency despite verbal encouragement (typically 60 rpm, but bounds of 50–70 rpm for COPD patients are acceptable). The point at which the patient is unable to regain the target frequency despite encouragement defines tLIM.
We recommend limiting the target duration for the pre-intervention CWRET to between 180 s and 480 s. There are some important physiological, statistical and practical reasons necessitating this relatively narrow range [26, 102, 105, 107] (online supplementary material). Exercise durations within this range are typically limited by the integrated functioning of cardiopulmonary and neuromuscular systems, rather than boredom or discomfort [3, 4, 18]. Physiologically, the relationship between WR and tLIM is not linear; therefore interpreting the magnitude of intervention-related tLIM change should be better made from a common baseline duration. Statistically, the variability of pre-intervention tLIM among subjects in published randomised controlled trials is typically high (coefficient of variation 20–60%, median 42%) (figures 1–3). The minimisation of interindividual variability [105, 107] can therefore economise the sample size needed to detect an effect. Practically, long exercise tests are undesirable for both the patient and the testing facility. Furthermore, elimination of long baseline tLIM values will reduce the occurrence of very long post-intervention tests (most frequently seen after muscle training), requiring premature termination by the investigators and therefore invalidating interpretation of the magnitude of intervention-related tLIM change using parametric statistics [26, 105, 107].
a) Absolute and b) relative changes in time to the limit of tolerance (tLIM) in constant work-rate exercise test studies using a cycle ergometer with nonpharmacological interventions in chronic obstructive pulmonary disease patients. FEV1: forced expiratory volume in 1 s; WR: work-rate; NIV: noninvasive ventilation; MCID: minimum clinically important difference. #: suggested mean MCID thresholds a) 105 s (lower limit of 95% CI 60 s); b) 33% (lower limit of 95% CI 22%) [102].
a) Absolute and b) percentage changes in time to the limit of tolerance (tLIM) in constant work-rate exercise test studies using a cycle ergometer in relation to placebo with short-acting β2-adrenoceptor agonists (SABA), short-acting antimuscarinics (SAMA) and their combination in chronic obstructive pulmonary disease patients. Descriptive data for individual studies are shown only once. FEV1: forced expiratory volume in 1 s; WR: work-rate; MCID: minimum clinically important difference. #: suggested mean MCID thresholds a) 105 s, b) 33%; ¶: suggested lower limit of 95% CI MCID thresholds a) 60 s, b) 22% [102].
a) Absolute and b) percentage changes in time to the limit of tolerance (tLIM) in constant work-rate exercise test studies using a cycle ergometer in relation to placebo in studies with long-acting β2-adrenoceptor agonists (LABA), long-acting antimuscarinics (LAMA), inhaled corticosteroids (ICS) and their combination in chronic obstructive pulmonary disease patients. Descriptive data for individual studies are shown only once. FEV1: forced expiratory volume in 1 s; WR: work-rate; MCID: minimum clinically important difference. #: suggested mean MCID thresholds a) 105 s, b) 33%; ¶: suggested lower limit of 95% CI MCID thresholds a) 60 s, b) 22% [102].
Typically, CWRET WRs are selected to be 75–80% of IET WRpeak. Pooled data from one retrospective [108] and two prospective [33, 109] studies (total n=2608) suggest that 75–80% WRpeak results in the target tLIM being achieved in ∼57% of COPD patients (∼25% <180 s and ∼18% >480 s) (figures 4 and 5). For patients not achieving a baseline tLIM within 180–480 s, a practical strategy is to adjust the WR to bring tLIM within the desired range (i.e. ±5 W), and repeating the test [102]. Measurements of the curvature constant of the power–duration relationship in COPD patients [26, 105, 110] support this approach, and suggest that 5-W adjustments (up or down) are sufficient to reset an initial tLIM between ∼120 and ∼840 s to within the recommended range in an additional 30% of the patients [33, 108, 109]. Additionally, 5-W adjustments are within the technical capacity of most cycle ergometers.
Endurance time (time to the limit of tolerance (tLIM)) variability in response to cycle ergometer constant work-rate exercise test performed at a) 75% and b) 80% of peak work-rate in moderate-to-severe chronic obstructive pulmonary disease patients. a) n=463. Reproduced from [33] with permission from the publisher. b) n=92. Reproduced with permission [109].
Endurance time (time to the limit of tolerance (tLIM)) variability in response to cycle ergometer constant work-rate exercise tests performed at 75% of peak work-rate in a large sample (n=2053) of chronic obstructive pulmonary disease patients. a) Males; b) females. Reproduced and modified from [108] with permission from the publisher.
Safety
There have been no reports of adverse events while performing CWRETs. These tests are usually performed after an IET, which effectively serves as a prescreening for adverse reactions to exercise. While there is no specific information about the threshold at which arterial desaturation becomes hazardous, it has been recommended that the test should be terminated if SpO2 falls below 80% [3, 17, 18].
Tolerance time
Tolerance time for a CWRET is the duration from the WR imposition to the point of task failure. Due to the curvature of the WR–tLIM relationship, tLIM (conventionally expressed in seconds or minutes) is a particularly sensitive index of interventional change in several respiratory diseases.
Validity
In COPD patients the increase in tLIM following exercise training is likely to reflect improvements in the aerobic capacity of the trained muscles [111, 112] and a delayed onset of metabolic acidosis [98, 111, 113], as well as slowed increases in operating lung volume and breathlessness [114, 115]. Interventions designed to improve ventilatory function (i.e. bronchodilators) also lead to an improved tLIM, probably because of the latter mechanism [116], although respiratory-muscle unloading may also improve oxygen supply to the lower-limb locomotor muscles [117].
Interventions able to improve tLIM are associated with increased health-related quality of life (HRQoL) [25, 101, 102] and physical activity levels [118]. tLIM has been shown to perform consistently well as a sensitive outcome variable among different laboratories in multicentre clinical trials, provided that quality-control procedures are implemented [33] (table 1).
Precision
Repeatability for tLIM was addressed in a multicentre trial of 463 COPD patients who performed two CWRETs 5 days apart [33]. There was a small but significant (p<0.001) increase of 34 s (coefficient of variation ∼24%) in tLIM in the second CWRET, suggesting a small ordering effect. Nonetheless, the intraclass correlation coefficient (ICC) was 0.84 (95% CI 0.81–0.87) [33] (table 2). A smaller (∼12 s) nonsignificant ordering effect and similar ICC were also found in two studies of 60 [106] and 25 [102] COPD patients, in which CWRET was repeated on the same day or the next day, respectively, in single laboratories with experience in CWRET [102, 106]. The low mean difference and high ICC for repeated constant work-rate exercise tests suggests good to very good adherence for tests performed under identical conditions.
Prognostic information
To date, there are no studies establishing potential relationships between increase in tLIM and survival, healthcare costs or exacerbation rates (table 1).
Discrimination
Since one goal of the test design is to standardise tLIM and reduce intersubject variability, by design the CWRET tLIM is not discriminative. tLIM is designed to assess the efficacy of interventions.
Clinically meaningful difference
There is limited information on MCID for tLIM after interventions. In COPD, 100-s (95% CI 60–140 s) or 33% (95% CI 18–48%) change from baseline using cycle ergometry related well with positive patient-reported outcomes after pulmonary rehabilitation [102, 119]. The use of MCID as a percentage appears to be less dependent on baseline tLIM when 75% and 85% of WRpeak were compared [102] (table 2). However, data from bronchodilator studies suggest that improvements in lung function that seem to be clinically important are often associated with increases of tLIM <100 s (figures 2 and 3).
Evaluation
In COPD, tLIM is responsive to rehabilitative interventions, as well as to interventions aimed at unloading the respiratory system, such as breathing heliox, oxygen therapy, noninvasive ventilatory support (figure 1) and lung volume reduction surgery [120]. Most nonpharmacological interventions in COPD produce clinically important improvements in tLIM (i.e. a 100-s or 33% improvement was reached in 30 (91%) out of 33 and 27 (82%) out of 33 investigations, respectively) (figure 1). tLIM is also responsive to rehabilitation in idiopathic PAH [121, 122] and IPF [123, 124]. In one observational study of 53 IPF patients, the effect size for tLIM after rehabilitation was larger than for V′O2peak, WRpeak, 6MWD and ISWT [123] (table 3). Oxygen therapy significantly increases tLIM by an average of 162 (95% CI 118–207) s in patients with COPD who desaturate during exercise [59]. Although there are very few studies on the effects of oxygen therapy on tLIM, in two studies a substantial improvement in tLIM was found [125]. Improvements in tLIM with oxygen therapy have also been reported in cystic fibrosis [58].
High-intensity CWRETs have been used to evaluate responses to short- and long-acting bronchodilators in COPD (figures 2 and 3, table 4). Only 14 (54%) out of 26 studies of long-acting bronchodilators and three (27%) out of 11 studies of short-acting bronchodilators showed average increases in tLIM >100 s. This proportion decreased to eight (31%) out of 26 for long-acting bronchodilators if the criterion of 33% improvement was used. An effect >60 s (the one-tailed lower confidence limit of the MCID calculated from [102]) was observed in 22 (84%) out of 26 of the studies presented in figures 2 and 3. An effect >22% (the one-tailed lower confidence limit of MCID calculated from [102]) was seen in 20 (77%) out of 26 of the studies (figures 2 and 3).
“Isotime” responses
Isotime responses are measurements of variables made at specific time points, typically during a CWRET. The analysis of responses at a standardised time pre- and post-intervention has proven valuable in the physiological interpretation of tLIM changes. Importantly, unlike tLIM, isotime responses are effort-independent. They include V′O2, V′CO2, V′E, inspiratory capacity, breathing pattern, dyspnoea, leg effort, muscle fatigue, cardiac output, heart rate and arterial lactate concentration ([La−]a).
Isotime responses other than inspiratory capacity and dyspnoea
Validity
The physiological meaning of isotime measurements depends on the variable and intervention, but in general terms, isotime reduction in V′O2, V′CO2, V′E, leg effort, muscle fatigue and [La−]a are considered markers of increased aerobic and decreased anaerobic energy transfer in the working muscles [98, 111, 113, 126]. A decrease in isotime V′E may also be the consequence of reduced dead space volume to tidal volume ratio, because the metabolic and acid–base ventilatory demands of the task have been reduced or because the patient adopts a more efficient breathing pattern [45, 101, 114, 115, 118, 127].
Precision
In a sample of 463 COPD patients in a multicentre trial, within-subject coefficient of variation for isotime V′O2 was 8.2% and for isotime V′E was 7.4% [33].
Prognostic information
No specific information has been obtained on the ability of isotime measurements to predict clinical outcomes.
Discrimination
No information is available on the ability of isotime measurements to stratify patients with respiratory diseases.
Clinically meaningful difference
MCID is not established for isotime measurements.
Evaluative
Isotime V′O2, V′CO2 and V′E, as well as cardiac output and [La−]a and other variables are responsive to a number of interventions in COPD [25, 62, 98, 99, 120, 128–131] and in some other conditions such as cystic fibrosis [132] and PAH [122, 133]. Isotime comparisons are also responsive outcomes of interventions directed to train arm muscles [134–136].
Inspiratory capacity
In contrast to healthy individuals [137], in patients with expiratory airflow limitation, end-expiratory lung volume increases during exercise as expiratory time becomes reduced with increasing breathing frequency, a phenomenon called dynamic hyperinflation [137].
Inspiratory capacity is used increasingly as an outcome measure in clinical trials to test the efficacy of bronchodilators and other interventions [62, 99, 101, 114, 115, 118, 120, 129, 130].
Procedure
Subjects are required to take a deep inspiration, after normal expiration, at predetermined intervals during exercise (typically every 2 min). Dynamic hyperinflation can be also evaluated as the difference between inspiratory capacity at rest and during exercise [33, 138, 139].
Validity
Peak negative oesophageal pressure during exercise inspiratory capacity manoeuvres is similar to that at rest, despite changes in inspiratory capacity [140]. This suggests that changes in inspiratory capacity during exercise are due to dynamic hyperinflation and not to reduced respiratory muscle capacity.
While exaggerated dyspnoea and exercise intolerance are multifactorial in COPD, dynamic hyperinflation is believed to be an important contributor to each [141]. Inspiratory capacity may be more sensitive than other lung function variables to changes in expiratory airflow limitation [62, 99, 142, 143]. Dynamic hyperinflation correlates with carbon dioxide retention and hypoxaemia during exercise [144, 145] and daily living activity in COPD patients [146].
It has been shown that inspiratory capacity can be reliably determined at isotime and peak exercise in multicentre clinical trials [33, 101, 118, 127].
Precision
Reported within-subject coefficient of variation is 12–20% for exercise inspiratory capacity, but precision is less for change in inspiratory capacity (58–88%) [33, 138, 147]. In one study, variability was found to be larger for manoeuvres performed at the beginning and close to the end of exercise [138].
Prognostic information
Dynamic hyperinflation during exercise predicts mortality [148].
Discrimination
On average, there is a tendency to greater dynamic hyperinflation with COPD severity [149]. However, there was a significant overlap among groups [149].
Clinically meaningful difference
Changes in inspiratory capacity >0.14 L (or 4.5% predicted) are beyond the 95% confidence interval [150] and have been consistently associated with significantly increased tLIM in moderate-to-severe COPD patients (figure 6).
Relationships between the changes in a) endurance time (time to the limit of tolerance (tLIM)); b) dyspnoea at isotime; and c) isotime inspiratory capacity (IC) after the administration of bronchodilators. Mean minimal clinically important difference (MCID) thresholds of a) 105 s; b) 1 point; and c) 0.2 L. #: lower limit of 95% CI 60 s.*: p<0.05.
Dyspnoea
Dyspnoea is a characteristic symptom of most respiratory diseases. Two different approaches are used to measure dyspnoea: ratings based on daily living activities and ratings during specific exercise tasks. The information obtained by these two methods is different [151, 152]. As the task force focused on exercise testing, the latter are discussed here.
Procedure
Dyspnoea is measured using either the 10-point Borg scale (CR-10) [153] or a 100-mm visual analogue scale (VAS) [154, 155]. The CR-10 scale is derived from the original Borg 6–20-point perceived exertion scale (corresponding to a heart rate range of 60–200 bpm) [155, 156], modified to an open-ended 10-point scale including written indicators of severity to anchor specific numbers on the scale [153]. This was updated in 2010 with categories similar to the original 1982 scale, but having 19 points [155, 156]. In addition, a “modified CR-10 Borg scale”, which is not open-ended, is frequently used [155, 156]. Readers should be aware that all of them are described (and frequently misquoted) as Borg or CR-10 scales in the literature [155].
Before exercise testing, subjects must be familiarised with the CR-10 or VAS, preferably by means of written instructions [154, 155]. Dyspnoea should be measured at rest when the patient is ready to exercise, at least every 2 min during the test and at end-exercise [154, 155]. Symptom ratings should precede inspiratory capacity manoeuvres by at least five breaths to avoid interference [147]. End-exercise dyspnoea is quite variable among subjects [110, 152, 157] and frequently patients rate it similarly after interventions [26, 62, 99, 101, 118, 127, 131]; reflecting that exercise tolerance may have increased, exercise is often terminated at a similar intensity of dyspnoea. Thus, a more robust approach for comparing the effect of interventions on dyspnoea during IETs or CWRETs is to compare dyspnoea at isotime, or standardised for V′O2 [158] or V′E [158, 159]. Another method is comparing “dyspnoea slopes”, i.e. the rating of dyspnoea as a function of time or WR during an IET [158].
Validity
The majority of respiratory patients experience dyspnoea during exercise [66, 99, 157, 160]. CR-10 or VAS dyspnoea ratings are subjective and therefore of most value when change within an individual is assessed. However, they are also used for intersubject comparisons [154, 155].
Precision
Dyspnoea ratings are reproducible in the short- and long-term [33, 154, 155, 161]. ICC for isotime and peak CR-10 are 0.79 and 0.81, respectively [33].
Prognostic information
To our knowledge, neither isotime nor end-exercise dyspnoea scores have been specifically studied as predictive variables.
Discrimination
Dyspnoea ratings are highly variable for the same effort among subjects with similar spirometric impairment [33, 154, 155, 157, 161] and, probably because of that, no significant correlation between peak exercise dyspnoea and spirometric severity has been detected in COPD patients [162].
Minimal important difference
Using a distribution-based approach, differences of two points in the CR-10 and 10–20 mm for the VAS have been suggested as clinically relevant thresholds [154, 155]. However, some propose that since the CR-10 scale is not strictly linear (yet is designed with ratio properties), smaller changes of the order of 1.0 can be considered relevant for less intensive interventions (i.e. acute response to oxygen therapy or bronchodilator therapy) [163] (figure 6).
Evaluation
Isotime dyspnoea, as measured by the CR-10 and VAS scales, is sufficiently sensitive to evaluate therapeutic interventions in COPD patients [45, 62, 99, 101, 118, 120, 127, 129, 131]. However, that improvements in patient-reported dyspnoea with bronchodilator therapy during CWRETs are variable is probably due to measurement variability in this outcome as well as the modest numbers of patients in several of these studies [156]. Oxygen therapy significantly reduces dyspnoea (CR-10) by a pooled mean of −1.15 (95% CI −1.65–−0.65) in COPD patients who desaturated during exercise [128]; and a statistically significant, but clinically very small, effect in mildly or nonhypoxaemic COPD patients who are dyspnoeic at rest, i.e. −0.37 (95% CI −0.50–−0.24) [164]. In patients with ILD, also dyspnoeic at rest, the effects of ambulatory oxygen therapy on dyspnoea (CR-10) in hypoxaemic [165] and nonhypoxaemic subjects parallel those described for COPD [166].
Field tests
These simple exercise tests require less technical equipment than laboratory-based tests; therefore, they are cheaper, but at the expense of obtaining less physiological information. However, with recently available portable (although costly) equipment, it is now possible to perform field tests with full cardiopulmonary physiological monitoring. The most common field tests are the 6MWT, the ISWT and ESWT. Their metabolic (i.e. V′O2) profiles differ considerably (figure 7), highlighting the differences between them.
Oxygen uptake (V′O2) profile in three different exercise tests in eight chronic obstructive pulmonary disease patients. a) 6-min walk test; b) incremental shuttle walk test (ISWT); and c) incremental cycle ergometer exercise test (IET). Although the IET and ISWT involve different exercise modalities and active muscle masses, a similar peak V′O2 is attained. Reproduced and modified from [168] with permission from the publisher.
Safety
Field tests are generally safe. A specific report of complications associated with premature termination of the 6MWT indicated a 6% occurrence, with oxygen desaturation being the most common (∼5%), followed by “symptoms” (∼1%) and chest pain and tachycardia (<1%) [167]. Both 6MWT and ISWT are used in PAH patients with no severe problems being reported [66]. Nonetheless, the available evidence shows that in both COPD and ILD patients, 6MWT typically elicits V′O2 and heart rate responses as high as 85–90% of the cycle ergometer IET peak, and not infrequently, even greater values are achieved [168, 169]. This also seems to be the case in most PAH patients [170]. For ISWT, V′O2 and heart rate responses approach peak cycle ergometer IET values [168, 171] (figure 7). Therefore, there is no rationale for considering risk mitigation in field testing compared to maximal laboratory exercise tests. As such, the degree of monitoring typically undertaken for an IET (i.e. ECG monitoring (direct or telemetric) or pulse oximetry) would be expected to enhance safety in field tests (table 1).
The ESWT is usually performed after an ISWT, and so is likely to be safer than the ISWT because contraindications to exercise testing may have been identified in the prior test.
Reasons to stop the test, and contraindications to exercise, are identical to the IET recommendations [3, 18]. While there is no information about the risks of desaturation, it is recommended that tests be terminated if SpO2 falls to <80% [172].
Shuttle walk tests
Incremental shuttle walk test
The ISWT (table 1) is an externally paced incremental walking test [173]. The ISWT has been shown to be feasible in a wide variety of populations, including patients with COPD [174], ILD [175] and in children with cystic fibrosis [176].
Procedure
Subjects are required to walk around two markers 9 m apart (10 m course). Single audio cues (beeps) signal the time at which the subject is expected to turn at the marker. Walking speed is increased each minute. The ISWT has 12 levels (walking speeds) and therefore lasts a maximum of 12 min. No encouragement is given during the test: the only verbal cues provided refer to an impending increase in walking speed [173]. ISWT performance is usually defined as the distance achieved. It is recommended to measure SpO2 and heart rate [172, 177] (table 1).
Validity
Available data suggest that ISWT distance correlates well (r=0.66–0.88) with measured IET V′O2peak in a variety of respiratory conditions [172]. Furthermore, the metabolic response profile is comparable to IET values [168] (figure 7). ISWT distance has been shown to be correlated significantly with daily living activity (r=0.17–0.58) [32, 172], with quadriceps muscle strength (r=0.47) and with forced expiratory volume in 1 s (FEV1) and age in severe COPD, but not with muscle mass or body mass index [178]. ISWT distance accounted for 55% of the variability of St George's Respiratory Questionnaire total score and 53% of Short-Form (SF)-36 [179]. The ISWT has been used in a multicentre clinical trial [180].
Precision
On average, there is a small (20–25 m), but statistically significant, increase between the first two ISWTs performed on the same day or on different days [172, 181]. To ensure reproducible results when evaluating responses to interventions, the recommendation is that two ISWTs be performed at baseline and the greatest distance recorded.
Prognostic information
The ISWT is a significant predictor of survival and readmission in COPD patients [182, 183] (table 2).
Discrimination
There is no information about whether ISWT is discriminative in respiratory diseases. ISWT distance demonstrates a mild correlation with FEV1 [178]. In one study of COPD patients, walking <170 m on an IWST was associated with greater mortality [183].
Clinically meaningful difference
The MCID value for the ISWT was set at 47 m [180, 184], with additional benefits reported at 79 m [184] (table 2).
Evaluation
In COPD the ISWT is usually sensitive to rehabilitative interventions (figure 8). In a recent meta-analysis of nine trials, a mean improvement of 38 m (95% CI 22–51 m) was found [73]. In COPD patients recovering from a stay in the intensive care unit, ISWT was found to improve by a mean of 64 m after rehabilitation [185].
Change in incremental shuttle walk test (ISWT) distance in a selection of studies using ISWT as an outcome following rehabilitation. MCID: minimal clinically important difference (47 m).
ISWT detected small improvements after oxygen therapy in COPD patients with exercise desaturation [186, 187] (table 3).
ISWT is sensitive to both short- and long-acting anticholinergic and β2-agonist bronchodilators in COPD in most [172, 177, 188–190], but not all [191] studies. The mechanisms of improvement have not been studied extensively, but are probably related to improvements in lung function [189, 190]. In non-cystic fibrosis bronchiectasis patients, treatment of exacerbations with 14 days of intravenous antibiotics significantly improved ISWT performance [192] (table 4).
Endurance shuttle walk test
The ESWT [193] is derived from the ISWT, in much the same way as the laboratory CWRET derives from the IET. Like the ISWT, it is externally paced and its intensity is tailored to the exercise tolerance of the individual patient (table 1).
Procedure
The ESWT uses the same course and auditory signal method as the ISWT; however, a constant walking cadence is maintained throughout the test. The ESWT starts with a 100-s “warm-up” at a slow pace [193], followed by the “exercise” phase at the prescribed speed (typically 80% of peak ISWT) calculated from a previous ISWT result [193]. Results are expressed in seconds or in metres.
As the considerations of the velocity–duration curve for the ESWT are essentially identical to those of the CWRET power–duration curve, interindividual variability of test duration is expected to be high, unless the duration is purposely standardised. Thus, coefficient of variations have been reported to range between 60% and 120% of the corresponding means [174, 194–198]. While no recommendation on how to adjust test duration has yet been proposed, it is reasonable to assume that the range of baseline test durations used for the CWRET is desirable (i.e. 180–480 s) (table 1).
Validity
There is little information on the mechanisms determining baseline ESWT duration. It is likely that the physiological determinants are analogous to those of CWRET (online supplementary material), although some evidence suggests that ventilatory limitation may be more prominent in walking than in cycle ergometry. In a small group of moderate COPD patients, dyspnoea was reported more frequently during the ESWT than for cycle ergometry CWRET (70% versus 41%, respectively) [174]. In COPD patients, ESWT duration moderately correlates with FEV1 (r=0.44, p<0.001), but not with measures of muscle mass or strength [178]. In one small study in older individuals, ESWT was found to be limited by a ceiling effect and could not be performed successfully in approximately half the patients [199]. The test has been used in multicentre clinical trials [198, 200].
Precision
In the original study defining the ESWT, a significant ordering effect between first and second ESWT attempts was observed (an increase of 59 s or 47 m). However, subsequent reports showed that differences between tests performed on different days were lower (e.g. 15 s) and nonsignificant [194, 201, 202]. In a recent multicentre clinical trial involving 255 moderate-to-severe COPD patients, the mean±sd differences between the first two ESWT performances (−7±72 s and −7±113 m for endurance time and distance, respectively) were not statistically significant [194] (table 2).
Prognostic information
There is no information about changes in ESWT having prognostic value for variables such as survival, exacerbations or healthcare burden.
Discrimination
Since one goal of the ESWT design is to standardise the initial duration to reduce its intersubject variability, it is anticipated that ESWT will not provide additional discriminative information to the ISWT needed to set the test speed.
Clinical meaningful difference
A few studies have addressed the MCID for change in ESWT in response to interventions in COPD. Employing an anchor-based approach, it was not possible find a valid estimation of the MCID for ESWT following pulmonary rehabilitation [174]. For pharmacotherapies (salmeterol and ipratropium), an MCID of change in ESWT duration of 65 s was found (95% CI 45–85 s), equivalent to 95 m (95% CI 60–115 m). These ESWT changes represented 13–15% of baseline values. These figures were confirmed by the same authors in a multicentre trial including 255 patients in which changes of 56–61 s and 70–82 m in ESWT were perceived by the participants as relevant [194] (table 2).
Evaluation
ESWT is responsive to pulmonary rehabilitation in patients with COPD (the pooled mean of studies in figure 9 is ∼360 s), bronchiectasis [203] and IPF [175] (table 3). ESWT is also responsive to oxygen therapy in patients with COPD [204] and IPF [175] who desaturate during exercise.
Improvement in endurance shuttle walk test (ESWT) performance (time in seconds) in chronic obstructive pulmonary disease patients following pulmonary rehabilitation.
Experience with ESWT to evaluate response to bronchodilators is more limited than with CWRET (table 4). Significant changes are seen with short- and long-acting anticholinergics or salmeterol in COPD patients, with improvements between 70 s and 164 s [174, 195–198]. In contrast, the MCID for ESWT was not reached with ipratropium in one study of patients with mild COPD [205] and in the post hoc integrated data analysis of two studies with the fixed combination umeclidinium/vilanterol (55/22 μg) [198].
In idiopathic PAH patients, ESWT distance was not increased by sildenafil, but 6MWD did increase [206].
6-min walk test
The 6MWT is the result of the evolution of a previous test aimed to assess functional capacity by measuring distance walked in a controlled length of time. A 6-min duration was found to be the best compromise between variability and length, while remaining discriminative [207].
Procedure
The 6MWT measures the distance that an individual can walk on an indoor 30-m flat corridor for a 6-min period [172]. Tracks <15 m have been shown to reduce 6MWD [208]. Due to a distinct familiarisation effect [172, 177, 209, 210], a minimum of two tests should be performed, with at least 15 min of intervening rest, with the greatest distance in the two tests being recorded [172, 177]. 6MWD is expressed in metres or feet [172, 177]. Other variables such as minimum SpO2, peak heart rate and dyspnoea and fatigue ratings can be measured [172, 177].
Validity
In several studies of COPD, PAH and ILD patients, 6MWD correlates with IET V′O2peak and WRpeak [170, 211–215]. In the largest study to date (2906 COPD patients), the correlation between 6MWD and WRpeak was 0.67 [214]. 6MWD correlates (r=0.40–0.65) with FEV1 in COPD [213, 215–217]. In IPF, 6MWD does not appear to show significant correlation with either forced vital capacity or transfer factor of the lung for carbon monoxide [211, 218, 219] (table 1). In PAH, 6MWD correlates with resting cardiac output and PVR [212], New York Heart Association Functional Class [212] and HRQoL [220]. Similarly, 6MWD change following PAH treatment correlated with changes in cardiac index, PVR [221] and HRQoL [220]. However, it has proven very difficult to demonstrate any prognostic power of 6MWD change with therapy. In a meta-analysis of 22 PAH treatment studies, neither the change in haemodynamics nor 6MWD predicted subsequent clinical events [221]. In a pooled analysis of 10 PAH treatment studies submitted to the FDA (2404 patients), despite a significant improvement in 6MWD, this accounted for only 22.1% of the treatment effect; the authors concluded that it was, at best, a modest surrogate end-point for clinical events [222]. 6MWD also correlates with physical activity (r=0.40–0.85) [223–225] and with HRQoL and daily living dyspnoea (r=0.25–0.50) [213, 216, 219]. The 6MWT is a more sensitive test for identifying exercise-induced desaturation than the cycle ergometer IET [226]. It has been widely used in multicentre studies of respiratory patients [42, 172, 177, 219, 227, 228], and is the only test that has been used so far in IPF clinical trials [219, 228]. However, the 6MWT has a ceiling effect as the linear relationship between IET V′O2peak and 6MWD is lost in less impaired patients [229, 230].
Precision
When the 6MWT is performed twice, 6MWD is reproducible, with ICC >0.70, with a coefficient of variation range of 5–8% for COPD, ILD and cystic fibrosis [172, 177] (tables 1 and 2).
Prognostic information
Of any testing format, the 6MWT has the most comprehensive prognostic information available in most chronic respiratory diseases [172, 177] (table 1). Several of the variables measured, such as 6MWD [172, 177], desaturation in IPF [231, 232] and impaired heart rate recovery in IPF [231] and PAH [233] have been found to be predictive of survival. In COPD, 6MWD is also related to hospital admissions [234] (table 1).
Discrimination
In a large COPD cohort, 6MWD was related to the severity of obstruction, although there was considerable overlap between GOLD obstruction stages [227]. In addition, 6MWD can stratify the risk of mortality in COPD [235] and is part of the commonly recommended BODE (body mass index, airflow obstruction, dysponea, exercise capacity) index [1].
Clinically meaningful difference
Two recent documents issued by the ERS and American Thoracic Society concluded that the MCID lies between 25 and 33 m, independent of disease [172, 177] (table 2). Given the poor performance of 6MWD change in predicting clinical outcomes in PAH [43, 221, 222], the use of MCID may be of limited significance, and values of 6MWD associated with better prognosis (380 m as suggested in one study [236], or 440 m as suggested by the REVEAL (Registry to Evaluate Early and Long-term Pulmonary Arterial Hypertension Disease Management) registry [237]) can be better targets for assessing the effectiveness of treatment [43]. This might also be the case for other respiratory diseases if exercise is used as a surrogate of survival.
Evaluation
Nonpharmacological interventions
According to a meta-analysis of rehabilitative interventions in COPD (including 38 trials), the mean effect of rehabilitation on 6MWD is 44 (95% CI 33–55) m comparing treatment and control groups [72, 238].
The 6MWT has been used frequently to assess the effects of pulmonary rehabilitation. According to a meta-analysis of rehabilitative interventions in COPD (including 38 trials), the mean effect of rehabilitation on 6MWD is 44 m (95% CI 33–55 m) comparing treatment and control groups [73]. In a review of rehabilitation after a COPD exacerbation (including six trials), the mean effect was 78 m (95% CI 12–143 m) comparing treatment and control groups [185]. Nonetheless, the limited comparative information available suggests that 6MWT is less responsive than cycle ergometer CWRET tLIM in assessing the effects of pulmonary rehabilitation in COPD [119]. Responses greater than the MCID were observed in 57% of the participants with CWRET tLIM, but only in 27% with 6MWD [119]. Other interventions in COPD for which 6MWD has been shown to be responsive are inspiratory muscle training [239], lung volume reduction surgery [240] and oxygen therapy [241], with improvements of 50–95 m in distance walked.
A meta-analysis of two trials comparing exercise training with control groups in IPF showed statistically significant improvement of 6MWD of 39 m (95% CI 15–62 m) [242], and a recent systematic review of rehabilitation in ILD reported a mean overall improvement of 44 m (95% CI 26–63 m) and 36 m (95% CI 16–55 m) in patients with IPF [50]. 6MWD was also shown to improve after rehabilitation in idiopathic and thromboembolic PAH patients [121, 243, 244]. In these studies, control groups usually deteriorated, resulting in differences between the rehabilitation and control groups of 80–110 m (table 3). In COPD, oxygen therapy improves 6MWD by 12–59 m in hypoxaemic patients. This difference is larger (17–109 m) when the control is walking with compressed air rather than breathing ambient air, because of the weight handicap [172]. In one retrospective ILD study, the 29 patients not on oxygen therapy prior to testing walked a mean distance of 81.2 m further using optimal ambulatory oxygen (p=0.01), while the 41 patients already on domiciliary oxygen walked a mean distance of 16.9 m (p=0.02) [245]. In another retrospective study of 52 patients with ILD who desaturated on exercise, oxygen therapy increased 6MWD (∼30 m, p=0.01) [165].
Pharmacological interventions
The 6MWT has a low responsiveness for evaluating bronchodilator interventions in COPD: most studies report changes in 6MWD with bronchodilation that are below the MCID value. These range from 20 to 42 m with short-acting β2-agonists [60, 246], from 6 to 39 m with short-acting anticholinergics [60, 247], from 21 to 54 m with long-acting β2-agonists [239, 248, 249] and ∼10 m with long-acting anticholinergics [250].
In a systematic review of 26 randomised clinical trials (n=3519), pooled mean increase in 6WMD for prostacyclin analogues (nine trials) was 35 m (95% CI 17–53 m), for endothelin receptor antagonists (eight trials) 46 m (95% CI 38–54 m) and for phosphodiesterase-5 inhibitors (six trials) 34 m (95% CI 25–43 m) [251]. Pooled 6WMD increase for all three drug types (23 trials) was 38 m (95% CI 30–47 m) [251]. In one small study, 6MWD was found to be more responsive than ESWT distance or CWRET tLIM in PAH patients taking sildenafil [206].
The pooled analysis of the studies PIPF-004 and PIPF-006 showed that after 72 weeks of treatment with pirfenidone there was a statistically significant difference in 6MWD between treatment and control groups of 24 m (p<0.001), just below the MCID [228].
Conclusion
In respiratory patients, exercise testing is useful in the clinical and research setting to assess the effects of interventions. It also allows appraisal of the degree and mechanisms of impairment, and it is a strong independent prognostic factor. Several methods for evaluating exercise capacity are available. The severity and cause of exercise intolerance are best assessed by conducting standardised laboratory exercise testing in which detailed physiological measurements are made while patients perform cycle ergometry or treadmill walking. Protocols can be either constant (“endurance”) or incremental. Simpler tests are also used, although the physiological information gathered is more limited: the 6MWT is relatively simple and has been used extensively; the ISWT and ESWT are better standardised and have been also used in several clinical trials. In COPD, endurance tests (i.e. CWRET and ESWT) are more responsive to interventions, both pharmacological and nonpharmacological, than incremental tests (i.e. IET and ISWT) and the 6MWT. The cycle ergometer CWRET has been used more extensively than the ESWT. It has the advantage that 1) the WR is precisely quantified; and 2) several physiological variables can be easily measured (e.g. inspiratory capacity and isotime responses), which allows elucidation and targeting of putative mechanisms of improvement or deterioration and which are responsive to interventions. Careful standardisation is necessary to reduce interindividual variability and economise sample size in clinical trials.
While a majority of bronchodilator studies show a relevant effect in enhancing the exercise capacity of COPD patients, a clinically relevant effect is not always found. Some of the inconsistency may be explained by differences in the mode and duration of action of bronchodilators; however, considerable variations may be due to inherent differences in study design or patients studied. In particular, the best method of assessing exercise tolerance requires further investigation before those bronchodilators which consistently improve exercise endurance can be fully appreciated.
The application of exercise testing to PAH has been dominated by the 6MWT, which has been used as surrogate measurement to obtain registration for most medications now used in that condition; however, the accumulated evidence suggests that change in 6MWD has little or no prognostic power and targeted absolute values are potentially better surrogates of changes in the natural history of the disease. Both V′O2peak and indices of the V′E–V′CO2 relationship are also useful targets for treatments. Limited evidence suggests that V′O2peak may be a better predictor of the time to clinical deterioration.
Less information about exercise testing in other chronic respiratory conditions is available, but being a strong predictor of survival and a source of information to understand the mechanisms of exercise intolerance, exercise testing appears as relevant for these as it is for COPD and PAH.
Exercise testing is generally safe. The available evidence does not favour any particular format as safer, and standard precautions must be adopted in all of them.
Acknowledgements
Thanks to Thomy Tonia (Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland) for her methodological advice, Valérie Rebeaud (Scientific Activities Department, European Respiratory Society, Lausanne, Switzerland) for her logistic assistance and Vasileios Andrianopoulos (Department of Research and Education, CIRO+, Centre of Expertise for Chronic Organ Failure, Eindhoven, the Netherlands) for his kind assistance with figure 5.
Footnotes
This article has supplementary material available from erj.ersjournals.com
Conflict of interest: Disclosures can be found alongside the online version of this article at erj.ersjournals.com
- Received May 12, 2015.
- Accepted September 14, 2015.
- Copyright ©ERS 2016