Lung function decline in COPD trials: bias from regression to the mean

S. Suissa ¹ , ² , ³

¹McGill Pharmacoepidemiology Research Unit, Jewish General Hospital, Depts of ²Epidemiology and Biostatistics, and ³Medicine, McGill University, Montreal, QC, Canada.

S. Suissa, Division of Clinical Epidemiology, Royal Victoria Hospital, 687 Pine Avenue West, Ross 4.29, Montreal, QC, Canada H3A 1A1. Fax: 1 5148431493. E-mail: samy.suissa{at}clinepi.mcgill.ca

The decline in lung function over time is a fundamental measure of disease progression among patients with chronic obstructive pulmonary disease (COPD). As a result, forced expiratory volume in one second (FEV₁) decline has been used as a central outcome measure for many randomised controlled trials evaluating whether pharmacological treatments could modify the natural history of COPD. These trials, mostly focussed on assessing the benefit of inhaled corticosteroids, and their meta-analyses highlighted the complexities of analysing data from repeated FEV₁ measurements over time, particularly in the context of studying COPD patients who discontinue follow-up early and in large numbers. This has lead to contradictory results and divergent conclusions 1–3. At present, the recent paper from the Towards a Revolution in COPD Health (TORCH) trial reports that the yearly decline in FEV₁ was significantly slower with fluticasone, salmeterol or both compared with placebo 4.

A common misconception in the interpretation of these studies is that the effect of the study drug on FEV₁ decline is likely to be greater than the data suggest. The rationale is that more patients receiving placebo were discontinuing the study drugs and the patients who dropped out early had a steeper decline in lung function than those who remained. Consequently, it has generally been believed that the resulting analysis “actually minimises the differences observed in the rate of FEV₁ decline” 4.

To clarify this misconception, a fundamental aspect of the design and statistical analysis of these trials is addressed, namely bias from regression to the mean resulting from the absence of an authentic intent-to-treat approach. It is described based on the TORCH trial and illustrated using data from the Canadian Optimal randomised trial.

BIAS FROM REGRESSION TO THE MEAN

Most of the trials to date are not designed for a full intent-to-treat analysis of lung function decline. The TORCH trial, for example, only measured FEV₁ until the patients discontinued treatment. It involved 6,112 moderate-to-severe COPD patients randomised to one of four treatment groups including fluticasone, salmeterol, both or placebo, followed for 3 yrs. However, the lung function analysis involved only 5,343 subjects with 26,539 measurements of post-bronchodilator FEV₁ made twice a year during follow-up to estimate the rate of FEV₁ decline between 6 months and 3 yrs after randomisation. Consequently, 10,133 measurements were missing so that the pure intent-to-treat analysis was not possible. These missing measurements are likely to not be a random sample of all possible 36,672 measurements that the study could have yielded, i.e. they are not missing at random, which can generate two forms of bias from regression to the mean 5.

The first form of the bias results from excluding some subjects altogether. Nearly 18% of patients allocated to placebo did not contribute a single FEV₁ value because they discontinued placebo before the first 6-month visit when the initial FEV₁ was measured. In contrast, only 9% of patients allocated to combination therapy did not make it to this first 6-month visit. It is generally accepted that these excluded patients would have had the worse FEV₁ values at the first visit had they been available to be measured. Thus, the slope of decline in the remaining subjects with better FEV₁ values at the first visit may have been affected by regression to the mean. This phenomenon is illustrated below.

The second form of this bias results from discontinuing the follow-up of patients who have the initial FEV₁ value measured but are missing some subsequent values; these measurements are also unlikely to be missing at random. In the TORCH study, the placebo patients who discontinued before the end of follow-up had a faster decline in FEV₁ (76 mL·yr⁻¹) than those completing the trial (54 mL·yr⁻¹) 4. Here again, these slopes of decline may have been affected by regression to the mean.

ILLUSTRATION

Data from the Canadian Optimal study, a three-arm randomised trial of 449 patients with moderate or severe COPD, are used to illustrate the bias 6. Measurements of post-bronchodilator FEV₁ were made at randomisation (visit 0) and at 4, 20, 36 and 52 weeks thereafter during the 1-yr follow-up (visits 1–4). The 322 subjects who had measurements of FEV₁ for all visits were used in this illustration. FEV₁ decline was measured as of visit 1 and the rate of FEV₁ decline was estimated in two ways: 1) as the difference in FEV₁ between visits 4 and 1; and 2) using measurements from all four visits with a mixed linear regression model accounting for within-subject correlation, with the slope standardised to a 1-yr time span.

The 322 patients had a mean FEV₁ of 1,131 mL at visit 1, with a change in FEV₁ from visit 1 to 4 of 38.7 mL. There was a significant correlation of 0.33 between this decline and the FEV₁ measure at visit 1, which indicates that patients with the highest initial FEV₁ have the greatest decline, while the patients with the lowest initial FEV₁ have the lowest decline. Figure 1⇓ depicts this correlation by showing that patients in the highest quartile of initial FEV₁ values (>1,440 mL) have the largest decline (mean decline 119 mL), while the patients in the lowest quartile of initial FEV₁ values (<770 mL) in fact show an improvement of 32 mL.

Fig. 1—

Depiction of regression to the mean. Correlation between initial forced expiratory volume in one second (FEV₁; divided in quartiles) and the change in FEV₁ from visit 1 to 4 (mean change in each quartile). Patients in the highest quartile of initial FEV₁ values have the largest decline (119 mL) while the patients in the lowest quartile of initial FEV₁ values show an improvement of 32 mL. •: initial FEV₁<770 mL; ▪: initial FEV₁ 770–1030 mL; ○: initial FEV₁1040–1440 mL; □: initial FEV₁ >1442 mL.

Table 1⇓ shows that the overall 1-yr decline in FEV₁ estimated using measurements from all four visits was 38.6 mL (p = 0.001). It shows that if the 18% of patients with the lowest FEV₁ at visit 1 (<700 mL) are excluded, in keeping with the hypothesis that patients in poorer health are more likely to leave the study, the 1-yr rate of FEV₁ decline among the remaining subjects becomes 52.2 mL, a clear overestimate of the decline. If only the 9% of patients with the lowest FEV₁ at visit 1 (<630 mL) are excluded, analogous to the lower exclusion rate with combination therapy in TORCH, the 1-yr rate of FEV₁ decline among the remaining subjects is 44.1 mL, still an overestimate but less marked. Conversely, the two groups of excluded subjects have a mean increase in FEV₁ of 40.7 and 52.5 mL, respectively.

View this table:

Table 1—

Estimation of decline in forced expiratory volume in one second(FEV₁) among 322 subjects across four visits during a 1−yr follow-up from the Canadian Optimal trial

Table 2⇓ addresses the issue of subjects with a FEV₁ value at visit 1, but values possibly missing thereafter. Here, it is found that if the 20% of patients who have the lowest FEV₁ at visit 2 (<740 mL) are excluded, the 1-yr rate of FEV₁ decline among the remaining subjects becomes 50.3 mL. These excluded subjects had, between the first two visits, a decline in FEV₁ of 50.6 mL over the initial 4-month period but no significant decline over all four visits (1.7 mL).

View this table:

Table 2—

Estimation of decline in forced expiratory volume in one second(FEV₁) among 322 subjects across four visits during a 1−yr follow-up from the Canadian Optimal trial

CONCLUSION

The intent-to-treat principle for randomised controlled trials is fundamental to avoid bias. Selection bias can occur when patients are permitted to exit the study when they discontinue the study medication. A more severe form of this bias occurs when randomised patients are excluded from the analysis altogether.

The TORCH study performed an authentic intent-to-treat analysis for the outcome of mortality by assessing the survival status of all 6,112 patients randomised for the entire 3-yr follow-up period. However, the recent TORCH analysis of lung function was not a true intent-to-treat analysis since it was based on only 5,343 out of the 6,112 patients with randomised FEV₁ values. Moreover, twice as many patients randomised to placebo were excluded altogether compared with the combination therapy group, while many patients discontinued early with no further FEV₁ measurements. This paper shows that such differential exclusion rates can introduce selection bias if the basis for exclusion is associated with the outcome, in this case decline in FEV₁.

The current illustration can help interpret the TORCH findings. Assume, for instance, that the FEV₁ decline in the TORCH study was 38.6 mL·yr⁻¹ equally for all treatment arms. The exclusion of 18% of placebo patients, most likely with the lowest initial FEV₁ value, can result in a mean FEV₁ decline of 52.2 mL among the remaining subjects, compared with the exclusion of 9% in the treated patients which changes the decline to 44.1 mL. Thus, instead of comparing 38.6 with 38.6 mL·yr⁻¹, a study based on such incomplete groups would mistakenly compare 44.1 with 52.2 mL·yr⁻¹ and infer that the declines are different. In addition, among those with an initial FEV₁ value, the TORCH study placebo patients who discontinued before the end of follow-up had a decline of 76 mL·yr⁻¹ compared with 54 mL for those completing the trial. In the present illustration, the patients most likely to discontinue had a 4-month decline in FEV₁ of 50.6 mL (extrapolated to 152 mL over 1 yr) compared with a 1-yr FEV₁ decline among the remaining subjects of 50.3 mL, both quite different from the true 1-yr decline of 38.6 mL. Incidentally, this second form of the bias may explain the different initial spikes in FEV₁ seen just after randomisation in these trials. Such bias is not eliminated by advanced techniques of data analysis, such as mixed linear regression, that account for within-subject correlation and variable numbers of FEV₁ measurements per subject.

While the TORCH study may provide some evidence that pharmacological therapy could modify the decline in lung function, it could also reflect an artificial effect of the phenomenon of regression to the mean. Indeed, the decline was estimated after excluding patients, in a context where the patients with the best initial forced expiratory volume in one second value generally have the greatest decline in forced expiratory volume in one second and those with the poorest initial forced expiratory volume in one second value have the lowest decline, or even an increase, in forced expiratory volume in one second. In such a context, the impression that the effect of the study drug on forced expiratory volume in one second decline is greater than the data would suggest is a misconception; the effect of the drug could even be nil. The regression to the mean phenomenon can clearly lead to bias in randomised trials of chronic obstructive pulmonary disease treatment, thus, proper attention to the intent-to-treat principle becomes crucial to avoid this bias and provide valid data. This will hopefully be addressed in the upcoming Understanding Potential Long-term Impacts on Function with Tiotropium (UPLIFT) trial 7.

Support statement

S. Suissa is a Distinguished Investigator of the Canadian Institutes of Health Research (CIHR).

Statement of interest

A statement of interest for S. Suissa can be found at www.erj.ersjournals.com/misc/statements.shtml

Acknowledgments

The author would like to thank S. Aaron (University of Ottawa, Ottawa, ON, Canada) and the Canadian OPTIMAL trial group for kindly providing data used in this editorial and P. Ernst (McGill University, Montreal, QC, Canada) who provided crucial comments.

References

Highland KB, Strange C, Heffner JE. Long-term effects of inhaled corticosteroids on FEV₁ in patients with chronic obstructive pulmonary disease. A meta-analysis. Ann Intern Med 2003;138:969–973.
Sutherland ER, Allmers H, Ayas NT, Venn AJ, Martin RJ. Inhaled corticosteroids reduce the progression of airflow limitation in chronic obstructive pulmonary disease: a meta-analysis. Thorax 2003;58:937–941.
Soriano JB, Sin DD, Zhang X, et al. A pooled analysis of FEV₁ decline in COPD patients randomized to inhaled corticosteroids or placebo. Chest 2007;131:682–689.
Celli B, Thomas NE, Anderson JA, et al. Effect of pharmacotherapy on rate of decline of lung function in COPD: results from the TORCH study. Am J Respir Crit Care Med 2008; 178: 332–338
Bland JM, Altman DG. Some examples of regression towards the mean. BMJ 1994;309:780
Aaron SD, Vandemheen KL, Fergusson D, et al. Tiotropium in combination with placebo, salmeterol, or fluticasone-salmeterol for treatment of chronic obstructive pulmonary disease: a randomized trial. Ann Intern Med 2007;146:545–555.
Decramer M, Celli B, Tashkin DP, et al. Clinical trial design considerations in assessing long-term functional impacts of tiotropium in COPD: the UPLIFT trial. COPD 2004;1:303–312.

Main menu

User menu

Search

Lung function decline in COPD trials: bias from regression to the mean

BIAS FROM REGRESSION TO THE MEAN

ILLUSTRATION

CONCLUSION

Support statement

Statement of interest

Acknowledgments

References

Contact us