Copyright ©ERS Journals Ltd 2002 Variability and lack of predictive ability of asthma end-points in clinical trialsDepts of 1 Clinical Biostatistics and 3 Clinical Research, Merck Research Laboratories, Rahway, NJ, USA. 2 Respiratory, Cell and Molecular Biology Research Division, School of Medicine, University of Southampton, Southampton, UK CORRESPONDENCE: T. Reiss, Merck Research Laboratories, Pulmonary-Immunology, Mail code RY34B-328, Rahway, NJ 07065, USA. Fax: 1 7325947830. E-mail: theodore_reiss@merck.com Keywords: asthma, montelukast, predictability of response, variability of asthma
Received: November 29, 2001
This study was supported by a grant from the Merck Research Laboratories.
While a consensus definition of the clinical parameters important in asthma control exists, an adequate objective definition of a response to asthma treatment and parameters for prediction of that response remain undefined. Given that asthma is a complex biological disease and that different parameters may measure dissimilar aspects of the disease status, this study assessed the relationship among several end-points of asthma control, and attempted to select a combination of variables measured before (baseline characteristics) or early in asthma therapy which would be predictive of a long-term clinical response. Data from two previously reported clinical studies which included montelukast, inhaled beclomethasone, and placebo in mild-to-moderate asthmatics (n=1,576) were analysed. The forced expiratory volume in one second (FEV1), daily symptoms score (DSS), ß-agonist use, and morning peak expiratory flow (PEFAM) were recorded during the baseline period and throughout the 12-week treatment period. For the long-term response, as measured during the last 9 weeks of treatment, there was a large within-patient variability and no more than a moderate correlation between the changes in FEV1 and PEFAM; DSS and FEV1; and DSS and ß-agonist use. The overall predictive values for FEV1 and DSS were 7080%. The results showed that multiple measurements over a length of time are needed to establish a more complete profile of response, and that demographic and early treatment responses had a small but inadequate ability to predict future response. This study demonstrates the complex relationship among asthma end-points and the difficulty of reliably estimating long-term response using common, surrogate clinical markers of asthma control. Asthma is a complex biological disease; this syndrome is hypothesised to have multiple causes, and ranges in severity from intermittent to severe-persistent. The aim of treatment is control of asthma, which has been defined broadly in international guidelines by consensus expert panels as a decrease in chronic symptoms and the need for rescue medication use of inhaled ß-agonists, an increase in airflow and the absence of worsening episodes 1, 2. While this model describes the general concepts and end-points important in asthma control, and although the response to therapy has been largely reported and measured by a number of variables (as outlined in the guidelines), a comprehensive description and understanding of the relationship among everyday measurements of clinical asthma control remains elusive 3, 4. The authors recently showed that the forced expiratory volume in one second (FEV1) and morning peak expiratory flow (PEFAM) maintained a strong correlation with each other throughout a l-yr study, while FEV1 and the daily symptom score (DSS) or daily ß-agonist use showed a weak correlation during the same time period 5; such differences suggest that these end-points may measure different aspects of asthma disease status. Nevertheless, it remains of interest to investigate if a combination of these variables could serve as predictors of a response to treatment 6. The ability to accurately and reliably predict individual long-term response would depend on the clinical end-points demonstrating both a small degree of variability within the study population and a large correlation coefficient between predictor and response variables. To further explore relationships among changes in asthma end-points after interventions and the predictability of these clinical parameters, data from two large clinical studies was analysed 7, 8 with the objective of: 1) determining the within-patient (over time) and between-patient variability of asthma, as indicated by objective and subjective measurements; 2) estimating the strength of univariate and multivariate correlations among such measures; and 3) testing whether an appropriate combination of baseline and early-response variables (demographic, objective and subjective) could reliably predict long-term response to therapy.
The data analysed were from two previously completed, randomised, multicentre, double-blind, placebo-controlled, parallel-group clinical studies comparing montelukast, inhaled beclomethasone and placebo in mild-to-moderate asthmatics (n=1,576) 7, 8. Each study consisted of a 2-week, single-blind placebo run-in period, a 12-week double-blind active-treatment period, and a 3-week double-blind wash-out period. Patient characteristics are described in table 1
The factors and response variables measured were: 1) age, 2) sex, 3) duration of asthma, 4) baseline FEV1, 5) baseline FEV1 % predicted, 6) baseline FEV1 % reversibility after inhaled ß-agonist, 7) baseline PEF (L·min1). The following variables were measured postrandomisation: 1) FEV1 % change from baseline at week 3, 2) average DSS during first 3 weeks, 3) average daily ß-agonist use during the first 3 weeks, 4) average DSS change from baseline over the last 9 weeks, 5) average daily ß-agonist use over the last 9 weeks, and 6) FEV1 average % change over the last 9 weeks. The population of both studies included adult and adolescent patients with FEV1 of 5085% of predicted FEV1 (after withholding ß-agonist for at least 6 h), a minimum of 15% reversibility after ß-agonist administration, a minimum average DSS of 1.14 (measured on a 06 point scale), and an average of at least one puff of ß-agonist per day during a 2-week placebo run-in period. Patients taking concomitant oral theophylline (limited to 25% of patients in study 1) and patients taking concomitant inhaled steroids (limited to 25% of patients in study 2) were allowed to continue at a constant dose 7, 8. The pulmonary function variable FEV1 was measured at baseline and every 3 weeks thereafter. Patient-reported measurements of DSS, daily ß-agonist use, and PEFAM were recorded by the patient using a validated daily diary during the baseline period and throughout the 12-week treatment period 9. Other end-points included a patient's global evaluation (on a 7-point scale 10) and asthma-specific quality of life (QoL 11) assessment; these subjective measures provide further evidence for the impact of therapy on the patient's daily life. The average value over 3 weeks prior to a visit was used as the score at each office visit.
Statistical methods Pair-wise correlations among variables of interest and potential predictors for late response were established by calculating Pearson's correlation coefficients on the possible predictor values prior to allocation, at baseline, and in the treatment periods. To compute within-patient and between-patient variations for single visit measurements, the values from each visit were fitted via a variance component model, with factors for treatment and time as fixed effects and the subject as a random effect.
The within-patient and between-patient variations for the average of two measurements were obtained using the same model. The within- and between-patient variabilities for the average of three measurements were not computed directly, but were derived from a standard three-way analysis of a variance model. Each observation (measurement) Yijk A multivariate linear regression was used to establish prediction models for late responses 12. The model included demographic variables, baseline measurements, early responses, and treatment groups as independent variables. Each factor was evaluated using an F-test. The proportion of variation, r2, in the dependent variable accounted for by the independent variables in the model was used to indicate a goodness of fit, with 1 indicating a perfect fit of the model.
Assumptions of normality and homoscedasticity were assessed. All statistical tests were two-tailed, and a p
The variability of postrandomisation measurements and the correlation between variables were generally similar among the three treatment groups (for example, the sd in PEFAM measured within patients in the placebo group, montelukast group and beclomethasone group were 22.0, 22.2, and 25.3, respectively; and the sd in DSS measured within patients were 0.43, 0.45, and 0.44). However, the between-patient sd increased slightly in the active treatment groups because of a wider range of responses (e.g. the sd in DSS measured between patients in the placebo group, montelukast group and beclomethasone group were 0.63, 0.71, and 0.77, respectively. The measured variability was therefore considered characteristic of the disease itself and not of each different therapy; consequently, the data were pooled when reporting variability and correlations.
Asthma variability
These variabilities were reduced in magnitude when the average value of measurements at weeks 3+6 and weeks 9+12 were used, and further reduced when the average of three measurements was computed.
The variability of responses to asthma therapy is represented graphically in figures 1a and b
There was large variability in FEV1 (% change from baseline) and DSS (change from baseline) when averaged over the last 9 weeks. Patients who had an 10% increase in FEV1 at week 3 had between a 50% decrease and a 75% increase in FEV1 at week 12 (fig. 2a
In contrast, the average FEV1 (% change) at weeks 3+6 plotted against the average FEV1 (% change) at weeks 9+12 showed reduced variability (fig. 2b
Correlations
The correlations between objective and subjective measures were weak (r2=0.130.18). To measure the results of therapy, the correlation was calculated for each of the four variables during the last 9 weeks of treatment measured as change or per cent change from baseline. The correlation of FEV1 (% change) to PEFAM was 0.36, substantially less than the 0.76 correlation coefficient measured with baseline values. Similarly, there was a large variability and a moderate correlation (r2=0.33) between FEV1 (% change) and DSS (table 3
The relationships between
However, QoL improved in 65% (placebo group), 72.5% (montelukast group), and 77.4% (beclomethasone group) of the patients; patient global evaluation improved in 57.3%, 76.4%, and 96.2% of the patients, respectively. Thus, a significant proportion of patients with no improvement in FEV1 nonetheless showed improved QoL and global evaluations. Similar results were recorded for DSS and global evaluation and QoL (table 4
Predictive model
Twelve variables that correlated (i.e. the correlation coefficient was calculated to be significantly different from zero) with FEV1, DSS, or both are shown in table 5 Multivariate predictive models for FEV1 and DSS were fitted using the significant variables. As each predictive variable was added to the model, the cumulative coefficient of determination r2 progressively increased. The r2 for predicting the per cent change from baseline of FEV1 was 0.404 when seven variables were used. Similarly, the r2 progressively increased to 0.537 for predicting changes in DSS on using four variables.
The relationship between the measured
For a model-predicted improvement in FEV1 of 10% change from baseline (chosen arbitrarily), the range of actual change in FEV1 improvement was from 2035%. Similarly, the measured DSS averaged over the last 9 weeks of treatment was compared with predicted DSS (figure 3b
Three individual patients were selected to illustrate the model. Their observed and model-predicted values for
Although the differences between the observed and predicted values for both of these response variables are not large, the 95% confidence intervals were large, indicating a limited certainty of prediction.
Predictive value In addition, the authors studied a multivariate regression model that fitted (FEV1, DSS) over the other factors (data not shown); this analysis showed that the same set of covariates had similar contributions and yielded similar results to the model reported in this article.
Variability and correlations of asthma assessment variables The National Asthma Education and Prevention Program and the Global Initiative for Asthma guidelines 1, 2 utilise the magnitude of pulmonary function measurements and the frequency of ß-agonist use, daily symptoms, and night-time awakening to form the basis for establishing the severity of asthma and the response to asthma therapy. In recommending several end-points to measure asthma severity and control, these guidelines have attempted to account for the complex physiological, genetic and environmental factors that underlie asthma. While supporting the concept that asthma is a clinically complex syndrome, the results of this study suggest that the relationship among asthma end-points is equally intricate, preventing simple definitions of asthma control and individual responses. Large within- and between-patient variability of single time-point measurement of FEV1, PEFAM, DSS, and ß-agonist use was observed. This variability was reduced when measurements at multiple time points were taken and averaged. Thus, changes in a single measurement at any one time point did not adequately describe the results of asthma therapy, and consequently cannot be used to accurately predict the outcomes of therapy. Furthermore, patients who showed a robust improvement in pulmonary function did not necessarily show a similar change in daily symptom scores or ß-agonist use. Measures of changes in pulmonary function, which may be thought of as objective, correlated poorly with patient-reported (subjective) measures of asthma. A substantial proportion of the patients who showed essentially no change in DSS had a clear improvement in FEV1. Similarly, a substantial proportion of the patients who showed essentially no change in FEV1 showed a clear reduction in DSS. Thus, the low correlation among FEV1, DSS, and other measurements indicates that no one variable adequately describes the state of the disease. Further, in addition to demonstrating that multiple response variables must be measured, these findings show that in order to capture a comprehensive patient profile, more than one measurement of the same variable needs to be made.
Predictive model
Pharmacogenomics: responders and nonresponders An adequate definition clearly segregating responders from nonresponders is necessary to assess the influence of predictor variables (including genotype) as well as therapy. However, efficacy responses in asthma therapy for individual end-points have generally not shown a clear responder-nonresponder pattern (responses have been unimodal), with individual responses ranging from low to moderate to high 21. For example, Malmstrom et al. 8 recently showed that the pulmonary function response to beclomethasone and montelukast in adults with asthma followed a near normal and unimodal, not a bimodal, distribution. Similar response distributions have been reported in mild asthmatics treated with inhaled beclomethasone and a leucotriene receptor antagonist 22. Collectively, the available evidence does not appear to support the hypothesis that asthmatic patients can be readily separated into largely disjoint responder and nonresponder categories using one variable. This indicates that when building a response model with pharmacogenomic covariates (e.g. haplotypes), large variability would be expected (therefore significant overlap) among the different haplotype subgroups. The present data and other published reports therefore suggest that a simple cut-off point should not solely be relied upon, e.g. 5% for FEV1, as a definition of response; instead, multivariate continuous variables should be used to capture the response profile.
This study assessed the relationship and correlation among several end-points of asthma control to determine if a combination of markers measured before (baseline characteristics) or early in asthma therapy could serve as predictors of a long-term response to treatment. The large within-patient variability in the end-points measured, and no more than a moderate correlation between the predictor and response variables, resulted in inadequate predictive values for the forced expiratory volume in one second and the daily symptom score. The unclear relationships among end-points and the difficulty of reliably estimating long-term response using common, clinical markers of asthma control suggest that these variables measure dissimilar aspects of asthma, and lead one to question the clinical relevance of the changes measured in any one particular end-point and to its value in identifying important clinical outcomes in asthma. Clearly, further studies utilising multiple measurements over a length of time are needed to understand the complex relationship among asthma end-points and to establish a more complete profile of early response that is predictive of future response. While the model described in this paper was shown to have some limited predictive value, efforts are underway to refine it further so that it becomes of significant use to the physician trying to establish therapies that provide improvements of a predictable magnitude.
The authors would like to thank W.B. Gough and A.S. Swern for their critical comments and S. Balachandra Dass for writing and editorial assistance.
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||