Annual variability in methacholine responsiveness in nonasthmatic working adults

Annual variability in methacholine responsiveness in nonasthmatic working adults. W.S. Beckett, P.A. Pace, S.J. Sferlazza, V.J. Carey, S.T. Weiss. ERS Journals Ltd 1997. ABSTRACT: Change in airway responsiveness is used frequently as a clinical as well as an epidemiological tool. Changes in airway responsiveness can be superior to other measures of lung function in that they are more sensitive indicators of an environmental effect. However, normal variation in test results must be defined before change can be interpreted. To characterize annual variability in airways responsiveness, we administered a high-dose methacholine challenge at 1 yr intervals for up to 4 yrs to 105 healthy, nonasthmatic working subjects. Using this high-dose protocol, the majority of tests (83%) produced at least a 20% fall in forced expiratory volume in one second (FEV1), allowing standard calculation of the provocative dose of methacholine causing a 20% fall in FEV1 (PD20). An annual change in methacholine responsiveness by one or more doubling doses was seen in at least 30% of subjects each year. The components of variance of airways responsiveness measures were estimated to allow direct comparison of within-subject and between-subject variability. The within-subject variability in PD20, was markedly greater than the comparable within-subject variability in FEV1. Level of FEV1 and age were both significant determinants of methacholine responsiveness. Comparison of two methods of expressing methacholine responsiveness (PD20 using the full challenge up to 250 mg·mL-1 methacholine, and the dose-response slope using data up to 32 mg·mL-1 methacholine as the maximum dose) had similar annual variability in censored data and mixed-effects models. We then developed an approach to statistical analysis of "right-censored" methacholine challenge data using a maximum likelihood estimation under a censored Gaussian model. These studies of methacholine responsiveness provide normative data on annual test variability in healthy, nonasthmatic working adults, and show that a shorter low-dose challenge has comparable annual variability to a lengthier high-dose challenge. Eur Respir J 1997; 10: 2515–2521. *Occupational Medicine Program, and Division of Pulmonary and Critical Care Medicine, Yale University School of Medicine. **Channing Laboratory, Dept of Medicine, Brigham and Women's Hospital and +Pulmonary and Critical Care Division, Beth Israel Hospital, Harvard Medical School, Boston, MA, USA.

Measuring changes in airway responsiveness is clinically useful in asthma management, and epidemiologically useful in studies of groups exposed to substances that may cause asthma. As with any test, the ability to detect a significant change between two or more serial methacholine challenge tests in individuals or groups is related in part to the test characteristics of between-subject variability and within-subject variability (reliability) when the test is administered repeatedly over time. As the test variability increases and the reliability decreases, more subjects must be studied to detect a small change in response. If the number of subjects is kept constant, then as test variability increases, only larger changes in response will be detected as significant. Variance components analysis is a statistical approach which quantifies these aspects of group versus individual variability of repeated testing.
A limitation of the usefulness of airway responsiveness in studies of nonasthmatics is the lack of a response of many normals to standard bronchoconstrictor challenges. A 20% fall in forced expiratory volume in one second (FEV1) is the usual criterion for a "positive" methacholine challenge, but a large proportion of nonasthmatic normal subjects will not meet this criterion, and thus have no test "result", even with very high-dose challenge. This makes it difficult to use airway responsiveness as an indicator of environmental effects on airways of nonasthmatics.
Individuals who do not have a "positive" response create a further problem for statistical analysis, as their provocative dose causing a 20% fall in FEV1 (PD20) is "right-censored", in statistical terms. Although with a small proportion of such individuals it is possible to assign the value of the highest dose administered, this proportion is too large in a population of nonasthmatics, as it would create a non-normal (or non-Gaussian) distribution.
The purpose of the present study was threefold: 1) to characterize the annual variability in response to a full high-dose methacholine challenge given annually for up to 4 yrs to a healthy nonasthmatic workplace-based population; 2) using the same methacholine challenge data, to compare the test-to-test variability when the challenges were expressed in the two most widely used formats: the PD20 and the dose-response slope (DRS) using a methacholine challenge protocol up to a maximum concentration of 32 mg·mL -1 , commonly used in clinical pulmonary function laboratories; and 3) to develop and test on these data a new, valid statistical method for analysing right-censored methacholine challenge data.

Subject population
The 105 subjects were all healthy working adults employed in a shipyard, who participated as volunteers in a study of respiratory responses to occupational exposures. The study was approved by the Human Investigations Committee of the Yale University School of Medicine, and subjects gave informed consent to participate. Criteria for inclusion were: active employment status as a shipyard draftsman, technical aide, or welder; absence of known heart disease (such as angina or arrhythmias); and no current asthma, as defined by physician diagnosis and absence of requirement for bronchodilator medication at any time in the previous year. Individuals with a past history of asthma but who had not required any medication in the previous year were included. The cohort was initially assembled and studied prospectively to determine whether occupational exposure (specifically welding) affected airway reactivity. Comparisons between occupational groups of spirometry and methacholine inhalational challenge showed no differences by occupation either at entry or during the period of observation, and thus the groups were pooled for analysis. There were two main occupational groups: draftsmen and technical aides who worked indoors on plans for ships; and welders who worked daily in both indoor and outdoor locations. Of the 105 subjects, 92 (88%) were male; 36 (34%) were never-smokers, 36 (34%) were exsmokers, and 33 (31%) were current smokers.

Methacholine challenge
Methacholine challenge testing was performed once yearly after a usual workshift using a protocol similar to that of CHAI et al. [1], but with a very high maximum dose of methacholine (250 mg·mL -1 ), 10 times higher than the usual maximum dose of 25 mg·mL -1 . After three baseline spirograms, subjects inhaled five inspiratory capacity breaths of phosphate-buffered saline vehicle solution and then increasing concentrations of methacholine bromide solution, using a 0.6 s timed dosimeter delivering 137.8 N·m -2 compressed air to a DeVilbiss 646 jet nebulizer (DeVilbiss Co., Somerset, PA, USA). The concentration of methacholine was progressively doubled until a 20% fall in FEV1 occurred, or the subject reached a maximum concentration of 250 mg·mL -1 methacholine. This high-dose challenge had previously been found to be acceptable to nonasthmatic subjects in a laboratory study of repeated methacholine challenge [2]. Prior to the first challenge, an initial assignment of subjects to dosing schedules was made based on clinical indicators of airway responsiveness. Those with no history of asthma and thus a very low likelihood of hyperresponsiveness were assigned to an initial higher starting dose (8 mg·mL -1 methacholine), while others with a past history of asthma began at 1 mg·mL -1 methacholine, as previously described by HENDRICK et al. [3]. Serial challenges were delivered at exactly 5 min intervals; subjects waited 2 min after each dose and then performed a single forced vital capacity manoeuvre [4]. The time required for each complete challenge was up to 45 min for subjects requiring the highest doses to produce a 20% fall in FEV1. Table 1 shows the sequence of dilutions of methacholine used for previously asthmatic and nonasthmatic subjects. Figure 1 illustrates three sample dose-response curves for subjects with and without a previous history of asthma.
Methacholine responsiveness from these high-dose challenges was calculated using two methods, illustrated in figures 1 and 2. In the first, the dose was calculated using cumulative breath units of methacholine, where 1 breath of a 1 mg·mL -1 concentration represents 1 breath unit. Using this nebulizer delivery system, 1 breath unit delivers approximately 8 µg methacholine to the mouth. PD20 was calculated from a log-linear plot of methacholine dose versus per cent fall from postvehicle FEV1 by linear interpolation. When the per cent fall in FEV1 at the maximum dose of methacholine was less than 20%, the maximum cumulative dose (2515 breath units) was assigned as the PD20 for statistical analysis. In the second method ( fig. 2), from the same challenges, the methacholine DRS was calculated according to the method of O'CONNOR et al. [5], using data from methacholine concentrations up to only 32 mg· mL -1 but disregarding data from higher methacholine concentrations. The DRS is the slope of the line (on the same linear-log plot) connecting the origin to the per cent fall in FEV1 for the 32 mg·mL -1 methacholine dose (corresponding to 320 cumulative units of methacholine) or to the per cent fall for the highest dose attained if a PD20 occurred before the 32 mg·mL -1 dose. The units of the DRS calculation are the per cent fall from baseline FEV1 divided by the cumulative methacholine  0  16  160  16  160  32  320  32  320  64  640  56  640  125  1265  125  1265  250  2515  250  2515 Those with a past history of asthma started at a lower initial concentration/dose (1 mg·mL -1 methacholine); those without past asthma started at 8 mg·mL -1 *: 1 breath unit is equivalent to an inhalation of one breath of a 1 mg·mL -1 solution.
dose. Calculating the DRS from these challenges was performed as it would be for an abbreviated methacholine challenge test in which the highest dose used was 32 mg·mL -1 , permitting exact comparisons of the within-subject and between-subject variability in methacholine responsiveness calculated by these two different methods (high-dose PD20, and DRS in the same group of subjects). Spirometry was performed with the subjects seated and wearing noseclips, using two identical Stead-Wells survey spirometers (Eagle II, W.E. Collins, Braintree, MA, USA) following American Thoracic Society criteria for spirometry [6], except that the few tests for which the 5% reproducibility criterion was not met with five efforts were not discarded [7]. Results were converted to body temperature and pressure saturated, and baseline spirometry values expressed as per cent predicted [8]. The spirometers were calibrated and leak tested daily before use.

Data analysis
Preparation of data sets was carried out with the SAS system (SAS Institute, Cary, NC, USA) and S-Plus (Stati-Sci, Seattle, WA, USA) statistical packages. SAS Procedures LIFEREG and VARCOMP were used to facilitate analysis of censored Gaussian outcomes and to obtain variance components for residuals from the censored data models. S-Plus routines for estimating mixed-effects linear models [9] were used to perform simultaneous estimation of regression and variance component estimates. PD20 and DRS outcomes were transformed to natural logarithms of the raw measures, henceforth referred to as ln PD20 and ln DRS. Because a negative DRS to inhaled methacholine is not biologically consistent and represents no responsiveness or zero slope, 14 DRS measures with small negative slopes were changed to 0.001, the lowest observed positive ln DRS measure in this study. The basic statistical models used are "mixed-effects" models for Gaussian outcomes, which have been found to be useful in longitudinal studies of lung function [10,11]. A statistical problem arises in the light of the fact that the study produced right-censored data in normal individuals with low airway responsiveness. The sample under analysis involved 59 right-censored PD20 values on 33 subjects, and four subjects were right-censored at all four visits. The Appendix details our statistical approach to modelling censored correlated outcomes.

Results
One hundred and five subjects in the study (92 male and 13 female) completed a total of 330 high-dose methacholine challenge tests over a 4 yr period. Approximately equal numbers were employed as draftsmen/technical aides, and as structural welders. Overall attendance was 79% (330 of 420 possible tests). Missed tests were in most cases due to out-migration from the workforce, although many subjects continued in the study after leaving employment at the shipyard. Of 330 complete data results of ln PD20 and covariates, 59 (18%) were right-censored. Of the tests completed, 82% had sufficient response to calculate the PD20, while the 18% that did not, made up the "right-censored" component of the set. -Graph of the two methacholine challenge protocols used for those with and without a past history of asthma, according to the method of HENDRICK et al. [3]. Subjects with a history of asthma or symptoms suggesting asthma completed the "asthmatic" protocol ( q ) each year starting at 5 cumulative breath units (1 mg·mL -1 ) methacholine. Subjects with no history or symptoms of asthma completed "no history of asthma" or "Normal" protocol and received a first dose at 40 cumulative breath units (8 mg·mL   challenge tests expressed as PD20, while figure 4 shows the frequency distribution of the same tests after log transformation of PD20. As can be seen, log transformation of PD20 converts the distribution to a more Gaussian one. However, the tall bar at the right end of the distribution, which represents all tests with logtransformed value ≥7.8, the clustering or "right censoring" of methacholine challenge tests, creates a problem in statistical analysis where Gaussian distribution is an underlying assumption. Year-to-year variability in methacholine responsiveness, as measured by the distribution of PD20 and the change from year-to-year in the dose of methacholine, is described in table 2. These are shown next to the FEV1 measurements for the same years for comparison. While the annual FEV1 was relatively stable over time for the majority of the group, approximately 30% doubled or halved (or more) their methacholine responsiveness in any given sequential annual pair of measurements.
The first two columns of table 3 provide mixed-effects model estimates for FEV1/height 2 (ht 2 ) and ln DRS outcomes using all available observations. The rescaling of FEV1 by the square of height (FEV1/ht 2 ) is common in respiratory epidemiology [12], and was motivated in this sample as a remedy for heteroskedasticity (failure to meet the assumption of normal distribution) observed in raw FEV1 relative to height. Model fits were examined in plots of residuals versus predictors and fitted values, and no striking departures from model assumptions were detected. Regression parameter estimates for models of FEV1/ht 2 and ln DRS are accompanied by model-based conditional standard errors.
The third column of table 3 provides estimates and 95% confidence intervals for effects of covariates on mean ln PD20 using maximum likelihood for censored Gaussian outcomes. Appropriate standard errors (SEs) are from an approximate "sandwich" parameter covariance estimate described in the Appendix. Of the factors tested in the model (age, sex, height, level of FEV1, smoking status, occupational status as a welder), only age and level of FEV1 were significantly associated with methacholine responsiveness. Younger age and lower level of FEV1 were associated with a higher degree of methacholine responsiveness (lower PD20).

Discussion
The usefulness of a test used longitudinally for epidemiological studies depends on many factors, including the variability of test results over time. This is in turn determined by true disease or environmental effects, normal physiological variation, and measurement error. In general, greater variation in the test result over time will necessitate a larger sample size to determine a disease or environmental effect of specified magnitude. Airway responsiveness, one important characteristic of clinical asthma, has been studied over various time intervals [13]. It has been found to have large variability over longer periods of time (months to years) relative to long-term variability of other measures of lung function such as FEV1 [14], and relative to its own short-term variability, which has been studied primarily in asthmatic patient groups [15,16].
Methacholine challenge airway responsiveness has been applied as a clinical and epidemiological tool to detect responses of the airways of exposed groups in environmental and occupational studies [17][18][19]. Use of methacholine responsiveness rather than other measures of lung function has the potential advantage for specificity in detecting changes in airway function in circumstances where other measures of lung function such as FEV1 may not be affected. The year-to-year variability in responsiveness differs between asthmatic patients and the healthy working subjects who may be the focus of workplace surveillance and epidemiological studies.
The year-to-year variability of response reported here in a large, actively employed nonasthmatic adult population may be useful in planning future studies of methacholine responsiveness, particularly for calculating study sample sizes. Age (less reactivity with higher age) and FEV1 (greater reactivity with lower FEV1) were the important modulators or predictors of methacholine responsiveness over time in this population. Measurement of FEV1 was substantially more consistent in longitudinal tracking from year-to-year than was methacholine responsiveness, regardless of the way in which methacholine responsiveness was expressed (i.e. PD20 or DRS).
The difference can be seen in the much larger intraclass correlation coefficient for FEV1 (close to the maximal value of 1.0) compared with those for ln DRS or ln PD20 (table 3).
Information describing how much variability should be expected, based on our data, between repeated measures (e.g. annual methacholine challenges) in an individual is contained in table 3, in the row labelled "within-person variance". Here, within-person variance for ln PD20 is 1.574. This can be converted to the standard deviation of change in ln PD20 by taking the square root of 2×1.574, √2×1.574). This quantity is 1.87. Thus, one standard deviation of the year-to-year change in ln PD20 in a single normal individual is 1.87. A more clinically relevant expression of the year-to-year variability in methacholine challenge in normal subjects is expressed in table 2. Here it can be seen that 30% of subjects changed in airway responsiveness by at least one doubling dose of methacholine between the first and second years of testing while baseline FEV1 remained stable.
Because a multiple, high-dose methacholine challenge was used to characterize responsiveness over the four annual assessments in these subjects, it was possible to examine year-to-year variability in response comparing the measured PD20 from a high-dose protocol with the DRS responses of a low-dose protocol, one which is both less time-consuming to perform and less likely to produce cholinergic symptoms. The 82% of study subjects whose PD20 could be accurately characterized with a maximum dose of 250 mg·mL -1 methacholine was greater than the 34% found in the study by MALO et al. [20], where a maximum concentration of 128 mg·mL -1 methacholine was used. The difference is most likely related simply to the greater bronchoconstricting effect of the higher maximum dose used in the present study (250 mg·mL -1 ), and hence a higher proportion of "responders".
The DRS method uses (for the purposes of calculating the DRS) a maximum methacholine dose of 32 mg· mL -1 , similar to the method used by O'CONNOR et al. [5]. When comparing the two methods (PD20 to DRS) by year-to-year variability within and between subjects, as measured by intraclass correlation coefficients, no Values in parentheses are 95% confidence intervals. + : restricted maximum likelihood estimates for mixed-effects models with random intercepts for each subject supplying repeated measures. # : maximum likelihood for right-censored Gaussian outcomes. Standard errors are based on the "sandwich" covariance estimates for clustered residuals, and the variance components are estimated using restricted maximum likelihood applied to these residuals. The intraclass correlation coefficient is close to 1 when intrasubject variability is small in relation to between-subject variability, and close to 0 when intrasubject variability is great in relation to between-subject variability. FEV1 -3.81 L and height -175 cm are intercepts expressed as residuals. ln DRS: natural logarithm of dose-response slope calculated by the method of O'CONNOR et al. [5]; ln PD20: natural logarithm of the provocative dose of methacholine required to produce a 20% fall in FEV1. For further definitions, see legend to table 2.
clear advantage was seen of one method over the other. In particular, these results predict that in sample size calculations (planning the number of subjects needed to detect an effect of given magnitude in a study of normal nonasthmatics), choosing a methacholine challenge protocol to measure PD20 is not better than using the DRS.
The appropriateness of the random intercept model to analyse methacholine responsiveness over time was verified by graphical examination of estimated random intercept distributions, which did not differ markedly from Gaussian structure. The absence of regression effects of gender, job status, or smoking status in the estimated random intercept distributions is further evidence that a random intercept formulation is adequate for this study.
In summary, 105 nonasthmatic adult working subjects completed annual methacholine challenges for up to 4 yrs. Annual variability in methacholine responsiveness was strikingly great in relation to annual variability in forced expiratory volume in one second. Intraclass correlation coefficients for methacholine responsiveness were considerably lower than for forced expiratory volume in one second. Annual methacholine challenge testing in nonasthmatics as a surveillance tool for environmental airways effects is limited by marked variability in within-subject responses. Test results appear to have similar reliability regardless of whether a simpler lower-dose challenge dose-response slope protocol or a lengthier multiple-dose high concentration provocative dose causing a 20% fall in forced expiratory volume in one second protocol is used.

Mixed-effects model
A model was fitted with fixed effects at the individual level for sex, smoking, employment status, (FEV1 -3.81) L, and (height -175) cm, plus a random effect for remaining between-subject variation.

Modelling censored correlated outcomes
Our attack on this problem involves using residuals from maximum likelihood models for censored Gaussian observations to estimate between-individual heterogeneity. Specifically, we wish to use all available (censored) information for estimation of fixed effects in the main model. Therefore, we use maximum likelihood estimation under a censored Gaussian model, assuming all observations to be independent, to obtain unbiased estimates of β. Standard error of estimation must be modified to account for dependence among repeated observations, and the "information sandwich" method described by ROYALL [21] is used for reporting on these estimates. The fixed effects estimates are then used to form residuals for all uncensored observations. The variance of α i is then estimated using restricted maximum likelihood, and intraclass correlation coefficients are derived in the customary manner: p=variance (α i /((variance α i ) + variance (eij))· [22] Appropriate standard errors SEs are from an approximate "sandwich" parameter covariance estimate which accounts for clustering in the outcome, and 95% confidence intervals represent ±2 SE. These SEs are obtained by combining the information matrix corresponding to the censored outcome likelihood model with an empirical covariance matrix based on clustered residuals calculated from the censored model. These approximate SEs are wider by 10-33% than those obtained ignoring clustering.
Restricted maximum likelihood estimation of random intercepts was performed for residuals of all the models in table 3. In the case of FEV1/ht 2 and ln DRS outcomes, between-subject variance is estimated simultaneously with regression coefficients and error variance; for censored ln PD20 outcomes, the variance of random intercepts was estimated using restricted maximum likelihood on residuals from the censored Gaussian model. Corresponding intraclass correlation coefficient estimates for spirometry and methacholine challenge outcomes are also provided in table 3 [23].