Abstract
The lung clearance index (LCI) has strong intra-test repeatability; however, the inter-test reproducibility of the LCI is poorly defined.
The aim of the present study was to define a physiologically meaningful change in LCI in preschool children, which discriminates changes associated with disease progression from biological variability.
Repeated LCI measurements from a longitudinal cohort study of children with cystic fibrosis and age-matched controls were collected to define the inter-visit reproducibility of the LCI. Absolute change, the coefficient of variation, Bland–Altman limits of agreement, the coefficient of repeatability, intra-class correlation coefficient, and percentage changes were calculated.
LCI measurements (n=505) from 71 healthy and 77 cystic fibrosis participants (aged 2.6–6 years) were analysed. LCI variability was proportional to its magnitude, such that reproducibility defined by absolute changes is biased. A physiologically relevant change for quarterly LCI measurements in health was defined as exceeding ±15%. In clinically stable cystic fibrosis participants, the threshold was higher (±25%); however, for measurements made 24 h apart, the threshold was similar to that observed in health (±17%).
A percentage change in LCI greater than ±15% in preschool children can be considered physiologically relevant and greater than the biological variability of the test.
Abstract
Biological variability of lung clearance index is dependent on magnitude; % change is better for tracking patients http://ow.ly/tgbX30dBbCX
Introduction
The lung clearance index (LCI), measured by multiple breath washout (MBW), may be a useful physiological test to clinically monitor lung function in patients with cystic fibrosis. The LCI identifies early obstructive lung disease [1] and is feasible to perform in young children [2, 3]. In addition, the LCI has good intra-test repeatability [4–6]; however, there are limited longitudinal data describing inter-test reproducibility.
Investigators report differing findings for the inter-test reproducibility of the LCI due to variability in protocol designs, study population and MBW methodology [5–14]. Most notably, time intervals between repeated tests range from hours to years, and none report reproducibility for time intervals typically used to clinically track patients with cystic fibrosis (e.g. quarterly) [6, 11, 15]. Comparison between studies is further limited by the various methodologies that define reproducibility; coefficient of variation (%CV) [5, 7, 8], the intra-class correlation coefficient (ICC) [6], the limits of agreement from Bland–Altman plots [5, 9–11] or the coefficient of reproducibility [12, 13]. Reproducibility has most commonly been reported as Bland–Altman limits of agreement or coefficient of repeatability. Both of these methods assume that the measurement variability is independent of its magnitude, and analyses have only been tested in health or for a narrow range of LCI values in participants with cystic fibrosis [12, 15]. If this assumption is not valid across the spectrum of values observed in clinical practice, the use of the limits of agreement or coefficient of repeatability may introduce bias in the interpretation of clinically relevant between-visit changes in the LCI.
An additional challenge of defining outcome reproducibility is only using data from healthy individuals. This approach accounts for biological variability of the test in a healthy population, but may be too sensitive in disease as it assumes that any variability observed outside this range is driven by disease progression. In constrast, defining stability and assuring that observed changes between measurements in disease are not attributed to natural progression proves difficult; thereby, biological variability based on disease groups may be too conservative.
In this study, we aim to use longitudinal LCI data collected at multiple time intervals (e.g. 1 month, 3 months, 6 month, 9 months and 12 months) to define inter-visit reproducibility in health and stable cystic fibrosis. Furthermore, we aim to define reproducibility for repeated measurements made within 24 h to elucidate if variability in LCI is driven by the biological variability of the test, as changes observed within this time period are unlikely to be associated with changes in the disease state.
Methods
Study population
Data were collected as part of a prospective multi-centre observational study of preschool children (aged 2.5–6 years) with cystic fibrosis and age-matched healthy controls from three North American cystic fibrosis centres between January 2013 and June 2015 [3]. MBW was measured at enrolment and 1, 3, 6, 9 and 12 months. All participants, both healthy and with cystic fibrosis, were free of respiratory infection for at least 4 weeks prior to the enrolment. Symptoms (cough) and treatments (oral or intravenous antibiotics) were recorded for all subsequent tests. Any measurements for which participants had either cough or pulmonary exacerbation (defined as respiratory symptoms and treatment with antibiotics) were excluded from these analyses. Thus, these data should reflect clinically stable visits. The study was approved by the Research Ethics Board at the Hospital for Sick Children (REB # 1000036303), Riley Children's Hospital (1401277863), and the Office of Human Research Ethics at University of North Carolina at Chapel Hill (13-1258). Informed written consent was obtained from the parent/guardian for all participants.
Reproducibility was also defined from the placebo arm of a randomised cross-over interventional trial (NCT 02276898) that included participants with cystic fibrosis aged ≥7 years randomised to a single dose of either hypertonic saline or isotonic saline [14]. MBW was performed at baseline and 24 h later with at least 1 week washout between treatment arms [14]. The study was approved by the Research Ethics Board at the Hospital for Sick Children (REB # 1000024909) and informed written consent was obtained from the parent/guardian for all participants.
MBW outcomes
MBW tests were performed with the Exhalyzer D (EcoMedics AG, Duernten, Switzerland). Adaptions were made for testing preschool children [3]. Participants were enrolled if they could complete a single MBW trial, whereas a MBW test was considered successful if there were at least two technically acceptable trials. The LCI was calculated as the cumulative expired volume (CEV) divided by the functional residual capacity (FRC) at 1/40 of the starting gas concentration. LCI5 was calculated at 1/20 of the starting gas concentration. All MBW traces were reviewed for technical quality and appropriate breathing pattern [16]. Full details are previously published [3].
Spirometry
Spirometry was performed after MBW. Spirometry was performed according to the American Thoracic Society/European Respiratory Society standards [17]; different devices were used at each of the three sites. Absolute values of forced expiratory volume in 1 second (FEV1) were standardised for height, age, sex and ethnicity using the Global Lung Function Initiative reference equations [18]. Mid-expiratory forced expiratory flow (FEF25–75%) was reported from the effort with the highest sum of FEV1 and forced vital capacity.
Statistical analysis
Table 1 summarises the methods applied to calculate the inter-visit reproducibility for each pair-wise comparison of the six LCI measurements. An average reproducibility was calculated with the pooled pair-wise results of all time points using the mixed-effects model; the residual standard deviation from the random effects was used with the mean difference to calculate 95% confidence bands (mean differences ±1.96×standard deviation). To account for repeated observations in the same individuals, the mixed-effects model was specified with subject as the random factor. An exchangeable correlation structure was used. LCI values were right skewed, therefore non-parametric summaries are presented, whereas percentage change in LCI followed a normal distribution, and parametric summaries were used. The same analyses were conducted for FEV1 and FEF25–75% z-scores and % predicted. The sample size was a convenience sample size based on available data from the primary research study [3]. All statistical analyses were performed using Stata version 14 (StataCorp, College Station, TX, USA).
Summary of statistical methods used to calculate the inter-visit reproducibility
Results
77 participants with cystic fibrosis and 71 healthy preschool children had at least two MBW measurements performed during the study period. Of the 324 technically acceptable MBW measurements in healthy participants, 295 (91%) were asymptomatic; whereas, of the 343 technically acceptable MBW tests in cystic fibrosis participants, 210 (61%) were considered asymptomatic and clinically stable. The LCI was significantly higher in the cystic fibrosis group (median 8.9 (range 6.4–16.2)) at the enrolment visit compared with the healthy children (median 7.1 (range 6.1–8.1)). The reproducibility, defined as the absolute difference of LCI between two repeated measurements was close to zero in both health (mean difference: −0.03) and cystic fibrosis (mean difference: −0.05), regardless of the time interval between measurements; whereas the within-subject between-test %CV of the LCI was approximately double that in cystic fibrosis (7.7%) compared with healthy children (4.3%) (table 2). The ICC, representing the correlation between two measurements in the same individual, was much lower for healthy children (average 0.4), compared with cystic fibrosis participants (average 0.7).
Reproducibility of lung clearance index between each pair of repeated measurements in preschool children
A meaningful change in LCI can be defined as 0.9 units (table 2) based on the Bland–Altman limits of agreement (figure 1) and coefficient of repeatability calculated from healthy data. Thus, the probability that a change of more than 0.9 LCI units is due to chance is less than 5%, and can be considered clinically meaningful. The threshold is more than double (2 LCI units) when derived from measurements in clinically stable cystic fibrosis participants. However, both the limits of agreement and coefficient of repeatability assume that the within-subject standard deviation is proportional to the magnitude of the measurement, which we observed not to be the case (figure 2). The bias was not as apparent for healthy children in whom the range of LCI values is narrow, but this bias increases in cystic fibrosis, especially at higher LCI values (figure 2). Thus, using the limits derived in healthy children in cystic fibrosis would lead to over-estimation of clinically relevant changes in those with higher LCI values (i.e. greater disease severity).
Bland–Altman plot of the difference in repeated lung clearance index (LCI) measurements. The difference between LCI measurements is greater at higher LCI values. CF: cystic fibrosis.
The within–subject between-test standard deviation of the lung clearance index (LCI) was found to be proportional to the magnitude, especially in cystic fibrosis (CF) subjects.
The reproducibility of repeated LCI measurements for each of the time intervals, calculated as a percentage change, was, on average, very close to zero in both health and stable cystic fibrosis (table 2) and independent of the magnitude of LCI (figure 3). To assess the percentage change in LCI that would be considered a threshold beyond the biological variability of the test, we used the average of the pair-wise repeated measurements; 95% of the percentage change observed in healthy preschool children was between ±15%. In cystic fibrosis participants, the limits were higher (±25%). The percentage change was similar for measurements made 1 month apart compared with those made 3 months apart, in both health and cystic fibrosis (table 2). In addition, the percentage change in LCI was similar for each of the pair-wise comparisons across the 12-month period in healthy children and cystic fibrosis participants, albeit the range was much wider for the cystic fibrosis group (figure 4). LCI5, at 1/20 of the starting concentration, was more variable than the standard LCI (at 1/40 of the starting gas concentration), with 95% confidence bands in health ranging from −35% to 34%. The observed percentage change for FEV1 were similar to those observed for LCI (95% confidence bands in health (−19–16%) and cystic fibrosis (−29–26%)); however, the reproducibility of FEF25–75% were more than double that observed for FEV1 and LCI in both health and cystic fibrosis (table 3). Since interpretation of % predicted can be biased, we include the absolute changes in z-scores which were also more variable in cystic fibrosis than in health, and were more variable for z FEF25–75% than for zFEV1.
The percentage change in lung clearance index (LCI) between two test occasions was found to be independent of the magnitude of the LCI.
Percentage changes between LCI measurements were independent of the time interval between measurements in both health and in stable cystic fibrosis disease. Boxplots indicate the median value (centre line); inter-quartile range (box) and minimum and maximum values, excluding outliers greater than three-times the lower quartile (error bars).
Reproducibility of FEV1 and FEF25-75% z-scores for absolute change (95% limits of agreement) and % predicted for percentage change (95% confidence bands)
LCI reproducibility was also defined for measurements in cystic fibrosis participants made 24 h apart. The average percentage change was −1.3% (95% confidence bands ±17%). When the limits defined in healthy preschool children were applied, 16 (94%) out of 17 measurements made 24 h apart in cystic fibrosis participants had a percentage change in LCI within ±15% of the baseline measurement and 100% within ±25%. In contrast, using an absolute change, only 9 (53%) out of 17 of the 24 h measurements in cystic fibrosis participants were within 0.9 units in LCI (the coefficient of repeatability derived from the healthy preschool children).
Discussion
We comprehensively assessed the biological variability of repeated LCI measurements at time points relevant for clinical care of cystic fibrosis patients. These data highlight that the interpretation of LCI in terms of absolute change (i.e. 1 unit) is prone to bias, especially in cystic fibrosis. A percentage change in LCI of ±15% could be considered greater than the biological variability of the test in healthy children, and is comparable to measurements made 24 h apart in cystic fibrosis. In clinically stable cystic fibrosis patients measured at longer time intervals (1–3 months), the biological variability was within ±25%. The greater variability likely represents asymptomatic and spontaneous variations in disease not captured by clinical symptoms; however, differentiating biological variability from changes in disease state in this population is challenging and requires further investigation. Our previous work highlighted the population-level changes in LCI in preschool children over time and identified changes in LCI are associated with lower respiratory symptoms [3]. The in-depth analyses of the individual-level biological variability of the LCI presented here provide further evidence and tools to interpret clinical changes in LCI in young children with cystic fibrosis.
Changes in LCI >15% can be considered physiologically relevant and greater than the biological variability of the test in health. At the healthy spectrum of LCI (i.e. LCI=7), a 15% change in LCI reflects a change of approximately 1 unit, which coincides with the Bland–Altman limits of agreement and coefficient of repeatability defined in this study. The inter-visit %CV and absolute change in LCI are consistent with previous literature; however the range of observed LCI values in previous studies was narrower, thus the relationship between the variability and magnitude may not have been apparent [6, 12, 15]. For instance, Singer et al. [12] reported a coefficient of repeatability of 0.96 units for measurements in cystic fibrosis made 24 h apart, which was similar to what we observed in healthy participants. The LCI values observed by Singer et al. [12] ranged from 7.3 to 11.5 units in cystic fibrosis; thereby, are not representative of the spectrum of lung disease observed in patients with cystic fibrosis. In our data, at the severe spectrum of cystic fibrosis lung disease (i.e. LCI=16), the LCI would need to change by more than 2.5 units to be considered physiologically relevant, highlighting the limitations of using absolute LCI as a threshold for a relevant change. We also observed greater variability for the LCI5 between visits than for LCI (at 1/40 the starting N2 concentration), which is opposite to the within-visit variability [3] but corresponds to the effects observed for interventional studies [19].
The observed within-subject between-test reproducibility of LCI was relatively high (±15%), but consistent with FEV1 reproducibility defined using similar methods in the same study population. Absolute z-scores changes ranged from −1.3 to 1 z-scores, corresponding to approximately −19% to 16% percentage change when using % predicted. This z-score range is similar to a previous study in children reporting reproducibility for measurements made 12 months apart [20]. Furthermore, multiple studies demonstrate that the “signal” such as treatment differences, or changes related to worsening symptoms, from both of these lung function tests is much greater than the “noise” or biological variability and can be useful for tracking cystic fibrosis lung disease over time [1, 3, 21–24]. The similar within-subject variability observed for LCI and FEV1 is in contrast to the lower between-subject variability observed for LCI compared to FEV1 in interventional studies [7, 8, 25]. This paradox is not unexpected, since changes at a group level are often lower than changes observed within an individual [26, 27]. Compared with the reproducibility of the LCI and FEV1, the reproducibility of the FEF25-75% was rather poor (absolute z-score changes ranged from −2.5 to 1.8 z-scores, which corresponds to approximately −55% to 65% percentage changes when using % predicted). This is consistent with previous literature which suggests both the within- [28] and between-subject variability [29] of the FEF25-75% is too high to track individuals over time, despite the sensitivity of FEF25-75% to detect differences at the population level [30].
Numerous statistical approaches exist to calculate the biological variability of a test, each with its own advantages, disadvantages and statistical assumptions that may be specific to an outcome. Generally, less variable measurements are more precise, thus better at tracking individuals over time [31]. The Bland–Altman limits of agreement are widely used to define a threshold of measurement error and biological variability, but assume that the error is independent of the measurement itself [32]. Our findings clearly demonstrate that, for LCI, this assumption does not hold true. The effect was less obvious in healthy participants (Bland–Altman limits of agreement defined in health); therefore, likely to impact the interpretation of changes observed in cystic fibrosis, where the bias is more pronounced. The discordance with previous studies is not unexpected since these were either limited to healthy subjects, or a narrow range of LCI values [12, 15]. In some previous studies within-test repeatability criteria of the FRC and LCI have been applied, such that higher LCI measurements with greater variability were excluded [6]. MBW data in our study relied on a standardised quality control protocol that examined trials for technical quality and variability of the breathing pattern, thus no exclusions were made based on measurement variability.
Previous studies define inter-test reproducibility as the %CV [5, 7, 8] or the ICC [6]. Both of these are useful parameters to summarise consistency and agreement, but do not provide a meaningful limit of reproducibility [33]. Furthermore, both the %CV and ICC are better suited for situations where there are more than two repeated measurements in the same individual [34] which were not available for most of the previously published data sets. The %CV observed in our study was comparable to values previously published [5, 7, 8]; whereas the ICC was much lower. O'Neill et al. [6] report that the ICC of LCI measurements made in cystic fibrosis participants approximately 8 months apart (range 67–614 days) was 0.96. The discordance in results likely reflects the inherent sensitivity of the ICC to the magnitude (or spread) of values in the dataset [33]. Indeed, the ICC calculated from the healthy children in our study was much lower than the 0.6 limit suggested as a threshold for a meaningful test [35], but the spread of values in health was smaller. A similar bias can be expected if results in cystic fibrosis participants represent only a small segment of the spectrum of disease.
Limitations
Our findings are limited by the narrow age range of preschool data, and may not be generalisable to older patients with more advanced lung disease. These observations were based on patients followed at three centres and there is a need to validate in a different study populations. Further longitudinal studies in older cystic fibrosis patients across a range of disease severity are necessary to better interpret changes seen in the clinical setting. The lack of a gold standard to define the variability of LCI (or FEV1) in cystic fibrosis limited our ability to distinguish variability due to asymptomatic disease from inherent variability of ventilation inhomogeneity in those with cystic fibrosis; the 24 h reproducibility data from a small number of older patients does suggest that biological variability of LCI is similar in health and cystic fibrosis, and the poorer reproducibility observed at longer time intervals is associated with progression of lung disease. Additional studies, potentially including images of both structural changes as well as images of ventilation (e.g. hyper-polarised gas magnetic resonance imaging) may help to better understand LCI variability observed in cystic fibrosis patients who clinically appear stable. Finally, the spirometry collected in preschool children was limited to a smaller number of observations, and thus the results may be more variable than those observed in older children and adults. These were included as a reference and comparison, and were not meant to define reproducibility of preschool spirometry.
Conclusion
Our findings suggest that the reproducibility of LCI between two consecutive measurements should be interpreted as percentage changes, where changes greater than ±15% represent changes greater than the biological variability of the test and are physiologically relevant.
Acknowledgements
We would like to thank the children and families that participated in this research study, as well as Hailey Webster (Sick Kids, Toronto, ON, Canada), Miriam Davis (Riley Hospital for Children, Indianapolis, IN, USA), Robin C. Johnson University of North Carolina at Chapel Hill, Chapel Hill, NC, USA), Renee Jensen (Sick Kids, Toronto, ON, Canada), Maria Ester Pizarro (Sick Kids, Toronto, ON, Canada), Mica Kane (Sick Kids, Toronto, ON, Canada), Charles C. Clem (Riley Hospital for Children, Indianapolis, IN, USA) and Leah Schornick (Riley Hospital for Children, Indianapolis, IN, USA) for their role in recruiting participants, collecting and interpreting the multiple breath washout data.
Author contributions were as follows. Substantial contributions to the conception or design of the work, or the acquisition, analysis, or interpretation of data for the work: E. Oude Engberink, F. Ratjen, S.D. Davis, G. Retsch-Bogart, R. Amin and S. Stanojevic. Drafting the work or revising it critically for important intellectual content: E. Oude Engberink, F. Ratjen, S.D. Davis, G. Retsch-Bogart, R. Amin and S. Stanojevic. Final approval of the version to be published: E. Oude Engberink, F. Ratjen, S.D. Davis, G. Retsch-Bogart, R. Amin and S. Stanojevic. Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved: E. Oude Engberink, F. Ratjen, S.D. Davis, G. Retsch-Bogart, R. Amin and S. Stanojevic. S. Stanojevic is the guarantor and accepts full responsibility for the work and/or the conduct of the study, had access to the data, and controlled the decision to publish.
Footnotes
Support statement: This study was funded by the National Heart, Lung, and Blood Institute, grant number R01HL116232-04. Funding information for this article has been deposited with the Crossref Funder Registry.
Conflict of interest: None declared.
- Received February 28, 2017.
- Accepted July 6, 2017.
- Copyright ©ERS 2017
This ERJ Open article is open access and distributed under the terms of the Creative Commons Attribution Non-commercial Licence 4.0