Abstract
Expressing forced expiratory volume in 1 s (FEV1) as % predicted relies on the assumption of proportional variability and generalisability of prediction equations that may be unrealistic, especially for elderly people. We evaluated the prognostic implications of alternative ways of expressing FEV1.
We enrolled 318 patients with chronic obstructive pulmonary disease (COPD) and 475 controls in the Salute Respiratoria nell'Anziano (SARA) study. The risk for 5-, 10- and 15-year mortality associated with FEV1 was studied by expressing FEV1 % pred, standardised by height cubed (FEV1·Ht−3) and as a multiple of the sex-specific first percentile (FEV1 quotient (FEV1Q)).
In the group with COPD, the incidence rate ratio for the worst versus the best quintile of FEV1Q was 4.65 (95% CI 2.33–10.37), compared to 2.98 (1.53–6.27) for FEV1 % pred and 3.95 (2.01–8.45) for FEV1·Ht−3. The corresponding incidence rate ratios at 15 years were 4.52 (2.84–7.43), 3.16 (2.02–5.07) and 3.52 (2.25–5.63), respectively. In the control group, even moderate reduction of FEV1Q was associated with long-term mortality, while FEV1 % pred was not associated with the outcome.
FEV1Q may be more informative about prognosis in an elderly population compared to FEV1 % pred.
Reduction of forced expiratory volume in 1 s (FEV1) is primarily a measure of bronchial obstruction. The extent of airways obstruction has important prognostic implications not only in chronic obstructive pulmonary disease (COPD) or asthma patients, but also in people with pulmonary restriction [1], or without an established diagnosis of respiratory disease [2]. FEV1 is influenced in healthy people by age, sex and body size; therefore, it is commonly expressed as a fraction of a “normal” value that is predicted using equations derived in samples of healthy people. This approach is based on the assumption that the variability is proportional to the predicted values across ages; its validity also depends on the availability of standards developed in a reference normal population that is as similar as possible to the population being studied. Both these assumptions are often unrealistic, especially for the elderly population. First, there is paucity of data on normal spirometric values in this age group, as reference standards are commonly derived from equations developed in populations composed of mainly young and adult subjects [3]. Secondly, insofar as the decline of height with age is accounted for in predicted values, height declines more rapidly in COPD patients, due to the strong association between COPD and osteoporosis [4], so that predicted values may underestimate the true normal value in patients. All this makes it difficult to rely upon a reference standard for judging whether a given FEV1 value is normal in elderly subjects. Accordingly, the information contained in the FEV1 might be of limited value for both clinical and epidemiological purposes if FEV1 is standardised in a suboptimal way. Given these problems, alternative methods for standardising FEV1 are of special interest. Two of these methods, adjusting FEV1 by dividing by height cubed (FEV1·Ht−3), and expressing FEV1 as a function of sex-specific first percentile (FEV1 quotient (FEV1Q)), deserve special consideration because they proved more effective than traditional FEV1 % predicted in predicting long-term survival in a large unselected population [5]. Interestingly, in 1976 Fletcher and Peto [6] reported that relating FEV1 to height cubed was the best way of following lung function decline in COPD over an 8-year period.
Arising from this we decided to compare the prognostic implications of the use of three different standards for FEV1 in an unselected elderly population enrolled in the Salute Respiratoria nell'Anziano (SARA) study [7]. This geriatric population offered the unique opportunity of testing the alternative FEV1 definitions in both COPD and non-respiratory subjects aged >64 years, followed up for 15 years. This will allow verification of whether prognostic implications of individual FEV1 definitions change as a function of disease status and the duration of follow-up.
METHODS
Study population and follow-up
Between January 1996 and July 1999 a total of 1970 participants were recruited from 24 departments of geriatrics or respiratory medicine within the context of the SARA study. Details on the SARA project are available elsewhere [7]. This is a multicentre Italian project investigating various aspects of chronic airway diseases in the elderly population (age ≥65 years) attending pulmonary or geriatric outpatient clinics. Researchers had extensive training in both respiratory function study of the elderly and multidimensional geriatric assessment. Enrolment was on a consecutive basis. Data from individual centres were collected by a coordinating centre at the Cattedra di Malattie dell'Apparato Respiratorio of the University of Palermo (Palermo, Italy), which was also responsible for the quality control, the retrieval and the final processing of data. The study design was approved by the ethical committees of the participating institutions. From this dataset, we selected those with valid anthropometric and post-bronchodilator spirometric data (n=1316), and excluded those with incomplete personal data precluding administrative follow-up (n=68). Participants were followed-up throughout December 2010, with regard to their vital status, by contacting the registry office of the last municipality of residence. Information on vital status at 15 years was obtained for 1086 (86.5%) participants using administrative registries. Follow-up time was calculated from the date of recruitment (first visit) until the date of death or December 31, 2010. Data on vital status were collected by the Dept of Epidemiology, Lazio Regional Health Service (Rome, Italy).
Pulmonary function tests
All the centres were equipped with an identical fully computerised water-sealed Stead-Wells spirometer (Baires System; Biomedin, Padua, Italy) that met the standards of the American Thoracic Society recommendations for diagnostic spirometry. Tests were performed with a standardised technique in all centres and a quality control process was successfully implemented; all the centres achieved a high-quality performance in spirometry [7].
Sample selection
We considered participants as having COPD with bronchial obstruction, defined as post-bronchodilator FEV1/forced vital capacity (FVC) below the lower limit of normal (LLN), and without asthma, which was defined as either FEV1 ≥80% pred and a history of wheezing in the last year, or FEV1 <80% pred and with an increase in FEV1 of ≥12% after inhalation of fenoterol (Boehringer Ingelheim Italy, Regello, Italy) [8] (n=318). As a control group we included people with post-bronchodilator FEV1/FVC equal to or greater than LLN and without asthma or respiratory symptoms (n=475).
Analytic approach
Post-bronchodilator FEV1 was standardised using three different methods: 1) % predicted estimated using the equations proposed by García-Río et al. [9] and developed in an elderly population (FEV1 % pred); 2) FEV1 divided by the cube of height (FEV1·Ht−3); 3) FEV1 divided by the sex-specific first percentile of the FEV1 distribution (FEV1 quotient (FEV1Q)). This index approximates to the number of remaining turnovers of a lower survivable limit of FEV1. The cut-off limits used (0.5 L for males and 0.4 L for females) were derived from a large and heterogenous population [5] to avoid the bias inherent in the use of values derived from our relatively small sample.
All the measures were categorised using the quintiles of their distribution as cut-off points. The agreement between the categorised measures was evaluated using the Cohen's κ statistics with disagreements weighted using their squared distance from perfect agreement.
The risk for 5-, 10- and 15-year mortality was calculated using the product-limit method. To evaluate the increase in risk across categories of FEV1 we calculated incidence rate ratios (IRRs) using the upper quintile as the reference category. The overall diagnostic performance was evaluated using the C-statistic derived from Cox proportional hazard models [10]. The C-statistic can be interpreted in the same way as the area under the receiver operating characteristic curve and, as such, takes values between 0.5 (no discriminative capacity) and 1 (perfect discriminative capacity).
The measures analysed in this study are affected differently by demographic and anthropometric characteristics. For example, FEV1 % pred takes into account age, sex and height, while FEV1Q only takes into account sex. Although the aim of the project is to evaluate the prognostic significance of the FEV1 definition “as is” , to provide a broader view of their relevance, we also evaluated the risk for mortality associated with quintiles of FEV1 adjusted for the demographic factors not taken into account by the definition itself and for smoking, using Cox regression models.
RESULTS
The two groups under study had similar age (table 1), while the proportion of males was higher in the COPD group (78.6% versus 43.6% in controls); in this group the mean FEV1 % pred was 65%. The agreement between quintiles of FEV1 % pred and FEV1·Ht−3 was good, with an overall concordance of 60% and a weighted κ=0.86, while agreement between quintiles of FEV1 % pred and FEV1Q was much poorer, with 42% concordance and a weighted κ=0.66. Finally, the overall concordance between FEV1·Ht−3 and FEV1Q was 48.5%, with a weighted κ=0.80. The mortality at 5 years was 28.6% and 13.5% in COPD patients and controls, respectively. The corresponding figures were 53.1% and 29.5% for 10-year mortality and 65.8%, and 43.8% for 15-year mortality. Figures 1, 2 and 3 show the risk of mortality by quintiles of FEV1 % pred, FEV1·Ht−3 and FEV1Q, respectively.
Results in subjects with COPD
In subjects with COPD (table 2), only a relatively large reduction of FEV1 was associated with 5-year mortality, regardless of the definition, with FEV1Q having the stronger association with the outcome (IRR (95% CI) for the worst versus best quintile: 4.65 (2.33–10.37)). At longer follow-up times, the stronger association between FEV1Q and mortality was more evident: at 10 years, the IRR (95% CI) of the worst quintile compared to the best quintile was 5.44 (3.18−9.84) for FEV1Q, 4.21 (2.53–7.33) for FEV1·Ht−3 and 3.4 (2.07–5.79) for FEV1 % pred. The corresponding IRRs (95% CI) for mortality at 15 years were 4.52 (2.84–7.43), 3.52 (2.25–5.63) and 3.52 (2.25–5.63), respectively. The overall predictive power, expressed by the C-statistic, was similarly low (between 0.63 and 0.66) for the three methods, regardless of follow-up time.
As shown in table 3, even after correction for potential confounders, FEV1·Ht−3 and FEV1Q were still associated with the outcome, with little changes in the hazard ratio (HR) estimate. At 5 years, the HR (95% CI) for FEV1 % pred (worst quintile versus best quintile, adjusted for smoking exposure) was 3.02 (1.49−6.1); the corresponding figures for FEV1·Ht−3 (adjusted for age, sex and smoking exposure) and FEV1Q (adjusted for age and smoking exposure) were 3.99 (1.95–8.15) and 4.53 (2.14–9.57), respectively. At 10 years, the adjusted HR (95% CI) for FEV1 % pred, FEV1·Ht−3, and FEV1Q were 3.49 (2.07–5.87), 4.42 (2.58–7.55) and 5.5 (3.1–9.73), respectively. The corresponding figures for 15-year mortality were 3.28 (2.06–5.24), 3.58 (2.24–5.7) and 4.45 (2.72–7.27).
Results in controls
There was no association between FEV1 % pred and mortality in the control group regardless of follow-up time, and only the worst quintile of FEV1·Ht−3 was associated with 10- and 15-year mortality. However, FEV1Q was significantly associated with both 10- and 15-year mortality with an increase in risk evident even at the third quintile. The IRR (95% CI) for the worst quintile compared to the best quintile was 2.8 (1.55–5.38) and 2.7 (1.75–4.32) for 10- and 15-year mortality, respectively. In this group, the overall predictive power was lowest for FEV1 % pred (C=0.53), while for FEV1·Ht−3 and FEV1Q it was slightly lower than the one calculated in the COPD group (C=0.59–0.62). After adjustment for confounders, the association between FEV1 and mortality in this group was weak. The 5-year mortality HR was 1.43 (0.68–3) for FEV1 % pred (adjusted for smoking exposure), 1.4 (0.63–3.07) for FEV1·Ht−3 (adjusted for age, sex and smoking exposure), and 0.56 (0.23–1.34) for FEV1Q (adjusted for age and smoking exposure). The corresponding figures were 1.18 (0.72–1.92), 1.61 (0.9–2.9) and 1.05 (0.57–1.92) for 10-year mortality, and 1.16 (0.77–1.74), 1.75 (1.06–2.88) and 1.14 (0.7–1.83) for 15-year mortality.
DISCUSSION
We found that in subjects with COPD the FEV1 % pred, the most commonly used indicator of disease severity, has a weaker association with mortality compared to both FEV1·Ht−3 and FEV1Q, which had the best predictive power. The association between FEV1Q and mortality, however, was more affected by the correction for potential confounders compared to FEV1·Ht−3, indicating that the latter measure is less affected by age. However, even after adjustment, FEV1Q outperforms FEV1·Ht−3 in predicting survival. Furthermore, FEV1Q and, to some extent, FEV1·Ht−3 have some predictive power towards mortality in people without COPD. However, this finding was not confirmed after adjustment for potential confounders.
Our findings confirm those by Miller and co-workers [5, 11] of suboptimal predictive power of FEV1 % pred with respect to mortality, and expand the knowledge on this topic providing information pertaining to an elderly population. In line with the above-mentioned studies, we also found a better performance of FEV1Q compared to FEV1·Ht−3.
In our sample FEV1Q and, to a lesser extent, FEV1·Ht−3, are associated with mortality in subjects without COPD, while FEV1 % pred was not associated with mortality in this group. This is in contrast to a number of studies showing that FEV1 % pred is associated with an increased risk for cardiovascular mortality [12], and also with lung cancer [13], regardless of smoking habit. Furthermore, this association seems to be retained also in older age [14]. It should be noted, however, that in the study by Miller and Pedersen [5], FEV1Q outperformed FEV1 % pred in predicting mortality in the group without bronchial obstruction. Thus, our data confirm the finding that the use of reference equations to standardise FEV1 is less informative than alternative indexes such as FEV1Q and FEV1·Ht−3 in subjects without bronchial obstruction.
We decided to use the equations proposed by García-Río et al. [9] because they were developed in a population similar to ours, but a different choice might have produced different results. For example, it has been shown that the commonly used equations proposed by the European Respiratory Society/European Community of Coal and Steel [15] underestimate the FEV1 compared to other reference equations [16]; as a result, these equations would classify as “normal” people that would be classified as having a reduction of pulmonary function using other equations. Another problem with the use of reference equations is that the coefficient of variation is age-dependent, and in adults increases considerably with age [3]. Therefore, the use of per cent of predicted inevitably introduces an age-related bias. These findings may also be important with regard to the diagnosis of COPD. Most current guidelines [1] recommend the use of the LLN of the ratio FEV1/FVC to identify subjects with bronchial obstruction, calculated using reference equations. While our results cannot be directly extrapolated to the diagnostic field, it seems sensible to wonder whether we should reconsider the criteria currently proposed, at least for elderly people. However, the association of both FEV1Q and FEV1·Ht−3 and mortality was not confirmed after correction for age. This indicates that these measures are sensitive to the decline of FEV1 with age , and that their role as a diagnostic tool in the elderly is worthy of further investigation.
This study has some limitations, the most important being the age- or disease-related decrease of height. FEV1·Ht−3 is obviously more prone to this bias compared to FEV1 % pred, while FEV1Q does not take height into account at all. In persons with COPD, the prevalence of vertebral fractures has been estimated to be around 25% [17] and, at least in females, height reduction is associated with increased mortality independently of incident vertebral fractures [18]. Vertebral fractures are also associated with reduced FVC [19] and in patients with vertebral fractures, FVC is increased after vertebroplasty [20]. Thus, use of actual height versus “true” height in FEV1 measures may introduce a conservative bias, with people at higher risk for death having higher estimated values of FEV1·Ht−3.
Furthermore, causes of death were not analysed. It is likely that mortality attributed to respiratory causes is more frequent in people with COPD, although it has been shown that the most common cause of mortality in this group is cardiovascular disease [21].
In conclusion, we have shown that, for different reasons, FEV1Q and FEV1·Ht−3 are an appealing way of standardising the FEV1 measurement. First, they are able to take into account variability coming from body size (especially FEV1·Ht−3) and sex (especially FEV1Q). Secondly, they do not rely on statistical assumptions about the distribution of the FEV1. This lack of assumptions probably makes the use of normative values obtained from external populations (i.e. different from the population that originated the individuals at hand) less prone to bias. Thirdly, normative values (e.g. percentiles) for these measures are more easily calculated. As a consequence, FEV1 % pred does not seem to be the best prognostic indicator in elderly people with COPD. Studies focused on health status outcomes, e.g. decline of exercise capacity and personal independence, are needed to verify whether a move to using FEV1Q or FEV1·Ht−3 as the preferred measure of severity of airflow obstruction is justified.
Footnotes
Statement of Interest
None declared.
- Received January 14, 2012.
- Accepted June 20, 2012.
- ©ERS 2013