Abstract
Newer methods of grading severity of airflow limitation perform better than the percent predicted of the FEV1 and deserve consideration in both prognostic models and individual patient assessment http://ow.ly/oQB030kvVEW
There is broad consensus that airflow limitation, the primary physiological abnormality of chronic obstructive pulmonary disease (COPD), is best defined by a significant reduction in the ratio of forced expiratory volume in 1 s (FEV1) to forced vital capacity (FVC) or to the slow vital capacity. There is also abundant evidence that this reduction is most accurately identified as an individual value less than the lower limit of the normal range (LLN) specific to that individual, as determined from an appropriate healthy, non-smoking, reference population [1]. Because the FEV1/FVC ratio declines normally with age, using a non-individualised cut-off, such as 0.70, has been shown to cause an unacceptable level of misclassification with age and sex bias; this leads to over-diagnosis of 30% or more of older men and under-diagnosis of younger women [2–5]. There is much less consensus, however, on methods to indicate disease severity or to assess the likelihood of future outcomes. The most commonly used index is the percent of the predicted value of FEV1, with various cut-off points proposed for categories of severity, and this was endorsed in the 2005 American Thoracic Society (ATS)/European Respiratory Society (ERS) pulmonary function documents [1]. The use of standardised residuals (z-scores) to establish the normal range was recommended by the ERS in 1993 [6] and, more recently, they have also been evaluated as an index of severity [7]. Their use in pulmonary function reporting, particularly as part of a visual scale, has been endorsed in a current ATS technical statement [8]. Both percent of predicted and z-score depend upon the predicted value of FEV1, but predicted values have inherent uncertainty and may not accurately reflect some individuals. Other indices, based upon the absolute value of FEV1, and thus not dependent upon a reference value, have been proposed but are not yet in wide use. Recently in the European Respiratory Journal, Huang et al. [9] reported an evaluation of seven methods to categorise reductions in FEV1, comparing their correlations to the outcomes of acute exacerbations and mortality. While this study in a Taiwanese population with confirmed COPD was relatively small (n=296) and predominantly male (94%), the results draw attention to some of the less commonly used indices and may cause us to rethink our dependence on percent of predicted.
For each of the methods studied, the population was divided into four stages of severity and the reliability of the index was assessed by the correlation of these strata to increasingly worse outcomes. This analysis tests both the index and the appropriateness of the cut-off points dividing the stages. For each of five indices, the population was divided into quartiles, which has the advantage of a symmetrical distribution among the stages and focuses differences on the index itself. Percent of predicted was also evaluated by the widely used cut-off points of the Global Initiative for Chronic Obstructive Lung Disease (GOLD) [10] and the reference-independent index FEV1/height2 was also evaluated by previously published cut-off points [11]. The outcomes studied were frequency of severe acute exacerbations (SAE) at 1 and 2 years and all-cause mortality at a median follow-up of ∼4 years. For each of these outcomes, the reference-based indices (percent predicted both by quartiles and GOLD cut-off points and z-scores) were out-performed by the reference-independent methods. Of these, the FEV1 quotient (FEV1Q) performed best. This index, proposed by Miller and Pedersen [12] is simply the absolute value of the FEV1 divided by 0.5 for men (or by 0.4 for women) where the denominator represents the first percentile of FEV1 values from a large clinical population. As an FEV1Q value approaches 1 it is tracking a patient's course toward the minimum FEV1 value compatible with survival (or, at least, with being able to come to a lab and be tested), so it should not be surprising that low values correlate well with mortality. Even though most COPD patients die of comorbidities or during an exacerbation well before reaching this minimal level of lung function, the risk of such an outcome would be expected to increase as this index decreases but, interestingly, the FEV1Q has also been shown to correlate with mortality even in a subpopulation having FEV1 within the normal range [12]. The study by Huang et al. [9] confirms the findings of Miller and Pedersen [12] that the FEV1Q correlated better with mortality than two other reference-independent indices proposed earlier, FEV1/height2 and FEV1/height3, and shows in addition that FEV1Q correlated well with three interrelated SAE outcomes. The percent of predicted by quartiles also correlated well with mortality, but not with SAEs, while percent predicted by GOLD correlated less well with both, and z-scores showed a good correlation only to an adjusted model for mortality. Most of the indices showed an inversion with a higher likelihood of mortality or SAE or both in the third quartile than in the fourth (lowest) quartile. This may reflect an anomaly in this relatively small population, but this inversion of quartiles was not seen for FEV1Q in any of the four outcomes studied.
Epidemiological studies can use an index of severity as a continuous variable or with population specific cut-off points, but clinical use has typically placed deficits in staging categories such as mild, moderate and severe, which require predetermined cut-off points. While the analysis by quartiles has the advantage for this study of allowing like−like comparisons between the methods, the cut-off points that result may not translate to other populations, but do inform the choice of more generic cut-offs. For example, in the clinical COPD population reported by Huang et al. [9], the quartile breaks for FEV1 percent predicted occurred at 69, 53 and 41%, while the GOLD divisions at 80, 50, and 30% placed 81% of the patients in the two mid-levels with only 12% in the highest and 7% in the lowest. Note that because this study defined obstruction by the Global Lung Function 2012 Initiative LLN for FEV1/FVC [13], rather than 0.70, the false-positives usually seen by GOLD would not be included. For FEV1Q the quartile cut-off points in the clinical population of the current study were 3.4, 2.5 and 1.9. The study by Miller and Pedersen [12] combined the data sets of a large pulmonary function laboratory (n=11 972), a general population (13 900) and a clinical COPD population (1095). The distribution of results in this mixed adult population (figure 1) shows that 6% had an FEV1Q <2 with median survival of <5 years, 24% had an FEV1Q of 2–4 with median survival of about 10 years, 38% had an FEV1Q of 4–6 with median survival of about 20 years, and 31% had an FEV1Q of >6 with 75% surviving more than 20 years. Pending other clinical population studies, these data suggest that the integer values, 6, 4 and 2 would make a convenient four-part scale of mortality risk.
The various expressions of FEV1 have differing utility and can be complementary. The placement of an individual's test result in relationship to the normal population, i.e. determining the likelihood that the result is abnormal, is best done by z-score, where the LLN is −1.645, equivalent to the fifth percentile of the healthy reference population. But both z-score and percentiles have limitations as an index of severity. For z-score, the unit of measurement (the standardised residual of the normal population) is the same for all individuals, so any negative z-score represents a greater proportionate decrement for an older, smaller individual than for a younger, taller one. When expressed as a percentile of the healthy population, all abnormal FEV1 values are clustered below the fifth percentile with 40–50% of COPD patients falling below the 0.5th [7]. The percent of predicted is not well suited for diagnosing abnormality because the value representing the population LLN varies with the individual and with the test; however, it does provide an intuitive estimate of any decrement in lung function and is easily understood by patients. While percent of predicted shows the loss from a predicted normal value, FEV1Q shows how much function is left before reaching a minimal value, so it may be most helpful to clinicians in evaluating the later stages of disease. Because FEV1Q is independent of a reference source it may be particularly helpful in older individuals where there is more uncertainty in the predicted values. This has been demonstrated at age ≥65 years by a better correlation with mortality for FEV1Q than percent predicted for both a group with COPD and one without airflow limitation [14], and in a population-based sample age ≥80 years for mortality and other adverse outcomes [15].
Indices based upon the FEV1 are the most commonly used measures of lung function in staging or prognosis of COPD and other obstructive diseases, but others may also have a role. The FEV1 itself, expressed by absolute value, has been shown to correlate better with exertional dyspnoea than when “corrected” for sex, size and age by percent of predicted [16]. FVC has been shown to correlate better than FEV1 with all-cause mortality in general populations [17], most likely reflecting the large role of cardiovascular and other diseases as causes of death. While the large population variability of the forced expiratory flow at 25–75% of FVC (FEF25–75) has made it unhelpful for the diagnosis of airflow obstruction [18], there is interest in trending its change, or that of the ratio of FEF25–75/FVC, in individual patients at risk for transplant-related obliterative bronchiolitis or early in the course of cystic fibrosis [19, 20]. Efforts to improve clinical staging or predict future outcomes will add other risk factors, such as age and smoking history, to an index of lung function. A GOLD update added history of exacerbations and symptoms to the FEV1 percent predicted cut-off points shown above, but this did not improve its prognostic validity [10, 21] The BODE index added body mass index, a dyspnoea score and walking distance [22]. Other candidates include frailty [23], phenotypes [24, 25] and biomarkers [26, 27]. These more complex prognostic models may be improved by including FEV1Q, and are a valuable research tool, but the clinician dealing with an individual patient needs a simple, easily calculated guide to severity. The demonstrated value of the FEV1Q as a predictor of clinically important outcomes in population research suggests that it deserves attention for individual patient assessment as well.
Footnotes
Conflict of interest: None declared.
- Received May 30, 2018.
- Accepted June 13, 2018.
- Copyright ©ERS 2018