Abstract
We studied the fit of the Global Lung Function Initiative (GLI) all-age reference values to Norwegians, compared them with currently used references (European Community for Steel and Coal (ECSC) and Zapletal) and estimated the prevalence of obstructive lung disease.
Spirometry data collected in 30 239 subjects (51.7% females) aged 12–90 years in three population-based studies were converted to z-scores.
We studied healthy non-smokers comprising 2438 adults (57.4% females) aged 20–90 years and 8725 (47.7% female) adolescents aged 12–19 years. The GLI-2012 prediction equations fitted the Norwegian data satisfactorily. Median±sd z-scores were respectively 0.02±1.03, 0.01±1.04 and −0.04±0.91 for forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC) and FEV1/FVC in males, and −0.01±1.02, 0.07±0.97 and −0.21±0.82 in females. The ECSC and Zapletal references significantly underestimated FEV1 and FVC. Stricter criteria of obstruction (FEV1/FVC <GLI-2012 lower limit of normal (LLN)) carried a substantially higher risk of obstructive characteristics than FEV1/FVC <0.7 and >GLI-2012 LLN. Corresponding comparison regarding myocardial infarction showed a four-fold higher risk for women.
The GLI-2012 reference values fit the Norwegian data satisfactorily and are recommended for use in Norway. Correspondingly, the FEV1/FVC GLI-2012 LLN identifies higher risk of obstructive characteristics than FEV1/FVC <0.7.
Abstract
GLI-2012 all-age reference values for spirometric indices fit Norwegians aged 12–85 years http://ow.ly/WP9D3034svA
Introduction
Spirometry is pivotal for the diagnosis and follow-up of patients with obstructive lung diseases such as asthma and chronic obstructive pulmonary disease (COPD). Results are often reported as percentage predicted where predicted values are obtained from a healthy reference population. However, predicted values from different sources may differ widely, and as the variability of measurements varies with age the use of percentage predicted leads to an age bias [1]. The latter can be circumvented by the use of sex, age, height and ethnicity specific z-scores, which define how many standard deviations a measurement differs from the predicted value [2]. It is important to use prediction equations that fit the population. In Norway, equations have been developed for adults by Gulsvik et al. [3] (study population 18–23 years (n=480)) Johannessen et al. [4] (study population 26–82 years (n=515) and Langhammer et al. [5] (study population 20–80 years (n=908)). Nevertheless, the equations from the European Coal and Steel Community (ECSC) (study population 18–70 years) [6] are widely used despite having been found to underestimate normal lung function in Caucasians [4, 5, 7–9]. For children and adolescents, the equations developed by Zapletal et al. [10] (study population 6–17 years (n=111)) and Polgar and Promadhat [11] (study population 6–18 years) are commonly used. The ECSC and Polgar equations were constructed from published reference equations rather than from actual measurements. Use of different reference values within and between pulmonary function laboratories introduces inconsistencies in reported predicted values for individual patients, as do the disjunctions between reference values developed for different age groups [12].
In 2012, the Global Lung Function Initiative (GLI) published reference values for the age range 3–95 years for several ethnic groups. For Caucasians, the values were based on 57 395 individuals [2]. The fit of these reference values has been tested in some populations, and reported results have been conflicting [13–17]. Ben Saad et al. [14] reported that GLI-2012 overestimated forced expiratory volume in 1 s (FEV1) and forced vital capacity (FVC) in a North-African adult population, possibly because of contribution of sub-Saharan ancestry of Berbers, while recent studies found that GLI-2012 underestimates both FEV1and FVC in adults in Finland and Sweden [13, 17]. Few studies have evaluated the fit of GLI-2012 over a large age span; a large study including Australasian Caucasians aged 4–80 years reported the differences to be less than the within test variation accepted in spirometry testing [16].
Use of different predicted values also results in different estimates of prevalence of lung disease. Peradzynska et al. [18] found that in children referred to hospital for respiratory symptoms or disease, GLI-2012 resulted in a higher proportion with lung function abnormalities compared to use of Polish 1998 reference values. Transition from Hankinson et al. [9] and Wang et al. [19] equations to GLI-2012 led to grossly similar prevalence rates of abnormally low values for FEV1, FVC and FEV1/FVC in hospital-based settings, but disparate results [20, 21] when compared with equations from Polgar and Promadhat [11], and Zapletal et al. [10].
We aimed to study the fit of GLI-2012 in a reference sample of healthy never-smoking Norwegians, by firstly developing prediction values based on a reference sample (Nor-2015), and then evaluating how these reference values agree with GLI-2012 in comparison with other currently used prediction equations in Norway. Additionally, we aimed to estimate the prevalence of respiratory and cardiovascular symptoms and disease in the general population using different criteria for bronchial obstruction.
Methods
Study population and measurements
We generated a reference sample of participants randomly selected from three Norwegian population based studies; the Tromsø 6 Study [22], the Hordaland County Cohort Study (HCCS) [23], and the Nord-Trøndelag Health Study (HUNT) with HUNT2 and HUNT3 for adults, and YoungHUNT1, YongHUNT2 and YoungHUNT3 for adolescents [24, 25] (table 1). In total, 15 615 persons aged 12–19 years and 14 624 persons aged 20–90 years performed pre-bronchodilator spirometry. In all studies, the participants answered comprehensive questionnaires regarding life style, symptoms and diseases. Height and weight were measured with light clothing and without shoes, height to the nearest centimetre in HCCS, to the nearest half centimetre in HUNT2/YoungHUNT1 (1995–1997) and YoungHUNT2 (2000–2001), and to one decimal place in Tromsø and HUNT3/YoungHUNT3 (2006–2008). Age at spirometry was registered without decimals in Tromsø, in HUNT and HCCS with one decimal.
Persons having participated more than once had spirometry and corresponding data included only from their latest participation. HUNT3 invited the entire adult population of the county, and among the 50 807 participants, the HUNT Lung Study invited a 10% random sample of all participants, and persons reporting symptoms, diagnosis or use of medication for asthma or COPD to spirometry.
Three different types of spirometers were used for pre-bronchodilator FEV1 and FVC (table 1). In the HCCS and the Tromsø studies, the measurements were performed at centralised examination stations [26], while in HUNT, spirometry was performed by adolescents at schools, and in adults at examination stations in all 24 municipalities of the county.
All centres performed spirometry according to the 1994 ATS-criteria [27]. All spirometers were calibrated each morning and afternoon with a 1-L or 3-L litre syringe; the participants were seated, wore nose clips and received standardised instruction from the technician. At least three spirometric manoeuvres were performed and the test was considered satisfactory when the two largest FEV1 and FVC differed by less than 200 mL; the Tromsø Study, however, aimed at differences below 150 mL. The highest FEV1 and FVC were selected and used to calculate the FEV1/FVC ratio. Room temperature ranged from 19 to 24°C.
Quality control
In HUNT, the spirometry software provided feedback on acceptability of technique and repeatability. For spirometry from HUNT3/YoungHUNT3, curves were scored grade A–F partly in line with a recent study by Hankinson et al. [28]. All curves graded A–C were included in the study; meaning at least two acceptable blows with less than 200 mL difference, but we did not apply the end of test criterion with exhalation >6 s. Among adolescents short expiration time made evaluation of end of test difficult, thus only smoothly ending flow–volume curves with FEV1/FVC ≤0.95 were accepted. Thorough quality assessments were also performed in HCCS and the Tromsø Study; in the latter inter- and intra-observer agreements showed excellent results [29].
Statistics
The healthy reference sample comprised persons without self-reported respiratory disease, cardinal respiratory symptoms, smoking history and other respiratory symptoms [23]. Z-scores of lung indices according to predicted values by GLI-2012 were calculated for all participants and means and standard deviations reported.
If the GLI-2012 equations are appropriate for healthy Norwegians mean ±sd z-scores should approximate 0±1 across the entire age and height range studied. Data were stratified by sex and reported by study site, age and height.
Predicted values derived from the total Norwegian dataset of healthy subjects (Nor-2015) were estimated in the same manner as GLI-2012 using the statistical software R (version 2.15.1; www.rproject.org/) with the generalised additive model for location, scale and shape package (GAMLSS). We applied the LMS method: a statistical method to normalise skewed distributions using a smoothing trend curve; the resulting Box–Cox–Cole Green power lambda (L), and the similarly smoothed median mu (M) and coefficient of variation sigma (S) summarise the distribution of the data [30].
Predicted values and z-scores (observed minus predicted/standard deviation) were also calculated according to the equations from Johannessen et al. [4], Gulsvik et al. [3], Langhammer et al. [5], ECSC [6] and Zapletal et al. [10], restricted to the relevant age range. The agreement with GLI-2012 and other reference values was visualised by Bland–Altman plots.
We looked at two definitions for airflow obstruction, FEV1/FVC <0.70 (which is commonly used in Norwegian clinical practice) and FEV1/FVC < lower limit of normal (LLN) as recommended by the American Thoracic Society (ATS)/European Respiratory Society (ERS) [31] in order to compare their COPD prevalence and associations with symptoms.
The proportion of males and females in the general population with an obstructive pattern was estimated by use of GLI-2012 and Nor-2015 in all participants in HUNT3 based on inverse weighting of the probability of having been selected for spirometry (figure S1). Further, the age-adjusted risk of reporting symptoms or diseases by the stricter criteria of obstruction, FEV1/FVC <LLN, compared with FEV1/FVC<0.7 and ≥LLN was estimated by binary logistic regression models.
Results
In total, 1403 females and 1035 males aged 20–90 years met the selection criteria for the reference sample; corresponding figures for age group 12–19 years were 4404 girls and 4654 boys (table 2). The mean age of included adult women and men was lower in HUNT (49 and 45 years) compared with Hordaland (56 and 52 years) and Tromsø (63 and 61 years). The percentages of the total sample included in the healthy reference population were 15.1% in Tromsø, 15.9% in Hordaland, 18.7% in HUNT and 62.4% in YoungHUNT (table S1).
In the reference sample of healthy individuals, the distributions of z-scores for the different indices stratified by sex and age below and above 20 years were close to normal. There was no relevant correlation in the reference population between the z-scores for FEV1, FVC and FEV1/FVC according to GLI-2012 and age and height (explained variance 0.08%–1.8%), but the FEV1 and FVC tended to be somewhat below predicted in the smallest boys (figure 1). The median±sd z-scores for FEV1, FVC and FEV1/FVC in females were −0.01±1.02, 0.07±0.97 and –0.12±0.82, and in males 0.02±1.03, 0.01±1.04 and −0.04±0.91, respectively. Stratification by age 20 years revealed closer fit among adolescents than adults with z-scores for all indices close to zero except for median z-FEV1/FVC (–0.11) in girls. In adults, the fit to GLI-2012 was closer in men than in women, but for both sexes, a slightly higher positive z-FVC than z-FEV1 resulted in a negative median z-FEV1/FVC. Corresponding medians and 5th and 95th percentiles are reported in table 3.
Plots of GLI-2012 z-scores by study site, age and height categories showed minor differences between study sites, but confirmed a slightly lower z-FEV1/FVC in women except in HUNT (figure 1). In women, z-FEV1 and z-FVC were slightly higher at age 40–79, while FEV1/FVC z-scores were slightly lower in all age groups. In men, there was a good fit of GLI-2012 for all indices, independent of age. In women with a height of 155–185 cm and men taller than 165 cm, there was good fit of GLI-2012. The Bland-Altman plots showed that predicted values from Zapletal for adolescents and ECSC for adults underestimated the lung indices, while there was fairly close agreement between Nor-2015 (12–90 years) and GLI-2012 (figure 2). Similar plots comparing Norwegians reference values (Gulsvik, Langhammer and Johannessen) with GLI-2012 show rather good agreement for women for FEV1 and FVC. In men, however, the plots reveal somewhat higher levels of predicted FEV1 from the Norwegian sets, but rather inconsistent patterns and relatively large differences for predicted FVC (figure S2).
Proportion with lung indices below LLN in the healthy and the general population
Ideally, about 5% of observations should fall below the GLI-2012 fifth percentile in the healthy population. GLI-2012 fit adolescents rather well, less so in adults, especially for FVC (table 3). For FEV1, 2.4% of males and 3.1% of females of observations were below LLN, corresponding figures for FEV1/FVC were 3.3 and 3.7%.
Based on estimates of the weighted sample (figure S1); in the age range associated with the highest prevalence of COPD (40–80 years), 13.5% of men and 9.3% of women had pre-bronchodilator FEV1/FVC <0.70, and 6.6% of both sexes had FEV1/FVC <LLN defined by GLI-2012. Among those with pre-bronchodilator FEV1/FVC <LLN according to Nor-2015 (n=3353) and GLI-2012 (n=3526), median±sd z-scores for FEV1 were −1.68±1.19 and −1.62±1.11, and for FEV1/FVC −1.99±0.70 and −2.08±0.62, respectively.
More women and men were defined as obstructive according to FEV1/FVC <0.70 compared with FEV1/FVC <LLN according to Nor-2015 and GLI-2012 (table 4). In subjects identified with an obstructive pattern by GLI-2012 and Nor-2015 there was close agreement with self-reported obstructive lung disease in men, whereas in women disease and symptom prevalence were somewhat lower for GLI-2012 than for Nor-2015 (table 4). About one-third reported a doctor diagnosis of asthma and one-fifth a doctor diagnosis of COPD, emphysema or chronic bronchitis.
For both sexes, the age-adjusted odds ratio for having obstructive lung disease-related attributes was significantly higher in those with FEV1/FVC <LLN compared with those with FEV1/FVC <0.7 but ≥LLN (table 5). The mean age for the first category was 49.8 years in women and 52.8 years in men, and corresponding figures for the last category were 70.8 and 67.3 years. The same comparison for cardiovascular diseases found the odds ratio for myocardial infarction increased in those with FEV1/FVC <LLN but only for women.
Discussion
The main finding in this study is that the GLI-2012 equations fit a Norwegian population; differences between predicted and observed values are small, they are unrelated to age and height, and the scatter is close to that in GLI-2012 equations except for the FEV1/FVC ratio in females, for whom it is smaller. The reference group of adolescents outnumbers that of adults nearly fourfold (table 2), so that the closer fit in adolescents might relate to sample size. This study also confirms previous reports that the ECSC- and Zapletal-predicted values do not fit a healthy white population and lead to misclassification of subjects.
In men, the weighted prevalence of respiratory symptoms, diagnoses and use of medication among all HUNT3 participants were similar among those defined with an obstructive pattern according to GLI-2012 and Nor-2015. In females, the corresponding prevalence was lower for GLI-2012 than for Nor-2015 defined obstruction; this is to be expected, as reference values derived from the population will fit that population better than external reference values.
The strength of the current study is the population-based design, recruiting randomly selected persons who had performed spirometry according to international recommendations. Additionally, data were available on respiratory disease, symptoms and smoking history needed for selection of a healthy never-smoking reference group. The number of adolescents and elderly adults was large compared with other studies. In line with GLI-2012's recommendation, in this study, height and age were measured to one decimal place accuracy to avoid potential bias related to inaccuracies in these measures [32]. Another strength of this study is the availability of reported symptoms, diagnosis and use of asthma medication for the entire HUNT3 population (n=50 807), from which subsamples reporting symptoms or a diagnosis of obstructive lung disease were additionally invited to spirometry. This provided an opportunity to estimate symptom prevalence by persons identified as obstructed by criteria. The prevalence of self-reported symptoms and diseases in the weighted sample agreed well with the source data for all participants (table S2). We have found that using FEV1/FVC <LLN to define airflow obstruction better identified subjects with relevant respiratory symptoms and disease attributes than did using the fixed ratio of <0.7 to define airflow obstruction. Our finding of a four-fold increased risk of myocardial infarction in women meeting the LLN definition compared with those above LLN and below the fixed ratio requires further investigation, as opposite results have been reported by other studies [33]. Among limitations, it is a challenge getting good quality spirometry curves in a large study carried out in different centres over many years. Many adolescents had a rather abrupt end of the exhalation and, in order to avoid underestimation of FVC and overestimation of FEV1/FVC, we excluded persons with a ratio above 0.95; however, this selection bias might explain the negative association of FEV1 and FVC with height in younger subjects (figure 1). In addition, a low participation rate of 54% for adults in HUNT3 could point to selection bias. A non-participation study reported some healthy selection bias; chronic diseases, such as diabetes, cardiovascular disease and COPD, were more prevalent in the general population compared with participants [34]. This should, however, not lead to bias in the healthy reference population, but will contribute to some underestimation of obstructive pattern in the entire population of the county. A disjunction in measured values from adolescence to adulthood probably arose from a selection bias; in the 20–29-years age range, participation rate was only 26% and 38% in males and females. Comparisons of the prevalence estimates for COPD between the present and other studies should take into consideration our use of pre-bronchodilator spirometry and not post-bronchodilator spirometry as recommended by the Global Initiative for Chronic Obstructive Lung Disease [35].
During the last decades many prediction equations for spirometry have been published with large differences between them. This has partly been explained by technical, procedural and biological differences, and by sampling errors as well as differences in selection of reference groups [36]. Some disagreements are related to different statistical software and modelling facilities; new statistical methods such as GAMLSS, however, have improved modelling by taking into account the non-linear relationship between age, height and spirometric indices, including a smooth transition from adolescence to adulthood, as well as modelling the age dependence of measurement variability. The present study found only minor differences between study sites and correspondingly between different spirometers. Such differences might arise from sampling error. The different patterns revealed when comparing previous Norwegian reference values with GLI-2012, especially among men, emphasise the importance of representative and large reference samples.
The reference sample included in the present study is large enough to develop prediction values for Norwegians aged 12–90 years, in line with previous recommendations of updated and regionally developed reference values. In subjects with respiratory symptoms, there was fair agreement in z-scores whether derived from GLI-2012 or Nor-2015 equations. A new equation specifically for Norwegians will exploit the idiosyncrasies of the data and lead to a small improvement in the fit to this particular population sample at the expense of a smaller valid age range and applicability to other ethnic groups. Norway includes a Sami population of about 40 000 persons, and there is increasing immigration from other ethnic groups, challenging the choice of reference values in daily clinical practice. Whereas reference intervals and LLN provide some guidance, there is no sharp demarcation of measurements in healthy and pathological conditions, so that decision limits also hinge on clinical considerations [37]. In order to simplify comparative studies between countries, avoid confusion due to age-related gaps in reference values and ease the conversion to reference values for different ethnic groups, we recommend implementation of GLI-2012 in healthcare at all levels in Norway. Although no data from children younger than 12 years were available, there is no reason to believe that GLI-2012 would not fit that age range.
Conclusion
GLI-2012 fits the Norwegian reference sample satisfactorily and shows superiority compared to the ECSC and Zapletal reference values that are widely used in Norway. Advantages are a large age span and inclusion of reference values for different ethnic groups. Therefore, we recommend implementation of GLI-2012 in Norwegian healthcare.
Acknowledgements
All participants in the three population based studies are thanked for their contribution.
The Tromsø studies were carried out by the Department of Community Medicine at the Arctic University of Norway in collaboration with the Norwegian Institute of Public Health, the University Hospital of Northern Norway (UNN) and Tromsø City Council.
The Nord-Trøndelag Health Study (The HUNT Study) is a collaboration between HUNT Research Centre (Faculty of Medicine, Norwegian University of Science and Technology NTNU), Nord-Trøndelag County Council, Central Norway Health Authority, and the Norwegian Institute of Public Health.
Footnotes
Editorial comment in: Eur Respir J 2016; 48: 1535–1537.
This article has supplementary material available from erj.ersjournals.com
Support statement: The Hordaland County Cohort Study was funded by the Royal Norwegian Council for Scientific and Industrial Research and the Norwegian Research Council.
Conflict of interest: Disclosures can be found alongside this article at erj.ersjournals.com
- Received March 1, 2016.
- Accepted August 2, 2016.
- Copyright ©ERS 2016