|
|
||||||||
1 Dept of Public Health Sciences, King's College London, London, UK, 2 Dept of Thoracic Medicine, Haukeland Hospital, Bergen, Norway.
CORRESPONDENCE: S. Chinn, Dept of Public Health Sciences, King's College London, 5th Floor Capital House, 42 Weston Street, London SE1 3QD, UK. Fax: 44 2078486605. E-mail: sue.chinn{at}kcl.ac.uk
Keywords: Height, populations, reference values, spirometry
Received: June 22, 2005
Accepted December 15, 2005
| ABSTRACT |
|---|
|
|
|---|
Data were analysed for 6,323 never-smoking adults who did not report wheeze or asthma, from 42 centres participating in the European Community Respiratory Health Survey. Means and components of variance were estimated for males and females aged 2024 yrs, and the relationships with age and height were examined in those aged 2544 yrs.
Mean lung function for those aged 2024 yrs differed between centres, but variation could not be wholly attributed to differences in population or equipment. The maximum difference in means by equipment type was 101 mL for FVC in males. Equipment differences were not statistically significant adjusted for country, but differences in mean forced expiratory volume in one second and forced vital capacity by country, adjusted for instrument, were statistically significant in males. Differences between centres in relation to age and height had less influence on predicted values.
In conclusion, there are unexplained differences in lung function between ethnically similar nonsmoking symptom-free populations. Neither national reference curves nor those based on the same ethnic group can be guaranteed to give accurate norms of lung health.
Spirometric lung function measurements are used clinically for diagnosis and monitoring, and have many uses in research. In clinical use, forced expiratory volume in one second (FEV1) and forced vital capacity (FVC) are usually each expressed as a percentage of predicted value for height, age and sex. For example, the Global Initiative for Chronic Obstructive Lung Disease (GOLD) criteria classify patients with chronic obstructive pulmonary disease into categories according to post-bronchodilator FEV1/FVC and FEV1 % predicted 1. Researchers often analyse FEV1 % pred as an outcome variable 2, 3 or select patient groups according to FEV1 % pred 4.
The GOLD criteria do not specify the reference equations that should be used to obtain predicted values, despite evidence that the published predicted values for FEV1 and FVC differ considerably 5, for both males and females. Recent recommendations for lung function testing have suggested that reference values should be derived from lung function measurements from a population including the "age range, sex and ethnic group of individuals to be tested" 6. This assumes that predicted values should depend only on these three parameters. Each of the equations compared by Roca et al. 5 was a regression equation linear in height and age fitted to data for adults, mostly from those aged 25 yrs 711. Hence, reference equations that are usually constructed using data from adults have four elements: the mean value at the youngest age considered fully mature; the adjustment for height; the relationship with age; and the residual standard deviation. Each of these may differ between populations and over time.
The reasons for the differences in predicted values were not explored by Roca et al. 5, but the regression coefficients in the published equations suggest that these include variation in the relationship with height or age between populations. Calculations show some variation in predicted values for those aged 25 yrs and those of average height.
Roca et al. 5 found variation between the mean values for adults aged 2044 yrs from 34 centres participating in the European Community Respiratory Health Survey (ECRHS). The variation was not partitioned between the mean values in young adults, or different relationships to age and height, the knowledge of which can inform debate about appropriate reference values. The analysis presented in the current study describes variation in FEV1 and FVC in healthy nonsmoking adults aged 2024 yrs in 42 centres taking part in the ECRHS I. The relationships with height and age in adults aged 2544 yrs are also estimated.
| METHODS |
|---|
|
|
|---|
150,000 people. At stage one, where possible, an up-to-date sampling frame was used to randomly select at least 1,500 males and 1,500 females aged 2044 yrs, who were sent a self-completed postal questionnaire. A random sample of responders to stage one was invited to stage two, which included an administered questionnaire and measurement of lung function. The administered questionnaire in stage two included the question: "Have you ever smoked for as long as a year?" Stage two was carried out from 1990 to 1995 across 42 centres, which included the 34 centres in the analysis of Roca et al. 5, and eight contributing data later. The centre in Bombay (India) was omitted, as were those in Aarhus (Denmark) and Wroclaw (Poland) where the equipment used was unknown. Most centres were in Western Europe, plus three in New Zealand, one in Australia, six in Canada and one in the USA. Ethnic origin was not recorded, but participants were known to be almost exclusively White.
Spirometry
The equipment used was Biomedin spirometer (Biomedin, Padua, Italy; 16 centres), SensorMedics spirometer (SensorMedics, Yorba Linda, CA, USA; nine centres), Spirotech spirometer (Spirotech, Bilthoven, the Netherlands; eight centres), Jaeger pneumotach (Jaeger, Hoechberg, Germany; three centres), Morgan spirometer (Morgan, Haverhill, MA, USA; three centres), Morgan pneumotach (Morgan; one centre), Fleisch pneumotach connected to a Hewlett-Packers lung function analyser (Massach USA; one centre) and Vitalograph spirometer (Vitalograph, Buckingham, UK; one centre), each of which complied with American Thoracic Society (ATS) standards. The maximum FEV1 and maximum FVC of up to five technically acceptable blows were determined, and whether FEV1 and FVC each met the ATS criterion for reproducibility 13. Height was recorded prior to spirometry. Out of the 42 centres, it was measured in 31, self-reported in five centres, and not recorded whether measured or asked in six. Each European centre took part in a centrally organised training day, and was visited by a central organiser who checked that the common protocol for lung function testing was observed.
Statistical analysis
All analyses were carried out for never-smokers who did not report ever having had asthma, or wheeze in the last 12 months, for males and females separately. Data were divided a priori between participants aged 2024 yrs and those aged 2544 yrs, since 25 yrs has been reported as the age from which lung function starts to decline 11, at least in males 8. Falaschetti et al. 14 demonstrated a plateau up to around age 25 yrs for FEV1 and FVC in males, and for FVC in females. Variation of FEV1 and FVC in those aged 2024 yrs was analysed for heterogeneity between centres using Bartlett's test. The variation between individuals and between centres was estimated without and with adjustment for height. This was carried out using multilevel models with participant at level one and centre at level two, with height included as a covariate at level one for the height-adjusted components. The regression coefficients for height from the models were used to calculate height-adjusted values. Values were analysed according to whether they met the ATS reproducibility criterion or not, i.e. that the two largest values did not differ by >0.2 L 13. Height-adjusted centre means were analysed in relation to response rate, type of instrument used and country. A meta-analysis method was used to estimate heterogeneity of means and relationships with height and age between centres, for those aged 2544 yrs 15. The percentage of total variation across centres due to chance was calculated 16.
| RESULTS |
|---|
|
|
|---|
|
Table 2
shows the mean, average within-centre variation and between-centre variation for males and females. Without adjustment for height, differences between centre means accounted for, at most, 10% of the between-person variation, as shown by the intraclass correlation coefficients (ICCs). This centre variation increased the standard deviation of a single lung function measurement by at least 3%, for females, but at most 6%, for FEV1 in males, as shown by the ratios of the total single-determination standard deviation to the within-centre standard deviation in table 2
. No statistically significant variation between centres in the relationship of FEV1 or FVC with height was detected, either in males or females. The mean±SD height of the young males was 1.79±0.07 m and 1.66±0.07 m for the young females. Adjustment for height reduced each component of variation, but had little effect on the ICCs (table 2
). Although the ICCs were small, the differences between centres were highly statistically significant (p-value for heterogeneity <0.0001 in each case). Chance accounted for less than half of the observed variation between centres (FEV1 27.6% in males, 32.6% in females; FVC 42.5% and 26.0%, respectively).
|
Relationship of centre means at ages 2024 yrs with response rate and type of instrument
There was no evidence for a relationship of height-adjusted centre mean FEV1 (p = 0.72) or FVC (p = 0.57) in males or FVC in females (p = 0.17) with overall centre response rate, but there was some evidence of an increase for females with response rate for FEV1 (0.021 L per 10% increase in response rate; 95% CI 0.0100.041; p = 0.040). Mean FEV1 by instrument ranged from 4.33 L (Vitalograph) to 5.14 L (Morgan pneumotach) in males, and from 3.40 L (Fleisch) to 3.72 L (Morgan pneumotach) in females (fig. 1
). There were corresponding differences in FVC. The variation in means was statistically significant for FVC in males (p = 0.021), accounting for 29% of the centre variation. The differences by instrument type were not statistically significant for FEV1 in males (p = 0.06; 20% variation explained), FEV1 in females (p = 0.44; <1%) or for FVC in females (p = 0.81; 0%). Comparing the three makes of spirometer that were used in more than one country, Biomedin, SensorMedics and Spirotech, there were no significant differences between mean FEV1 or FVC by make with adjustment for country, but country differences adjusted for make were statistically significant for males (FEV1 p = 0.0005; FVC p = 0.0027). Divided, a priori, into Biomedin, other spirometer and other type of instrument, there were significant differences between the groups in mean FEV1 in males, adjusted for country (p = 0.009), but not in the other measures or in means unadjusted for country except for FVC in males (p = 0.017).
|
|
|
| DISCUSSION |
|---|
|
|
|---|
Although there was evidence for variation between centres in the relationship of lung function with age, more than half of the observed variation could be attributed to chance. For most uses, in diagnosis and patient selection, cross-sectional reference curves are appropriate. Several authors have found differences between cross-sectional associations with age and longitudinal decline with age 17, but not all in a consistent direction 18, 19. The longitudinal decline with age should represent the true mean decline due to the ageing process, but estimates may be affected by selective participation in multiple surveys, learning effects and healthy survivor effects. Cross-sectional relationships with age will encompass cohort effects as well as healthy survivor effects, but are less affected by participation bias than longitudinal estimates. These different influences may explain the discrepancies between the cross-sectional and longitudinal findings. Clearly, it is desirable to allow for pure cohort effects in reference equations, but part of the decline with age may be due to increasing ill health, and earlier cohorts may have poorer health than later cohorts. Hence, full age adjustment may lead to underdiagnosis of lung disease. Although never-smokers without wheeze or asthma were selected, in common with most studies that reported reference equations, asymptomatic disease could not be rules out. The increasing decline with age, observed in the current females (data not shown) and reported by others 17, 18, 20, may also be due to effects of poorer health.
The current authors chose to compare means in those aged 2025 yrs since several studies have estimated the decline after that age 8, 11, while others have modelled means from an earlier age 14, 2023. Studies that have modelled variation from subjects aged
20 yrs do not show a consistent age of maximum lung function. Falaschetti et al. 14 have found an earlier decline in FEV1 in females than in males but an extended plateau in FVC, Gulsvik et al. 20 have shown an apparent maximum lung function in male subjects aged 30 yrs and stated that curves in males and females were parallel, and Langhammer et al. 23 have demonstrated a decline in male subjects from 20 yrs but later in females. These differences may be an artefact of the various forms of equations used to model mean lung function. It is unlikely that the sample sizes available,
6,000 in the Health Survey for England across the age range of 1685 yrs and over 14, would be adequate to distinguish between these models.
By far the most difficult issue is what mean value of each measure of lung function in subjects aged 2025 yrs, or at the age of maximum lung function, should be used in reference equations. Although considered as random effects, the centre differences increased the residual standard deviation, and hence the width of a reference range, by at most 6% (table 2
), the differences in means by type of instrument were not negligible (fig. 1
). Compared with between-centre variation, differences between the eight types of equipment were not generally statistically significant, but this analysis of 42 mean values does not rule out important real differences. Assuming a normal distribution, and, therefore, using the mean minus 1.64 total standard deviations of FEV1 from table 2
14, 21, the estimated 5th centiles of FEV1 for males and females of average height aged 2025 yrs are 3.94 L and 2.87 L, respectively, but range 3.484.29L for males and 2.733.05 L for females using the minimum and maximum means from figure 1
with within-centre variation from table 2
. The ATS 13 and the European Respiratory Society 11 each state that calibration of equipment should achieve readings to within ±50 mL.
Although participation bias could be ruled out as unlikely to be a major cause of the centre variation, true population differences could not be fully separated from instrument variation. The latter comprises spirometer versus pneumotach difference, make, model and machine within-model variation, and inconsistent calibration and operation. It is not possible to ascertain whether corrections to body temperature, ambient pressure, saturated with water vapour conditions are comparable between different manufacturers. In so far as the same type of instrument was used in several centres in different countries, there is evidence that there are true population differences and that differences between spirometers may be less important, but neither of these can be quantified from the current study. However, there is evidence that even devices of the same type, used under carefully controlled conditions and calibration, may give differing results 24, 25.
The current results suggest that centre variation was more likely to be due to true population differences. Population differences may be due to genetic differences or to differences in health that are not removed by restricting the data to nonsmokers without wheeze or asthma. In the present data, differences remained after adjusting for variation between countries in height and also for variation in body mass index. As participants in the ECRHS were almost exclusively White, the country differences lead to the conclusion that reference curves cannot be guaranteed to be applicable to a population of the same ethnic group as that from which they were derived, as recently recommended 6. Conversely, national reference curves cannot be used in epidemiological studies that seek to compare populations and, if used to select patients with lung function below a given percentage of predicted value in multicentre trials across different countries, may result in heterogeneity in severity of disease in those chosen.
Without a large international study comparing several instruments within each centre, it is impossible to fully separate instrument from true population differences. Studies that wish to compare population values between centres must be prepared to invest in standardised equipment. In order to show whether population differences in lung function represent differences in health, it would be necessary to compare mortality and morbidity between populations in relation to lung function measured in a standardised way. Neither national reference curves nor ones based on the same ethnic group can be guaranteed to give accurate norms of lung health.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. Stanojevic, A. Wade, J. Stocks, J. Hankinson, A. L. Coates, H. Pan, M. Rosenthal, M. Corey, P. Lebecque, and T. J. Cole Reference Ranges for Spirometry Across All Ages: A New Approach Am. J. Respir. Crit. Care Med., February 1, 2008; 177(3): 253 - 260. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Stanojevic, A. Wade, T. J. Cole, and J. Stocks Population-specific reference equations? Eur. Respir. J., January 1, 2007; 29(1): 215 - 215. [Full Text] [PDF] |
||||
![]() |
S. Chinn, D. Jarvis, and P. Burney From the authors Eur. Respir. J., January 1, 2007; 29(1): 215 - 216. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |