Abstract
Interpretation of spirometry involves comparing lung function parameters with predicted values to determine the presence/severity of the disease. The Global Lung Function Initiative (GLI) derived reference equations for healthy individuals aged 3–95 years from multiple populations but highlighted India as a “particular group” for whom further data are needed. We aimed to derive predictive equations for spirometry in a rural Western Indian adult population.
We used spirometry data previously collected (2008–2012) from 1258 healthy adults (aged 18 years and over) by the Vadu Health and Demographic Surveillance System. We constructed sex-stratified prediction equations for forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC), and FEV1/FVC using the Generalised Additive Model for Location, Scale and Shape (GAMLSS) method to derive the best fitting model of each outcome as a function of age and height.
When compared with GLI Ethnicity Codes 1 (White Caucasian) and 5 (Other/Mixed), the Western Indian adult population appears to have lower lung volumes on average, though the FEV1/FVC ratio is comparable. Both age and height were predictive of mean FEV1 and FVC; and for females, the variability of response was also dependent on age. FEV1/FVC appears to have a very strong age effect, highlighting the limitations of using a fixed 0.7 cut-off value.
The use of GLI normal values may result in overdiagnosis of lung disease in this population. We recommend that the values and equations generated from this study should be used by physicians in their routine practice for diagnosing disease and its severity in adults from the Western Indian population.
Abstract
The Western Indian adult population appears to have lower lung volumes compared to the Euro-American population. Use of GLI normal values may result in overdiagnosis of respiratory disease and locally derived equations should be used in clinical practice. https://bit.ly/3aMAN4s
Background
Spirometry is the gold standard for accurate and repeatable measurement of lung function. It is central to detecting obstructive airway diseases (OADs) such as asthma and chronic obstructive pulmonary disease (COPD) as well as suspecting restrictive lung diseases [1]. Interpretation of spirometry involves assessing lung function parameters such as forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC) and FEV1/FVC in comparison with reference predicted values and the absolute values of FEV1/FVC to determine the presence and severity of OADs. Reference predicted values are typically calculated from thousands of healthy individuals and vary according to not only their age, sex and height, but also their ethnicity [2].
The Global Lung Function Initiative (GLI), reports normative reference values, derived from over 160 000 data points in combined datasets from 72 centres in 33 countries [2]. In this, reference equations were derived for healthy individuals aged 3–95 years principally for European-Americans (n=57 395), African-Americans (n=3545), North (n=4992) and South East Asians populations (n=8255). For individuals not represented by these four groups, or of mixed ethnic origins, a composite equation taken as the average of the above equations has been provided to facilitate interpretation until a more appropriate solution is developed. India is highlighted as a “particular' group” in whom further data are needed as the limited existing literature suggests the predicted values may be “at least as low as those found in subjects of African origins” [2].
In the absence of predictive values specific to the Indian population, physicians and researchers depend on estimates for non-Indian populations such as GLI [2] or European Community Respiratory Health Survey (ECRHS) [3, 4] with a correction factor to interpret spirometry. This may lead to misclassification of disease in the Indian population if people of Indian origin are different in terms of ethnicity and body habitus compared to other populations. To overcome this, there is a need to develop reference predictive values specific to the Indian population as well as to geographically diverse sub-populations within India (see figure 1 for an overview). Previous studies that have tried to do this have either been based on very small numbers [5–7], limited age range [8], or populations from northern or eastern India [8, 9]. In this study, we aimed to develop reference equations for lung function parameters in the Western Indian adult population.
Methodology
We analysed spirometry data collected in two population surveys conducted in the Vadu Health and Demographic Surveillance System (Vadu HDSS) population of Vadu Rural health Programme, KEM Hospital Research Centre, Pune India in the period 2008 to 2012. One of the two original surveys aimed to estimate the prevalence of OADs in a rural adult population from Western India [4]; the other aimed to study the phenotypic characterisation of the adult healthy population using Ayurveda science (system of medicine with historical roots in the Indian subcontinent) [10]. Here we have used the data to develop reference equations for lung function parameters for this population.
Ethical consideration
Ethical approval for the secondary data analysis was obtained from the Ethics Committee of KEM Hospital Research Centre, Pune, who also provided approvals for the primary surveys in which the spirometry data were collected. All participants had provided written informed consent at the time of data collection for the original survey.
Study population
The study area was Vadu HDSS which is located in Pune district of Maharashtra State of Western India. Western India also includes Gujarat, Madhya Pradesh, Goa and Karnataka states. Although this distribution is based on geography (figure 1) and not ethnicity of the population the study population is distinct in terms of physique, nutrition, climate and culture compared with southern, northern and eastern populations [11–15]. There is some variation within the region in terms of local dialects and staple food as well as differences in terrain, rainfall and temperature. Defined geographically, the population of these states is approximately 20% of the total Indian population.
Study procedures
13 600 adults (men and women, aged 18 years and above) were selected for the two surveys using random number schemes from the Vadu HDSS sampling frame out of the total adult population of Vadu HDSS. We ensured that no participant was part of both population surveys. We had spirometry data from 2598 adults: 2083 able to provide spirometry from the OAD study and 515 randomly selected from the non-respiratory study. We performed spirometry using calibrated EasyWare NDD spirometers. Spirometry was performed by study staff from Vadu HDSS who were trained for conducting spirometry using the American Thoracic Society (ATS) and European Respiratory Society (ERS) guidelines [16]. Quality checks of the spirometry data generated were performed by S. Salvi; (respiratory physicians Chest Research Foundation, Pune, India). Height and weight of the study participants were assessed by trained field workers using calibrated stadiometer and adult weighing scale to a precision of 0.1 cm or 0.1 kg, respectively. Age and ethnicity were reported by the participants. Age was recorded in completed years as exact birth dates for some older participants were not available.
Creation of dataset for current analysis
By virtue of routine surveillance of the population, we had detailed medical, personal, biomass usage, smoking status and socio-demographic information on all sampled individuals. Based on these data, we classified participants as healthy or non-healthy. Healthy participants were defined as those who were never smokers, no occupational exposure to known hazards and non-alcohol drinkers. They also had to be free of any self-reported illness/disease conditions or respiratory symptoms in the last 12 months, no previous history of any lung disease and no previous hospitalisation for respiratory illness. Information to define healthy participants were collected using the Burden of Obstructive Lund Disease Initiative (BOLD) study questionnaire (refer to supplementary file 1 for BOLD questions used to define “healthy”) [17]. We also excluded anyone with a FEV1/FVC ratio of <0.5 (even if they had no respiratory symptoms and were apparently disease free) as this represents clear “non-healthy” lung function by comparison with any reference guidelines.
Statistical analysis
We performed the analysis using the Stata v15 software and R software version 3.5.3 [18]. Descriptive statistics were performed for all demographic and spirometry parameters. Following Quanjer et al. [2] and Cole et al. [19], we applied the GAMLSS methodology to fit flexible statistical models to each of the outcomes FEV1, FVC and FEV1/FVC; with age and/or height included as predictors. No assumption was made regarding the similarity of models across sex strata, and separate models were fitted within male and female groups (i.e. we used a completely stratified analysis). Full details about the statistical models fitted can be found in supplementary file 2.
Models were fitted using the GAMLSS package [20] in R software version 3.5.3 [18]. The model predicted values, 5th centile values and 2.5th centile values, were each presented graphically and in look-up tables. Akaike's Information Criterion and the Schwarz Bayesian Criterion were calculated to assess model fit. Cox and Snell's pseudo R-squared values allowed us to assess the improvement in model goodness-of-fit compared with a null model with no covariates.
Internal validation of the derived models was accomplished using a bootstrap method. The regression coefficient (calibration slope) for the relationship between the bootstrap model predicted values and observed outcome values was calculated. We also calculated mean values of 95% limits of agreement for the bootstrap values compared to observed outcome values.
In addition, lower limit of normal values (2.5th and 5th centiles) were computed corresponding to the GLI Ethnicity Code 5 (other/mixed) models using supplementary material from Quanjer et al. [2]. We then calculated the proportion of our population sample that was below the GLI model-derived 2.5th and 5th centiles, and compared these proportions to those using our Western Indian population models. We also calculated the proportion of participants below the GLI centile limits who were then reclassified above and below our new centile limits.
Results
From 2598 individuals, we had unique, acceptable and reproducible spirometry data for 2007 participants. Spirometry data of 591 (22.8%) participants were excluded as they did not meet ATS/ERS criteria. After screening for healthy criteria, we had 1258 records available for final analysis (694 females and 564 males). The demographic characteristics of the study population and descriptive analysis of the key spirometry parameters (FEV1, FVC and FEV1/FVC) are presented in table 1.
Derived prediction equations
Prediction equations were developed using the GAMLSS package as recommended by the GLI initiative [2]. According to the Schwartz Bayesian Criterion, linear regression models were found to be sufficient for modelling the outcomes in males, although not for modelling FEV1 and FVC in females. The model predicted values, 5th centile and 2.5th centile values (lower limit of normal) are shown graphically in a line plot for FEV1/FVC (figure 2). We used contour plots for the models that included both age and height (FEV1 and FVC) (supplementary file 3) because the results are in three dimensions. The model predicted values, 5th centile and 2.5th centile (lower limit of normal) values are shown in supplementary tables S1 to S18. The guideline-recommended fixed ratio of 0.7 [1] is shown on the FEV1/FVC plots for reference (figure 2).
The prediction model for FEV1/FVC in males and females were linear regression models with age as a single predictor. FEV1/FVC appears to have a very strong age effect, as found in over 50 other studies [21]. This suggests that using a fixed 0.7 cut-off value regardless of age is not appropriate and will tend to result in under-diagnosis of people under the age of 40 years and over-diagnosis of people over 45 years (supplementary file 3). Prediction equations derived using the GAMLSS analysis are shown in table 2.
Overall, the bootstrap results confirmed the validity of the models (table 3). The mean calibration slopes are very close to 1 for all models except possibly for FVC in females and FEV1/FVC in females. The pseudo R-squared values for these models show a similar pattern; with relatively low R-squared values for FVC and FEV1/FVC in females (table 4). However, the alternative models we considered did not show any clear improvement in model performance (models indicated with + in tables 3 and 4).
Comparison with other studies
The comparison of predicted values (FEV1, FVC and FEV1/FVC) generated from this study with other previously published Indian studies and with GLI values (Ethnicity Code 1: Caucasian population; and Code 5: other/mixed population) for both males and females is illustrated in figure 3. Regarding the comparison of lower limit of normal values between this study and GLI Ethnicity Code 5 (other/mixed population), a much higher proportion of the study population sample were considered to be below the GLI limits compared to when using the limits generated from this study for FEV1 and FVC (table 5). Indeed, a high proportion of the sample participants were reclassified to be above the limits based on our new models for FEV1 and FVC when compared with the GLI limits, and a small proportion for FEV1/FVC (table 5). This suggests that GLI Ethnicity Code 5 has the potential to over diagnose respiratory disease in the Western Indian population, particularly in terms of FEV1 and FVC. This finding is confirmed by comparison of the differences between the absolute values of the model residuals (differences between observed and predicted values) which shows that the absolute model residuals of GLI Code 5 are between 0.25 and 0.27 greater on average for FEV1 and FVC compared with the models used in this study (figures A and B in supplementary file 4). Bland–Altman plots also showed clear differences between the predicted values (figure 4a and b), particularly for those participants with relatively high values of FEV1 and FVC.
Discussion
Clinicians rely on accurate reference values for lung function parameters to define normal lung function and to identify values that suggest disease; most commonly asthma or COPD. Normal ranges for lung function are dependent on a number of factors such as age, sex, height and also ethnicity. Our findings, from population surveys of healthy individuals, presents prediction equations and predictive values for the Western Indian adult population. Using GAMLSS models, it was found that age and height were the main predictors of the FEV1 and FVC spirometry parameters in both sexes. For FEV1/FVC, height was not a significant predictor of outcome and so was not included in our models.
Comparison with data from other surveys
When compared with the data from other Indian studies and the GLI initiatives, the Western Indian adult population had lower lung volumes on average compared with the European-American population [2] or to populations from Northern India [9] (Refer to figure 3 for comparison) [2, 22, 23]. Indeed, we found that many of our participants would have been classified as being below the lower limit of normal using the GLI Ethnicity Code 5 (other/mixed populations) limits. This is a significant finding as it indicates that using reference predictive values derived from non-Indian (or Northern Indian) populations will over diagnose some lung conditions in the Western Indian adult population, though the FEV1/FVC ratio which identifies OADs is similar to GLI. Figure 1 indicates the location of previous studies in India.
In contrast to other studies, we did not find that height was a significant predictor of FEV1/FVC. This may be because we did not have data from children, restricting the range of the explanatory variable (height). We also observe that the relationships between height and outcome were very similar for FEV1 and FVC individually (table 2), and so when we take the ratio there is no residual effect of height.
The question arises whether the smaller lung volumes observed in our study compared to other populations, are due to inherent ethnic differences or are the result of environmental influences on the lungs. For our analysis, we defined the healthy population as rigorously as possible by selecting individuals with no respiratory symptoms, no apparent illnesses and no exposure to smoking or known occupational hazards. This excluded more elderly women than elderly men (see demography in table 1) possibly because the ubiquitous exposure to biomass fuel preferentially affects women (and girls from a young age) as they are responsible for cooking. We also excluded individuals with clearly abnormally low FEV1/FVC ratios even if they had no symptoms or history of disease. However, clinical examinations were not performed to confirm reported health status. Our findings are derived from adults, and from one rural region of India limiting generalisability.
There are multiple factors that influence lung development (e.g. exposure to biomass fuel, vehicular pollution, nutritional deficiencies, (passive) smoking, sedentary lifestyle, obesity) which will play out differently in rural and urban communities. Children living in rural communities have smaller lung volumes than their urban counterparts [21] but the situation in adults may be less clear cut a as young people from rural communities move into cities and experience a lifetime of exposure to environmental factors. However, comparison with a small study in a slightly different population (from Mumbai) showed similar lung volumes to our rural study [5]. In addition, a survey from the USA confirmed similarly reduced lung volumes in Asian Indians (specific ethnic groups were not identified) living in Chicago, IL [24], and assumed not to have been exposed to biomass fuel or high levels of pollution (figure 3 illustrates the comparison). Our findings may thus have relevance to physicians caring for migrant Western Indian populations in USA, Europe (especially UK), and the Gulf States, albeit with the caveat that the US study did not report the ethnic group of the Asian Indian migrants or duration of their residence in the USA.
Our study has many strengths. Our sample size of 1258 healthy participants is four-times that of previous studies in the Western Indian population, though it remains small for creating generalisable equations. This is also the first study in the Indian population which has used GAMLSS models to derive reference equations and predictive values. This meant that we were not restricted to certain types of models (e.g. linear regression models) with their implicit assumptions. Instead, our aim was to take a flexible parametric approach via GAMLSS models in order to maximise the fit of the models to the data and thereby improve predictions. Nevertheless, for the lung function outcomes in males, our final models were, in fact, linear regression models with age and/or height, confirming the validity of using such models, at least in the Western Indian male population. We also performed internal validation of the statistical models by calculating model diagnostics and using bootstrap methods. Pseudo R-squared values were low, particularly for FEV1/FVC in females. Note that this does not suggest that the models were inferior: our thorough modelling approach means that it is unlikely that better fitting models are possible, at least based on the data we collected. It does mean, however, that there is still a large amount of unexplained variability in response and using the models to try to make exact predictions is inadvisable (especially for FEV1/FVC). This highlights the importance of using both the lower limit of normal values as well as predictive values when assessing lung function.
We realise that to define those who were truly healthy, we would need to examine the individuals in greater detail. Even then, it is difficult to confirm if someone is truly healthy with the complete absence of any pathology. We thus made best use of the available data to define healthy individuals.
Conclusion
Reflecting the ethnically diverse populations in India, Western Indian individuals from rural populations have small lung volumes relative to European-Americans or populations from Northern Indian populations. Use of GLI normal values are thus potentially misleading in this population and may result in overdiagnosis of lung disease. We recommend that the values and equations generated from this study should be used by physicians in their routine practice for diagnosing the disease condition and its severity in adults from Western Indian populations. There are large Western Indian migrant communities in Europe (principally in the UK) and the USA for whom these data may also be relevant. We will be sharing our data with the GLI network.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary file 1: BOLD questions used to define “healthy” ERJ-02129-2019.Supplementary_file_1
Supplementary file 2: Statistical models fitted ERJ-02129-2019.Supplementary_file_2
Supplementary file 3: Contour plots ERJ-02129-2019.Supplementary_file_3
Supplementary file 4: Bland-Altman plots of predicted values and residuals ERJ-02129-2019.Supplementary_file_4
Supplementary tables ERJ-02129-2019.Supplementary_tables
Shareable PDF
Supplementary Material
This one-page PDF can be shared freely online.
Shareable PDF ERJ-02129-2019.Shareable
Acknowledgments
We gratefully acknowledge all the study participants for their participant in the study. We also acknowledge Field Research Assistants of Vadu HDSS for data collection and performing spirometry of study participants. RP is partly supported in this work by NHS Lothian via the Edinburgh Clinical Trials Unit.
Footnotes
This article has supplementary material available from erj.ersjournals.com
Author contributions: S. Juvekar and H. Pinnock led the development of the study. D. Agarwal wrote the first draft of the manuscript which was critically reviewed and refined by R. Parker, S. Juvekar, H. Pinnock, S. Roy, D. Ghorpade, S. Salvi, and P. Khatavkar. R. Parker performed the statistical analysis. RESPIRE UMC members provided advice and contributed to discussions. All authors approved the final version.
The RESPIRE collaboration: The University of Edinburgh, Edinburgh, UK: Harry Campbell, Harish Nair, Steve Cunningham, Monica Fletcher, Liz Grant, Aisha Holloway, Aziz Sheikh, Pam Smith; The Allergy & Asthma Institute, Islamabad, Pakistan: Osman Yusuf, Shahida Yusuf; Maternal Neonatal and Child Health Research Network, Islamabad, Pakistan: Hana Mahmood; University of Malaya, Malaysia: Wong LP; KEM Hospital Research Centre, Pune, India: Anand Kawade; Aga Khan University, Karachi, Pakistan: Sajid Bashir Soofi; Christian Medical College, Vellore, India: Rita Isaac; Victoria University of Wellington, New Zealand: Colin Simpson.
Conflict of interest: R. Parker has nothing to disclose.
Conflict of interest: H. Pinnock reports grants from NIHR Global Health Research (ref: 16/136/109, NIHR Global Respiratory Health (RESPIRE) Unit), during the conduct of the study.
Conflict of interest: S. Roy reports grants from NIHR Global Health Research (ref: 16/136/109, NIHR Global Respiratory Health (RESPIRE) Unit), during the conduct of the study.
Conflict of interest: D. Ghorpade has nothing to disclose.
Conflict of interest: S. Salvi has nothing to disclose.
Conflict of interest: P. Khatavkar has nothing to disclose.
Conflict of interest: S. Juvekar reports grants from NIHR Global Health Research (ref: 16/136/109, NIHR Global Respiratory Health (RESPIRE) Unit), during the conduct of the study.
Conflict of interest: D. Agarwal reports grants from NIHR Global Health Research (ref: 16/136/109, NIHR Global Respiratory Health (RESPIRE) Unit), during the conduct of the study.
Support statement: This study funded by the NIHR Global Health Research Unit in Respiratory Health (RESPIRE) at the Usher Institute of Population Health Sciences and Informatics. RESPIRE was commissioned by the National Institute of Health Research using Official Development Assistance (ODA) funding. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. Funding information for this article has been deposited with the Crossref Funder Registry.
- Received August 27, 2019.
- Accepted April 24, 2020.
- Copyright ©ERS 2020