Abstract
The aim of this study was to quantify the global prevalence of chronic obstructive pulmonary disease (COPD) by means of a systematic review and random effects meta-analysis.
PubMed was searched for population-based prevalence estimates published during the period 1990–2004. Articles were included if they: 1) provided total population or sex-specific estimates for COPD, chronic bronchitis and/or emphysema; and 2) gave method details sufficiently clearly to establish the sampling strategy, approach to diagnosis and diagnostic criteria.
Of 67 accepted articles, 62 unique entries yielded 101 overall prevalence estimates from 28 different counties. The pooled prevalence of COPD was 7.6% from 37 studies, of chronic bronchitis alone (38 studies) was 6.4% and of emphysema alone (eight studies) was 1.8%. The pooled prevalence from 26 spirometric estimates was 8.9%. The most common spirometric definitions used were those of the Global Initiative for Chronic Obstructive Lung Disease (13 estimates). There was significant heterogeneity, which was incompletely explained by subgroup analysis (e.g. age and smoking status).
The prevalence of physiologically defined chronic obstructive pulmonary disease in adults aged ≥40 yrs is ∼9–10%. There are important regional gaps, and methodological differences hinder interpretation of the available data. The efforts of the Global Initiative for Chronic Obstructive Lung Disease and similar groups should help to standardise chronic obstructive pulmonary disease prevalence measurement.
Chronic obstructive pulmonary disease (COPD) is a leading cause of death worldwide 1. In addition to generating high healthcare costs 2, COPD imposes a significant burden in terms of disability and impaired quality of life 3. Unlike many leading causes of death and disability, COPD is projected to increase in much of the world as smoking frequencies rise and the population ages 4, 5. Despite the importance of this disease, the general perception is that the prevalence of COPD is not well measured. Accurate prevalence information is important for several reasons, including documentation of COPD's impact on disability, quality of life and costs, and for helping to inform public health planning 6. It is also important to establish baseline prevalence rates so that researchers can monitor trends, including the success or failure of control efforts.
Previous publications have reviewed the literature qualitatively, but not quantitatively 7, 8. These reviews identified potential sources of interstudy variation that could affect reported prevalence estimates. Historically, COPD has been defined symptomatically as chronic bronchitis (CB), anatomically as emphysema, or, most recently, physiologically as airway obstruction 9. The physiological definition has become the most common 10, 11, although studies using other case definitions are still published. Even with growing consensus on the use of spirometry as a physiological criterion, spirometric cut-off points for establishing airflow obstruction differ significantly 12. Since lung function declines with age, COPD prevalence estimates are highly dependent upon the age range and distribution of subjects included. As smoking is the primary risk factor for COPD, prevalence estimates may also vary by underlying smoking frequencies. With the rise in smoking frequencies in females, there are ongoing controversies as to the relative impact of smoking on the development of COPD in males and females. Finally, the contribution of other inhaled exposures (e.g. occupational smoke or dust, ambient air pollution, and biomass fuel) to population prevalence rates have yet to be determined for most countries.
In order to quantitatively describe the global burden of COPD prevalence, a systematic review and meta-analysis of the published medical literature was conducted.
METHODS
PubMed was searched for population-based prevalence estimates published during the period 1990–2004. The search terms included “chronic obstructive pulmonary disease”, “COPD”, “chronic bronchitis”, “emphysema”, “airway obstruction”, “epidemiology” and “prevalence”. Details of the search strategy are presented in Appendix 1.
Articles were included if they: 1) provided total population or sex-specific estimates for COPD, CB and/or emphysema; and 2) gave method details sufficiently clearly to establish the sampling strategy, approach to diagnosis and diagnostic criteria used by the investigators. Sampling strategy was assessed to determine whether or not the study could be generalised to the rest of the country or region (i.e. whether a representative sample of the population was selected). Studies that provided data on only specific subpopulations (e.g. smokers or occupational studies) were excluded, as were non-English language studies with duplicate publications in English.
Based on these explicit criteria, two researchers reviewed a random 10% sample of abstracts identified by the search strategy. Inter-rater agreement was assessed using the kappa statistic, and the remaining abstracts were split evenly between the reviewers once a sufficient level of agreement was achieved (kappa >0.7). The full text of all accepted publications was obtained and their content reviewed for final inclusion. Non-English language articles were translated into English. The references of all English language articles with primary or secondary COPD prevalence estimates were also reviewed in order to identify additional estimates that may have been missed by the initial search strategy.
For each accepted study, the following data, when available, were abstracted: author, year of publication, year of data collection, sample size, percentage prevalence (or number of COPD cases), age range and mean age of study subjects, percentage males, percentage smokers (combined smokers and ex-smokers), country, study setting (rural, urban or mixed), response rate, diagnosis (COPD, CB or emphysema), and diagnostic criterion (chronic productive cough, spirometry, patient-reported diagnosis, physician diagnosis or physical/radiographic findings). Data were also collected on quality of study design and quality of data analysis, which were classified as good, average or poor. Information about spirometric quality was collected when appropriate. The guidelines used for assessing study quality are presented in Appendix 2.
For each study, sex-, smoking- and age-specific prevalence estimates were abstracted when reported. If not specifically reported, these estimates were calculated based on the data provided. For smoking status, estimates for smokers, ex-smokers and nonsmokers were included. For consistency, estimates in which ex-smokers were combined with smokers or nonsmokers were excluded. Since the majority of studies did not report mean age, prevalence estimates were assigned to an age category based upon judgment of which age group was most appropriate. Age-specific estimates were grouped into two age categories with a cut-off of 40 yrs; the ≥40-yrs age group was further subdivided into 40–64 yrs and ≥65 yrs.
For the meta-analysis, the conservative random-effects empirical Bayesian method of Hedges and Olkin 13 was used to pool the estimated effects. Within-group heterogeneity was evaluated using Cochran's Chi-squared test (also called the Q test) 14 and the I-squared statistic 15. For the Q test, significance was set at p<0.10. For subgroup analyses, the heterogeneity between groups was also calculated using the Q test. Since many studies provided multiple prevalence estimates using various definitions, double-counting from the same study was avoided by using a hierarchical ranking system based on diagnostic criteria (Appendix 3).
RESULTS
A detailed diagram of the review process is presented in figure 1⇓. The initial search identified 5,464 studies of potential interest, including 978 non-English language articles. After title and abstract review, 5,108 studies were excluded. Of 356 studies meeting the initial inclusion criteria, 64 were accepted for data abstraction. Articles were excluded due to duplicate publication, lack of adequate data for meta-analysis or inclusion/exclusion criteria that made the study unrepresentative of the population. Three additional articles were identified through hand-searches of relevant bibliographies, bringing the total number of accepted articles to 67.
Of 67 accepted articles, several studies presented data from the same study group or survey. In these cases, the data were merged, leaving a total of 63 unique entries in the meta-analysis. A total of 62 studies reported 101 overall prevalence estimates from 28 different counties, and one additional study limited to females provided a sex-specific estimate (table 1⇓). Two studies reported data collected as part of the European Community Respiratory Health Survey; these included data from multiple European countries. The 101 overall estimates included some duplicate estimates from the same study (e.g. patient-reported and spirometrically determined COPD).
Pooled prevalence estimates for all diagnostic groups are presented in table 2⇓. After eliminating duplicate estimates from the same study, 37 estimates for COPD (including studies that reported a combined rate for CB and emphysema) yielded a pooled prevalence estimate of 7.6%. Objective definitions tended to produce higher prevalence estimates than patient-reported diagnoses. For example, spirometric criteria resulted in a higher prevalence estimate compared with patient-reported COPD (9.2 versus 4.9%, respectively). The pooled prevalence of CB alone was 6.4% from 38 studies. Eight studies reported emphysema alone, with a pooled prevalence of 1.8%.
Diagnostic criteria for spirometry-based prevalence estimates from 26 studies are presented in table 3⇓. Of the 26 spirometric COPD estimates, five studies excluded asthma 27, 48, 54, 57, 67. A sensitivity analysis excluding these five studies did not appreciably affect the pooled prevalence estimate. The most common spirometric definitions were based upon criteria developed by the Global Initiative for Chronic Obstructive Lung Disease (GOLD; 13 estimates) 11. A few studies used older versions of criteria published by the European Respiratory Society in 1995 (two estimates) 82 and American Thoracic Society (ATS) in 1987 (two estimates) 83. All of these guidelines suggest that post-bronchodilator values should be used to define obstruction; however, only nine studies reported any type of post-bronchodilator measurement. Of 10 studies using GOLD criteria, only one study used post-bronchodilator values in the analysis 53. There was wide variation in the reporting of spirometric quality control. For example, 81% of studies identified the type of spirometer used, but less than half (46%) mentioned reproducibility criteria or made any mention of calibration procedures or frequency.
As expected, there was significant heterogeneity in all analyses. In order to address this, analyses limited to a diagnosis of COPD were performed, examining subgroups defined by age group, smoking status, sex, World Health Organization (WHO) region, study setting (urban versus rural) and study quality (table 4⇓). Pooled prevalence estimates were significantly higher in strata containing persons aged ≥40 yrs (9.0%), smokers (15.4%), males (9.8%) and persons with urban residence (10.2%). Prevalence did not vary significantly by WHO region, although these results should be interpreted with caution since only the European region had more than four estimates. Results were not appreciably affected by study quality.
DISCUSSION
The present report provides the first quantitative summary of the world literature on COPD prevalence, with high-quality estimates for COPD in important subgroups defined by age, smoking status and sex. The available data suggest that the prevalence of physiologically defined COPD in adults aged ≥40 yrs is 9–10%. This is consistent with the range of 4–10% cited in a previous qualitative review 7. These results highlight the lack of good quality prevalence data from outside Europe and North America. It was not possible to locate any spirometric studies reporting COPD prevalence in the African or Eastern Mediterranean regions. In addition, only three or four reports each were found from the American, South-East Asian and Western Pacific regions. Much of the available literature from Africa is limited to CB, and has been well summarised by Chan-Yeung et al. 8. Tan et al. 84 used a statistical model to estimate the prevalence of moderate-to-severe COPD in the Asia–Pacific region, with a regional estimate of 6.3% and projected country-specific rates of 3.5–6.7%, which are generally consistent with the pooled estimates presented here.
Significant heterogeneity was found in prevalence measures, which was incompletely explained by subgroup analyses. Although prevalence differences among countries are not unexpected, it is important to explore potential sources of heterogeneity. One such source is the diversity of diagnostic definitions. Clinical diagnoses or, more properly, patient-reported diagnoses clearly appear to underestimate disease prevalence. Spirometry can provide better estimates, but is not without limitations. Even among studies that used spirometric definitions of COPD, the most common diagnostic criterion, GOLD stage II, was used in only a quarter of studies. Pooled prevalence estimates varied widely by definition, from 5.5% (GOLD stage II) to >20% (ATS, 1987), a wider range than might be expected from methodological differences alone 7. However, the efforts of the GOLD are clearly having an effect. The definition proposed by the GOLD, forced expiratory volume in one second (FEV1)/forced vital capacity (FVC) of <0.70, has been adopted as an epidemiological case definition by the Burden of Obstructive Lung Disease (BOLD) initiative and the Latin-American Project for the Investigation of Pulmonary Obstruction (PLATINO), both of which measure COPD prevalence in multiple countries 6, 85. Although new prevalence measurements have been produced by both groups, they were not available in print during the period covered by this review. Movements toward a consistent spirometric criterion should help reduce the diversity reflected in the literature 11, 86.
Some of the variation in COPD prevalence may reflect technical issues related to the collection of spirometric data. At the most basic level, the quality of spirometric testing can affect the assignment of a diagnostic label. An inadequate FVC, for example, can lead to overestimation of the FEV1/FVC ratio and thus underestimation of prevalence. It was not possible to grade the quality of spirometry, but the reporting of spirometric quality criteria, which varied widely, was examined. Both the BOLD initiative and PLATINO have embraced systematic quality control criteria for spirometry as an essential component of their programmes 6, 85. Between-study differences in the handling of substandard spirometric results may also affect prevalence estimates. The likelihood of producing reproducible spirometric measurements decreases with increasing severity of lung disease 87. Thus the exclusion of nonreproducible tests is likely to selectively exclude a higher proportion of persons with obstructive disease, leading to prevalence underestimation. Another source of variation may be the use of post-bronchodilator lung function testing. Most of the major COPD guidelines indicate that post-bronchodilator results should be used to identify obstruction. From the present spirometric studies, however, only approximately a third administered a bronchodilator to any of the subjects tested, and half of these only gave a bronchodilator to subjects with abnormal results during the initial reading. The impact of post-bronchodilator testing on COPD prevalence estimates can be substantial 88.
Other important sources of heterogeneity include known rate relationships within epidemiologically important subgroups, with age strata perhaps the most important. There was a wide diversity of age ranges across the studies in the present review, and few papers reported summary age statistics or age distribution data that might have allowed mathematically robust age comparisons. As a result, the definition for age subgroups was imprecise. The cut-off at age 40 yrs was chosen to reflect the methodology proposed by the BOLD initiative 6. Indeed, the pooled estimate of 10% for adults aged ≥40 yrs may be the most useful parameter to emerge from the present study.
Subgroup analyses also showed that, as expected, rates were higher in smokers, males and urban residents. However, reporting of prevalence estimates for these subgroups was imperfect. For example, only 73% of studies provided separate prevalence estimates for males and females, and 46% provided separate estimates for smokers. Since these subgroups were not the primary interest, however, several studies that limited their study population to smokers alone were excluded. Similarly, several studies limited to various high-risk occupational settings were excluded. It was not possible to examine true interactions between age, sex and smoking status due to the limitations of the meta-analytical technique, as well as the limited details of results reported in most publications.
In order to avoid double-counting, a hierarchical system was used to choose between multiple estimates drawn from the same population. In doing so, assumptions were made that might have introduced bias. In order to evaluate this, these hierarchical results were compared with models using the lowest (conservative) and highest (liberal) prevalence estimate within each subgroup (data not shown). In most subgroups, the pooled prevalence estimate for the hierarchical model lay between the conservative and liberal estimates.
Articles published prior to 1990 were excluded in order to avoid temporal bias in smoking/COPD trends, which meant excluding several population-based prevalence estimates from the USA that were conducted in the 1960s, 1970s and 1980s. In addition, although the US National Health Interview Survey is conducted annually, only the most recent publication from the survey was included. As a result, the results over-represent European studies in comparison with North American studies.
Conclusions
Although prevalence estimates for chronic obstructive pulmonary disease are being published for many areas of the world, high-quality estimates are lacking for key regions, and differences in measurement methodology hinder meaningful comparisons of published studies. Efforts by groups such as the Global Initiative for Chronic Obstructive Lung Disease, Burden of Obstructive Lung Disease initiative and the Latin-American Project for the Investigation of Pulmonary Obstruction may help standardise chronic obstructive pulmonary disease measurements, thus improving understanding of the global burden of this major disease.
Appendix 2: Criteria for study quality assessment⇑
APPENDIX 3: HIERARCHICAL RANKING SYSTEM⇑
- Received October 25, 2005.
- Accepted March 25, 2006.
- © ERS Journals Ltd