The 2005 guidelines of the American Thoracic Society/European Respiratory Society recommend the use of race- and/or ethnic-specific reference standards for spirometry. Yet definitions of the key variables of race and ethnicity vary worldwide. The purpose of this study was to determine whether researchers defined race and/or ethnicity in studies of lung function and how they explained any observed differences.
Using the methodology of the systematic review, we searched PubMed in July 2008 and screened 10 471 titles and abstracts to identify potentially eligible articles that compared “white” to “other racial and ethnic groups”.
Of the 226 eligible articles published between 1922 and 2008, race and/or ethnicity was defined in 17.3%, with the proportion increasing to 70% in the 2000s for those using parallel controls. Most articles (83.6%) reported that “other racial and ethnic groups” have a lower lung capacity compared to “white”; 94% of articles failed to examine socioeconomic status. In the 189 studies that reported lower lung function in “other racial and ethnic groups”, 21.8% and 29.4% of explanations cited inherent factors and anthropometric differences, respectively, whereas 23.1% of explanations cited environmental and social factors.
Even though researchers sought to determine differences in lung function by race/ethnicity, they typically failed to define their terms and frequently assumed inherent (or genetic) differences.
Spirometry is used to measure lung capacity in a variety of medical and public health contexts, as well as in speciality clinics. Increasingly, primary care physicians worldwide employ spirometric evaluation in their offices and lung capacity is considered a key indicator of physical health . Lung function values obtained with the spirometer are corrected for age, height, sex and race according to information on the individual patient provided by the clinician.
Leading medical societies have long incorporated race or ethnic “correction” or “adjustment” of lung capacity measurement into their clinical practice guidelines, generally for people considered to be “black”. By 1990, the application of a correction factor (generally 6–12%) or the use of population-specific standards, both of which could be programmed into the spirometer, were commonplace in pulmonary training programmes in the USA .
The most recent guidelines, published in 2005 by the Joint Working Party of the American Thoracic Society (ATS)/European Respiratory Society (ERS) recommend the use of race- and ethnic-specific reference standards, rather than an adjustment factor, using self-identification to determine race . In 2006, Kiviranta and Haahtela  questioned the racial classification systems employed in the lung capacity literature and proposed that the racial and ethnic groups used be revisited. Yet, race and ethnicity are defined differently in various national contexts , and the selection of appropriate reference equations in spirometry and the issue of “race correction” or population-specific norms remain topics of discussion [6, 7].
To assess the evidence contributing to the ATS/ERS guidelines on race we undertook a systematic review of the biomedical literature that compared lung function in different racial and ethnic groups published prior to and immediately after the publication of the guidelines in 2005. Specifically, the purpose of this study was to determine 1) whether spirometry researchers have defined race and/or ethnicity in their studies, and 2) how they explained any observed differences among racial and ethnic groups.
Eligible articles described the results of primary research that explicitly compared lung function in “other race and ethnic groups” to “white” groups (i.e. used the terms “race” and/or “ethnicity” or “stock”, or terms such as “Caucasian” or “Mongoloid”). Included articles either made direct comparisons between groups or included comparative statements regarding group difference. We excluded articles that only made within-group comparisons. For example, articles that compared Indians to Europeans and specifically used the terms race and/or ethnicity to refer to these groups were included, whereas articles that compared Indians in different regions of India (e.g. north versus south) were not included. Eligible articles described studies measuring forced vital capacity (FVC), vital capacity (VC), and/or forced expiratory volume in 1 s (FEV1). Excluded articles included review articles, task force statements and non-English language articles.
For our analyses, we classified the groups studied as either “white” or “other race and ethnic groups”. Commonly used terms for groups in this literature that we classified as “white” by frequency of usage were: white, Caucasian, European or European descent, American, Western and several UK populations. Additional terms and groups used infrequently specified Italian, occidentals, Scotch–Irish of pre-Revolutionary stock, Anglo-, Scandinavian, Danish Caucasian and foreigners. We intentionally did not classify the wide range of “other race and ethnic groups” into conventional race and/or ethnic categories, or continental groupings. Use of the terms “white” and “other race and ethnic groups” is appropriate for this particular study but does not imply any consistent social, historical or biological distinctions between, or uniformities within either of these categories.
Search strategy and data extraction
With the assistance of a science librarian, we conducted a search of PubMed on July 24, 2008, using the search strategy outlined in online supplementary table S1 and retrieving 10 471 articles (fig. 1). L. Braun screened the title and abstract of each article. If the title and abstract suggested that the article might meet the above criteria, L. Braun conducted a full text review (421 articles were identified for full text review). M. Wolfgang examined a 10% sample of the 10 471 (first 10 citations in each group of 100 chronologically ordered citations) and identified no additional articles meeting our criteria. After full text review, 165 articles met our inclusion criteria. Through a review of the reference section of the 165 articles, we identified an additional 62 eligible articles. At the time of abstraction, we excluded one additional article because it was not relevant to our analysis. Thus, we included 226 articles in our systematic review (see online supplementary bibliography for a full list of papers).
The primary investigator (L. Braun), M. Wolfgang and two research assistants abstracted key data, including: the populations compared and whether investigators included a description of the procedures they employed to assign individuals to a race or ethnic group in the Methods section of the paper; whether FVC, VC, and FEV1 were measured; and whether racial and/or ethnic differences in lung function were observed between study groups. Explanations of any differences observed were extracted word for word. We also recorded: year of article publication; country and discipline of the corresponding author, or first author when no corresponding author was indicated; age and sex of study participants; sample size, defined as the number of people on whom lung function was measured in the study; type of comparison group; use of the term Caucasian; and whether socioeconomic status was assessed in conjunction with race and/or ethnicity.
The identified articles were sorted by publication year. L. Braun and M. Wolfgang used preset coding rules to code independently the explanations for observed differences of the first 40 articles and compared their results by discussion, resolving any differences. L. Braun and M. Wolfgang then independently coded the explanations provided in the remaining 186 articles. The few minor differences were resolved by discussion.
Coding rules for all other abstracted data items were finalised after testing and verification by K. Dickersin, who examined a ∼10% sequential random sample of articles from each decade (total n=26). L. Braun independently examined and coded the same 26 articles; K. Dickersin and L. Braun compared their coding and resolved disagreements regarding the coding rules, and coding rules were finalised. Data items were abstracted by one person (L. Braun or M. Wolfgang) and verified by L. Braun. Once the abstraction and coding process was completed, L. Braun re-reviewed all data for accuracy. Data were initially entered into Microsoft Excel 2004 (Microsoft Corp., Redmond, WA, USA) and subsequently analysed using SAS software (Version 9.2; SAS Institute Inc., Cary, NC, USA).
Explanations for difference
Explanations for observed racial/ethnic differences in lung capacity that we abstracted included both explanations provided by the authors and explanations provided by others if they were cited in the Results or Discussion sections of included articles. We coded explanations into one of seven categories: 1) inherent differences between racial/ethnic groups; 2) anthropometric differences; 3) environmental and social factors; 4) mechanical factors; 5) technical factors; 6) other; or 7) no explanation (online supplementary table S2). With the exception of anthropometric differences, categories were mutually exclusive. We created category 2, anthropometric differences, because some investigators considered anthropometric differences (e.g. height) as a “fixed” or inherent biological feature of individuals or groups and others as changeable in response to environmental (e.g. nutritional) or social factors.
We categorised the comparison populations as either a “parallel” comparison group (i.e. a population drawn concurrently and compared with the study population by the authors) or as a “historical” comparison group. Historical comparisons included: 1) data collected on a population in a previous study by the same investigators (“cohort from the same investigator”); 2) data from one or more “literature comparison groups with presentation of data” (e.g. one or more studies by different investigators), and displayed in the study article’s figures or tables; and 3) information from one or more “literature comparison groups without presentation of data” (e.g. information about other populations, presented in the Results or Discussion section of the study article). We subsequently combined the literature comparison groups described in 2) and 3) into one group (“literature comparison group”).
Characteristics of articles identified for systematic review
The 226 eligible articles were published between 1922 and 2008, with a marked increase beginning in the 1960s (table 1). Corresponding authors were from 39 countries, with the highest proportion from the USA (68 out of 226; 30.1%), followed by India (31 out of 226; 13.7%) and the UK (26 out of 226; 11.5%).
Only one-third (33.2%) of the studies included a parallel comparison group, with about two-thirds of the studies comparing the population under study to a historical group from another “cohort from the same investigators” (3.5%) or “from the literature” (63.3%). Study sample sizes ranged from 13 to 65 086, with the majority (59.3%) including 100–999 participants (does not include data presented in either of the literature comparison groups). About one-quarter of all studies involved children only (23.9%) or males only (27.0%) (table 2).
Definitions of race and/or ethnicity
We found that 39 out of 226 (17.3%) of the included articles defined race and/or ethnicity and the proportion of articles providing a definition increased over time. In articles published before 1980, 10 out of 87 (11.5%) defined race and/or ethnicity, whereas in articles published in 1980 and after, 29 out of 139 (21%) provided a definition. Considering only studies with stronger study designs (i.e. parallel controls), seven out of 10 (70%) articles in the 2000s stated how race was defined, compared to five out of 10 (50%) and four out of nine (44%) studies conducted during the 1980s and 1990s, respectively (fig. 2).
In the 39 articles in our study that did define race, methods used to classify a study participant's race and/or ethnicity varied, including “participant self-identification” (13 out of 39) or “variations on classification by the investigator-observers” (visual inspection, surname, “lineage”, records, language, tribal membership, birthplace, residence or a combination of participant and observer identification) (26 out of 39). Authors frequently used the terms “race” and “ethnicity” to group individuals with a shared country or region of birth or residence (e.g. northern or southern Indians), a putatively common culture, a common language or a particular skin colour. In our study sample, the term “Caucasian” was used commonly to represent the “white” or European population (97 out of 226 papers; 42.9%).
By design, our review examined articles that compared at least two populations, one of which was “white”. The “other race and ethnic groups” represented 94 groups from many countries (online supplementary table S3). Of the 226 articles examined, 189 (83.6%) claimed that “other race and ethnic groups” had lower spirometric values, 13 (5.8%) had no difference/similar values, 10 (4.4%) had higher values and 14 (6.2%) had variable spirometric values of lung capacity as compared to “white” groups (table 3). Only 14 (6.1%) of the articles examined socioeconomic status in conjunction with race and/or ethnicity, and we did not observe any meaningful changes over time.
Explaining racial and ethnic difference in lung capacity
We found a wide range of proposed explanations for racial and ethnic differences in lung capacity (table 4). Of the 189 articles in which “other race and ethnic groups” were reported to have lower lung function than “white”, the most common explanations suggested were anthropometric differences (93 out of 316; 29.4%) followed by environmental differences (73 out of 316; 23.1%), and inherent differences (69 out of 316; 21.8%) (authors could provide more than one explanation). In 46 out of 189 (24.3%) articles, investigators provided no explanation for the differences observed (fig. 3).
Explanations for racial/ethnic differences reported varied over time and place, and included inherent and anthropometric differences, environmental and social factors, technical factors, and others (see online supplementary table S2 for examples of each). The authors of four articles published in the 1920s suggested that differences were simply due to “a racial factor” (coded as inherent difference) [8, 23–25]; one article suggested anthropometric difference  and another provided no explanation . In contrast, McCloy  found “white” and Chinese to be similar. The author’s explanation for similarity was complex, pointing to changing levels of physical activity among Chinese students and noting that any difference would be eliminated if lung function was interpreted in relation to surface area.
The authors of some articles published between 1930 and 1944 and conducted in India questioned the racial/ethnic paradigm established in the 1920s, providing environmental explanations for difference, such as climate and “modes of life and habits” . By the 1960s, most researchers included a long list of potential explanations, such as “environment” or nutrition, in addition to or in conjunction with biological or genetic factors. Some investigators raised environmental explanations only to rule them out. Schoenberg et al. , for example, stated that “the lower FVC and FEV1 values among blacks are related to genetic rather than to environmental variables”. Conversely, Myers  of South Africa concluded that “the available evidence does not support a clear thesis of racial or ethnic differences in spirometric lung functions”. Some authors viewed anthropometric difference as a changeable factor over time, yet others viewed it as fixed. The most recent article in our study emphasised anthropometric differences along with social factors: “Differences in upper body segment explained more of the ethnic differences in lung function than SH [standing height], particularly among Black Caribbean/African subjects. Social correlates had a smaller but significant impact” . As indicated in figure 3, inherent difference remains an explanation for racial/ethnic differences in lung function to the present day.
In this article, we used the methodology of the systematic review to address two questions fundamental to biomedical research on race/ethnicity and lung function. 1) How was race and/or ethnicity defined in the literature? 2) How were differences explained and interpreted over time? The interpretation of race and ethnicity in research depends on their appropriate definition in the original research study. Our systematic review found that only 17.3% of articles defined race and/or ethnicity. Over time, there was a clear increase in the proportion of studies defining race and/or ethnicity in the Methods section. Yet, even in the 2000s, when the current clinical practice guidelines regarding “ethnic adjustment” of spirometry data were published, only 29.4% of studies defined race and/or ethnicity.
Despite this lack of definition, 51.2% of the studies reporting lower lung capacity in “other race and ethnic groups” as compared to “white” suggested inherent or anthropometric characteristics as explanations for difference. This was a consistent finding from the earliest years. For example, in explaining that their “findings suggest a possible racial factor…Poverty, environment and social status, with the ensuring advantages and disadvantages, do not seem to influence the lung capacity of children”, the earliest paper in our review, published in 1922, laid the foundation for inherent difference . Importantly, studies of all designs and from earlier decades continue to be cited in the contemporary literature as evidence of racial difference .
Kiviranta and Haahtela  note that “the term Caucasian…is solely a historical term in anthropological science”. In our study, authors’ use of the 18th century racial term “Caucasian” was common, even after 1994, when its use was proscribed by the Council of Biology Editors . Our review also found that about two-thirds of the articles examining racial/ethnic differences in lung function used literature comparison groups from different times and places; a weak study design for demonstrating a reliable association.
Over the decades, pulmonologists have discussed extensively the myriad technical and interpretive issues related to proposed racial/ethnic differences in spirometry test results. Yet until recently, few researchers have explicitly considered the underlying assumptions about race and ethnicity that informed pulmonary research. Notable exceptions were the South African scientists Myers  and White  and, more recently, Kiviranta and Haahtela . Our study is the first to examine the complex history of how race and ethnicity were defined in studies of lung function and how the differences reported were interpreted, both of which have influenced the current policy of “race correction”.
Our objective in this review was not to determine whether the association between race and/or ethnicity and lung function is valid. Indeed, validity is difficult to assess accurately, given the lack of definition of race and ethnicity in the literature and their different meanings in various national contexts. Moreover, selective publication of research findings (reporting biases), or a deliberate selection of a “literature comparison group” which provides the desired contrast, could have influenced our findings. We also recognise that not all aspects of our systematic review follow methodological standards suggested by, for example, the Cochrane Collaboration . We only searched one electronic database, included only English language articles, used a single person to review titles, abstracts and full length articles, and used a single abstractor for most data items. However, we did employ sampling, to check on the reliability of our classifications, and standardised definitions, developed through consultation with all authors. It is possible that we may have missed some studies that would affect our estimates of prevalence. For our findings to be invalidated, however, we would have had to have missed a large number of articles with parallel comparisons, with definitions of race and/or ethnicity and examination of socioeconomic status, and showing no evidence of association between race and/or ethnicity and lung function, all of which we believe is unlikely based on our findings. We also recognise that by giving each mention of an explanation equal weight in our count of explanation frequency, we may not have captured the full meaning and/or intent of authors' explanations.
Finally, in keeping the classification of anthropometric differences separate from inherent difference, we may have underestimated the extent to which researchers considered lung function differences among groups to be inherent. In the literature we reviewed, anthropometric measures, such as difference in the sitting height to standing height ratio (a commonly used explanation for racial and ethnic differences), were frequently considered fixed (inherent) characteristics of individuals and groups (table 4). Conversely, some investigators were clear in considering anthropometric data, such as height (and thus differences in height), as changeable factors in individuals and populations (e.g. due to improved nutrition) . It was not possible to discern the meaning of other authors.
Almost 94% of the studies we examined failed to consider race and/or ethnicity in the context of socioeconomic status, even in recent years. Socioeconomic status has variable meanings within and across time periods, and its examination in the context of a research study is limited by the dataset. Given that lower lung function is associated with lower socioeconomic status , descriptors of socioeconomic status should be taken into account when analysing lung function. Using data from the third National Health and Nutrition Examination Survey (NHANES III), a recent analysis of the contribution of socioeconomic status to racial difference in lung function found that high school completion was associated with increases in lung function values that were racially patterned .
Our study found that since the 1920s and 1930s, some researchers explained racial difference as environmental. Recent studies confirm the view that environmental exposures, such as air pollution, affect lung function and respiratory health . Moreover, racial and ethnic minorities are exposed to higher levels of environmental, including respiratory, pollutants. It is thus critical that environmental differences and socioeconomic factors be examined in more detail as possible explanations for difference.
Our data indicate that in the 2000s, studies were more rigorously designed and they defined race and/or ethnicity more often. Nonetheless, only some journal editors require that race be defined in articles using the term and there is no consensus on how to do this . We found that most studies in the 2000s that defined race and/or ethnicity used self-identification to assign a participant's race/ethnicity. Self-identification will probably continue to be the norm as long as it is recommended by the ATS/ERS. Self-identification, however, changes over time and place, in part because the categories offered to respondents change. In the US census, which is influential in health research, racial categories are fluid, changing every decade since census-taking began in 1790 [5, 38, 39]. The changing nature of race over time provides compelling evidence that race is a social category. While self-identification of race and ethnicity is appropriate for understanding social experiences in certain research contexts, it is inappropriate for genetic causality studies. Thus, there remains a disconnect in the lung function literature between the use of the social definition of self-report and genetic explanations for difference. Nor does the recent turn to racial “ancestry” circumvent the problems associated with defining and interpreting racial and ethnic difference . Ancestry itself is not consistently defined and, in the case of lung function, the use of ancestry is built on traditional race-based models [6, 41, 42].
The concept of race, particularly as it pertains to causal explanations for racial and ethnic difference and disparity in disease, is currently a topic of a vibrant international debate. The debate taking place in the scholarly literature, including arguments for greater emphasis on understanding social determinants of disease, often reflects views of race from the perspective of the USA [43–48], even though the issue is broadly relevant. The present study contributes to the debate by showing that the majority of the literature comparing lung function in white subjects with other racial groups has generally ignored the importance of socioeconomic factors.
The studies we reviewed included 95 groups from very different social contexts. The fact that the key variable of race and/or ethnicity used to frame comparative studies on lung capacity was rarely defined over a period of nearly 90 years should, at the very least, raise questions about the reliability of research that reports an association between inherent or genetic racial difference and lung function, and the scientific evidence that underpins the practice of “race correction”. While the view that races and ethnic groups differ in the capacity of their lungs is widely accepted in pulmonary medicine , the continued practice of explaining racial and ethnic difference in lung function as rooted in inherent and fixed anthropometric difference has critical health policy implications. Importantly, it could divert attention from much-needed research into the physiological mechanisms by which specific social and physical environments influence lung function. Of note in this regard is recent work by Burney and Hooper , which raises questions about the “normality” of lower lung function in African Americans and argues against the use of ethnic-specific norms for prognosis.
There is no simple way to alter an established practice, such as race correction. There is, however, much to be learned about the concept of race, how its historical use in biomedicine shapes current scientific research design, and how white USA and European norms came to be the standard of “normal”. The idea of racial difference in lung capacity stems from the work of the southern USA plantation physician S. Cartwright . This was later affirmed in the work of B.A. Gould  in an anthropometric study of soldiers in the Union Army at the end of the American Civil War and followed up by F. Hoffman in Race Traits and Tendencies of the American Negro . There are many problems with this early research , one of which is its failure to allow for differences between black and white subjects in height, age or socioeconomic status. Yet, researchers continue to cite Gould’s article uncritically .
Despite the challenges, continued dialogue in biomedical journals and at conferences on how underlying assumptions about race and ethnicity shape design and interpretation of research is of great importance. To this end, we suggest that an international workshop comprising pulmonologists who use this technology and historians, anthropologists and sociologists who study the concepts of race and ethnicity from the global north and south be convened to assess past research on lung function as it pertains to racial and ethnic disparities, to explore how and why outdated assumptions about race persist in the scientific literature, and to develop methodologies to guide future research. Such a conference would build upon the questions we and others pose about race correction and provide a less Anglo-American perspective on race. For example, while Indian researchers have compared lung function among regional groups within India, Indian researchers have historically drawn more extensively on environmental explanations. Only through continued dialogue will we gain clarity on the relationship of lung function measurements with race, ethnicity and socioeconomic indicators.
We give special thanks to D. Kern (Maine Veterans Administration Medical Center, Augusta, ME, USA), who has provided expert advice on this project, D. Fullwilley (Stanford University, Stanford, CA, USA) for helpful discussions on Ancestry Informative Markers, and P. Brown (Northeastern University, Boston, MA, USA) and C. Bliss (Brown University, Providence, RI, USA) for their thoughtful suggestions. We also thank M. Gerdes and K. Saxton (formerly of Brown University) for their help in abstracting articles, M.C. Trimbur (Montefiore Medical Center, New York, NY, USA) for extensive library work, J. Crager (Brown University Sciences Library) for her expert assistance in using PubMed and B. Hollingsworth, J. Wood and E. Coogan (all of Brown University) for their assistance and patience in fulfilling our numerous interlibrary loan requests. We also are grateful to anonymous reviewers for their helpful comments.
For editorial comments see page 1249.
This article has supplementary material available from www.erj.ersjournals.com
This work was supported by a National Science Foundation Scholar Award (SES-0846552) to L. Braun. L. Braun was independent from the funding agency. The funding agency had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Statement of Interest
- Received June 9, 2012.
- Accepted July 20, 2012.
- ©ERS 2013