Clinical COPD phenotypes: a novel approach using principal component and cluster analyses

P-R. Burgel; J-L. Paillasseur; D. Caillaud; I. Tillie-Leblond; P. Chanez; R. Escamilla; I. Court-Fortune; T. Perez; P. Carré; N. Roche

doi:10.1183/09031936.00175109

Abstract

Classification of chronic obstructive pulmonary disease (COPD) is usually based on the severity of airflow limitation, which may not reflect phenotypic heterogeneity. Here, we sought to identify COPD phenotypes using multiple clinical variables.

COPD subjects recruited in a French multicentre cohort were characterised using a standardised process. Principal component analysis (PCA) was performed using eight variables selected for their relevance to COPD: age, cumulative smoking, forced expiratory volume in 1 s (FEV₁) (% predicted), body mass index, exacerbations, dyspnoea (modified Medical Research Council scale), health status (St George’s Respiratory Questionnaire) and depressive symptoms (hospital anxiety and depression scale). Patient classification was performed using cluster analysis based on PCA-transformed data.

322 COPD subjects were analysed: 77% were male; median (interquartile range) age was 65.0 (58.0–73.0) yrs; FEV₁ was 48.9 (34.1–66.3)% pred; and 21, 135, 107 and 59 subjects were classified in Global Initiative for Chronic Obstructive Lung Disease (GOLD) stages 1, 2, 3 and 4, respectively. PCA showed that three independent components accounted for 61% of variance. PCA-based cluster analysis resulted in the classification of subjects into four clinical phenotypes that could not be identified using GOLD classification. Importantly, subjects with comparable airflow limitation (FEV₁) belonged to different phenotypes and had marked differences in age, symptoms, comorbidities and predicted mortality.

These analyses underscore the need for novel multidimensional COPD classification for improving patient care and quality of clinical trials.

Chronic obstructive pulmonary disease (COPD) is a major cause of mortality and disability worldwide 1. The disease is characterised by airflow limitation that is not fully reversible. Classification of COPD is usually based on the severity of airflow obstruction, as assessed using the forced expiratory volume in 1 s (FEV₁) 1. In recent years, it has emerged that COPD is a complex disease with multiple clinical manifestations and that COPD subjects cannot be described by only using the severity of airflow limitation. Thus, many other independent predictors of outcomes have been identified, including worsening dyspnoea, frequency and severity of exacerbations, malnutrition, depression and health-related quality of life (HRQoL) impairment 2. Furthermore, comorbidities (e.g. cardiovascular diseases and cancer) are major causes of death and hospitalisations in COPD subjects 3, 4.

Large clinical trials performed in COPD subjects have shown that current treatments improved several outcomes (e.g. exacerbations, dyspnoea and HRQoL), but the authors reported disappointing data on mortality and rates of decline in FEV₁ 5, 6. One explanation may be that COPD subjects are heterogeneous and that not all subjects benefit from the same therapy. This point has been best exemplified by the National Emphysema Treatment Trial, in which some phenotypic characteristics were associated with increased mortality after lung volume reduction surgery, whereas this therapy reduced mortality in other COPD subjects 7. Thus, dismantlement of phenotypes appears as one of the current major challenges in subjects with COPD.

Phenotypic characterisation of COPD subjects may rely on clinical manifestations, assessment of patient-related outcomes (e.g. depression and HRQoL) using validated questionnaires, imaging and biological measurements 8. Many studies are currently trying to identify biomarkers related to severity or prognosis of COPD subjects 9. Adequate clinical categorisation of subjects would be of utmost importance in these studies. Furthermore, identification of phenotypes using clinical variables would be useful in primary care, where imaging and biological measurements are not widely used.

Identification of clinical COPD phenotypes has been described as early as the 1950s, when Dornhorst 10 proposed the distinction between pink puffers and blue bloaters. These descriptions were based on rather subjective clinical assessment of subjects. In recent years, it has been proposed that statistical methods can be applied to clinical medicine for examining phenotypic heterogeneity. Cluster analysis, which seeks to organise information so that heterogeneous groups of variables can be classified into relatively homogeneous groups, has been proposed to examine phenotypic heterogeneity in airway diseases 11. In the present study, we used this method to analyse clinical data obtained in a well-characterised group of COPD subjects recruited throughout France 12. Because information obtained using clinical data and validated questionnaires contained redundancy, cluster analysis was performed using principal component analysis (PCA)-transformed data. This original methodology allowed for testing the hypothesis that COPD subjects could be grouped into clinical phenotypes.

METHODS

Subjects

The present study is based on a cross-sectional analysis of a cohort of COPD subjects (Initiatives BPCO study group) recruited between January 2005 and August 2008 in 17 pulmonary units in university hospitals located throughout France 12. Respiratory physicians prospectively recruited subjects in stable condition (no history of exacerbation requiring medical treatment for the previous 4 weeks) with a diagnosis of COPD based on the presence of a post-bronchodilator FEV₁/forced vital capacity (FVC) ratio <70% 1. Subjects with a main diagnosis of bronchiectasis, asthma or any significant respiratory diseases were excluded. The study was approved by the Ethics Committee of Versailles, France and all subjects provided informed written consent.

Data collection

We used a standardised characterisation process that covered demographic data, cumulative tobacco smoking and COPD characteristics (including symptoms, spirometry and therapy) in stable condition. Pulmonary function tests were performed according to international standards 13. Severity of airflow obstruction was evaluated according to Global Initiative for Chronic Obstructive Lung Disease (GOLD) classification 1. Numbers of acute exacerbations of COPD during the previous year were determined according to patient's self-reported exacerbations. Comorbidities (including congestive heart failure, coronary artery disease, systemic hypertension and diabetes mellitus) were identified from the patient files. We calculated the multidimensional BOD index (body mass index (BMI), obstruction (FEV₁% pred) and dyspnoea evaluated on the modified Medical Research Council (MMRC) scale), which was reported to be a better predictor of mortality than FEV₁ 14, 15.

The hospital anxiety and depression (HAD) scale was used to examine mood disorders. This 14-item self-questionnaire has two seven-item subscales for anxiety (HAD-A) and depression (HAD-D). Scores range from 0 to 21 for each subscale, and a score of 8 or higher on either subscale is conventionally used to define anxiety and depression 16. A score of 11 or higher on either subscale is even more closely associated with the presence of the mood disorder. HRQoL was evaluated using the St George's Respiratory Questionnaire (SGRQ) 17.

Statistical analysis plan

Statistics followed a step by step process detailed below, beginning with selection of relevant clinical variables by the Scientific Committee. Subjects with complete information for these variables were analysed. Correlations between GOLD classes 1 and other variables were assessed using Kendall τ_b rank correlation or logistic regression, as appropriate. Next, correlations within the group of selected variables were studied using cluster analysis. These analyses were useful in determining whether information provided by each clinical variable was independent from the others. Because redundancy was found, PCA was performed on these variables as a method for reducing interaction between variables. Then, cluster analysis based on the main components of the PCA was performed to search for COPD phenotypes. Data are presented as median (interquartile range) or % unless otherwise specified. A p<0.05 was considered statistically significant. Analyses were performed using the SAS package version 9.01 (SAS Institute, Cary, NC, USA).

Selection of variables for analyses

The scientific committee selected eight variables for their relevance to pulmonary and/or extrapulmonary manifestations of COPD. The variables were: age (years), tobacco smoking (pack-years), severity of airflow obstruction (assessed by FEV₁ % pred), exacerbations (number per patient per year), nutritional status (assessed by BMI kg·m⁻²), dyspnoea (assessed by the MMRC scale), HRQoL (assessed by the SGRQ total score), and anxiety and depression (assessed by the HAD total score).

The cohort contained 584 individual subjects at the time of analysis. Complete data for these eight variables, which were necessary for principal component and cluster analyses, were available for 322 subjects. Most of the remaining 262 subjects were excluded from analyses due to the lack of data on SGRQ or HAD questionnaires. Both populations did not differ in terms of age, cumulative tobacco smoking, FEV₁, MMRC scale, BMI and exacerbations (online supplementary material). Male subjects represented 76.7 versus 84.0% of subjects included and excluded in the analysis, respectively (p = 0.03, Chi-squared test).

Correlations between clinical variables

Relationships between the eight selected variables were studied by cluster analysis, using the VARCLUS procedure. This procedure, which organises a set of numeric variables into hierarchical clusters, can be used to examine redundancy between variables. Results were presented in a dendrogram showing variables in each grouping, and the distance between groupings.

Identification of COPD phenotypes

Because we found that information obtained using these clinical variables was not independent from one another, we transformed clinical data using PCA 18. Linear combinations of the eight selected variables were used to form eight new independent variables (eigenvectors) called components 19. The eigenvalue of each component is a measure of its variability. A component with an eigenvalue <1 contributes little to explain the relationships between original variables and thus is not subjected to further analysis. Next, we performed a cluster analysis based on significant components identified in the PCA (i.e. with an eigenvalue >1). Cluster analysis was performed using Ward's method. In this method, grouping was based on quantitative measures of similarity procedure (minimum within cluster sum of square), such that subjects in the same cluster were more similar to each other than to subjects in another cluster. We used pseudo-F and pseudo-t² statistics to determine the optimal number of clusters in the data. Relatively large pseudo-F values were considered to indicate a stopping point. For the pseudo-t² statistic, we moved down the column until we found the first value markedly larger than the previous value and moved back up the column by one cluster.

RESULTS

Classification of COPD subjects according to GOLD classification

Clinical characteristics of the 322 COPD subjects according to GOLD classification are presented in table 1. Variables correlated with increasing GOLD stages included FEV₁, FVC and BMI (inverse correlations), and MMRC, BOD score, SGRQ total score and numbers of exacerbations per patient per year (positive correlations). Systemic hypertension was less prevalent in subjects with GOLD stage 3 and 4 COPD, and a similar trend existed for coronary disease. There was no significant correlation between GOLD stage and age, smoking history, HAD total score, and other comorbidities (i.e. chronic heart failure and diabetes mellitus).

View this table:

Table 1– Characteristics of the 322 chronic obstructive pulmonary disease subjects according to Global Initiative for Chronic Obstructive Lung Disease (GOLD) stages

Long-acting β-agonists and inhaled corticosteroids were prescribed in ∼50% subjects in GOLD stages 1 and 2, and in up to 84% of subjects in GOLD stage 4. No significant difference was observed among GOLD classes for tiotropium prescription.

Relationships between clinical variables

Cluster analysis of the eight selected variables resulted in a dendrogram (fig. 1), which illustrates how these variables relate to each other. This analysis showed that clinical information obtained using these eight variables could be grouped into three clusters (fig. 1), indicating that these variables were not completely independent.

Figure 1–

Dendrogram illustrating the results of the cluster analysis of clinical variables. Clinical variables obtained in 322 chronic obstructive pulmonary disease subjects were classified using VARCLUS cluster analysis. If the patterns of response to two variables were similar for most individuals, these variables were grouped, whereas different response patterns suggested variables were rated more independently. Each horizontal line represents an individual variable and the length of horizontal lines represents the degree of similarity between variables. The original variables could be grouped into three major clusters. MMRC: modified Medical Research Council; SGRQ: St George’s Respiratory Questionnaire; HAD: hospital anxiety and depression; BMI: body mass index; FEV₁: forced expiratory volume in 1 s; % pred: % predicted.

PCA of clinical variables

We performed PCA to transform data contained in the eight selected variables into eight independent components. The first three components that contributed significantly to explaining the relationships among the eight selected variables (eigenvalues >1) accounted for 61% of the information. Correlations of the selected variables with these three independent components are shown in table 2. Component 1 correlated with SGRQ total score and MMRC score, and inversely correlated with FEV₁ (% pred), but was independent from age. Component 2 highly correlated with age and cumulative tobacco smoking, but was independent from FEV₁ (% pred) and SGRQ score. Component 3 mostly correlated with BMI and FEV₁ (% pred). Components 4–8 explained little variability of the original data (eigenvalues <1; online supplementary material) and therefore were not subjected to further analysis.

View this table:

Table 2– Correlations of the eight original variables with the three main components derived from the principal component analysis in the 322 chronic obstructive pulmonary disease subjects

Classification of COPD subjects using cluster analysis

Classification of the 322 COPD subjects using cluster analysis based on the first three components identified in the PCA resulted in a dendrogram that showed the progressive joining of the clustering process (fig. 2). Pseudo-F and pseudo-t² statistics determined that the data could be optimally grouped into four clusters (phenotypes). Clinical characteristics of the 322 COPD subjects according to these four phenotypes are presented in table 3.

Figure 2–

Dendrogram illustrating the results of the cluster analysis in 322 chronic obstructive pulmonary disease subjects. Subjects were classified using agglomerative hierarchical cluster analysis based on principal component analysis-transformed clinical variables (see Methods section). Each horizontal line represents an individual subject and the length of horizontal lines represents the degree of similarity between subjects. The vertical line identifies the optimal number of clusters in the data, as determined by pseudo-F and pseudo-t² statistics (see Methods section). Data can be optimally grouped into four clusters (phenotypes). Characteristics of subjects in each phenotype are presented in tables 3 and 4.

View this table:

Table 3– Characteristics of the 322 chronic obstructive pulmonary disease subjects according to the four phenotypes identified using principal component analysis-based cluster analysis

Major differences were found among groups. First, two extreme phenotypes were identified. The first phenotype contained young subjects (n = 44, median age 58 yrs) with severe airflow limitation (GOLD stage 3 and 4), low BMI, severe dyspnoea, frequent exacerbations, anxiety, depression and severely impaired HRQoL. Cardiovascular comorbidities were infrequent in this group of subjects. The second phenotype was composed of older subjects (n = 89, median age 68 yrs) with mild airflow limitation (GOLD stage 1 or 2 in 85.4% subjects), low dyspnoea, low levels of anxiety and depression, almost no exacerbations, mild impairment in HRQoL and who were mildly overweight. These older subjects had higher prevalence of comorbidities, including hypertension (median 57.5%), coronary artery disease (19.5%), diabetes mellitus (17.5%) and chronic heart failure (12.8%).

Phenotypes 3 and 4, which were composed of subjects with moderate to severe airflow limitation (GOLD stages 2 and 3 in about three-quarters of subjects), could not be distinguished based on FEV₁, but differed in terms of age, symptoms and comorbidities: significant differences (all p<0.05) were found for all variables presented in table 3 except for sex ratio, FEV₁, FVC, HAD anxiety subscale and percentage of subjects treated with tiotropium. Compared with subjects in phenotype 3, subjects in phenotype 4 were older, and had higher prevalence of depressive symptoms and other comorbidities, including cardiovascular comorbidities (especially chronic heart failure). Furthermore, subjects in phenotype 4 had higher BMI and more severe dyspnoea, which were responsible for increased BOD scores.

A summary of these four COPD phenotypes is presented in table 4.

View this table:

Table 4– Summary of chronic obstructive pulmonary disease phenotypes identified using principal component analysis-based cluster analysis

Relationship between BOD index and phenotypes

We examined predicted mortality using the BOD index, which was not used to elaborate our analysis. BOD scores were markedly different among phenotypes, but BOD index was not sufficient for the discrimination of these phenotypes (fig. 3).

Figure 3–

BOD scores (body mass index, obstruction (forced expiratory volume in 1 s % predicted) and dyspnoea evaluated on the modified Medical Research Council scale) in 322 chronic obstructive pulmonary disease (COPD) subjects grouped by phenotypes. COPD subjects (n = 322) were grouped in four phenotypes, according to the results of the principal component analysis-based cluster analysis. BOD scores were calculated as described previously 14, 15. Increased BOD score predicts increased mortality. Each box plot is composed of five horizontal lines that display minimum and maximum values, and 25th, 50th and 75th percentiles of the variable.

DISCUSSION

We used an original statistical approach to analyse clinical data obtained in a large group of COPD subjects. In this heterogeneous COPD population defined with the current FEV₁-based GOLD classification, this methodology resulted in the identification of four COPD phenotypes, as follows. Phenotype 1: young subjects with predominant severe to very severe respiratory disease; phenotype 2: older subjects with mild airflow limitation, mild symptoms and mild age-related comorbidities; phenotype 3: young subjects with moderate to severe airflow limitation, but few comorbidities and mild symptoms; and phenotype 4: older subjects with moderate to severe airflow limitation and severe symptoms ascribed, at least in part, to major comorbidities (e.g. chronic heart failure). Importantly, our results indicated that age, dyspnoea, HRQoL, exacerbations and comorbidities (e.g. chronic heart failure and depression) were markedly different among subjects with the same GOLD classification, underscoring the need for multidimensional assessment of COPD subjects.

We searched for COPD phenotypes using cluster analysis, a method for classifying heterogeneous groups of variables into relatively homogeneous groups 11. Previously, few studies used cluster analysis for assessing phenotypes in patients with airway diseases. These studies were performed in a mixed population of 27 asthmatics and 22 COPD subjects 11, in 175 subjects from a community-based study 20, and in three different populations of asthmatic subjects 21. Our study is original because we applied this method to a large cohort of well-characterised COPD subjects. This exploratory statistical approach allowed the identification of several COPD phenotypes. Among these, some have been suggested using conventional methods. Indeed, phenotype 1 (subjects with severe respiratory disease and nutritional depletion) and phenotype 4 (mildly overweight subjects with moderate to severe airflow limitation) would correspond to classical descriptions of severe respiratory disease in pink puffers and blue bloaters, respectively 10. We also identified groups of subjects with milder phenotypes. Thus, phenotype 2 was composed of older subjects in whom mild airflow limitation was accompanied by mild symptoms, few exacerbations and relatively preserved HRQoL. These older subjects were mostly classified as GOLD stage 2, suggesting that severity of airflow limitation in this population is independent from age. Finally, subjects in phenotype 3 were young subjects with moderate to severe airflow limitation and few comorbidities. Longitudinal follow-up will be necessary to improve our knowledge of the natural history of subjects within these phenotypes.

We used PCA as a mean for transforming variables included in the cluster analysis. This methodology, which was applied for the first time in airway diseases, is especially useful for the elimination of noisy variables that may corrupt the cluster structure 18, 22. To confirm the yield of PCA as a preliminary step before cluster analysis, we performed another cluster analysis based on the total number of initial variables, i.e. without previously using PCA for variable reduction. This analysis identified three phenotypes which overlapped markedly for several variables, including age, BMI, dyspnoea and SGRQ (complete data are provided in the online supplementary material). Such overlaps suggest that phenotypes identified by clustering without initial PCA may be less clinically useful, confirming that PCA is a useful way for transforming variables before cluster analysis.

This study has important strengths. First, clinical data were collected prospectively by respiratory physicians. Secondly, the diagnosis of COPD was based on GOLD criteria and validated questionnaires were used to measure patient-related outcomes. Thirdly, the studied population contained subjects in all GOLD classes. Finally, the statistical methods used allow unbiased analyses that are not based on any a priori assumptions. Some limitations also have to be taken into account when interpreting the results. Subjects with incomplete datasets were excluded from the analyses, which necessitated complete data. Importantly, no clinically significant difference was found between included and excluded subjects (see Methods section), except for sex, with female subjects representing 23.2% versus 16.0% of subjects included and excluded in the analysis, respectively, suggesting that females are more likely than males to answer questionnaires. Subjects were recruited in university hospitals and may represent a specific population of COPD subjects. However, the entire range of GOLD severity stages of COPD was represented. Assessment of comorbidities was based on diagnosed comorbidities and not on systematic diagnostic work-up, preventing us from taking clinically occult diseases into account. Exacerbations were analysed as self-reported exacerbations, which may result in underestimation of exacerbation numbers 23. However, this approach corresponds to what happens in real life when a physician characterises one of his subjects. Phenotyping was exclusively based on clinical variables, spirometry and questionnaires, but no imaging or biomarkers data were analysed. Our approach was appropriate for identification of clinically based phenotypes that can be used in daily practice. It is possible that inclusion of other variables relevant to the pathogenesis of COPD (e.g. bronchodilator reversibility, peak flow variability, atopy, α₁-antitrypsin status, emphysema or sputum production, eosinophilic airway inflammation, exhaled nitric oxide fraction or other biomarkers) may have increased our ability to identify phenotypes. It is also possible that current treatments have disease-modifying effects that affected our phenotypes and/or that some treatment responses vary depending on the clinical phenotype. Further studies will be required to explore these hypotheses.

In the present study, we defined COPD using a fixed FEV₁/FVC <0.7 ratio, which is sex- and age-dependent. The purposes of this choice were to stay in line with the current GOLD guidelines 1, to use the criteria that is mostly referred to in routine practice, and to allow comparison of the data with previous literature. It may have resulted in the exclusion from our cohort of young subjects with mild airflow obstruction (FEV₁/FVC less than the lower limit normal (LLN) but >0.7). Furthermore, it has resulted in the inclusion of subjects with an FEV₁/FVC <0.7 but FEV₁/FVC greater than the LLN (n = 56, 17.3% of subjects). In order to examine whether the definition of airflow limitation using the LLN instead of fixed ratio had an impact on our conclusion, we performed another cluster analysis based on the 266 patients in this cohort with a FEV₁/FVC less than the LLN (results are provided in the online supplementary material). An important and reassuring finding was that this analysis identified four phenotypes that were very comparable to those identified by our previous set of analysis in the 322 patients with FEV₁/FVC <0.7. Thus, although our “mild phenotype in older patient” largely disappeared (due to the removal of older patients with mild airflow limitation using LLN criteria), the other phenotypes were very similar and our main conclusion remains: patients with similar airflow limitation (FEV₁) had different symptoms (dyspnoea) and outcomes (exacerbation numbers and predicted mortality), and differed in terms of age and comorbidities, further indicating the robustness of our analyses.

There was a strong inverse correlation between GOLD classification and BMI. These data confirm and extend findings by Vestbo et al. 24, who reported that BMI was reduced in GOLD stage 4 subjects compared with subjects with milder airflow obstruction. However, the results of our PCA-based cluster analysis suggested that the relationship between BMI and FEV₁ is affected by age, with higher BMI found in older subjects.

We found major differences in the levels of dyspnoea in subjects classified at similar GOLD stages. Cluster analysis of the relationships between clinical variables indicated that dyspnoea showed only moderate correlation with FEV₁, confirming previous studies 25. Comparing subjects who could not be differentiated based on FEV₁ (phenotypes 3 and 4), we found that subjects with increased dyspnoea (phenotype 4) were older, had increased prevalence of chronic heart failure, and were mildly overweight. We speculate that chronic heart failure and reduced physical activity may explain, at least in part, increased dyspnoea in these subjects. Indeed, other studies found that daily activity (which was not assessed in our cohort) is decreased even in subjects with mild to moderate airflow limitation, and that this decrease is associated with increased dyspnoea, chronic heart failure and increased mortality 26, 27.

Although BOD scores (and, therefore, predicted mortality) differed significantly between the four identified phenotypes, a marked overlap was observed. Indeed, these phenotypes take into account important patient characteristics that are not captured in the BOD score (e.g. age, exacerbation history, quality of life, comorbidities and depression).

These data have important implications for patient care. International guidelines recommend adaptation of therapies based on severity of COPD, as assessed by post-bronchodilator FEV₁ and symptoms 1. We suggest that this strategy is appropriate for subjects with predominant respiratory disease (phenotype 1), but not in subjects with important extrapulmonary disease, i.e. comorbidities (phenotype 4). The development of patient-oriented rather than single-disease guidelines may prove very useful for management of subjects with multiple chronic diseases.

These data also have major implication for clinical trials. We showed that subjects with similar GOLD stages had very different clinical characteristics, including symptoms, comorbidities and predicted mortality (as determined using BOD index). We speculate that the inclusion of subjects with similar FEV₁, but different risks or causes of mortality, may have resulted in negative results reported in large therapeutic trials of inhaled therapies in COPD subjects 5, 6. Indeed, relative mortality risk reduction depends not only on beneficial effect of treatment but also on the distribution of baseline mortality risk in the population and on the causes of mortality in each subgroup of subjects 28. This has been best underscored in subjects with high blood pressure for whom mortality risk depends not only on blood pressure levels but also on various coexisting conditions (e.g. age, diabetes, high cholesterol level and smoking) 28. We suggest that future clinical studies should analyse results based on risk assessment (e.g. mortality risk) rather than on a single parameter (e.g. FEV₁).

In summary, current FEV₁-based GOLD classification appears inappropriate for guiding therapy and for stratification of subjects in clinical trials, because it does not discriminate subjects with markedly different phenotypes. Our study described an original statistical methodology that allowed the identification of clinical COPD phenotypes. This methodology could be applied to other COPD cohorts to examine whether similar, or different, phenotypes are present in different populations. The prognostic value of these phenotypes should also be evaluated in longitudinal studies. Such studies will provide data on the relevance of these new phenotypes and will allow comparisons of outcome prediction between phenotypes and the GOLD classification or composite indices. We propose that dissemination of this original approach could result in better phenotypic characterisation, which may prove useful in daily practice and clinical trials. We further propose that data from large clinical trials should be re-analysed using this methodology for classification of patients according to their clinical characteristics at study entry.

Acknowledgments

The Initiatives BPCO study group: G. Brinchault-Rabin (Rennes), P-R. Burgel (Cochin, Paris), D. Caillaud (Clermont-Ferrand), P. Carré (Carcassonne), P. Chanez (Marseille), A. Chaouat (Vandoeuvre les Nancy), I. Court-Fortune (Saint-Etienne), A. Cuvelier (Rouen), R. Escamilla (Toulouse), C. Gut-Gobert (Brest), G. Jebrak (Paris), F. Lemoigne (Nice), P. Nesme-Meyer (Lyon), T. Perez and I. Tillie-Leblond (Lille), C. Perrin (Cannes), C. Pinet (Toulon), C. Raherison (Bordeaux) and N. Roche (Hôtel Dieu, Paris, France).

Footnotes

For editorial comments see page 472.
This article has online supplementary material available from www.erj.ersjournals.com
Earn CME accreditation by answering questions about this article. You will find these at the back of the printed copy of this issue or online at www.erj.ersjournals.com/misc/cmeinfo.dtl
Support Statement
This work was funded by unrestricted grants from Boehringer Ingelheim France (Paris, France) and Pfizer France (Paris, France).
Statement of Interest
A statement of interest for this study can be found at www.erj.ersjournals.com/misc/statements.dtl

Received November 5, 2009.
Accepted December 24, 2009.

REFERENCES

↵
1. Rabe KF,
2. Hurd S,
3. Anzueto A,
4. et al
. Global strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary disease: GOLD executive summary. Am J Respir Crit Care Med 2007; 176: 532–555.
OpenUrl CrossRef PubMed Web of Science
↵
1. Celli BR,
2. Cote CG,
3. Lareau SC,
4. et al
. Predictors of survival in COPD: more than just the FEV₁. Respir Med 2008; 102 Suppl. 1:S27–S35.
OpenUrl CrossRef PubMed
↵
1. Mannino DM,
2. Thorn D,
3. Swensen A,
4. et al
. Prevalence and outcomes of diabetes, hypertension and cardiovascular disease in COPD. Eur Respir J 2008; 32: 962–969.
OpenUrl Abstract/FREE Full Text
↵
1. McGarvey LP,
2. John M,
3. Anderson JA,
4. et al
. Ascertainment of cause-specific mortality in COPD: operations of the TORCH Clinical Endpoint Committee. Thorax 2007; 62: 411–415.
OpenUrl Abstract/FREE Full Text
↵
1. Calverley PM,
2. Anderson JA,
3. Celli B,
4. et al
. Salmeterol and fluticasone propionate and survival in chronic obstructive pulmonary disease. N Engl J Med 2007; 356: 775–789.
OpenUrl CrossRef PubMed Web of Science
↵
1. Tashkin DP,
2. Celli B,
3. Senn S,
4. et al
. A 4-year trial of tiotropium in chronic obstructive pulmonary disease. N Engl J Med 2008; 359: 1543–1554.
OpenUrl CrossRef PubMed Web of Science
↵
1. Criner GJ,
2. Sternberg AL
. National Emphysema Treatment Trial: the major outcomes of lung volume reduction surgery in severe emphysema. Proc Am Thorac Soc 2008; 5: 393–405.
OpenUrl CrossRef PubMed
↵
1. Celli BR
. Roger S. Mitchell lecture. Chronic obstructive pulmonary disease phenotypes and their clinical relevance. Proc Am Thorac Soc 2006; 3: 461–465.
OpenUrl CrossRef PubMed
↵
1. Turino GM
. COPD and biomarkers: the search goes on. Thorax 2008; 63: 1032–1034.
OpenUrl FREE Full Text
↵
1. Dornhorst AC
. Respiratory insufficiency (Frederick Price Memorial Lecture). Lancet 1955; 1: 1185–1187.
OpenUrl
↵
1. Wardlaw AJ,
2. Silverman M,
3. Siva R,
4. et al
. Multi-dimensional phenotyping: towards a new taxonomy for airway disease. Clin Exp Allergy 2005; 35: 1254–1262.
OpenUrl CrossRef PubMed Web of Science
↵
1. Burgel PR,
2. Nesme-Meyer P,
3. Chanez P,
4. et al
. Cough and sputum production are associated with frequent exacerbations and hospitalizations in COPD subjects. Chest 2009; 135: 975–982.
OpenUrl CrossRef PubMed Web of Science
↵
1. Quanjer PH,
2. Tammeling GJ,
3. Cotes JE,
4. et al
. Lung volumes and forced ventilatory flows. Eur Respir J 1993; 6 Suppl. 16:5–40.
OpenUrl FREE Full Text
↵
1. Celli B,
2. Jones P,
3. Vestbo J,
4. et al
. The multidimensional BOD: association with mortality in the TORCH trial. Eur Respir J 2008; 32 Suppl. 52:42s–
OpenUrl CrossRef
↵
1. Celli BR,
2. Cote CG,
3. Marin JM,
4. et al
. The body-mass index, airflow obstruction, dyspnea, and exercise capacity index in chronic obstructive pulmonary disease. N Engl J Med 2004; 350: 1005–1012.
OpenUrl CrossRef PubMed Web of Science
↵
1. Ng TP,
2. Niti M,
3. Tan WC,
4. et al
. Depressive symptoms and chronic obstructive pulmonary disease: effect on mortality, hospital readmission, symptom burden, functional status, and quality of life. Arch Intern Med 2007; 167: 60–67.
OpenUrl CrossRef PubMed Web of Science
↵
1. Jones PW,
2. Quirk FH,
3. Baveystock CM,
4. et al
. A self-complete measure of health status for chronic airflow limitation. The St. George's Respiratory Questionnaire. Am Rev Respir Dis 1992; 145: 1321–1327.
OpenUrl CrossRef PubMed Web of Science
↵
1. Vogt W,
2. Nagel D
. Cluster analysis in diagnosis. Clin Chem 1992; 38: 182–198.
OpenUrl Abstract/FREE Full Text
↵
1. Casanova C,
2. Cote C,
3. de Torres JP,
4. et al
. Inspiratory-to-total lung capacity ratio predicts mortality in patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med 2005; 171: 591–597.
OpenUrl CrossRef PubMed Web of Science
↵
1. Weatherall M,
2. Travers J,
3. Shirtcliffe PM,
4. et al
. Distinct clinical phenotypes of airways disease defined by cluster analysis. Eur Respir J 2009; 34: 812–818.
OpenUrl Abstract/FREE Full Text
↵
1. Haldar P,
2. Pavord ID,
3. Shaw DE,
4. et al
. Cluster analysis and clinical asthma phenotypes. Am J Respir Crit Care Med 2008; 178: 218–224.
OpenUrl CrossRef PubMed Web of Science
↵
1. Brownstein MJ,
2. Kohodursky A
1. Ben-Hur A,
2. Guyon I
. Detecting stable clusters using principal component analysis. In: Brownstein MJ, Kohodursky A, eds. Functional Genomics: Methods and Protocols. Totowa, Humana Press, 2003: pp. 159–182.
↵
1. Seemungal TA,
2. Donaldson GC,
3. Paul EA,
4. et al
. Effect of exacerbation on quality of life in patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med 1998; 157: 1418–1422.
OpenUrl CrossRef PubMed Web of Science
↵
1. Vestbo J,
2. Prescott E,
3. Almdal T,
4. et al
. Body mass, fat-free body mass, and prognosis in patients with chronic obstructive pulmonary disease from a random population sample: findings from the Copenhagen City Heart Study. Am J Respir Crit Care Med 2006; 173: 79–83.
OpenUrl CrossRef PubMed Web of Science
↵
1. Curtis JR,
2. Deyo RA,
3. Hudson LD
. Pulmonary rehabilitation in chronic respiratory insufficiency. 7. Health-related quality of life among patients with chronic obstructive pulmonary disease. Thorax 1994; 49: 162–170.
OpenUrl FREE Full Text
↵
1. Garcia-Aymerich J,
2. Serra I,
3. Gomez FP,
4. et al
. Physical activity and clinical and functional status in COPD. Chest 2009; 136: 62–70.
OpenUrl CrossRef PubMed Web of Science
↵
1. Watz H,
2. Waschki B,
3. Boehme C,
4. et al
. Extrapulmonary effects of chronic obstructive pulmonary disease on physical activity: a cross-sectional study. Am J Respir Crit Care Med 2008; 177: 743–751.
OpenUrl CrossRef PubMed Web of Science
↵
1. Kent DM,
2. Hayward RA
. Limitations of applying summary results of clinical trials to individual patients: the need for risk stratification. JAMA 2007; 298: 1209–1212.
OpenUrl CrossRef PubMed Web of Science

View Abstract

View this article with LENS

Vol 36 Issue 3 Table of Contents

Citation Tools

Full Text (PDF)

Subjects

COPD and smoking

Original Article

Show more Original Article

COPD

Show more COPD

[1] ↵
Rabe KF,
Hurd S,
Anzueto A,
et al
. Global strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary disease: GOLD executive summary. Am J Respir Crit Care Med 2007; 176: 532–555.
OpenUrl CrossRef PubMed Web of Science

[2] Rabe KF,

[3] Hurd S,

[4] Anzueto A,

[5] et al

[6] ↵
Celli BR,
Cote CG,
Lareau SC,
et al
. Predictors of survival in COPD: more than just the FEV₁. Respir Med 2008; 102 Suppl. 1:S27–S35.
OpenUrl CrossRef PubMed

[7] Celli BR,

[8] Cote CG,

[9] Lareau SC,

[10] et al

[11] ↵
Mannino DM,
Thorn D,
Swensen A,
et al
. Prevalence and outcomes of diabetes, hypertension and cardiovascular disease in COPD. Eur Respir J 2008; 32: 962–969.
OpenUrl Abstract/FREE Full Text

[12] Mannino DM,

[13] Thorn D,

[14] Swensen A,

[15] et al

[16] ↵
McGarvey LP,
John M,
Anderson JA,
et al
. Ascertainment of cause-specific mortality in COPD: operations of the TORCH Clinical Endpoint Committee. Thorax 2007; 62: 411–415.
OpenUrl Abstract/FREE Full Text

[17] McGarvey LP,

[18] John M,

[19] Anderson JA,

[20] et al

[21] ↵
Calverley PM,
Anderson JA,
Celli B,
et al
. Salmeterol and fluticasone propionate and survival in chronic obstructive pulmonary disease. N Engl J Med 2007; 356: 775–789.
OpenUrl CrossRef PubMed Web of Science

[22] Calverley PM,

[23] Anderson JA,

[24] Celli B,

[25] et al

[26] ↵
Tashkin DP,
Celli B,
Senn S,
et al
. A 4-year trial of tiotropium in chronic obstructive pulmonary disease. N Engl J Med 2008; 359: 1543–1554.
OpenUrl CrossRef PubMed Web of Science

[27] Tashkin DP,

[28] Celli B,

[29] Senn S,

[30] et al

[31] ↵
Criner GJ,
Sternberg AL
. National Emphysema Treatment Trial: the major outcomes of lung volume reduction surgery in severe emphysema. Proc Am Thorac Soc 2008; 5: 393–405.
OpenUrl CrossRef PubMed

[32] Criner GJ,

[33] Sternberg AL

[34] ↵
Celli BR
. Roger S. Mitchell lecture. Chronic obstructive pulmonary disease phenotypes and their clinical relevance. Proc Am Thorac Soc 2006; 3: 461–465.
OpenUrl CrossRef PubMed

[35] Celli BR

[36] ↵
Turino GM
. COPD and biomarkers: the search goes on. Thorax 2008; 63: 1032–1034.
OpenUrl FREE Full Text

[37] Turino GM

[38] ↵
Dornhorst AC
. Respiratory insufficiency (Frederick Price Memorial Lecture). Lancet 1955; 1: 1185–1187.
OpenUrl

[39] Dornhorst AC

[40] ↵
Wardlaw AJ,
Silverman M,
Siva R,
et al
. Multi-dimensional phenotyping: towards a new taxonomy for airway disease. Clin Exp Allergy 2005; 35: 1254–1262.
OpenUrl CrossRef PubMed Web of Science

[41] Wardlaw AJ,

[42] Silverman M,

[43] Siva R,

[44] et al

[45] ↵
Burgel PR,
Nesme-Meyer P,
Chanez P,
et al
. Cough and sputum production are associated with frequent exacerbations and hospitalizations in COPD subjects. Chest 2009; 135: 975–982.
OpenUrl CrossRef PubMed Web of Science

[46] Burgel PR,

[47] Nesme-Meyer P,

[48] Chanez P,

[49] et al

[50] ↵
Quanjer PH,
Tammeling GJ,
Cotes JE,
et al
. Lung volumes and forced ventilatory flows. Eur Respir J 1993; 6 Suppl. 16:5–40.
OpenUrl FREE Full Text

[51] Quanjer PH,

[52] Tammeling GJ,

[53] Cotes JE,

[54] et al

[55] ↵
Celli B,
Jones P,
Vestbo J,
et al
. The multidimensional BOD: association with mortality in the TORCH trial. Eur Respir J 2008; 32 Suppl. 52:42s–
OpenUrl CrossRef

[56] Celli B,

[57] Jones P,

[58] Vestbo J,

[59] et al

[60] ↵
Celli BR,
Cote CG,
Marin JM,
et al
. The body-mass index, airflow obstruction, dyspnea, and exercise capacity index in chronic obstructive pulmonary disease. N Engl J Med 2004; 350: 1005–1012.
OpenUrl CrossRef PubMed Web of Science

[61] Celli BR,

[62] Cote CG,

[63] Marin JM,

[64] et al

[65] ↵
Ng TP,
Niti M,
Tan WC,
et al
. Depressive symptoms and chronic obstructive pulmonary disease: effect on mortality, hospital readmission, symptom burden, functional status, and quality of life. Arch Intern Med 2007; 167: 60–67.
OpenUrl CrossRef PubMed Web of Science

[66] Ng TP,

[67] Niti M,

[68] Tan WC,

[69] et al

[70] ↵
Jones PW,
Quirk FH,
Baveystock CM,
et al
. A self-complete measure of health status for chronic airflow limitation. The St. George's Respiratory Questionnaire. Am Rev Respir Dis 1992; 145: 1321–1327.
OpenUrl CrossRef PubMed Web of Science

[71] Jones PW,

[72] Quirk FH,

[73] Baveystock CM,

[74] et al

[75] ↵
Vogt W,
Nagel D
. Cluster analysis in diagnosis. Clin Chem 1992; 38: 182–198.
OpenUrl Abstract/FREE Full Text

[76] Vogt W,

[77] Nagel D

[78] ↵
Casanova C,
Cote C,
de Torres JP,
et al
. Inspiratory-to-total lung capacity ratio predicts mortality in patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med 2005; 171: 591–597.
OpenUrl CrossRef PubMed Web of Science

[79] Casanova C,

[80] Cote C,

[81] de Torres JP,

[82] et al

[83] ↵
Weatherall M,
Travers J,
Shirtcliffe PM,
et al
. Distinct clinical phenotypes of airways disease defined by cluster analysis. Eur Respir J 2009; 34: 812–818.
OpenUrl Abstract/FREE Full Text

[84] Weatherall M,

[85] Travers J,

[86] Shirtcliffe PM,

[87] et al

[88] ↵
Haldar P,
Pavord ID,
Shaw DE,
et al
. Cluster analysis and clinical asthma phenotypes. Am J Respir Crit Care Med 2008; 178: 218–224.
OpenUrl CrossRef PubMed Web of Science

[89] Haldar P,

[90] Pavord ID,

[91] Shaw DE,

[92] et al

[93] ↵
Brownstein MJ,
Kohodursky A
Ben-Hur A,
Guyon I
. Detecting stable clusters using principal component analysis. In: Brownstein MJ, Kohodursky A, eds. Functional Genomics: Methods and Protocols. Totowa, Humana Press, 2003: pp. 159–182.

[94] Brownstein MJ,

[95] Kohodursky A

[96] Ben-Hur A,

[97] Guyon I

[98] ↵
Seemungal TA,
Donaldson GC,
Paul EA,
et al
. Effect of exacerbation on quality of life in patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med 1998; 157: 1418–1422.
OpenUrl CrossRef PubMed Web of Science

[99] Seemungal TA,

[100] Donaldson GC,

[101] Paul EA,

[102] et al

[103] ↵
Vestbo J,
Prescott E,
Almdal T,
et al
. Body mass, fat-free body mass, and prognosis in patients with chronic obstructive pulmonary disease from a random population sample: findings from the Copenhagen City Heart Study. Am J Respir Crit Care Med 2006; 173: 79–83.
OpenUrl CrossRef PubMed Web of Science

[104] Vestbo J,

[105] Prescott E,

[106] Almdal T,

[107] et al

[108] ↵
Curtis JR,
Deyo RA,
Hudson LD
. Pulmonary rehabilitation in chronic respiratory insufficiency. 7. Health-related quality of life among patients with chronic obstructive pulmonary disease. Thorax 1994; 49: 162–170.
OpenUrl FREE Full Text

[109] Curtis JR,

[110] Deyo RA,

[111] Hudson LD

[112] ↵
Garcia-Aymerich J,
Serra I,
Gomez FP,
et al
. Physical activity and clinical and functional status in COPD. Chest 2009; 136: 62–70.
OpenUrl CrossRef PubMed Web of Science

[113] Garcia-Aymerich J,

[114] Serra I,

[115] Gomez FP,

[116] et al

[117] ↵
Watz H,
Waschki B,
Boehme C,
et al
. Extrapulmonary effects of chronic obstructive pulmonary disease on physical activity: a cross-sectional study. Am J Respir Crit Care Med 2008; 177: 743–751.
OpenUrl CrossRef PubMed Web of Science

[118] Watz H,

[119] Waschki B,

[120] Boehme C,

[121] et al

[122] ↵
Kent DM,
Hayward RA
. Limitations of applying summary results of clinical trials to individual patients: the need for risk stratification. JAMA 2007; 298: 1209–1212.
OpenUrl CrossRef PubMed Web of Science

[123] Kent DM,

[124] Hayward RA

Main menu

User menu

Search

Clinical COPD phenotypes: a novel approach using principal component and cluster analyses

Abstract

METHODS

Subjects

Data collection

Statistical analysis plan

Selection of variables for analyses

Correlations between clinical variables

Identification of COPD phenotypes

RESULTS

Classification of COPD subjects according to GOLD classification

Relationships between clinical variables

PCA of clinical variables

Classification of COPD subjects using cluster analysis

Relationship between BOD index and phenotypes

DISCUSSION

Acknowledgments

Footnotes

REFERENCES

Citation Manager Formats

Subjects

More in this TOC Section

Original Article

COPD

Related Articles

Contact us

Main menu

User menu

Search

Clinical COPD phenotypes: a novel approach using principal component and cluster analyses

Abstract

METHODS

Subjects

Data collection

Statistical analysis plan

Selection of variables for analyses

Correlations between clinical variables

Identification of COPD phenotypes

RESULTS

Classification of COPD subjects according to GOLD classification

Relationships between clinical variables

PCA of clinical variables

Classification of COPD subjects using cluster analysis

Relationship between BOD index and phenotypes

DISCUSSION

Acknowledgments

Footnotes

REFERENCES

Citation Manager Formats

Jump To

Subjects

More in this TOC Section

Original Article

COPD

Related Articles

Contact us