Abstract
Our goal was to identify subgroups of adults with cystic fibrosis (CF) at low risk of death within 10 years.
Factor analysis for mixed data followed by Ward's cluster analysis was conducted using 25 variables from 1572 French CF adults in 2005. Rates of death by subgroups were analysed over 10 years. An algorithm was developed using CART (classification and regression tree) analysis to provide rules for the identification of subgroups of CF adults with low rates of death within 10 years. This algorithm was validated in 1376 Canadian CF adults.
Seven subgroups were identified by cluster analysis in French CF adults, including two subgroups with low (∼5%) rates of death at 10 years: one subgroup (22% of patients) was composed of patients with nonclassic CF, the other subgroup (17% of patients) was composed of patients with classic CF but low rates of Pseudomonas aeruginosa infection and diabetes. An algorithm based on CART analysis of data in 2005 allowed us to identify most French adults with low rates of death. When tested using data from Canadian CF adults in 2005, the algorithm identified 287 out of 1376 (21%) patients at low risk (10-year death: 7.7%).
Large subgroups of CF adults share low risk of 10-year mortality.
Abstract
Large subgroups of adults with cystic fibrosis have very low rates of death at 10 years http://ow.ly/PbS830n2Tkm
Introduction
Cystic fibrosis (CF) is a genetic disease affecting at least 70 000 patients worldwide [1], and is characterised as a multisystemic disease involving primarily the lung, pancreas and liver [2]. CF used to be a devastating disease with premature death in children, but prognosis has improved over the past decades. Recent studies performed in countries with multidisciplinary CF care [3] revealed that almost all CF children now reach adult age [4–6] and that paediatric mortality has become very rare [7, 8]. Despite these improvements, CF adults remain at increased risk of respiratory failure, resulting in death [8] or the need for lung transplantation [9, 10]. Although CF adults aim to achieve normal lives, they often face socioeconomic barriers in their daily lives. Among these barriers, CF adults are usually excluded from access to loan insurance, because insurance companies and banks assume that all CF adults are at high risk of premature death.
Several studies have examined factors associated with short-term (up to 5 years) prognosis using cohorts of CF patients usually obtained from national CF registries [11–14]. These studies have identified individual factors (e.g. low forced expiratory volume in 1 s (FEV1), low body mass index (BMI), pancreatic insufficiency, colonisation with Pseudomonas aeruginosa and/or Burkholderia cepacia, massive haemoptysis, and pneumothorax) associated with poor prognosis that were sometimes combined into prognostic scores [13, 15–18]. Importantly, previous studies were generally designed to identify subgroups at high risk of death and/or lung transplantation and there is a lack of data regarding the clinical profile of CF adults at low risk of poor outcome.
Cluster analysis is a generic term for several statistical methods that allow grouping patients who share multiple characteristics [19]. Such exploratory analyses have been successfully implemented in large groups of patients with asthma [20], chronic obstructive pulmonary disease [21, 22] and non-CF bronchiectasis [23], in which they have identified specific subgroups of patients sharing clinical outcomes (phenotypes) and/or biological pathways (endotypes). In CF adults, we identified only one study that used cluster analysis in 211 patients for identification of patient subgroups [24].
In the present study, we sought to identify subgroups of adult CF patients at low risk of death at 10 years. This objective was based on a request from the French government and private insurance companies to identify criteria that would allow proposing 10-year access to bank loans for adults with CF, who are currently not entitled to loan insurance because of their disease. Our strategy was to use cluster analysis to identify subgroups of French adults with CF experiencing low death rates over 10 years and to use CART (classification and regression tree) analysis for developing an algorithm that would allow the identification of these subgroups. We then externally tested this algorithm using data from Canadian CF adults.
Methods
Patients
The present study was conducted using the French CF Registry, which contains longitudinal data from at least 90% of CF patients in France [25]. The registry collects data once a year in CF patients followed in the network of accredited CF centres. Eligible patients were adult (≥18 years) CF patients living without lung transplantation in the 2005 registry. Patients listed for lung transplantation in 2005 were excluded due to the possibility that they underwent transplantation between their last visit (when the data were sent to the registry) and the end of the year. Patients with any history of cancer were excluded from the analysis because this comorbidity may have affected prognosis. Patients with missing data for critical prognosis factors (spirometry and/or BMI) were also excluded from the analyses.
Data were further obtained from the Canadian CF Registry, which represents >95% of Canadians with CF. Data were extracted from 2005 and selection of patients was performed by applying the aforementioned methodology developed for the French CF Registry.
Statistical analysis plan
First, French CF adults were classified into subgroups based on the results of a cluster analysis of data obtained in the cohort in 2005. The prognostic relevance of the identified subgroups was established by examining their association with death from any cause as per December 31, 2015 (10-year death). We next used CART analysis to develop an algorithm aimed at the identification of clusters containing patients with low rates (∼5%) of death at 10 years in the French CF Registry. This algorithm was externally tested using data obtained in the Canadian CF Registry. Survival and transplantation-free survival were analysed using Kaplan–Meier curves and the log-rank test. Risk of any death or risk of the combined event death without transplantation/occurrence of lung transplantation was analysed by Cox models. Data are presented as median (interquartile range (IQR)) or number (percentage). Analyses were performed using SAS version 9.2 (SAS Institute, Cary, NC, USA).
Cluster analysis
Variables were selected for inclusion in the cluster analysis based on clinical knowledge (i.e. previously described association with prognosis in CF patients) and were usual descriptors of adult CF patients. A complete description and definition of the 25 variables selected in this analysis is provided in supplementary table S1. Briefly, these variables included sex, age as of December 31, 2005, BMI (kg·m−2), FEV1 % pred, CF transmembrane conductance regulator (CFTR) mutations, pancreatic status, diabetes mellitus, liver cirrhosis, haemoptysis, pneumothorax, airway infection, and variables related to healthcare utilisation and/or therapeutic management (long-term oxygen therapy, noninvasive ventilation (NIV), treatment with oral steroids for >3 months, insulin, number of intravenous antibiotic courses in 2005 and number of hospitalisations ≥48 h in 2005). Identification of homogeneous subgroups of adults with CF was achieved using FAMD (factor analysis for mixed data) [26, 27], followed by classification of patients using Ward's agglomerative hierarchical cluster analysis [21, 28]. Briefly, FAMD allowed transformation of linear combinations of the 25 selected variables into 25 new independent variables (eigenvectors) called “components”. The eigenvalue of each component is a measure of its variability. A component with an eigenvalue <1 contributes little to explain the relationships between original variables and thus is not subjected to further analysis. We then performed a cluster analysis based on significant components (i.e. with an eigenvalue >1) identified in FAMD. Cluster analysis was performed using Ward's method, for which grouping was based on the quantitative measures of similarity procedure (minimum within-cluster sum of square), such that subjects in the same cluster were more similar to each other than to subjects in another cluster. We used pseudo-F and pseudo-t2 statistics, and visual assessment of the dendrogram, to determine the optimal number of clusters in the data [21].
Prognostic outcomes
To examine the prognostic relevance of the identified subgroups (clusters) of patients, we examined vital status in the French CF Registry, which contains reliable data on the occurrence of death in CF patients in France [8]. We first analysed data using rates of any death at 5 and 10 years. As lung transplantation has become the standard of care in CF patients with severe respiratory insufficiency, we also examined the occurrence of lung transplantation and death rates according to transplantation status (death without lung transplantation versus death after lung transplantation). Finally, we examined a combined outcome of death without lung transplantation or the occurrence of lung transplantation, as described previously [13].
CART analysis
The development of an algorithm to assign French adult CF patients to the subgroups at low rates of 10-year death (cluster 1 and cluster 2) was achieved using CART analysis, as described previously [22]. Briefly, variables included in this analysis were those selected for the cluster analysis (n=25 variables). Variables retained by the analysis and threshold values of these variables were obtained by CART analysis (a detailed explanation can be found in the supplementary material). This algorithm was then externally tested using data from the Canadian CF Registry.
Results
Patients
The French CF Registry contained 1942 adult CF patients as of December 31, 2005. A total of 370 subjects were excluded from the analysis; reasons for excluding these patients included a previous history of lung transplantation, being on a waiting list for transplantation in 2005, death in 2005, previous history of cancer, and missing data for lung function and/or BMI. A flowchart describing the study population is presented in figure 1. Thus, further analyses were conducted on 1572 adult CF patients; clinical characteristics are presented in table 1. From December 31, 2005 to December 31, 2015, lung transplantation was performed in 402 (25.6%) patients and death occurred in 232 (14.8%) patients, including 134 (8.5%) patients who had not undergone lung transplantation.
Cluster analysis
Cluster analysis was performed on 25 variables (listed in supplementary table S1), using FAMD followed by Ward's hierarchical classification. The dendrogram resulting from this analysis is shown in supplementary figure S1. Based on pseudo-F and pseudo-t2 statistics, and visual assessment of the dendrogram, the data could be optimally classified into seven clusters. Table 2 presents the clinical characteristics of the 1572 adult CF patients according to these seven clusters.
Cluster 1 was composed of young adults (median age 21.8 years) with classic CF characterised by diagnosis early in life and pancreatic insufficiency, but with mild to moderate impairment in lung function, low rates of chronic P. aeruginosa airway infection and no diabetes. Cluster 2 was composed of older adults (median age 30.2 years) with nonclassic CF characterised by late diagnosis, high rates of class IV or V CFTR mutations or incomplete genotype, high rates of pancreatic sufficiency, mild to moderate impairment in lung function and moderate rates of chronic P. aeruginosa airway infection. Cluster 3 was composed of patients with classic CF, moderate to severe lung function impairment, high rates of P. aeruginosa airway infection; two-thirds of these patients had i.v. antibiotics and one-third had diabetes mellitus. Cluster 4 was composed of patients with classic CF, high rates of pneumothorax (and thoracic surgery) and high rates of airway nontuberculous mycobacteria (NTM). Cluster 5 was composed of patients with classic CF and very high rates of respiratory insufficiency treated with long-term oxygen therapy or NIV. Cluster 6 was composed of patients with classic CF and chronic B. cepacia airway infection. Cluster 7 was composed of patients with classic CF, very high rates of treated aspergillosis and diabetes mellitus; all these patients were on oral steroids.
Prognostic relevance of the clusters
To assess the prognostic relevance of the seven clusters identified in the cluster analysis, we next examined rates of death and/or lung transplantation over 10 years in each cluster. Kaplan–Meier curves of survival (outcome: any death) or transplantation-free survival (outcome: death without lung transplantation/occurrence of lung transplantation) are shown in figure 2. Patients in cluster 1 (classic CF/low P. aeruginosa) and cluster 2 (nonclassic CF/late diagnosis) had the best prognosis with ∼5% death at 10 years; 10–15% of these patients underwent lung transplantation. Patients in cluster 3 (classic CF/moderate) and cluster 4 (classic CF/pneumothorax/NTM) had <10% death at 5 years, which increased to 13.5% and 20.6% at 10 years, respectively. Patients in cluster 5 (classic CF/respiratory insufficiency) had very high rates (67%) of lung transplantation and 45.6% death at 10 years. Patients in cluster 6 (classic CF/B. cepacia) had 51.4% death at 10 years with only 32.4% lung transplantation. Patients in cluster 7 (classic CF/allergic bronchopulmonary aspergillosis) had 37.3% death at 10 years with 31.4% lung transplantation. Table 3 summarises the main descriptors of patients in each cluster and prognostic outcomes in each cluster. Comparisons of rates of any death or rates of death without transplantation/occurrence of lung transplantation are shown in figure 3.
Algorithm for the identification of CF adults at low rates of 10-year death
As cluster 1 and cluster 2 were shown to have low (∼5%) and comparable death rates at 10 years, we next develop an algorithm that would allow us to identify these patients (n=613 patients: 262 patients in cluster 1 and 351 patients in cluster 2) using data obtain at study entry in 2005. The algorithm is presented in figure 4: patients with a least one factor negatively affecting the prognosis (see inset list in figure 4) were not considered at low risk; patient who did not have these negative factors were at low risk if they had no P. aeruginosa or, when they had P. aeruginosa, if they were pancreatic sufficient. The algorithm identified 515 out of 613 (84% of patients in clusters 1 and 2) patients and the rate of death in these 515 patients was 3.9% at 10 years. Kaplan–Meier analysis comparing 10-year death at lower risk versus higher risk according to the algorithm in French patients is presented in figure 5a.
Next, we tested this algorithm using data from adult patients in the Canadian CF Registry in 2005 (characteristics of patients are given in supplementary tables S3 and S4). Three variables necessary for excluding patients at higher risk according to the algorithm were unavailable in the Canadian CF Registry in 2005: infection with NTM, use of systemic steroids for >3 months and use of NIV. The algorithm was thus used on Canadian data without considering these variables: it identified 287 out of 1376 (21%) patients at low risk (10-year death: 7.7%). Kaplan–Meier analysis comparing 10-year death at lower risk versus higher risk according to the algorithm in Canadian patients is presented in figure 5b.
Discussion
In the present study, we sought to identify subgroups of adults with CF at low risk of mortality at 10 years. We performed a cluster analysis on data from adult CF patients contained in the French CF Registry in 2005 and found that a large proportion (39%) of CF adults had low (∼5%) death rates at 10 years. We next used CART analysis to develop an algorithm that allowed the identification of 84% of these patients, using data collected in 2005. The algorithm was then externally tested using data from the Canadian CF Registry in 2005 and showed consistent findings. These results indicate that many CF adults have a favourable long-term prognosis, a finding that has important socioeconomic consequences.
An important finding was that 39% of adult CF patients, grouped in cluster 1 (n=262 patients (17%)) and in cluster 2 (n=351 patients (22%)), shared rather good prognoses with 1% death at 5 years and 5% death at 10 years. These data indicate that a large proportion of adult CF patients have improved prognosis in the current era, compared with previous decades, which was in part related to lung transplantation (which occurred in 10–15% patients in these clusters). Most patients with at least one class IV or V mutation were found in cluster 2, confirming previous data showing that patients with residual function mutations often have nonclassic CF (characterised by late diagnosis and milder clinical features) and lower risk of death [15, 29]. Cluster 2 was also composed of a large group of patients with incomplete CFTR genotypes, in whom CF diagnosis could be questioned; this finding was in agreement with a previous report showing that patients found in CF registries and with less well-documented CF diagnosis usually had milder disease [30]. An important and novel finding of our study was that patients in cluster 1, who had a prognosis comparable to patients in cluster 2, had classic CF characterised by high occurrence of two loss-of-function CFTR alleles, diagnosis early in life and pancreatic insufficiency. Interestingly, these latter patients were younger than patients in other clusters, had higher lung function and nutritional status, very low rates of chronic P. aeruginosa infection, and no diabetes. These findings suggest that the prognosis is currently improving in some adults with classic CF. We speculate that these findings are related to improvement in lifelong care (e.g. nutrition and strategies for eradication of P. aeruginosa infection).
The study also highlighted the fact that despite comprehensive CF care and lung transplantation, ∼10% of patients belonged to subgroups with poor prognoses at 10 years. Thus, subjects with respiratory insufficiency (cluster 5: 8% of patients) and subjects with B. cepacia infection (cluster 6: 2% of patients) had comparable mortality rates at 10 years (45.6% and 51.4%, respectively). Of note, rates of lung transplantation were 2 times lower in cluster 6 versus cluster 5, confirming that B. cepacia complex infection is still often considered a contraindication to lung transplantation [9].
The present study has several strengths. It was conducted using a nationwide registry covering >90% of the French CF population, with a limited amount of missing data, which captures reliable data on CF patients (including patients who underwent lung transplantation) and survival [8]. Subgroups of CF patients were constructed using factorial and cluster analyses of variables with clinical relevance in the description of CF adults; importantly, the relevance of these subgroups was established using prognostic data (e.g. rates of death and/or lung transplantation) that did not participate in the construction of subgroups. CART analysis allowed identification of a relatively simple algorithm that highlighted patients at low rates of 10-year death in France; importantly, this algorithm was externally tested using data from the Canadian CF Registry.
We also recognise limitations. The present study was designed to identify subgroups of individuals sharing comparable characteristics and long-term prognosis, but was not intended to determine prognosis at the individual level in adults with CF. Thus, belonging to a group considered “not at low risk” does not necessarily mean that the individual prognosis of the patient was poor, but rather that the 10-year risk of death in this group of patients was considered too high for providing loan insurance by banks and insurance companies. Although our algorithm identified a subgroup of adult CF patients at low risk of death, it did not identify all adults at low risk of death (supplementary tables S5 and S6). Some of the data (i.e. infection with NTM, use of systemic steroids for >3 months and use of NIV) necessary to run the algorithm were unavailable in the Canadian CF Registry in 2005, which may have resulted in the selection by the algorithm of a limited number of patients at higher risk of death. However, the algorithm consistently identified patients with low rates (7.7%) of death at 10 years in Canadian CF adults despite this limitation. The effect of missing variables was likely limited by the low prevalence of patients treated with steroids, having NTM infection or treated with NIV and by the fact that some of these patients were probably excluded by other variables (e.g. patients treated with NIV usually have low FEV1 and/or oxygen therapy). Finally, the study was limited to CF adults and excluded the paediatric population. This choice was related to the fact that death and lung transplantation have become extremely low in paediatric CF patients in France [8], as in other developed countries [7]. Although our data were obtained using data from France and Canada, data generated from these two countries are likely relevant to other countries with comparable access to specialised CF care and to lung transplantation.
The present study highlighted the recent evolution of prognosis in CF adults. Although studies performed 10–15 years ago suggested that intrinsic patient characteristics (e.g. CFTR genotype associated with residual CFTR function) were associated with improved prognosis [15, 29], the present study extends these findings by suggesting that improved prognosis is observed in some adults (cluster 1) with classic CF who were exposed to appropriate CF care (e.g. eradication strategies for P. aeruginosa [31]) and who can benefit from lung transplantation. The finding that our algorithm was able to detect a large proportion of patients with low rates of 10-year death in France and in Canada led to a recent agreement between the French government and insurance companies to provide access to loan insurance for CF adults at low risk of 10-year death identified with this algorithm. Although the agreement is limited to a fraction of the overall CF population, it constitutes an important first step on the road to providing normal access to mortgage insurance for all patients with CF [32].
In conclusion, CF has progressively evolved from a devastating paediatric disease, causing early death in children, to a disease where almost all patients reach adult age in developed countries. The present study further indicated that large subgroups of adults with CF have improved longevity with low rates of 10-year death, confirming that prognosis continues to improve in the CF adult population. These latter data further indicate that lifelong specialised CF care results in better health status and prolonged survival, even in subjects with classic CF. We speculate that the future care of CF patients will presumably be a mix of patients with milder disease (especially younger adults who had lifelong exposure to high-quality CF care) and older patients with more severe disease, often due to lack of intensive CF care in their early years. Of note, these results were obtained at a time when novel drugs targeting defects in CFTR were not widely available. Future use of these disease-modifying agents will likely further reduce the survival gap that still exists between CF patients and the general population.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material ERJ-01943-2018_Supplement
Footnotes
This article has supplementary material available from erj.ersjournals.com
Conflict of interest: P-R. Burgel reports grants and personal fees from Boehringer Ingelheim, and personal fees from AstraZeneca, Chiesi, GSK, Novartis, Vertex, Zambon and MSD, outside the submitted work.
Conflict of interest: L. Lemonnier has nothing to disclose.
Conflict of interest: C. Dehillotte has nothing to disclose.
Conflict of interest: J. Sykes has nothing to disclose.
Conflict of interest: S. Stanojevic has nothing to disclose.
Conflict of interest: A.L. Stephenson receives a stipend from Cystic Fibrosis Canada for her work as Director of the Canadian CF Registry, outside the submitted work.
Conflict of interest: J-L. Paillasseur has nothing to disclose.
Support statement: This study was funded by the French CF association Vaincre la Mucoviscidose. Funding information for this article has been deposited with the Crossref Funder Registry.
- Received November 19, 2017.
- Accepted December 14, 2018.
- Copyright ©ERS 2019