Abstract
Sarcoidosis is a rare disease of unknown cause with wide heterogeneity in clinical features and outcomes. We aimed to explore sarcoidosis phenotypes and their clinical relevance with particular attention to extrapulmonary subgroups.
The Epidemiology of Sarcoidosis (EpiSarc) study is a French retrospective multicentre study. Sarcoidosis patients were identified through national hospitalisation records using appropriate codes from 11 hospital centres between 2013 and 2016 according to a standardised protocol. Medical charts were reviewed. The phenotypes of sarcoidosis were defined using a hierarchical cluster analysis.
A total of 1237 patients were included (562 men and 675 women). The mean age at sarcoidosis diagnosis was 43.5±13 years. Hierarchical cluster analysis identified five distinct phenotypes according to organ involvement and disease type and symptoms: 1) erythema nodosum, joint involvement and hilar lymph nodes (n=180); 2) eye, neurological, digestive and kidney involvement (n=137); 3) pulmonary involvement with fibrosis and heart involvement (n=630); 4) lupus pernio and a high percentage of severe involvement (n=41); and 5) hepatosplenic, peripheral lymph node and bone involvement (n=249). Phenotype 1 was associated with being European/Caucasian and female and with non-manual work, phenotype 2 with being European/Caucasian, and phenotypes 3 and 5 with being non-European/Caucasian. The labour worker proportion was significantly lower in phenotype 5 than in the other phenotypes.
This multicentre study confirms the existence of distinct phenotypes of sarcoidosis, with a non-random distribution of organ involvement. These phenotypes differ according to sex, geographical origin and socioprofessional category.
Abstract
There are five distinct phenotypes of sarcoidosis, with a non-random distribution of organ involvement. These five phenotypes differ according to sex, geographical origin and socioprofessional category. https://bit.ly/3iCurZK
Introduction
Sarcoidosis is a rare heterogeneous multisystemic granulomatous disease of unknown cause. It preferentially affects the lungs and intrathoracic lymph nodes. Extrapulmonary involvement is observed in 30%–50% of patients and occurs more frequently in skin, eye, lymphatic organ, liver and spleen involvements [1]. Sarcoidosis has a variable clinical presentation and heterogeneous outcomes, hitherto unexplained.
The first clinical phenotype of sarcoidosis was described by the Swedish pulmonologist Sven Löfgren in 1946 [2]. So-called Löfgren syndrome is mostly seen in young European/Caucasian patients with acute-onset erythema nodosum, bilateral hilar lymphadenopathy, fever or migratory polyarthritis and typically has a favourable outcome. Several studies have since described the heterogeneity of sarcoidosis according to ethnicity, sex and age [3–7] and identified phenotypes, but other phenotypes are yet to be recognised. The determination of sarcoidosis phenotypes was recently supported through a clustering approach by the Genotype–Phenotype Relationship in Sarcoidosis (GenPhenReSa) study of Caucasian patients in an international panel of centres [8]. This approach identified five subgroups of patients characterised by pulmonary and extrapulmonary organ involvement. Subgroups of extrapulmonary involvement have also been reported in a significant American cohort, in which skin lesions were the most important feature [9]. More recently, 18F-2-fluoro-2-deoxy-d-glucose (FDG) positron emission tomography (PET) coupled with cluster analysis succeeded in identifying in a small sample of patients an ordered stratification into four phenotypes [10]. However, other studies are necessary to confirm these findings and allow a better understanding of sarcoidosis, phenotype expression and treatment guidance [11].
The aim of the Epidemiology of Sarcoidosis (EpiSarc) study was to identify and explore sarcoidosis phenotypes using an unsupervised analytical approach in a well-characterised cohort of French sarcoidosis patients with extrathoracic involvement.
Patients and methods
The EpiSarc study was a multicentre (11 French centres) retrospective transversal study. Patients were identified using the French hospitalisations database (Programme de Médicalisation des Systèmes d'Information), which records the medical codes of hospitalisations (according to the International Classification of Diseases, 10th revision). Selected patients were hospitalised between January 2013 and December 2016 with a medical code corresponding to sarcoidosis (D86, G532 and/or M633), as newly diagnosed or chronic disease cases. Medical charts were manually reviewed for the verification of inclusion criteria. The inclusion criteria were as follows: 1) a sarcoidosis diagnosis according to the American Thoracic Society and World Association for Sarcoidosis and Other Granulomatous Diseases (WASOG) recommendations, including compatible clinico-radiological findings, histological evidence of noncaseating granuloma and the exclusion of other granulomatous diseases [12]; and 2) the involvement of at least one extrapulmonary organ. Patients with Löfgren syndrome, defined by the association of fever, erythema nodosum, arthralgia and bilateral hilar lymphadenopathy with or without histological evidence, were also included. Demographic, clinical, biological and imaging data were extracted from medical records. Organ involvements were recorded according to the WASOG organ assessment instrument and were classified as present if they were observed at any time in the sarcoidosis history [13]. We studied organ involvement listed in the extrapulmonary Physician Organ Severity Tool (ePOST) score and pulmonary involvement [14]. In addition, the analysis also considered the sociodemographic characteristics of the patients (age, self-reported geographical origin and occupation at the time of diagnosis, classified according to the Institut National de la Statistique et des Etudes Economiques (INSEE) tool (www.insee.fr/fr/information/2497958) and to the International Standard Classification of Occupation (ISCO) 2008), comorbidities (thromboembolism disease, neoplasm, autoimmune disease, chronic infectious disease), treatments and outcomes. Occupations were categorised into six groups according to INSEE classification and ISCO 2008: craft and trade-related workers (ISCO 2008 group 7); labour workers (ISCO 2008 groups 8 and 9); farmers and skilled agricultural workers (ISCO 2008 group 6); clerk support workers (ISCO 2008 group 4); intermediary occupations (ISCO 2008 groups 0, 3 and 5); and upper class, framework and higher intellectual professions (ISCO 2008 groups 1 and 2).
Statistical analysis
Variables are presented as percentages (categorical variables) or mean±sd (continuous variables) as appropriate. The phenotypes of sarcoidosis were defined using a hierarchical cluster analysis. For hierarchical ascendant clustering, preferential variable associations were obtained after including the variables of organ involvement listed in the ePOST score, in addition to the following, referring to intrathoracic involvement: mediastinal lymph nodes, parenchymal involvement and lung fibrosis. Clustering was represented in a dendrogram plot (“pvclust packages” in R version 3.3, R Project for Statistical Computing, Vienna, Austria) generated by Ward's method. Principal component analysis was applied to select the final clinical variables to include in the model to identify phenotypes according to statistical significance. The population was then separated into clusters (“factoMineR packages” in R version 3.3) [15]. Unsupervised hierarchical cluster analysis was performed using the Euclidian method to create the distance matrix and Ward's minimum variance method. Differences in sarcoidosis phenotypes were compared using Chi-squared test or Fisher's exact test when necessary (categorical variables) and a t-test, ANOVA, Wilcoxon test or Kruskal–Wallis test (continuous variables) when appropriate. In addition, multinomial logistic regression was used to confirm the relationships of the different clusters to the set of independent variables. The multinomial logistic regression models were adjusted for sex, geographical origin and socioprofessional activities. All tests were bilateral, with Type I errors of 5% and 95% confidence intervals. The analyses were performed in R software, version 3.3. This study was approved according to the French legislation by the Commission Nationale Informatique et Libertés organisation.
Results
Demographic and clinical characteristics
During the study period, 2090 patients were screened. Among them, 853 patients were excluded (figure 1) because of the absence of histological documentation (n=321, 15%), the absence of extrapulmonary organ involvement (n=275, 13%), insufficient data in medical records (n=95, 5%), the absence of a sarcoidosis diagnosis (n=155, 7%) and redundant follow-up in two participating centres (n=7, 0.05%). The final study included 1237 patients, 562 (44.5%) male and 675 (54.5%) female. The mean age at sarcoidosis diagnosis was 43.5±13 years. Most of the patients were European/Caucasian (n=541, 44%), followed by Afro-Caribbean (n=314, 25.5%). Detailed clinical characteristics of the population are described in table 1.
Flow chart. Description of the EpiSarc study sample enrolled from the French hospitalisation database (Programme de Médicalisation des Systèmes d'Information (PMSI)) in the 11 participating centres, from January 2013 to December 2016.
Clinical presentation of patients in the EpiSarc study
Clinical characteristics of the patients are described in table 2. Organ involvement mainly consisted of hilar lymphadenopathy (n=1115, n=90%), parenchymal lung involvement (n=792, 64%), pulmonary fibrosis (n=176, 14%), peripheral lymphadenopathy (n=518, 42%), skin involvement (n=396, 32%) and joint involvement (n=350, 28%). The mean number of organs involved was 3.4±1.5, and 520 patients (42%) had more than three organs involved.
Occupational activities, geographical origins and outcomes of patients in the EpiSarc study
Sarcoidosis phenotypes
Hierarchical cluster analysis identified five groups of patients, corresponding to five sarcoidosis phenotypes. The clinical characteristics of the population according to these five phenotypes are described in table 2. The five phenotypes were as follows: 1) a higher frequency of hilar lymph nodes (99%) and joint involvement (90%), a lower frequency of parenchymal lung involvement (39%) and the presence of erythema nodosum (67%) (n=180); 2) a higher frequency of neurological (31%), digestive (24%) and kidney (12%) involvement and a lower frequency of mediastino-pulmonary involvement (33%) (n=137); 3) a higher frequency of parenchymal lung involvement (72%), including pulmonary fibrosis (19%), cardiac involvement (19%) and cutaneous sarcoids (37%) (n=630); 4) an aggregation of all patients with lupus pernio (95%) and a high frequency of ear, nose and throat (ENT) (68%) and parenchymal lung involvement (88%), including pulmonary fibrosis (29%) (n=41); and 5) a high frequency of parenchymal lung involvement (82%), peripheral nodes (68%) and hepatic (68%), splenic (68%) and bone (29%) involvement (n=249). Parenchymal lung involvement, including pulmonary fibrosis, was present in clusters 3, 4 and 5. Cluster 3 associated neurological and abdominal involvement, cluster 4 aggregated cases of lupus pernio and ENT involvement, and cluster 5 comprised peripheral lymph nodes and hepatosplenic and bone involvement, in addition to parenchymal localisation. There was no difference in age at diagnosis regarding phenotype status. To confirm reproducibility and the stability of the results, the cluster analysis was performed twice, once after including patients from seven centres (corresponding to 1081 patients), then after including all 11 centres (1237 patients), leading to a concordance of 85% of the results, with the same clustering.
In addition to cluster separation, non-supervised preferential associations of organ involvement are shown in figure 2. Erythema nodosum was associated with joint involvement. Peripheral nodes were associated with bone and hepatosplenic involvement. Cardiac localisations were associated with parenchymal lung involvement and fibrosis. Lupus pernio was associated with ENT manifestations, uveitis and neurological involvement. These associations strengthened the results of the cluster distributions.
Preferential association of organ involvement in sarcoidosis in the EpiSarc study. a) Hierarchical cluster analysis. The bold horizontal line represents the section of the population allowing the smallest number of statistically valid clusters. b) Factor map. c) Dendrogram, which includes organ involvement listed in the extrapulmonary Physician Organ Severity Tool (ePOST) score and pulmonary involvement.
Treatments and outcomes
Treatments and outcomes according to the five phenotypes are described in table 2. The mean duration of follow-up was 8.1±8 years, and was longest in cluster 4. The most common treatment regimens were corticosteroids (n=907, 73%), methotrexate (n=388, 31%), hydroxychloroquine (n=283, 23%), azathioprine (n=161, 13%), tumor necrosis factor-α (TNF-α) antagonists (n=113, 9%) and cyclophosphamide (n=61, 5%). During follow-up, 23 patients died (2%).
Treatment regimens (i.e.corticosteroids, methotrexate, azathioprine, TNF-α antagonists and/or cyclophosphamide) were significantly different among the five phenotypes.
Phenotype 1 was treated less than other phenotypes with corticosteroids (n=95, 53% versus 77%, p<0.001), with methotrexate (n=34, 19% versus 34%, p<0.001), with azathioprine (n=13, 7% versus 14%, p=0.02), with TNF-α antagonists (n=7, 4% versus 10%, p=0.01) and with cyclophosphamide (n=3, 2% versus 5%, p=0.046). Phenotype 2 was treated more than other phenotypes with cyclophosphamide (n=15, 11% versus 9%, p=0.02) and treated less with azathioprine (n=10, 7% versus 14%, p=0.046). Phenotype 3 was treated more than other phenotypes with corticosteroids (n=493, 78% versus 68%, p<0.001). Phenotype 4 was treated more than other phenotypes with corticosteroids (n=36, 88% versus 73%, p=0.05), methotrexate (n=31, 76% versus 30%, p<0.001), azathioprine (n=14, 34% versus 12%, p<0.001) and TNF-α antagonists (n=17, 41% versus 8%, p<0.001). Phenotype 5 was treated less than the other clusters with TNF-α antagonists (n=12, 5% versus 10%, p=0.01). The number of treatments was also different between phenotypes (p<0.001) (figure 3). Patients received a lower number of treatments with phenotype 1 (1.1±1.3) than with phenotype 2 (1.6±1.3), phenotype 3 (1.7±1.2) and phenotype 5 (1.6±1.2). The number of treatments was highest among patients with phenotype 4 (3.6±1.7). The number of deaths was also significantly different in the clusters, with no deaths from phenotype 1; 0.5%–3% with phenotypes 2, 3 and 5; and 5% with phenotype 4 (p=0.01).
Treatments administered in the five phenotypes of sarcoidosis. Percentages of patients receiving corticosteroids, methotrexate, azathioprine, hydroxychloroquine, cyclophosphamide and tumor necrosis factor-α (TNF-α) antagonists in the five phenotypes of sarcoidosis.
Socioeconomic factors related to sarcoidosis phenotypes
The phenotypes varied according to socioeconomic characteristics (table 3). Females were represented differently according to the phenotypes (p<0.002), and the female to male ratio was higher in phenotypes 1 (66% women) and 4 (68% women) than in phenotypes 2 (56% women), 3 (51% women) and 5 (53% women). There was no difference in the age at diagnosis of sarcoidosis between phenotypes.
Factors related to clusters of sarcoidosis of patients in the EpiSarc study
The occupational repartition differed significantly between the five phenotypes (p=0.04). Labour workers (40.5%) were the most frequent occupational population among the whole cohort. The proportion of labour workers was higher for phenotype 3 (46%), phenotype 2 (44%) and phenotype 5 (35%).
The geographical origin repartition was significantly different regarding the phenotype (p=0.01). Patients of European/Caucasian origin (44%) were the most represented geographical category in the study population. The proportion of European/Caucasian origin was higher for phenotype 1 (51%) and phenotype 2 (62.5%) than for phenotypes 3 (42%), 5 (37.5%) or 4 (20%).
The adjusted-multinomial analysis confirmed the results of the univariate analysis. The female proportion was significantly higher for phenotype 1 than for the other phenotypes (OR 0.6, 95% CI 0.4–0.9, p=0.008); the labour worker proportion was significantly lower for phenotypes 1 (OR 0.6, 95% CI 0.4–0.7, p=0.03) and 5 (OR 0.6, 95% CI 0.4–0.9, p=0.01) than for the other phenotypes. European/Caucasian origin was significantly higher for phenotype 1 (OR 1.7, 95% CI 1.1–2.5, p=0.01) and phenotype 2 (OR 2.2, 95% CI 1.4–3.4, p<0.001) than for the other phenotypes.
Discussion
The EpiSarc study identified five phenotypes of sarcoidosis in a large French multicentre and multi-ethnic cohort of 1237 well-characterised patients. These phenotypes may be useful for better management of the disease [11]. The five phenotypes significantly differed by sex, geographical origin and occupational categories. The five phenotypes were 1) erythema nodosum, joint involvement and hilar lymph nodes, mainly European/Caucasian female; 2) neurological, digestive and/or kidney involvement; 3) parenchymal lung involvement and fibrosis, cardiac and skin involvement, mainly in non-European/Caucasian patients; 4) lupus pernio and severe involvement; and 5) parenchymal pulmonary involvement, peripheral nodes, and hepatic, splenic and bone involvement, mainly in non-European/Caucasian patients. In addition to clustering phenotypes, we identified a preferential association of organ involvements, characterised by the association of erythema nodosum with joint involvement; hepatic, splenic, bone and peripheral node involvement; pulmonary with cardiac involvements; kidney with digestive involvement; lupus pernio with ENT involvement; and uveitis with neurological involvement.
In the GenPhenReSa study, which included 2163 patients, the phenotypes were slightly distinct from those of the EpiSarc study. They included 1) abdominal organ involvement, 2) ocular-cardiac-cutaneous-central nervous system (CNS) disease involvement, 3) musculoskeletal-cutaneous involvement, 4) pulmonary and intrathoracic lymph node involvement and 5) extrapulmonary involvement. However, the GenPhenReSa study only included Caucasian patients. Organ involvement was different in the GenPhenReSa study: we observed more skin (32% versus 16%), eye (24% versus 8%), CNS (22% versus 3%), cardiac (10% versus 3%) and gastrointestinal involvement (3% versus 0.6%). Several explanations may be proposed: our cohort is a multi-ethnic cohort, and African American patients usually have more organ involvement [7]. Moreover, our patients were mostly enrolled from internal medicine centres, whereas the GenPhenReSa study participants were mostly enrolled from pulmonology centres. Our French centres are highly specialised centres, and it is likely that we recruited patients with more severe disease, particularly for cardiac, eye or CNS involvement. We only recruited hospitalised patients. Finally, we excluded patients with isolated intrathoracic sarcoidosis from our cohort, in order to highlight the extrapulmonary localisations in the clustering analysis. Thus, we think that our results are complementary to those of the GenPhenReSa study. In a Spanish study of 1230 patients, including 91.4% white patients, those with pulmonary involvement had a lower frequency of skin and salivary gland involvement, and a higher frequency of liver involvement [16]. This study used multinomial logistic regression analyses to test the associations, whereas we used unsupervised hierarchical clustering. Thus we better detected phenotypes than this Spanish study, which studied associations. In another study of 195 patients from pulmonology centres, the authors used 18F-FDG PET to identify four phenotypes based on not only organ involvement, but also disease activity [10]. However, the phenotypes were not compared for severity and socioeconomic factors. The impact of this study is also limited because not all patients with sarcoidosis need 18F-FDG PET, which is an expensive examination with radiation exposure.
In the EpiSarc study, unsupervised hierarchical clustering allowed the aggregation of organ involvement corresponding to Löfgren syndrome [17]. We also demonstrated female and European/Caucasian predominance among these patients. Löfgren syndrome was first described in 1946 and is recognised as a distinct phenotype of sarcoidosis. Patients typically have acute disease onset together with bilateral hilar lymphadenopathy, erythema nodosum and/or bilateral ankle arthritis or arthralgia. Patients with Löfgren syndrome typically have favourable outcomes, especially in HLA-DRB1*03+ individuals [18]. We found that Löfgren/phenotype 1 was less treated, with a mean number of treatments of 1.1±1.3 and lower rates for each molecule. It was also associated with better outcomes with no deaths. Extrapulmonary sarcoidosis specificities were recently reported in an American cohort [9]. In this multicentre study, the skin was the most frequently involved extrathoracic organ in patients with or without lung involvement. Our study focused on hospitalised patients, and this could explain the difference in organ involvement frequencies from other cohorts.
As previously reported, in the EpiSarc study, lupus pernio was strongly associated with a severe phenotype [19, 20]. With this phenotype, TNF-α antagonists were more frequently used (41%), and patients received more treatments than with other phenotypes. Lupus pernio was associated with pulmonary fibrosis and ENT, CNS, bone and joint involvement and was more frequently seen in Afro-Caribbean women.
This study has several strengths. The patients were recruited from 11 centres highly specialised in sarcoidosis. The cohort was well characterised: in addition to the verification of inclusion criteria, organ involvement was assessed from individual medical records using the WASOG instrument, leading to a strong validity of organ involvement. The main limitations of the study were the retrospective design and a possible selection bias of particularly severe patients with complicated, multi-organ involvements because we selected only hospitalised patients. We had only one cohort, so we could not obtain external validation. Additionally, the analysis did not take into account the time of appearance of organ involvement from the onset of disease. Both newly diagnosed cases and cases that had chronic disease were included in the cohort: in incident cases, additional organ involvement could have occurred during the disease course and accumulated, even though most of the time organ involvement in sarcoidosis is present at diagnosis. We provided a clustering based only on the basis of organ involvement, not on disease activity. With this study, we did not obtain a complete understanding of why sarcoidosis phenotypes exist: genetic, environmental, epigenetic and immunologic factors are increasingly being evaluated as possible mechanisms for sarcoidosis aetiology, but they are probably also major determinants of disease heterogeneity [21]. This should be evaluated in further studies.
In conclusion, our multicentre study highlights and confirms the existence of distinct phenotypes of sarcoidosis, with a non-random distribution of organ involvement. These phenotypes are associated with sex, geographical origin and socioprofessional category in a multi-ethnic cohort. These phenotypes should be used to understand sarcoidosis not as a single disease but as various syndromes. The genetic and environmental determinants of such phenotypes have to be elucidated in future studies.
Shareable PDF
Supplementary Material
This one-page PDF can be shared freely online.
Shareable PDF ERJ-01160-2020.Shareable
Footnotes
Author contributions: I. Annesi-Maesano, H. Nunes, D. Valeyre, Z. Amoura and F. Cohen Aubart designed the study. R. Lhote, R. Borie, K. Sacré, N. Schleinitz, D. Launay, H. Devilliers, P. Bonniaud, M. Hamidou, M. Mahevas, F. Lhote, J. Haroche and F. Cohen Aubart collected the data. R. Lhote, I. Annesi-Maesano and F. Cohen Aubart conducted the statistical analysis. R. Lhote, I. Annesi-Maesano, H. Nunes, D. Valeyre and F. Cohen Aubart analysed and interpreted the data. R. Lhote, I. Annesi-Maesano and F. Cohen Aubart wrote the manuscript. All authors critically reviewed and approved the final version of the manuscript.
Conflict of interest: R. Lhote has nothing to disclose.
Conflict of interest: I. Annesi-Maesano has nothing to disclose.
Conflict of interest: H. Nunes has nothing to disclose.
Conflict of interest: D. Launay has nothing to disclose.
Conflict of interest: R. Borie has nothing to disclose.
Conflict of interest: K. Sacré has nothing to disclose.
Conflict of interest: N. Schleinitz has nothing to disclose.
Conflict of interest: M. Hamidou has nothing to disclose.
Conflict of interest: M. Mahevas has nothing to disclose.
Conflict of interest: H. Devilliers has nothing to disclose.
Conflict of interest: P. Bonniaud has nothing to disclose.
Conflict of interest: F. Lhote has nothing to disclose.
Conflict of interest: J. Haroche has nothing to disclose.
Conflict of interest: P. Rufat has nothing to disclose.
Conflict of interest: Z. Amoura has nothing to disclose.
Conflict of interest: D. Valeyre has nothing to disclose.
Conflict of interest: F. Cohen Aubart has a patent pending for IL-6 antagonists as a treatment of sarcoidosis.
Support statement: R. Lhote was supported by Master and PhD grants of the Open Health Institute, the Fondation pour la Recherche Médicale (FRM) and Gérard Trouillet S.A. Funding information for this article has been deposited with the Crossref Funder Registry.
- Received April 13, 2020.
- Accepted August 10, 2020.
- Copyright ©ERS 2021