Longitudinal analysis of sarcoidosis blood transcriptomic signatures and disease outcomes
- Robert Su1,
- Michael M. Li2,
- Nirav R. Bhakta2,
- Owen D. Solberg2,
- Eli P.B. Darnell2,
- Joris Ramstein2,
- Suresh Garudadri2,
- Melissa Ho2,
- Prescott G. Woodruff2,3 and
- Laura L. Koth2⇑
- 1Division of Rheumatology, University of California, San Francisco, CA, USA
- 2Division of Pulmonary, Critical Care, Sleep and Allergy, Dept of Medicine, University of California, San Francisco, CA, USA
- 3Cardiovascular Research Institute, University of California, San Francisco, CA, USA
- Laura L. Koth, Division of Pulmonary, Critical Care, Sleep and Allergy, Dept of Medicine, University of California, San Francisco, Box 0111, 505 Parnassus Avenue, San Francisco, CA 94143, USA. E-mail: laura.koth{at}ucsf.edu
Abstract
Previously, we demonstrated concordance in differentially expressed genes in sarcoidosis blood and lung, implicating shared dysfunction of specific immune pathways. In the present study, we hypothesised that expression levels of candidate genes in sarcoidosis blood could predict and track with disease outcomes longitudinally.
We applied Ingenuity Pathway Analysis to a cross-sectional derivation microarray dataset (n=38) to identify canonical pathways and candidate genes associated with sarcoidosis. In a separate longitudinal sarcoidosis cohort (n=103), we serially measured 48 candidate gene transcripts, and assessed their relation to disease chronicity and severity.
In the cross-sectional derivation study, pathway analysis showed upregulation of genes related to interferon signalling and the role of pattern recognition receptors, and downregulation of T-cell receptor (TCR) signalling pathways in sarcoidosis. In the longitudinal cohort, factor analysis confirmed coregulation of genes marking these pathways and identified CXCL9 as an additional candidate pathway. CXCL9 and TCR factors discriminated between chronic versus nonprogressive disease, and CXCL9 predicted disease outcomes longitudinally. Interferon factor was similarly increased in both disease phenotypes. Factors associated with lung function decline included decreased TCR factor and increased CXCL9.
These findings demonstrate blood transcriptomic signatures reflecting TCR signalling and CXCL9 predict sarcoidosis chronicity and correlate with disease severity longitudinally.
Abstract
Blood gene transcript measurements predict sarcoidosis chronicity and severity longitudinally http://ow.ly/zFGnP
Introduction
Sarcoidosis is a heterogeneous inflammatory disease characterised by noncaseating granulomas that can affect multiple organ systems, most commonly the lung. Although many affected individuals have a self-limited course, approximately one-third of patients develop chronic, progressive disease of the lungs that can result in fibrosis. The pathogenesis of sarcoidosis remains to be elucidated although inflammatory cytokines have been implicated, including interferon (IFN)-γ and tumour necrosis factor (TNF), which are strongly associated with localised granulomatous inflammation [1–3]. When treatment is indicated, corticosteroids remain the mainstay of therapy; however, TNF inhibitors may have benefit in refractory disease [4–6].
The development of noninvasive markers that predict clinical course or correlate with sarcoidosis disease activity remains an important unmet clinical need. Previously, in a cross-sectional study [7], we applied a genome-wide approach and identified whole-blood gene expression markers of sarcoidosis that implicated dysfunction of a set of immune pathways, including IFN signalling. Using available gene microarray expression datasets, we showed concordance of these sarcoidosis-associated gene expression markers in both blood and lung. However, our prior study, and most other studies of candidate gene expression markers in sarcoidosis, lacked longitudinal serial measurements and prospective determination of disease course [8].
Based on our prior cross-sectional study [7], we hypothesised that expression levels of specific sets of genes in sarcoidosis blood would track with disease outcomes longitudinally. To test this hypothesis, we measured serial blood quantitative (q)PCR measurements of candidate gene transcripts in a large longitudinal sarcoidosis cohort, and evaluated their relation to disease chronicity and pulmonary function.
Material and methods
Study subjects
Sarcoidosis subjects and healthy controls were recruited from the community at large, either self-referred via online recruitment or referred by the University of California San Francisco (UCSF) (San Francisco, CA, USA) and community providers, to participate in the UCSF Sarcoidosis Research Program. All subjects provided informed consent to participate in this study and all sarcoidosis subjects fulfilled diagnostic criteria based upon American Thoracic Society guidelines [9]. Retrospective medical records for sarcoidosis subjects were collected at enrolment. Blood samples and clinical data (e.g. changes in medications and clinical notes from the subject’s healthcare provider) were then collected prospectively at the baseline study visit and at follow-up intervals of 6 months. Among sarcoidosis subjects, pulmonary function testing (PFT) results were used if performed within 6 months of study visits; otherwise, spirometry was performed at each visit (HDpft 1000; nSpire, Longmont, CO, USA) [10].
Sarcoidosis subjects were classified as nonprogressive, chronic or uncertain based on the schema depicted in figure 1. In prior studies, the course of sarcoidosis disease has been characterised as chronic if disease activity persisted beyond 2 years [11–16]. As such, the 2-year threshold was incorporated into our classification schema. Use of this classification schema required the use of both retrospective and prospective clinical data to categorise patients. Nonprogressive subjects had stable PFT and no history of flare beyond 2 years of diagnosis. In contrast, chronic patients had either progressive decline in PFT during the study period or a history of disease flare beyond 2 years of diagnosis. Progressive declines in PFT were defined as a decrease in forced vital capacity (FVC) ≥10% predicted or diffusing capacity of the lung for carbon monoxide (DLCO) ≥15% predicted during the study period as per published thresholds [17]. A flare was defined as a recurrence of extrapulmonary disease or pulmonary symptoms severe enough to warrant escalation of treatment to ≥20 mg of corticosteroids daily [18] or initiation of a TNF inhibitor.
Selection of genes for longitudinal study
Using Ingenuity canonical pathway analyses (Ingenuity Systems, Redwood City, CA, USA), we identified differentially expressed genes belonging to canonical pathways within a blood gene expression microarray dataset from our previously published study of 38 sarcoidosis subjects (Gene Expression Omnibus (GEO) accession number GSE19314) [7]. Based on the results of the pathway analysis and, in part, by manual curation of genes implicated in sarcoidosis or granulomatous inflammation, 48 genes of interest were selected for Fluidigm qPCR analysis (Fluidigm San Francisco, CA, USA) (table 1). Primer and probe sequences are listed in online supplementary table E1.
RNA processing and qPCR
Blood samples were collected with PAXgene Blood RNA tubes (PreAnalytiX, Hombrechtikon, Switzerland) and RNA extracted with the automated instrument Qiacube (Qiagen, Valencia, CA, USA). RNA concentration and RNA integrity number were calculated with Agilent RNA 6000 Nano Kit and Agilent 2100 Bioanalyzer (Agilent Technologies, Waldbronn, Germany). High-throughput qPCR was performed using Fluidigm 96.96 Dynamic Arrays IFC with each gene run in duplicate. Sarcoidosis expression values are expressed as log2 cycle threshold (Ct) values relative to mean healthy control Ct values. See the online supplementary material for details of expression data processing and normalisation.
Statistical analysis
Within Ingenuity, Fisher’s exact test was used to determine the probability that the association between the genes in the sarcoidosis derivation cross-sectional microarray data set and the canonical pathway was explained by chance alone. Other statistical analyses were performed using Stata version 12 (StataCorp, College Station, TX, USA). Exploratory factor analysis was performed on sarcoidosis qPCR results. Exploratory factor analysis identifies clusters (i.e. factors) of intercorrelated variables (i.e. gene transcripts) without making any prior assumption on the potential relationships between measured variables.
Unpaired two-tailed t-tests were used for comparisons of continuous data and Fisher’s exact test for categorical data. Logistic regression modelling was used to generate predicted probabilities of chronic versus nonprogressive disease. Receiver operating characteristic (ROC) curve analyses of the predicted probabilities and a time-to-event analysis were performed to assess the predictive utility of the factors. Mixed-effects modelling was performed with the longitudinal qPCR data using unstructured covariance for the random effects (i.e. subject identification number and visit number or time from diagnosis). Age, sex, lymphocyte count, time from diagnosis and visit numbers were included in mixed-effects models. Models also controlled for immunosuppression use when feasible (i.e. when not collinear with other covariates). p-values less than 0.05 were considered significant.
Results
Patient characteristics
Neither sex, race nor ethnicity was different between groups (table 2). The mean age of the sarcoidosis group was older than that of the healthy group and, therefore, all analyses presented in this study were adjusted for age.
At baseline, there were no differences in age, sex, race or Scadding stage in the nonprogressive, as compared to the chronic, sarcoidosis subjects (table 3). A higher proportion of chronic sarcoidosis subjects were on systemic immunosuppression, and had significantly lower FVC % predicted, FEV1 % predicted, and DLCO % predicted compared with nonprogressive subjects.
Pathway analyses of sarcoidosis blood gene expression implicates dysregulation of canonical immune pathways
We used a publically available microarray dataset that we previously generated using blood samples from an unrelated cohort of 38 sarcoidosis subjects (GEO accession number GSE19314) [7] to perform Ingenuity Pathway Analysis. This identified upregulation of genes related to IFN signalling (p=0.0024) (fig. E1) pathways and the role of pattern recognition receptors in recognition of bacteria and viruses (p=0.0006) (fig. E2), and downregulation of genes related to the T-cell receptor (TCR) signalling pathway (p=0.019) (fig. E3) in sarcoidosis blood, compared with healthy controls. Primers were selected for genes contained within these canonical pathways and, additionally, by manual curation as mentioned earlier (table 1).
Factor analysis demonstrates distinct clusters of intercorrelating genes
Factor analysis using the qPCR data derived from the longitudinal sarcoidosis cohort consisting of 103 subjects yielded eight factors, of which three (factors 1, 2, and 3; supplemental table E2) corresponded to the same three canonical biological pathways identified by Ingenuity analysis as described earlier. Factor 1 consisted of several highly intercorrelated genes associated with the IFN signalling pathway, including STAT1, STAT2, IRF1, IRF7 and IRF9. Factor 1 also included interferon-γ-inducible genes such as TAP1 and GBP1. Factor 2 genes were associated with role of pattern recognition receptors, including TLR2, TLR4 and TLR8. Factor 3 consisted of genes involved in TCR signalling: CD28, CTL4A, ITK and LEF1. Although LEF1 was not a member of the Ingenuity-defined canonical TCR signalling pathway, LEF1 is an integral mediator of TCR signalling as it encodes a protein that binds the TCR-α enhancer site and confers maximal TCR-α enhancer activity. Unexpectedly, the IFN-inducible chemokine CXCL9 gene (factor 7) was not intercorrelated with the other IFN-related genes in factor 1. We had particular interest in CXCL9 (factor 7), as CXCL9 is upregulated in gene network analyses of sarcoidosis organ tissues [19, 20], and we and others have previously shown that serum protein levels of this chemokine are upregulated in sarcoidosis [15, 21]. In summary, factor analysis of the qPCR results provided confirmation of the IFN, pattern recognition and TCR pathways in a separate replicate cohort, and additionally demonstrated a separate factor for CXCL9.
To facilitate data reduction for factors weighted by more than one gene, a three-gene mean was calculated and applied to each subject sample to represent the mean expression values of positively intercorrelating genes within factors 1, 2 and 3. This three-gene approach has been previously applied in our genomic studies of respiratory diseases such as asthma [22–24]. Within these factors, we chose the three genes with the greatest factor loadings (i.e. higher loading denotes greater contribution by that gene to the factor). We calculated a three-gene mean, herein called the “IFN factor” score, which represented the mean of standardised (i.e. mean-centered and scaled) expression values for STAT1, STAT2 and GBP1. Similarly, for each sample, a “pattern recognition factor” score was calculated using the mean of standardised expression values for TLR2, TLR4 and TLR8. The mean of CD28, ITK and LEF1 standardised expression values was taken together as the “TCR signalling factor” score.
Factors distinguish chronic from nonprogressive disease
Using blood collected at enrolment, logistic regression modelling of IFN, TCR and CXCL9 factors identified CXCL9 (p<0.001) and TCR (p=0.011) as significant predictors of disease course (i.e. chronic versus nonprogressive). We generated predicted probabilities of chronic disease from these logistic regression models. Lymphopenia (measured as 1 – lymphocyte count) was used for comparison as a reference predictor of chronic disease. Comparisons of the area under the ROC curve (AUC) demonstrated that the predicted probabilities based on gene factors at enrolment significantly outperformed lymphopenia as a predictor of chronic disease (AUC 0.80 versus 0.59, respectively; p=0.001) (fig. 2). The performance of CXCL9, TCR and IFN factors as independent measures of predicting chronic disease is shown in figure E4, and confirms that IFN, in itself, is a poor predictor of chronic disease.
To determine whether these factors demonstrated good predictive value among subjects early in their disease course, we repeated these analyses using only sarcoidosis subjects who had enrolled in the study within 2 years of their diagnosis. Again, we found that CXCL9 and TCR factor-based predicted probabilities identified chronic disease among subjects with early disease (AUC 0.84) (fig. E5). In sensitivity analysis, we also examined the predictive value of CXLC9 and TCR factors in sarcoidosis subjects not on any immunosuppression at enrolment, and found that the results were unchanged (AUC 0.84) (fig. E6).
High CXCL9 factor predicts disease course
All sarcoidosis subjects were classified as CXCL9 factor high or low at enrolment. The CXCL9 factor threshold was set at 0.4 (log2 relative expression value) based on ROC curve analysis of CXCL9 described above. We performed a time-to-event analysis, wherein an event was defined as: 1) a decrease in FVC ≥10% predicted or DLCO ≥15% predicted during the follow-up period; or 2) a sarcoidosis flare where treatment is escalated to ≥20 mg of corticosteroids daily [18] or there is an initiation of a disease modifying antirheumatic agent or TNF inhibitor after enrolment. As shown in figure 3, CXCL9-high sarcoidosis subjects developed a significant decline in pulmonary function testing or flare during the follow-up period more rapidly than CXCL9-low sarcoidosis subjects (p=0.009 by log-rank test).
Comparisons of IFN, TCR and CXCL9 factors longitudinally
Using mixed-effects modelling of longitudinal qPCR data from healthy and sarcoidosis subjects, both nonprogressive (p=0.001) and chronic sarcoidosis groups (p<0.001) demonstrated a higher IFN factor score compared with healthy subjects (fig. 4a). There was no significant difference in IFN factor scores between chronic and nonprogressive sarcoidosis (p=0.167). Both nonprogressive and chronic sarcoidosis groups showed lower TCR factor scores than healthy controls (p<0.001) (fig. 4b). Additionally, TCR factor was significantly lower in chronic sarcoidosis compared with nonprogressive sarcoidosis (p=0.005). CXCL9 was persistently and significantly higher in chronic compared with nonprogressive sarcoidosis and healthy subjects (p<0.001) (fig. 4c). There was no difference in CXCL9 between healthy and nonprogressive sarcoidosis subjects (p=0.826).
In subset analysis, subjects who met criteria for chronic disease based on declining PFT did not differ from those who met criteria for chronic disease based on flares in regards to IFN (p=0.31), TCR (p=0.11) and CXCL9 (p=0.95) factor scores longitudinally. This subset analysis suggests that our phenotypic group definitions yielded groups with relatively consistent patterns of gene expression.
TCR and CXCL9 factor scores correlate with pulmonary disease severity
In models that incorporate all time points, lower DLCO % predicted was associated with decreased TCR factor (p=0.011, coefficient 6.54), increased CXCL9 (p=0.036, coefficient −2.90) and longer disease duration (p=0.006, coefficient −0.002). Lower FVC % predicted was also associated with lower TCR factor (p=0.016, coefficient 2.49). Furthermore, higher Scadding stage was associated with lower TCR factor (p=0.011) by ordinal logistic regression.
Effect of immunosuppression on factor scores
In mixed-effects modelling incorporating data from all sarcoidosis subjects, immunosuppression had no significant effect on IFN (p=0.12) or CXCL9 (p=0.13) factors. In secondary subset analyses, immunosuppression use was associated with lower IFN factor score among patients with nonprogressive disease (p=0.04). However, in the chronic disease group, immunosuppression use had no significant effect on IFN factor score (p=0.23). These subset analyses of the effects of immunosuppression are limited by multiple hypothesis testing. Pattern recognition factor scores strongly positively correlated with immunosuppression use in both nonprogressive and chronic groups (p<0.001 for both groups). Because we could not distinguish whether pattern recognition receptor scores related to disease activity versus immunosuppression use [25] with the available data, we could not pursue further analysis of the pattern recognition factor at this time.
Discussion
Previously, in a cross-sectional study of 38 subjects with sarcoidosis, we identified gene expression markers of specific inflammatory pathways that were associated with sarcoidosis, and were concordant in blood and affected lung tissue biopsies. In the current study of a separate and larger, longitudinal cohort, we demonstrate the robustness of these gene expression markers and show that they predict disease severity and course. Whereas the majority of published sarcoidosis studies are cross-sectional, our study design allowed us to investigate longitudinal measurements of biological signals over time in relation to clinical course. We identified differential expression patterns relating to three distinct gene signatures, of which TCR and CXCL9 corresponded with disease course and severity. The results support our hypothesis that measurement of whole-blood gene transcripts in sarcoidosis may afford minimally invasive means of assessing sarcoidosis disease course and could be relevant to the underlying immunopathogenesis.
We demonstrated that peripheral-blood gene transcripts for factors TCR and CXCL9 have predictive value in discriminating between chronic and nonprogressive sarcoidosis subjects. Although currently, there are no reliable means for predicting sarcoidosis disease course, some studies have suggested that lymphopenia may correlate with chronically active sarcoidosis [26–28]. In ROC curve analyses, a logistic-based prediction variable incorporating TCR and CXCL9 factor significantly outperformed lymphopenia in identifying chronic disease. This result was reproduced when the analysis was restricted to sarcoidosis subjects who enrolled early in their disease course or were not on immunosuppression. In our longitudinal cohort, sarcoidosis subjects high in CXCL9 factor at enrolment experienced a shorter time to developing a significant decline in PFT or a clinical flare during the follow-up period. Our findings suggest that increased CXCL9 factor may identify those at risk for chronic disease, and warrant further study in validation cohorts.
It is notable that our IFN signature was predominantly driven by STAT1 and STAT2 expression in sarcoidosis peripheral blood, as this finding supports prior studies showing that STAT1 may play an important role in disease pathogenesis [19, 20]. Using microarray gene network analyses and immunohistochemical staining of granulomas, Rosenbaum et al. [20] previously demonstrated that STAT1 signalling is strongly upregulated in peripheral blood (n=12), and lung and lymph node tissues. Crouser et al. [19] similarly confirmed an increase in STAT1-related networks in sarcoidosis lung biopsies. Our study lends evidence that augmentation of STAT1 signalling is not limited to the affected granulomatous tissue but, importantly, is robustly upregulated in circulating immune cells. The finding of shared transcriptional signature of IFN pathways in the blood and diseased tissues identifies molecular targets that are not completely “compartmentalised” in the diseased organ [7, 29]. Furthermore, whereas the aforementioned studies focused on subjects at a cross-sectional level, our longitudinal study demonstrates that even nonprogressing sarcoidosis subjects, whose disease course is often considered to be benign, may exhibit persistent elevations in IFN signalling.
Our finding that both nonprogressive and chronic progressive sarcoidosis subjects share upregulation of IFN factor confirms that IFN plays an important role in the aetiopathogenesis of sarcoidosis. However, our results suggest that additional biological signals aside from IFN contribute to disease severity and progression. Some studies have suggested higher TNF signalling predicts disease progression [30, 31].
Notably, we found that peripheral CXCL9 transcript levels, which are known to be synergistically enhanced by TNF [32], predicted disease progression in terms of pulmonary decline and relapse. CXCL9 is an IFN-γ-inducible chemokine that, upon binding to its receptor CXCR3, facilitates recruitment of T-cells to inflamed tissues in human autoimmune diseases. We and others have demonstrated elevations in CXCL9 protein levels in sarcoidosis sera [15, 21], bronchoalveolar lavage (BAL) fluid [15, 21, 33] and granulomas [34]. We further show that CXCL9 gene expression inversely correlated with DLCO % predicted, independent of time from diagnosis. These data suggest that CXCL9 is associated with chronic sarcoidosis.
Additionally, we demonstrate that expression of genes directly involved in the TCR signalling pathway are suppressed in sarcoidosis compared with healthy controls, including CD28, ITK and LEF1. Our analyses were adjusted for lymphocyte counts, as lymphopenia may be seen in sarcoidosis [2]. In future studies, we propose to perform cell sorting analyses of T-cell subsets to ascertain whether these findings stemmed from peripheral changes in differential subsets of lymphocytes in sarcoidosis; however, a recent study by Oswald-Richter et al. [35] showed that sarcoidosis peripheral and BAL sorted CD4+ T-cells exhibited decreases in the TCR-responsive genes LCK and PKCQ, both of which were also shown to be downregulated in our Ingenuity TCR signalling pathway analyses. Our results add to their study by demonstrating that the degree of TCR downregulation has important clinical correlations longitudinally. Specifically, decreased TCR scores were associated with chronic disease and lower FVC and DL,CO % predicted. Taken together, these findings demonstrate that the TCR signalling complex and its downstream mediators are downregulated in sarcoidosis, which may, in part, explain suppressed responses to TCR polyclonal stimulation and peripheral anergy in sarcoidosis patients.
There are limitations to our study. Although we recruited from a diverse, multiethnic population, our cohort largely consisted of Caucasian subjects. Subjects within the cohort were followed longitudinally but were not necessarily enrolled at the time of diagnosis. Consequently, we cannot be certain that, at the time of diagnosis, nonprogressive patients did not exhibit high levels of CXCL9 initially; however, in a subset analysis of all sarcoidosis patients whose baseline visits were within 2 years of diagnosis, the chronic group still exhibited significantly higher CXCL9 expression levels compared with those with nonprogressive disease. Finally, we could not apply the clinical categorisations of acute and nonacute sarcoidosis disease onset reported by Prasse et al. [36], as our cohort consisted of only nonacute disease. As many published studies in sarcoidosis have used a 2-year threshold to define chronic sarcoidosis, we chose this schema as well.
In summary, we provide evidence for longitudinal measurements of gene expression patterns in sarcoidosis peripheral blood that are associated with dysregulated immune signalling pathways, and predict disease chronicity and correlate with pulmonary severity. These findings have important implications for investigations into noninvasive markers of disease activity that may inform clinical management of sarcoidosis.
Acknowledgments
We thank all the participants who volunteered their time for this study.
Footnotes
This article has supplementary material available from erj.ersjournals.com
Support statement: Funding for this project was supported by grants from the US National Institutes of Health (NIH)/National Heart, Lung and Blood Institute (U01HL112696), NIH/National Institute of Allergy and Infectious Diseases (1R56AI087652) and the Nina Ireland Lung Disease Program. R. Su received funding from the Rheumatology Research Foundation Scientist Development Award.
Conflict of interest: None declared.
- Received February 27, 2014.
- Accepted July 10, 2014.
- ©ERS 2014