Abstract
Despite its high prevalence and mortality, little is known about the pathogenesis of rheumatoid arthritis-associated interstitial lung disease (RA-ILD). Given that familial pulmonary fibrosis (FPF) and RA-ILD frequently share the usual pattern of interstitial pneumonia and common environmental risk factors, we hypothesised that the two diseases might share additional risk factors, including FPF-linked genes. Our aim was to identify coding mutations of FPF-risk genes associated with RA-ILD.
We used whole exome sequencing (WES), followed by restricted analysis of a discrete number of FPF-linked genes and performed a burden test to assess the excess number of mutations in RA-ILD patients compared to controls.
Among the 101 RA-ILD patients included, 12 (11.9%) had 13 WES-identified heterozygous mutations in the TERT, RTEL1, PARN or SFTPC coding regions. The burden test, based on 81 RA-ILD patients and 1010 controls of European ancestry, revealed an excess of TERT, RTEL1, PARN or SFTPC mutations in RA-ILD patients (OR 3.17, 95% CI 1.53–6.12; p=9.45×10−4). Telomeres were shorter in RA-ILD patients with a TERT, RTEL1 or PARN mutation than in controls (p=2.87×10−2).
Our results support the contribution of FPF-linked genes to RA-ILD susceptibility.
Abstract
Contribution of TERT, RTEL1, PARN and SFTPC mutations to rheumatoid interstitial lung disease susceptibility http://ow.ly/SXEm30a98Ic
Introduction
Rheumatoid arthritis (RA) is a destructive, systemic inflammatory and autoimmune disorder that affects up to 1% of the general adult population worldwide. Extra-articular disease occurs in nearly 50% of all RA patients, the lung being frequently involved [1]. Indeed, lung disease causes 10%–20% of all deaths in RA patients [2–4]. Specifically, interstitial lung disease (ILD) is the leading cause of mortality, accounting for a mortality rate that is approximately 13% higher in RA patients as compared to the general population [2, 4, 5], with a three-fold higher risk of death for those with ILD than those without [5]. In addition, whereas overall RA mortality rates are decreasing, RA-ILD deaths are increasing [6]. Despite its frequency and prognostic impact, RA-ILD has not been given much attention and we are far from understanding its pathogenesis [7].
In comparison to ILD occurring in other connective tissue diseases, patients with RA-ILD frequently present the usual interstitial pneumonia (UIP) pattern, which is characteristic of pulmonary fibrosis [8]. This pattern might explain the poor outcomes of RA-ILD patients, with survival rates similar to those of pulmonary fibrosis patients [9]. Familial pulmonary fibrosis (FPF) that might display histological patterns other than UIP has been linked to mutations in telomere maintenance-associated [10–15] and surfactant protein genes [16–18]. Most importantly, FPF and RA-ILD share common risk factors, such as cigarette smoking and the male sex [19, 20].
Given the above-cited similarities of RA-ILD and FPF, we hypothesised that RA-ILD and FPF share genetic risk factors. Therefore, we performed whole exome sequencing (WES) in RA-ILD patients to determine the contribution of mutations in genes previously linked to FPF.
Methods
Study participants
Consecutive RA patients with high-resolution computed tomography (HRCT) chest scans showing ILD were recruited by a French network of ILD-expert pulmonologists and RA-expert rheumatologists from 10 university hospitals during the period 2013–2015. All medical records were centrally reviewed by multidisciplinary discussion that included a pulmonologist (B. Crestani), rheumatologist (P. Dieudé) and radiologist. Medical records were independently reviewed to confirm whether subjects met the American College of Rheumatology criteria for RA [21]. The HRCT chest scans of the subjects were analysed by an experienced reader, blinded to clinical, biologic and genetic data, who scored the scans to ensure that the criteria for ILD were met [9]. In-house subjects (n=1010) without known autoimmune/inflammatory and/or pulmonary diseases served as healthy controls (supplementary material). The relevant ethics committees approved all procedures, and written informed consent was obtained from all participants in agreement with French bioethics laws.
WES followed by restricted analysis of FPF-linked genes in RA-ILD patients
FPF-risk genes implicated in telomere maintenance include telomerase reverse transcriptase (TERT) [10], telomerase RNA component (TERC) [10, 11], dyskerin (DKC1) [12], telomere-interacting factor-2 (TINF2) [13], regulator of telomere-elongation helicase-1 (RTEL1) [14, 15] and polyadenylation-specific ribonuclease deadenylation nuclease (PARN) [15]. FPF mutations are also found in genes that encode the following surfactant proteins: surfactant protein C (SFTPC) [16], ATP-binding cassette, subfamily A, member 3 (ABCA3) [17] and surfactant protein A2 (SFTPA2) [18]. We used WES, followed by an analysis restricted to these nine FPF-linked genes to assess excess mutations in RA-ILD patients. Sanger sequencing independently confirmed the WES-identified candidate disease-associated mutations.
TERT and RTEL1 molecular modelling and three-dimensional structure visualisation
Models of the three-dimensional structure of TERT and RTEL1 were built and analysed to assess the mutation effects.
Genotype–phenotype association analyses
Clinical, demographic, biological, HRCT chest scan and pulmonary function test results were assessed at RA-ILD patient inclusion. All HRCT scans were centrally reviewed and scored by a senior radiologist (M-P. Debray) (supplementary material). A telomeric restriction fragment length (TRFL) assay was used to measure telomere length in RA-ILD patients with mutations in telomere-maintenance candidate genes.
Statistical analyses
Power calculation
The 101 cases and 1010 controls provided a power higher than 70% to detect an overall association with an odds ratio of 3.0 (supplementary material).
Ancestry-inference analysis
Ancestry of all RA-ILD patients and controls was verified by principal component analysis, based on the individuals of the 1000 Genomes Project. To avoid population stratification bias, all outlier patient (i.e. those not of European ancestry according to the 1000 Genomes Project) data were excluded from the association analyses (burden test).
Burden test
A classical burden test was used to assess excess-risk mutations in RA-ILD. Significance was assessed using a one-sided Wald test.
Genotype–phenotype association analysis
Continuous variables, expressed as median (range), were compared using the t-test; categorical variables, expressed as n (%), were compared by the Fisher's exact test. Comparisons of RA-ILD patients with mutations were drawn using non-parametric tests because of the small sample size. Generalised additive models were used to evaluate the linearity of the relationship between continuous variables and mutation probability. Telomere lengths for TERT/RTEL1/PARN mutation carriers (n=11) and 15 healthy age-matched controls (TRFL being previously assessed for 13 of them [22]) were compared by logistic regression adjusted for age, using the R 3.1.2 glm function and corresponding figures were created with Graphpad Prism 6.0b. All statistical analyses were performed using the R 3.1.2 software. Levels of significance were defined at p<0.05.
Methods and corresponding analyses are detailed in the supplementary material.
Results
Phenotype of RA-ILD patients
We included 101 consecutive independent RA-ILD patients. Mean±sd age at RA onset was 53.54±15.40 years; 82.1% were anti-citrullinated peptide antibody-positive, 84.8% were rheumatoid factor-positive and 71.1% had erosive disease. Mean age at ILD onset was 61.42±11.81 years and mean RA duration before ILD detection was 7.93±10.83 years. Overall, 54.5% of all patients were ever-smokers and 65.4% showed the UIP pattern on HRCT. The demographic information and clinical characteristics are summarised in table 1.
Exome sequencing of FPF-linked genes in RA-ILD patients
WES combined with restricted analysis of the nine FPF-linked genes, followed by Sanger sequencing confirmation revealed that 12/101 RA-ILD patients (11.9%) carried 13 heterozygous mutations in the TERT, RTEL1, PARN or SFTPC coding regions (table 2, figure S2).
For telomere-maintenance genes, six RA-ILD patients carried six heterozygous TERT mutations: c.2383-2A>G, affecting intron splicing, not reported in the Exome Aggregation Consortium (ExAC) database, and c.3323C>T, p.Pro1108Leu, with ExAC minor allele frequency (MAF) of 5.55×10−5. In addition, four RA-ILD patients carried the previously reported FPF recurrent mutation [23]: c.1234C>T, p.His412Tyr. The TERT p.His412Tyr MAF is 1.5% in the European population ExAC database, which could suggest that this variant is a common polymorphism. However, taking into account 1) a MAF of 0.6% in the overall ExAC database, 2) evidence of linkage of p.His412Tyr to familial pulmonary fibrosis [24] and 3) the functional consequences including shortened telomere length [24], in addition to decreased catalytic activity in vitro [23, 25], we considered p.His412Tyr a low penetrant mutation and included it in our genetic association test. RTEL1 sequencing revealed four patients with four heterozygous mutations: three new mutations (c.2695 T>C, p.Phe899Leu; c.2824G>A, p.Asp942Asn; and c.2875C>T, p.His959Tyr) and the previously reported pathogenic mutation: c.2890T>C, p.Phe964Leu [22]. The p.Phe899Leu, p.His959Tyr and p.Phe964Leu mutations were not listed in the ExAC database, but p.Asp942Asn had an ExAC MAF of 2.06×10−4. One RA-ILD patient carried a PARN heterozygous frameshift mutation (c.1749_1750delAG, p.Ser585fs*5), not reported in the ExAC database. We found no mutations in TERC, DKC1 or TINF2 genes.
For genes encoding surfactant-related proteins, one RA-ILD patient carried a previously reported heterozygous SFTPC mutation (c.218T>C, p.Ile73Thr) and another carried an unreported SFTPC heterozygous mutation (c.180G>A, p.Met60Ile). Both mutations were located at highly conserved positions in the pro-SP-C-linker domain. One RA-ILD patient was a double heterozygote carrying both a heterozygous mutation in SFTPC (c.218T>C, p.Ile73Thr) and a heterozygous mutation in TERT (c.1234C>T, p.His412Tyr). No mutations were detected in ABCA3 or SFTPA2 genes. Details of the identified mutations are in supplementary table S1.
Predicted structural impact of the TERT and RTEL1 mutations
TERT
His412, located in the telomere RNA-binding domain (TRBD), is predicted to be in a helix involved in the binding of the TERT template-boundary element (TBE), which acts as a molecular guide to position the template in the active site. His412 does not make direct contact with the TBE, but is located on a positively charged TRBD surface (supplementary figure S3), which suggests that the mutation affects binding to structural elements located in the p3 helix and/or pseudoknot. Position 1108 is located on the C-terminal extension (CTE) thumb domain, in the loop that sharply turns the protein chain before the last 24 residues. Amino-acid Pro1108 is located in a local hydrophobic core; leucine substitution at this position preserves the residue's hydrophobic nature and is not predicted to engender major unfolding. However, because proline residues induce a kink in the main chain of the protein, this mutation could destabilise the structure of the terminal residues that interact with the CTE and RNA-template regions, and perhaps the TRBD near His412.
RTEL1
RTEL1 encodes an essential iron-sulfur (FeS)-containing DNA helicase that is critical for telomere maintenance and DNA repair. All mutations affect the first RTEL1 harmonin-like domain (supplementary figure S4a and c). Amino acids Phe899 and His959 occupy highly conserved positions. Leucine substitution at position 899 is predicted to disturb interaction with a yet uncharacterised partner of the harmonin-like domain, whereas tyrosine at position 959, although not directly involved in the putative binding groove, might also alter harmonin-like domain interactions. The effect of the asparagine at position 942, located in the α-3α-4 loop, remains undetermined. Furthermore, eucine replacement of phenylalanine at the highly conserved position 964, located in helix α-5, is predicted to disturb domain folding [22].
Burden test
Principal component analysis of WES data genotypes revealed that 81 patients, excluding the 20 outliers among the 101 RA-ILD patients, were clustered with the 1000-genome (1000G) subjects of European ancestry (supplementary figure S5a). Excess mutations in FPF-linked genes were evaluated in these 81 patients and compared to our 1010 controls of European ancestry. The retained patients had an excess number of risk mutations compared to controls: 12.35% (with at least one candidate disease-associated mutation) versus 4.46% (burden test, OR 3.17, 95% CI 1.53–6.12; p=9.45×10–4) (figure 1, table 3; supplementary figure S5a). The association remained significant with more stringent clustering on 1000G individuals of European ancestry: 14.71% of mutations in 68 cases versus 4.76% in 903 controls (OR 3.60, 95% CI 1.72–7.04; p=3.14×10–4) (supplementary figure S5b and supplementary table S5).
Genotype–phenotype-association analyses
Clinical phenotype
RA-ILD patients carrying a TERT, RTEL1 or PARN mutation showed no other clinical manifestation related to a telomere syndrome, such as skin abnormalities, typical haematological abnormalities (i.e.macrocytosis, anaemia and thrombocytopenia), bone marrow failure or liver disease. Mean age at ILD onset was significantly lower for patients with mutations than those without mutations: 53.27±10.21 versus 62.51±11.63 years, respectively; p=0.015 (table 1). Plots based on smoothing splines supported a nonlinear association between age at ILD onset and TERT/RTEL1/PARN/SFTPC mutations, with higher mutation probability for patients 36–41 years old than those younger or older (p=0.040) (figure 2a). Results remained significant after removal of the patient with a SFTPC mutation (p=0.042). No other phenotypic differences were detected; notably, pulmonary function and HRCT chest scan pattern at inclusion were similar for both subgroups (table 1).
Telomere lengths in RA-ILD patients with PARN, TERT or RTEL1 mutations
Consistent with previously reported telomere lengths of similar mutation carriers [14, 26, 27], telomere lengths in genomic DNA isolated from the circulating leukocytes of 11 RA-ILD patients with TERT, RTEL1 or PARN mutations were shorter than those from 15 controls (p=0.0114) (figure 2b; supplementary table S2), as confirmed by logistic regression adjusted for age (p=2.87×10−2).
Familial history of ILD and short telomere syndrome in patients with mutations
Among the 12 RA-ILD patients carrying a FPF-linked mutation, three had a family history of interstitial lung disease: case #5 (RTEL1 p.Phe964Leu) had a brother with idiopathic pulmonary fibrosis (IPF) (deceased); case #6 (TERT c.2383-2A>G) had a sister with IPF (deceased); and case #11 (TERT p.His412Tyr and SFTPC p.Ile73Thr) had a daughter with IPF (deceased). The father of case #8 (TERT p.His412Tyr) died of cirrhosis that was compatible with a short telomere syndrome (table 2) [28].
Discussion
To date and to our knowledge, this is the first exome sequencing study of RA-ILD patients. Our findings, from a candidate gene approach, provide evidence of an association between RA-ILD and mutations in FPF-linked genes (TERT, RTEL1, PARN or SFTPC). The burden of these mutations was significantly greater in patients than in controls. Moreover, the association was robust after adjustment for more stringent clustering of 1000G subjects of European ancestry. Our results show that RA-ILD and FPF share genetic risk factors, suggesting common pathogenetic mechanisms. The familial aggregation detected in 25% of RA-ILD patients carrying at least one mutation in FPF-linked genes supports this hypothesis.
For telomere-maintenance genes, we detected the TERT p.His412Tyr mutation that has been previously linked to FPF and dyskeratosis congenita [23, 25]. Our findings support the theory of TERT p.His412Tyr as a low penetrance mutation, with two of the four RA-ILD patients carrying the p.His412Tyr mutation, evidently shortened telomere length (supplementary table S2) and one whose father died of cirrhosis compatible with a familial short telomere syndrome [28]. Unfortunately, the affected father was not sequenced for TERT, to establish supplementary linkage evidence for TERT p.His412Tyr, which represents one limitation of the present study. We also identified p.Pro1108Leu in a highly conserved residue in the functional domain of the protein and a splice mutation that abolishes the acceptor splice site. Among the three new singletons in RTEL1, p.Phe899Leu, p.His959Tyr and p.Asp942Asn were predicted to be deleterious by at least two of the three prediction tools used. We previously reported the p.Phe964Leu mutation as a possibly deleterious mutation [22]. Furthermore, the PARN mutation leads to a frameshift and premature stop codon. Consistent with previously reported telomere lengths of TERT, RTEL1 or PARN mutation carriers [14, 26, 27], we found shorter lengths in the RA-ILD TERT, RTEL1 or PARN mutation carriers of the present study than in the controls, which confirms the deleterious effects of these mutations on telomere maintenance. The mechanism linking PARN mutations to telomere shortening was recently elucidated: PARN is required for TERC 3′-end maturation [29]. Our RA-ILD patient with the PARN frameshift mutation had the shortest telomeres, which further supports PARN participation in telomere maintenance.
We also investigated genes encoding surfactant-related proteins and detected two SFTPC mutations. The p.Ile73Thr mutation, located within the proSP-C (surfactant protein C) linker domain, accounts for more than 30% of all SFTPC mutations associated with diffuse parenchymal lung disease in patients with sporadic and inherited (autosomal-dominant) disease [16, 30]. Moreover, p.Met60Ile, a new mutation also located in the non-BRICHOS SP-C domain, was identified. Non-BRICHOS mutations within the proximal COOH propeptide (e.g. p.Ile73Thr) induce aberrant intracellular trafficking of proSP-C, which eludes cleavage and accumulates in the endosomal system, thereby causing cellular dysfunction [31]. Although we could not examine the functional consequences of the SP-C p.Met60Ile mutation, several lines of evidence favour its pathogenicity: 1) p.Met60Ile is located in a highly conserved region; 2) it has one of the three highest Combined Annotation Dependent Depletion scores; and 3) it is not in the ExAC database.
Our findings demonstrate the usefulness of WES combined with restricted candidate gene analysis in identifying RA-ILD-associated mutations, despite complexities, such as locus heterogeneity and late-onset disease. Because of the small number of available patients, our association study had neither sufficient power nor an appropriate design for gene discovery (e.g. no a priori hypothesis) [32]. Consequently, WES of larger RA-ILD and control populations, probably with international collaboration, is required to identify new RA-ILD risk genes and to refine the exact contribution of FPF-linked genes to the development of RA-ILD.
In the present genetic case–control association study, we provide evidence for an association between a panel of candidate genes (FPF-linked genes) and the “RA-ILD” phenotype, i.e. susceptibility to RA-ILD (RA-ILD versus controls). Our results do not provide information about the putative roles of these genes in 1) susceptibility to overall RA (RA versus controls) and 2) the risk of ILD in the RA population. These issues suggest that a genetic association study should be performed in RA-ILD cases compared to RA cases without ILD. To date, these issues remain unsolved and therefore support the need for an appropriately designed study facilitated by international collaborations, to test whether FPF-linked genes are also RA modifier genes, thereby increasing the risk of ILD in RA.
From a clinical perspective, the relatively high prevalence of male patients compared to that observed in a recent report of a large multiethnic RA population [33] and the rate of ever smoker patients, are consistent with that previously reported in RA-ILD [19, 34, 35]. Furthermore, consistent with that previously reported for ILD patients with RTEL1 or TERT mutations, ILD occurred earlier in RA-ILD patients with mutations than in those without mutations in telomere-maintenance genes, which might illustrate genetic anticipation, as has been reported in telomere-mediated disorders [36]. Nonetheless, the relatively small sample of RA-ILD patients carrying a mutation limits a genotype–phenotype association analysis, which emphasises the importance of future international collaborative studies on the genetics of RA-ILD.
FPF-risk genes involved in telomere maintenance might be linked to ILD associated with autoimmune diseases, because PARN or RTEL1 mutations have been identified in ILD patients with RA, autoimmune hepatitis, Sjögren's syndrome and more recently systemic sclerosis [22, 26, 37]. This hypothesis is reinforced by diminished telomerase activity and shortened telomere lengths that are apparently connected to premature immunosenescence in various systemic immune-mediated diseases, and more recently by the identification of TERT as a risk gene for systemic lupus erythematosus [38, 39]. In addition, we detected two SFTPC mutations in RA-ILD patients. To our knowledge, SFTPC mutations have only been associated with or linked to interstitial pneumonia, thereby contributing to ILD pathogenesis via endoplasmic reticulum stress in alveolar epithelial cells [16]. For the first time, our results provide evidence of an association between SFTPC mutations and RA-ILD that might contribute to the hypothesis of a pivotal role of the lung in the pathogenesis of RA [40]. Furthermore, our results were observed in European Caucasian patients and would require replication in other populations.
In conclusion, our findings establish, for the first time, shared genetic risk factors between the RA-ILD phenotype and familial pulmonary fibrosis.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material ERJ-02314-2016_Supplement
Figure S1. Filtering strategy to identify rheumatoid arthritis-interstitial lung disease (RA-ILD)-associated mutations. ERJ-02314-2016_Figure_S1
Figure S2. TERT-, RTEL1-, PARN- and SFTPC-detected mutations confirmed by Sanger sequencing. ERJ-02314-2016_Figure_S2
Figure S3. Molecular modelling of TERT mutations. Panel A. Model of full-length human telomerase reverse transcriptase (hTERT) complexed with the CR4/5 domain, template boundary element (TBE) RNA and RNA template/DNA telomeric repeat duplex. The N-terminal “anchor” domain (TEN) domain is red, telomere RNA-binding domain (TRBD) orange, catalytic reverse-transcriptase domain (RT) violet and C-terminal extension (CTE) pink. The surface covered by electrostatic potential around His412 is shown as an inset. Panel B. Secondary TERT structure with location of modeled RNA segments within the pseudoknot and Box/H/ACA domains. ERJ-02314-2016_Figure_S3
Figure S4. Molecular modelling of RTEL1 mutations. Panel A. Schematic representation of the RTEL1 protein, indicating domain architecture and mutation positions. PIP-box denotes PCNA-interacting protein box, HD helicase domain. Panel B. Positions of the mutated amino acids in multiple alignments of harmonin-like family sequences. The alignment, from Faure et al. [18] includes the sequences (with their UniProt entry name) of the RTEL1-protein harmonin-like domains from different species and those of human malcavernin (cerebral cavernous malformation-2 protein [CCM2]) and malcavernin-like (CCM2L), harmonin (USH1C), whirlin (WHRN) and delphilin (GRD2I). Positions with conserved hydrophobicity are depicted as grey squares, green for the general case and orange for the mostly aromatic character. The positions of the 3 mutated amino acids are indicated with stars above the alignment, and amino acids involved in cadherin-23 binding in harmonin (pdb 3K1R) are highlighted with yellow triangles. The secondary structure positions, as observed in the experimental 3D-structure of human harmonin (USH1C, pdb 3K1R), are reported above the alignment. Panel C: 3D-structural model of the first harmonin-like domain of human RTEL1 indicating the positions of the mutated amino acids. The position of cadherin-23 peptide, in grey, the harmonin-cadherin complex (USH1C, pdb 3K1R), is the hydrophobic groove, which may be used by these domains to interact with partners. ERJ-02314-2016_Figure_S4
Figure S5. Principal component analysis (PCA) of patients and controls. Panel A. PCA of genotypes from exome data for 101 RA-ILD patients demonstrated clustering with 1000G subjects of European (n=81) and non-European (n=20) ancestry. AFR denotes African, EUR European, SAS South Asia, EAS East Asian. ERJ-02314-2016_Figure_S5A
Figure S5. Principal component analysis (PCA) of patients and controls. Panel B. Projection of patients and controls on 1000G data with stringent clustering on EUR region. The EUR-region barycenter was computed with non-Finnish EUR individuals. Each studied individual was considered European Caucasian if the distance to the EUR barycenter was shorter than that for the non-Finnish EUR furthest from the EUR barycenter. ERJ-02314-2016_Figure_S5B
Disclosures
Supplementary Material
R. Borie ERJ-02314-2016_Borie
V. Cottin ERJ-02314-2016_Cottin
B. Crestani ERJ-02314-2016_Crestani
M-P. Debray ERJ-02314-2016_Debray
P. Dieudé ERJ-02314-2016_Dieude
Footnotes
This article has supplementary material available from erj.ersjournals.com
Support statement: This work was supported by research grants from the Société Française de Rhumatologie, Club Rhumatismes Inflammation, la Chancellerie des Universités de Paris (legs Poix), Sorbonne Paris Cité (FPI-SPC Program), Agence Nationale de la Recherche (grants ANR-10-LABX-46, ANR-10-EQPX-07-01, ANR-14-CE10-0006 and ANR-10-INBS-09), France Génomique National Infrastructure, unrestricted grants from Pfizer, Roche and Chugaï, and the Centre de Resources Biologiques Hôpital Bichat, Paris, France. Additional acknowledgements can be found in the supplementary material. Funding information for this article has been deposited with the Crossref Funder Registry.
Conflict of interest: Disclosures can be found alongside this article at erj.ersjournals.com
- Received November 28, 2016.
- Accepted February 11, 2017.
- Copyright ©ERS 2017