Abstract
Lack of reproducibility of findings has been a criticism of genetic association studies on complex diseases, such as chronic obstructive pulmonary disease (COPD).
We selected 257 polymorphisms of 16 genes with reported or potential relationships to COPD and genotyped these variants in a case–control study that included 953 COPD cases and 956 control subjects. We explored the association of these polymorphisms to three COPD phenotypes: a COPD binary phenotype and two quantitative traits (post-bronchodilator forced expiratory volume in 1 s (FEV1) % predicted and FEV1/forced vital capacity (FVC)). The polymorphisms significantly associated to these phenotypes in this first study were tested in a second, family-based study that included 635 pedigrees with 1,910 individuals.
Significant associations to the binary COPD phenotype in both populations were seen for STAT1 (rs13010343) and NFKBIB/SIRT2 (rs2241704) (p<0.05). Single-nucleotide polymorphisms rs17467825 and rs1155563 of the GC gene were significantly associated with FEV1 % predicted and FEV1/FVC, respectively, in both populations (p<0.05).
This study has replicated associations to COPD phenotypes in the STAT1, NFKBIB/SIRT2 and GC genes in two independent populations, the associations of the former two genes representing novel findings.
The prevalence of chronic obstructive pulmonary disease (COPD) in Western Europe is ∼10% 1, and COPD is expected to be the third most significant cause of death worldwide by 2020 2. The most important risk factor for COPD is smoking and there is a dose–response relationship between smoking exposure and reduced lung function, although there is a substantial heterogeneity in the degree of lung function impairment 3. Only a subset of smokers develops an accelerated rate of decline in lung function that leads to COPD. In addition, there appears to be familial clustering of both impaired lung function and COPD 4. These insights suggest that susceptibility to COPD may be influenced by genetic factors. The only well-established genetic cause of COPD, α1-antitrypsin deficiency, is present in only 1–2% of individuals with COPD 5. A number of studies have been performed to find other genetic susceptibility factors for COPD. So far, hundreds of candidate genes have been tested. To date, it has been difficult to replicate genetic findings from one COPD study to another 6. There may be several explanations for this lack of reproducibility, including small sample sizes, lack of Hardy–Weinberg equilibrium (HWE), poor phenotype characterisation of the COPD cases and genetic heterogeneity 7. It is now recommended that the findings of a genetic association should be replicated in another sample before being published 7; nevertheless, only a minority of genetic COPD studies meets this requirement.
To obtain further insight into the genetic basis of COPD, we replicated the relationships of a number of potential COPD candidate genes in two large, independent and well-characterised populations. We selected 257 single-nucleotide polymorphisms (SNPs) in 16 genes based on reported or potential relationships to COPD. They were analysed in a case–control sample from Bergen, Norway, including 953 COPD cases and 956 controls 8. SNPs with significant associations to COPD were then tested using family-based association analysis in 635 pedigrees with 1,910 individuals from the International COPD Genetics Network (ICGN), which is the largest family-based COPD collection reported to date 8.
METHODS
Study subjects
The Norwegian case–control study initially included 953 cases; 189 were recruited from two community studies 9, while the rest were recruited from a registry at Haukeland University Hospital, Bergen. The study also included 956 controls; 735 were recruited from the two community studies, while 221 were volunteers. The inclusion criteria for COPD cases was a post-bronchodilator forced expiratory volume in 1 s (FEV1) <80% predicted and FEV1/forced vital capacity (FVC) <0.7. The controls were selected based on post-bronchodilator FEV1 >80% pred and FEV1/FVC >0.7. Both cases and controls were Caucasians with ≥2.5 pack-yrs smoking history (current or ex-smokers).
In the multicentre ICGN study, subjects with known COPD were recruited as probands, and siblings and available parents were ascertained through the probands 8, 10. The probands were recruited from pulmonary and medical clinics, and hospital admissions. Inclusion criteria for probands were airflow limitation (post-bronchodilator FEV1 <60% pred and FEV1/vital capacity (VC) <90% pred) at a relatively early age (45–65 yrs), a ≥5 pack-yrs smoking history and at least one eligible sibling (with ≥5 pack-yrs smoking history). COPD in siblings was defined by a post-bronchodilator FEV1 <80% pred and FEV1/VC <90% pred. 1,910 Caucasian individuals from 635 pedigrees were included in the family-based association analysis.
Phenotyping
Three phenotypes were defined: 1) the binary COPD phenotype defined according to the criteria described previously; 2) post-bronchodilator FEV1 % pred; and 3) FEV1/(F)VC.
Candidate genes selection and genotyping
16 candidate genes were selected for analyses based on their potential biological relevance to pathways that may cause COPD or their proximity to genes with a known relationship to COPD (see supplementary table E1 for details of each gene).
Seven out of the 16 genes had previously been shown to be associated with COPD (GC, GSTP1, HDAC2, HDAC5, HMOX1, IL11 and JAK3). Genotyping in the two cohorts was performed with the Illumina (San Diego, CA, USA) array-based custom SNP genotyping platform. The selection of the candidate genes and the analyses were performed prior to the genome-wide association study (GWAS) that has recently been published using a subset of these subjects 11, which used an Illumina HumanHap550 genotyping BeadChip for the Bergen cohort and a Sequenom (Hamburg, Germany) iPLEX SNP genotyping protocol developed for measurement with the MassARRAY mass spectrometer for the ICGN study. The six SNPs rs13010343, rs1609181, rs802372, rs10278590, rs8065686 and rs4802898 were also present on the Illumina Human Hap550 chip from the GWAS 11.
For the selection of SNPs, linkage disequilibrium (LD) bins were established using an r2 threshold of 0.8. The tagging SNP selection was based on HapMap data for European-Americans with a minor-allele frequency (MAF) >5% from the public database 12. Nonsynonymous SNPs with any MAF were included. SNPs in these genes were selected and genotyped in both the ICGN family population and the Norwegian case–control population using the Illumina array-based custom SNP genotyping platform. HWE was performed for all SNPs in the control data by using the Chi-squared goodness-of-fit test with SAS software version 8.2 (SAS Institute, Cary, NC, USA); HWE for all SNPs was also tested in the family data using PBAT version 3.5 (Golden Helix Inc., Bozeman, MT, USA) 9. All SNPs (p-values >0.05) were in HWE in both the family data and the case–control data. COPD family data were evaluated for inconsistent Mendelian inheritance using the PedCheckprogram 13. A complete list of the genes and SNPs tested in each gene is given in supplementary table E2.
Statistical methods
In the case–control population, two models were used in the association analysis. A logistic regression model for the COPD binary phenotype and a linear regression model for the quantitative phenotypes (FEV1 and FEV1/FVC), with covariates including age, sex and pack-yrs of smoking. For the quantitative trait analysis, only COPD cases were included. The analyses were done using SAS software version 8.2 with an additive genetic model. FBAT version 1.7.3 14 was used for the family-based single-SNP association analysis of the COPD binary phenotype in the ICGN family study. The analyses of quantitative traits (FEV1 and FEV1/VC) were performed with covariates including centre, age, sex, height and pack-yrs of cigarette smoking using PBAT version 3.5 14. Biallelic tests were conducted for SNPs using an additive genetic model. The risk allele was determined from the FBAT Z statistic. Haplotype analyses were conducted using the HBAT function of the FBAT program with the use of Monte Carlo sampling for COPD, FEV1 and FEV1/VC 15 in the family data. In the case–control data, haplotype analysis was performed using the expectation-maximisation algorithm and score tests, implemented in the Haplo.stats program 16. LD structure was examined with the Haploview program 17, 18. We used a p-value <0.05 in both COPD populations to define statistical significance.
We assessed the power to detect significant associations between the genes and the phenotypes based on the following assumptions: we assumed that the allele frequencies of disease gene and the markera were 0.1, and the penetrances for the three genotypes of the marker were 0.7, 0.4 and 0.1. At significance level of 0.05, our study had 99.52 and 85.43% power for COPD case–control and family data, respectively, to detect an association.
RESULTS
Study participants
Characteristics of the participants of the two studies are given in table 1. In the case–control study, the cases comprised more males, were older and reported higher smoking exposure than the controls. The cases had, on average, a moderate-to-severe airflow limitation (mean post-bronchodilator FEV1 50.3% pred). In the IGCN study, the probands were predominantly male, of the same mean age, but with a greater smoking history than the siblings. The probands had, on average, a severe airflow limitation (mean post-bronchodilator FEV1 36.2% pred).
Single-marker and haplotype analysis
The relationships of the single-SNP associations with COPD, post-bronchodilator FEV1 % pred and FEV1/FVC (FEV1/VC in the IGCN study) are shown in tables 2, 3 and 4, respectively. Only genes with significant associations in the case–control study are shown.
The following genes included SNPs that were significantly associated with the binary COPD phenotype in either study: STAT1, GC, MAP3K5, KCND2, RARRES2, HADC5, SIRT2 and NFKBIB. Only one SNP in STAT1 (rs13010343) and one SNP in NFKBIB/SIRT2 (rs2241704) were associated with COPD in both studies (table 2). The risk allele was the same in both datasets for rs13010343, as well as for rs2241704.
The genes significantly associated with post-bronchodilator FEV1 % pred in either study were STAT1, GC, AGER, MAP3K5, KCND2, JAK3, SIRT2, NFKBIB/SIRT2, IL11 and HMOX1. Only GC (rs17467825) was replicated in both studies (table 3), with negative regression coefficients indicating a risk for impaired FEV1.
Regarding the associations with FEV1/FVC or FEV1/VC, significant associations were noted for STAT1, GC, AGER, HDAC2, MAP3K5, KCND2 and JAK3 in either study. Only GC (rs1155563) was observed in both samples (table 4), with negative regression coefficients indicating a risk for impaired ratio.
No SNP had replicated associations across the two studies for all the COPD phenotypes.
We also performed haplotype analyses. The results supported the findings of the SNP analyses, but no additional significantly associated genes were identified (data not shown).
LD analysis
Figures 1–⇓3 show the pair-wise LD (r2) values for the SNPs in the genes STAT1, NFKBIB/SIRT2 and GC. The LD structure appeared generally quite similar across the two study populations. Three haplotype blocks were identified in the LD map of the STAT1 gene in both populations, with significant SNPs given in bold text. Two haplotype blocks and one haplotype block were revealed for the NFKBIB/SIRT2 gene IGCN and case–control samples, respectively, while three haplotype blocks were observed in the two data sets for the GC gene. The six SNPs that were also present in the Illumina chip from the GWAS was not in significant LD (r2≥0.8) with any other SNPs tested in this study.
Linkage disequilibrium map in STAT1 gene region in a) the International COPD Genetics Network family-based population and b) case–control population. Values of r2 (×100) are shown. ▪: r2 = 1; □: r2 = 0. Shades of grey represent 0<r2<1 (the intensity of the grey is proportional to r2). Haplotype block structure was estimated with the Haploview program. COPD: chronic obstructive pulmonary disease.
Linkage disequilibrium map in NFKBIB/SIRT2 gene region in a) the International COPD Genetics Network family-based population and b) case–control population. Values of r2 (×100) are shown. ▪: r2 = 1; □: r2 = 0. Shades of grey represent 0<r2<1 (the intensity of the grey is proportional to r2). Haplotype block structure was estimated with the Haploview program. COPD: chronic obstructive pulmonary disease.
Linkage disequilibrium map in GC gene region in a) the International COPD Genetics Network family-based population and b) case–control population. Values of r2 (×100) are shown. ▪: r2 = 1; □: r2 = 0. Shades of grey represent 0<r2<1 (the intensity of the grey is proportional to r2). Haplotype block structure was estimated with the Haploview program. COPD: chronic obstructive pulmonary disease.
DISCUSSION
In the present study, we selected 257 polymorphisms in 16 genes with reported or potential relationships to COPD and genotyped the variants in a case–control COPD study. Significant associations were tested in a family-based COPD study. We detected that the STAT1, NFKBIB/SIRT2 and GC genes were associated with COPD-related phenotypes in the two data sets. To our knowledge, this is the first study to replicate the same SNPs of the STAT1 and NFKBIB/SIRT2 genes with COPD-related phenotypes in two populations. These genes have multiple potential relationships with established disease mechanisms in COPD.
The mechanisms by which STAT1 exhibits its functions are unclear. Progression of COPD is associated with increased numbers of CD8+ T-cells and B-cells in the airways, supporting a role for these cells in the pathophysiology of COPD 19. It is known that CD8+ T-lymphocytes express the chemokines CXCL10 and interferon (IFN)-γ 19. The chemokines act as ligands at the CXCR3 receptor, with expression peaking 8–12 h after stimulation with IFN-γ 20. This pathway is thought to be dependent on STAT1 21. It is worth noting that Tudhope et al. 21 observed that this pathway was dexamethasone-resistant, a finding that is consistent with the limited impact of corticosteroids on the inflammatory profile in COPD 22.
The two genes NKFBIB and SIRT2 are closely located (according to HAPMAP). The SNP rs2241704 is located in the flanking regions of the genes and we cannot distinguish them based on the current data. Hence, the observed association to COPD may work through either gene.
Increasing evidence supports a key role for the transcription factor nuclear factor (NF)-κB in the host response to Streptococcus pneumoniae infection 23. Control of NF-κB activity is achieved through interactions with the I-κB family of inhibitors, encoded by the genes NFKBIA, NFKBIB and NFKBIE. COPD patients frequently have lower respiratory tract colonisation with S. pneumoniae and this chronic infection may play a role in the pathogenesis of COPD. Functional polymorphisms in the NFKBIB gene may affect the host response to the S. pneumoniae infection and, thereby, cause chronic airway inflammation.
The SIRT2 gene may be related to degenerative processes through the deacteylation of the α-tubulin 24. The SIRT genes are believed to be involved in the ageing process 25, 26. Emphysema may be regarded as a premature ageing of the lung with loss of elastic fibres 27.
GC encodes the vitamin D binding protein (DBP), which may have several functions. The major function is binding, solubilisation and transport of vitamin D and its metabolites 28. It is also reported to augment the chemotactic effect of complement-derived molecules on neutrophils 29. Neutrophils play an important role in parenchymal destruction and airway inflammation in COPD. Another important function of DBP is its deglycosylation to DBP macrophage-activating factor. The absence of a glycosylated residue at position 420 in GC2 inhibits this conversion. This may be a partial explanation of its protective effect 30. Unlike the GC2 allele, the homozygous GC1F phenotype is a significant risk factor for the development of COPD 31, 32. Although the GC1F allele has no effect on the age of onset of COPD, the annual decline in FEV1 has been reported to be significantly higher in patients with this allele. High-resolution computed tomograpy showed that GC1F allele carriers suffer from more severe emphysema 29. Black and Scragg 33 recently analysed serum 25-hydroxyvitamin D3 concentration, FEV1 and FVC of 14,091 subjects (aged ≥20 yrs) and demonstrated a significant correlation between these parameters.
Several genes were only significantly associated with the COPD-related traits in one of the two studies. This does not necessarily imply that there is no true relationship between these genes and COPD. It especially relates to the genes of which the SNPs reached borderline significance in the other data set, for instance rs2282680 and rs4588, or the GC gene (table 3). This makes it less likely that these associations are false-positive. Although the three phenotypes of COPD used in the analyses were identical across the samples (except for VC versus FVC), they are still based solely on spirometry. It is well acknowledged that FEV1 and FVC far from reflect the whole picture of COPD 34. Hence, the COPD cases from the two populations might differ with respect to other important characteristics of the disease, such as degree of emphysema and chronic bronchitis, systemic inflammation, body mass index, respiratory failure and rate of exacerbations. More specific COPD-related phenotypes than those applied in the present study might have revealed stronger associations to the genes and enhanced the reproducibility of the relationships across the samples.
In both samples, the participants were examined only once. Hence, we have no data regarding disease progression or rate of lung function decline. Such data might have further strengthened the associations and reproducibility of the findings.
The major exposure in COPD, active smoking, was taken into account in the analyses, based on questionnaire responses. However, distribution of other environmental risk factors, such as occupational airborne exposure, as well as indoor and outdoor air pollution, may differ between the samples and affect the observed associations between the genes and COPD. Passive smoking in utero or in early life may cause airway disease in adults 35. The degree of passive smoking and at what time in life it occurs may vary across the samples and potentially influence any gene–environment interaction effect on COPD.
A common cause of lack of reproducible findings is small sample sizes 36. In our study, limited statistical power is not likely to explain lack of replication. The size of the population in these studies is appropriate to identify even modest effects. Random genotyping errors will cause a nondifferential misclassification and bias any association towards nil 37, 38. However, systematic errors may cause a false positive association. Deviation from HWE in the control group may be a sign of genotyping error 36. However, in the current study, all the SNPs tested were in HWE in both data sets.
Obviously, the possibility exists that there are true differences in genetic determinants for COPD in various populations. Several genetic COPD association studies have seen inconsistent results between Caucasian and Asian populations 39. However, it is worth noting that in both the ICGN and the Bergen data sets of the current study, all of the participants were self-reported Caucasians.
The potential weaknesses of the study should be acknowledged. First, the case–control population is from a single centre in Norway, and the age and pack-yrs of smoking were different between the cases and controls. Therefore, we used a logistic regression model for the association analysis, at least partly correcting for age and pack-yrs of smoking. The fact that the case–control population is from a single centre can also be considered as an advantage, because the population is more homogeneous and the possibility of false positive results though population stratification is minimal. The analysis of population stratification using a set of 257 unlinked SNPs showed no evidence for population stratification. Secondly, no SNPs were significantly associated with COPD after correction for multiple comparisons. However, the replication in two independent populations and validation by haplotype analysis suggest that the association results are valid. Although various procedures have been studied for correction of multiple testing, including Bonferroni correction and permutation testing, there has not been an ideal statistical framework to deal with raw p-values from SNP association analyses 40, especially for replication of a previously reported association result. 41. Here, we used the replication of the results in an independent cohort to validate our primary findings.
In conclusion, we conducted a robust genetic association study and found that variants in the STAT1, NFKBIB/SIRT2 and GC genes are likely to contribute to the susceptibility to COPD. Functional tests need to be performed to find the molecular mechanism that drives the genetic association between COPD phenotypes and these three genes.
Acknowledgments
The author’s affiliations are: P.S. Bakke, Dept of Thoracic Medicine, Haukeland University Hospital and Institute of Medicine, University of Bergen, Bergen, Norway; G. Zhu, GlaxoSmithKline R&D, Research Triangle Park, NC, USA; A. Gulsvik, Dept of Thoracic Medicine, Haukeland University Hospital and Institute of Medicine, University of Bergen; X. Kong, GlaxoSmithKline R&D, Research Triangle Park, NC, USA; A.G.N. Agusti, Hospital Universitari Son Dureta, Fundación Caubet-Cimera and Ciber Enfermedades Respiratorias, Palma, Spain; P.M.A. Calverley, University of Liverpool, Liverpool, UK; C.F. Donner, Division of Pulmonary Disease, S. Maugeri Foundation, Veruno, Italy; R.D. Levy, Division of Respiratory Medicine and James Hogg iCAPTURE Centre for Cardiovascular and Pulmonary Research, University of British Columbia, St. Paul's Hospital, Vancouver, BC, Canada; B.J. Make and P.D. Paré, both National Jewish Medical and Research Centre, Denver, CO, USA; S.I. Rennard, University of Nebraska, Omaha, NE, USA; J. Vestbo, Dept of Cardiology and Respiratory Medicine, Hvidovre Hospital, Copenhagen, Denmark and University of Manchester, Manchester, UK; E.F.M. Wouters, Dept of Respiratory Medicine, University Hospital Maastricht, the Netherlands; W. Anderson, GlaxoSmithKline R&D, Research Triangle Park, NC, USA; D.A. Lomas, Dept of Medicine, University of Cambridge, Cambridge, UK; E.K. Silverman, The Channing Laboratory and Pulmonary and Critical Care Division, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA; and S.G. Pillai, GlaxoSmithKline R&D, Research Triangle Park, NC, USA.
Footnotes
This article has supplementary material available from www.erj.ersjournals.com
Statement of Interest
Statements of interest for P.S. Bakke, G. Zhu, X. Kong, A.G.N. Agusti, P.M.A. Calverley, R.D. Levy, S.I. Rennard, W.H. Anderson, D.A. Lomas and E.K. Silverman can be found at www.erj.ersjournals.com/site/misc/statements.xhtml
- Received June 10, 2009.
- Accepted June 4, 2010.
- ©ERS 2011