Abstract
Granulocyte-macrophage colony-stimulating factor (CSF), also known as CSF2, and granulocyte CSF, also known as CSF3, are important survival and proliferation factors for neutrophils and macrophages. The objective of the present study was to determine whether single nucleotide polymorphisms (SNPs) of CSF2 and CSF3 are associated with lung function in smoking-induced chronic obstructive pulmonary disease.
In total, five SNPs of CSF2 and CSF3 were studied in 587 non-Hispanic white subjects with the fastest (n = 281) or the slowest (n = 306) decline of lung function selected from among continuous smokers in the National Heart, Lung, and Blood Institute Lung Health Study (LHS). These SNPs were also studied in 1,074 non-Hispanic white subjects with the lowest (n = 536) or the highest (n = 538) baseline lung function at the beginning of the LHS.
An increase in the number of CSF3 -1719T alleles was significantly associated with protection against low lung function (odds ratio 0.73, 95% confidence interval 0.56–0.95), and was still significant after adjustment for multiple comparisons. There was also a significant association of a CSF3 haplotype with baseline levels of forced expiratory volume in one second. No association was found for CSF2 SNPs and lung function, nor was there evidence of epistasis.
In conclusion, genetic variation in colony-stimulating factor 3 is associated with cross-sectionally measured lung function in smokers.
- Chronic obstructive pulmonary disease
- forced expiratory volume in one second
- genetic polymorphism
- granulocyte colony-stimulating factor
- granulocyte-macrophage colony-stimulating factor
- lung function
Chronic obstructive pulmonary disease (COPD) is a complex genetic/environmental disorder, which is characterised by airflow obstruction that is not fully reversible and by a chronic persistent inflammatory process. The degree of airflow obstruction defines disease severity, which is quantified by post-bronchodilator forced expiratory volume in one second (FEV1) calculated as a percentage of a predicted value. Genetic factors contribute to both the level and decline of lung function. There is evidence to suggest that genetic factors account for 28.0–51.5% of the variability in cross-sectional FEV1 1–3 and for 18% of the variability of longitudinal change in lung function in smokers 4. The inflammatory process is a complex interaction between many inflammatory cells. Among these cells, neutrophils and macrophages play important roles by releasing proteinases that break down connective tissue in the lung parenchyma, resulting in emphysema.
Granulocyte-macrophage colony-stimulating factor (CSF), also known as CSF2, is an important survival, proliferation and differentiation factor of the progenitor cells for neutrophils and macrophages. Granulocyte CSF, also known as CSF3, is specific for granulocytes. The CSF2 and CSF3 genes (located at 5q31.1 and 17q11.2–q12, respectively) were selected as candidates for studies of decline and cross-sectional level of lung function in COPD patients for the following reasons. First, CSF2 and CSF3 can induce the expression of pro-inflammatory cytokines and thereby enhance the inflammatory response. It was shown that levels of CSF2 in serum and bronchoalveolar lavage fluid (BALF), along with numbers of total cells and polymorphonuclear cells in the BALF, were increased in bronchitic patients during exacerbations 5. It was also reported that CSF3 expression in the lung correlated with severity of pulmonary neutrophilia in acute respiratory distress syndrome 6. Secondly, it has been shown that polymorphisms and haplotypes of the CSF2 gene are associated with the prevalence of asthma and other atopic diseases 7–9. COPD and asthma share a common diathesis according to the “Dutch hypothesis” 10, 11, and atopy is a risk factor for COPD 12. The association of a single nucleotide polymorphism (SNP) in CSF3 with a significant increase in granulocytes among workers exposed to benzene was also reported 13. Thirdly, a recent study directly linked the CSF2/CSF3 ratio with lung function in cystic fibrosis patients, which suggested that the interaction between CSF2 and CSF3 contributes to lung function in those patients 14.
The current authors hypothesised that CSF2 and CSF3 polymorphisms and their interactions would influence the decline of FEV1 and/or the cross-sectional level of FEV1 in smokers with mild-to-moderate airflow obstruction from the Lung Health Study (LHS) cohort. The LHS, sponsored by the National Heart, Lung, and Blood Institute (NHLBI; Bethesda, MD, USA), was a clinical trial of smoking intervention and bronchodilator treatment on the progression of COPD 15. The dataset provides an excellent opportunity to explore the impact of genetic polymorphisms and their interaction on longitudinal decline and/or the cross-sectional level of FEV1 % predicted, as previously described 16–22.
METHODS
Study subjects
The LHS recruited a total of 5,887 smokers aged 35–60 yrs with spirometric evidence of mild-to-moderate lung function impairment from 10 North American medical centres. From the LHS cohort, two nested case–control studies were designed in order to study genetic determinants of rate of FEV1 decline and cross-sectional level of FEV1. Based on the rate of decline of FEV1 during 5 yrs of follow-up study, using arbitrary cut-off points of FEV1 % pred decrease of ≥3.0% per yr and increase of ≥0.4% per yr for rapid decliners and nondecliners, respectively, the 287 non-Hispanic white subjects with the highest rate of decline of lung function (fast decline group) and the 308 non-Hispanic white subjects with the slowest rate of decline of lung function (nondecline group) were selected from 3,216 continuous smokers during the first 5 yrs of follow-up. The rationale for selecting approximately the 300 highest and 300 lowest phenotypic subjects was as follows: 1) this approach has the advantage of reducing cost while keeping satisfactory statistical efficiency when compared with the full-cohort approach 23, 24; 2) the common disease–common variant hypothesis was suggested in the late 1990s, and states that disease-susceptibility alleles of common diseases will be present at high frequencies 25–27; and 3) this sample size has adequate power to detect common genetic risk variants, as shown previously 28. From all remaining LHS subjects, non-Hispanic white subjects with the highest post-bronchodilator FEV1 % pred (high function group, n = 484) and the lowest post-bronchodilator FEV1 % pred (low function group, n = 468) at the beginning of the LHS were selected. Arbitrary cut-off points of FEV1 % pred ≥88.9% and ≤67.0% were used for the high and low lung function groups, respectively. Since 144 subjects from the rate of decline study groups had baseline lung function within one of the limits that defined the cross-sectional groups (58 in the high function group and 86 in the low lung function group), they were also analysed in the study of cross-sectional FEV1. Thus, there were 542 and 554 subjects in the high and low lung function groups, respectively. Informed consent was obtained from all participants and the investigation received the approval of the Providence Health Care Research Ethics Board (Vancouver, BC, Canada).
Tagging SNP selection
The CSF2 and CSF3 SNP discovery data were downloaded from SeattleSNPs NHLBI Program for Genomic Applications (PGA), University of Washington and Fred Hutchinson Cancer Research Center (Seattle, WA, USA) 29. From all SNPs identified in the 23 unrelated European-American samples from the Centre d’Étude Polymorphisme Humain (CEPH) family panel, a set of tagging (tag) SNPs was chosen for each gene using the LDSelect program developed by Carlson et al. 30. A linkage disequilibrium (LD) threshold of r2>0.64 and minor allele frequency of 5% were used. Initially, two SNPs located at -1440A/G and 1944T/C (I117T) in the CSF2 gene were selected; however, testing for the 1944T/C SNP could not be established by the TaqMan® assay (Applied Biosystems, Foster City, CA, USA), and a restriction fragment length polymorphism (RFLP) PCR assay for the same SNP showed that PCR amplification failed for some samples. Therefore, 1944T/C was replaced with an alternative SNP, 1622C/T. In the CSF3 gene, three SNPs located at -1719C/T, -882G/A and 2176T/C were selected and genotyped. TagSNP selection and the nomenclature of the SNPs are presented in table 1⇓.
Genotyping
All SNPs except CSF2 1622C/T were genotyped in 384-well plates with a total volume of 5 μL by the TaqMan® 5′ exonuclease assay using primers and probes supplied by Applied Biosystems on an ABI Prism 7900HT Sequence Detection System (Applied Biosystems). Probe and primer sequences for each assay are listed in table 2⇓. Major and minor probes were labelled with 5′ FAM or 5′ VIC fluorophores as reporters (Applied Biosystems). For each SNP genotyping, ≤47 DNA samples of the CEPH panel with sequencing information available from the SeattleSNPs PGA were included as quality controls. All genotype results from the TaqMan® assay were consistent with sequencing results for all CEPH DNA samples that have sequencing information available in the SeattleSNPs database. No discrepancies were detected in the 10% of the randomly selected samples that were genotyped in duplicate.
The CSF2 1622C/T polymorphism was detected by an RFLP-PCR method using the following primers flanking the polymorphic region: 5′-AAGGAAGGGAGGCTACTTGG-3′ (sense) and 5′-GTTCCCCAAGGAGTGCATAG-3′ (antisense). Amplification products were digested by the BlpI restriction enzyme. BlpI produced 116-bp and 133-bp fragments when 1622T was present, but did not digest the 249bp PCR product when CSF2 1622C was present. The genotyping method was confirmed by sequencing 10 samples with three different genotypes. Sequencing was performed on an ABI 3100 16-capillary automated genetic analyser (Applied Biosystems) using the same primers as in the PCR reaction that produced the sequence template.
Statistical analysis
Hardy–Weinberg equilibrium tests and LD estimation were performed using the genetics package for R 31. All single-locus association tests were performed in R. The codominant and additive models were tested first and, if there was a significant association, the dominant and recessive models were tested additionally, to see if those models fitted better. If the cell counts were low, significance was assessed by permutation tests. In a codominant model, a heterozygote shows the phenotypic effects of both alleles fully and equally. The three genotypic categories of each SNP in the case and control groups constitute a 2×3 contingency table and the analysis does not provide any sense of ordering across the three genotypes. This type of analysis is also called a general genetic model 32. In a dominant model, one copy of the minor allele increases disease risk. The homozygotes and heterozygotes for the minor allele are compared as a group with homozygotes for the major allele 32. In a recessive model, two copies of the minor allele are required to increase disease risk. The homozygotes for the minor allele are compared with heterozygotes and homozygotes for the major allele as a group. In an additive model, there is r-fold increased disease risk for heterozygotes compared with the homozygotes for the major allele, and 2r-fold increased disease risk for the homozygotes for the minor allele compared with the homozygotes for the major allele 32. The Armitage trend test 33 was used to test an additive effect of the allele. In both the FEV1-decline study and the cross-sectional FEV1 study, in addition to crude analysis by Chi-squared tests using 2×3 contingency tables, multivariate logistic regression analyses were also used to control for potential confounders that might influence the rate of decline of lung function or the cross-sectional FEV1 level. In the FEV1-decline study, multivariate logistic regression was used to adjust for confounding factors such as age, sex, smoking pack-yrs and research centre. In the cross-sectional FEV1 study, multivariate logistic regression was used to adjust for the same confounding factors plus the rate of decline of FEV1. Although other phenotypes, such as forced vital capacity (FVC) % pred and FEV1/FVC ratio, were not the primary phenotypes due to study design, associations of those phenotypes with single SNPs were also analysed, using one-way ANOVA if the data were normally distributed or a Spearman’s rank test if the data were not normally distributed in the study groups.
The effective number (ne) of haplotypes from SNPs with minor allele frequency ≥5% was calculated by the following equation, where pi is the frequency of haplotype i 30.
The ne of haplotypes weights the number of haplotypes by frequency, with common haplotypes more heavily weighted.
Correction for multiple tests of SNPs in LD in each gene was carried out on the basis of the spectral decomposition (SpD) of matrices of pairwise LD between SNPs using SNPSpD 34, 35. This method provides a useful alternative to the very conservative Bonferroni correction. Haplotype association was tested using the hapassoc package for R 31. This software performs likelihood inference of trait associations with haplotypes and other covariates for generalised linear models, including logistic regression, and does not assume that haplotype phase is known 36. An additive effect of haplotype on the log-odds of disease was assumed. To calculate haplotype frequencies, an Expectation Maximization algorithm from the haplo.stats package for R was used 31.
Focused interaction testing framework (FITF), was used to identify gene–gene interactions 37.
RESULTS
Characteristics of the study groups
The characteristics of study participants are shown in tables 3⇓ and 4⇓. Since there was no DNA available for eight subjects in the rate of decline of FEV1 study or for 22 subjects in the cross-sectional level of FEV1 study, the numbers of participants in the two studies were 587 and 1,074, respectively.
Among nondecliners from the rate of decline of FEV1 study and among the high lung function group of the cross-sectional FEV1 study, the allele frequencies of all five SNPs did not significantly deviate from Hardy–Weinberg equilibrium (results not shown).
Haplotypes resolved with the genotyped tagSNPs
Haplotypes from SNPs with minor allele frequency ≥5% in 23 CEPH samples were inferred by use of PHylogenetics And Sequence Evolution (PHASE) 2.0 39, 40. The LD-selected CSF2 tagSNPs could resolve 60% of the actual number of haplotypes (three out of five; table 5⇓) and 87.1% (2.7 out of 3.1) of the ne of haplotypes from SNPs with minor allele frequency ≥5%. For CSF3, 35.7% of actual haplotypes and 48.1% of effective haplotypes from SNPs with a minor allele frequency ≥5% were resolved by the three selected tagSNPs.
Individual SNP association analysis
In the FEV1-decline study, none of the five SNPs were associated with decline of FEV1 in codominant or additive models either before or after adjustment for confounding factors (table 6⇓).
In the study of the cross-sectional level of FEV1, there was a borderline association of CSF3 -1719T with high FEV1 levels in an additive model (p = 0.054) before adjustment for confounding factors; after adjustment for confounding factors, the association was more significant (p = 0.018; table 6⇑). The odds ratios (ORs) of having one -1719T allele compared with no -1719T allele and of having two -1719T alleles compared with one -1719T allele were both 0.73, 95% confidence interval 0.56–0.95. The association of CSF3 -1719 with FEV1 level was adjusted for multiple testing on the basis of the SNPSpD approach 34, 35. The significance threshold required to keep the type I error rate at 5% for CSF3 in the present study was 0.019, based on the LD of the three SNPs studied. Therefore, the association of CSF3 -1719 with FEV1 level remained significant after correction for multiple comparisons.
In addition, two SNPs showed borderline associations with FEV1 levels before adjustment for confounding factors: CSF2 1622 in a codominant model (comparison of the distribution of the three genotypic groups, CC, CT and TT, in the case and control groups gave p = 0.092); and CSF3 -882 in an additive model (the OR of one A allele compared with no A allele was equal to the OR of two A alleles compared with one A allele; p = 0.059). However, after adjustment for confounding factors the p-values were less significant for both SNPs (table 6⇑).
Although FVC % pred and the FEV1/FVC ratio at the beginning of the LHS were not the primary phenotypes due to the case–control study design, exploratory analyses of single SNP associations were performed with those phenotypes. In the rate of decline group, the FVC % pred phenotype was normally distributed; therefore, a one-way ANOVA was used to investigate whether FVC % pred and the FEV1/FVC ratio were the same among the three genotypic groups. A significant association of FVC % pred with CSF3 2176 was found (p = 0.033; table 7⇓): those individuals with the 2176TT genotype had a lower FVC % pred. No other significant associations were found (data not shown).
Haplotype association analysis
Haplotypes from CSF2 or CSF3 were not associated with decline of FEV1 in the analysis either with or without adjustment for confounding factors (data not shown).
The results of haplotype association in the cross-sectional level of FEV1 study are shown in table 8⇓. The haplotypes from CSF2 were not associated with decline of FEV1 in the analysis either with or without adjustment for confounding factors. The three-locus CSF3 haplotypes were associated with levels of FEV1 in a Wald global test before adjustment for confounding factors (an overall test of haplotype distribution between cases and controls gave p = 0.004), although after adjustment for confounding factors, the association became less significant (p = 0.027). The frequency of the haplotype -1719T/-882G/2176C was marginally higher in the high than the low function FEV1 group (16.9 versus 14.0%) when compared with the haplotype -1719C/-882G/2176T as a reference (adjusted p-value 0.047). Analysis of two-locus haplotypes (table 8⇓) demonstrated that this marginal association was probably driven by both the -1719T allele and the 2176C allele. The frequency of the haplotype -1719C/-882A/2176C was lower in the high than the low function FEV1 group (34.2 versus 38.7%) when compared with the haplotype -1719C/-882G/2176T as a reference, but the significance became borderline when adjusting for confounding factors (unadjusted p-value 0.007, adjusted p-value 0.089).
Gene–gene interactions
Using the FITF method, interactions of CSF2 and CSF3 were explored for all possible two- to four-locus models. There was no evidence of epistasis (gene–gene interaction; detailed results not shown).
Power analysis
The power of the present study was first calculated for a codominant mode of inheritance. A Chi-squared test with two degrees of freedom was used to calculate the associated power. Effect size, a measure of the magnitude of the Chi-squared value that is to be detected and a parameter needed for the power calculations, was calculated using the PASS program for each SNP and was used in the calculations. It was found that there was >80% power to detect an OR of 1.75 for both FEV1 decline and cross-sectional level studies. The power of the dominant and recessive models was tested with a 2×2 table; the proportions in the control group were set to be close to those observed with the five SNPs in the “low” outcome groups (i.e. nondecline of FEV1 group and high lung function group). Figure 1⇓ shows the curves of power versus OR value for the five studied SNPs in the cross-sectional study of baseline level of FEV1 for dominant and recessive models. For the FEV1-decline study, the power was slightly less than that of the baseline FEV1 study, due to smaller sample size (data not shown).
DISCUSSION
CSF3 is a logical candidate gene for the present study due to its biological function. In a rat model, neutrophil stimulation by CSF3 aggravates ventilator-induced lung injury manifested by increased lung neutrophils and interleukin-6 expression, increased alveolar oedema on histology and reduced lung compliance 41. In patients with acute respiratory distress syndrome, CSF3 expression level in the lung correlated with severity of pulmonary neutrophilia 6. Recently, it was shown that the CSF3 2176 SNP (named exon 4-165C>T) was associated with peripheral blood granulocyte count among workers exposed to benzene 13. Subjects with homozygous TT genotypes had significantly increased blood granulocytes compared with homozygous CC subjects (p = 0.00002) 13.
The functional significance of the CSF3 SNPs is unknown. Although no association of CSF3 2176 with the primary phenotypes of baseline and decline of FEV1 was found, a different SNP, CSF3 1719, was found to be associated with the baseline level of FEV1. Interestingly, in an exploratory analysis of individual SNPs with other phenotypes, such as FVC % pred and the FEV1/FVC ratio, a significant association of CSF3 2176 with FVC % pred was found (without correction for multiple comparisons). The association of the CSF3 2176TT genotype with lower FVC % pred is consistent with the previous report that the TT genotype was associated with higher blood granulocytes 13, since neutrophils in the lung and in the blood are important effector cells in COPD 42.
There are several explanations for these differences from previous studies, including genetic heterogeneity between different populations and differences in phenotypes studied and in tagSNP choice. It was reported previously that tagSNPs selected using the criteria of r2 = 0.64 and minor allele frequency of 5% could resolve 76% of actual and 85% of effective haplotypes in an analysis of 100 genes 30. However, using the same criteria, the CSF3 tagSNPs only resolved 35.7% of the actual haplotypes and 48.1% of the effective haplotypes. If CSF3 2176 is not the causal SNP and there are different LD patterns in the present study population compared with that of the workers exposed to benzene 13, the functional SNP might have been missed in the current study. The fact that the present results showed that CSF3 1719 was associated with the baseline level of FEV1 and CSF3 2176 was associated with FVC % pred suggests that neither SNP is causal but may be in LD with a causal SNP that is yet to be identified.
The observation that SNPs from CSF3 but not CSF2 were associated with lung function has several possible explanations. First, animal studies have documented that CSF3 plays a more important role than CSF2 in regulation of neutrophil homeostasis. Dogs depleted of CSF3 by a neutralising antibody developed profound and selective neutropenia 43, whereas mice depleted of CSF2 did not show impairment of haematopoiesis 44. In addition, CSF3 but not CSF2 knockout mice display chronic neutropenia 45, 46. Secondly, in patients with acute respiratory distress syndrome, CSF3 but not CSF2 expression in the lung correlated with severity of pulmonary neutrophilia 6, which demonstrated that CSF3 also plays a more important role than CSF2 in regulation of neutrophils in human subjects. Thirdly, it was reported that dexamethasone inhibits human airway smooth muscle cell release of CSF2 but not CSF3 47, suggesting that CSF3 and CSF2 are released through different mechanisms and, thus, may play different roles in the development of COPD.
It has been suggested that, apart from mobilising granulocytes from the bone marrow, CSF2 and CSF3 are decisive in influencing the subsequent T-helper cell (Th) type 1 or Th2 dominance of the immune response by selecting subsets of dendritic cells 14. A recent study demonstrated that a high CSF2/CSF3 ratio was correlated with good lung function in cystic fibrosis patients with chronic Pseudomonas aeruginosa lung infection 14, which prompted the current analysis of gene–gene interaction. However, no significant interaction of CSF2 and CSF3 was found in the present study. There are several potential reasons for this. First, the current study might not have had enough power to detect gene–gene interaction due to the sample size and minor allele frequencies used. Secondly, cystic fibrosis with chronic P. aeruginosa lung infection is a Th2-dominated response 48 while COPD is a Th1-dominated response 49. Therefore, the determinants of lung function in cystic fibrosis patients with chronic P. aeruginosa lung infection and in smoking-induced COPD patients are likely to be different.
There are several limitations to the present study. First, population stratification could have led to false-positive results. However, it has been reported that in the non-Hispanic white population, significant false-positive associations are unlikely to arise from population stratification, especially in well-designed, moderately-sized, case–control studies 50, 51. Secondly, false-positive results might have arisen from multiple comparisons. Although the results of association of CSF3 -1719 with lung function were corrected for multiple comparisons, only multiple SNPs in a single gene were taken into account. No correction for multiple genes and phenotypes was performed. Thirdly, no replication was performed by analysing a second cohort. Fourthly, no available function data support the associations that were found. Finally, the method of nested case–control study (i.e. using individuals from each extreme of the distribution of the phenotype of interest) has the advantages of cost reduction combined with satisfactory statistical efficiency when compared with the full cohort approach 23, 24. However, this study design prevented analysis of baseline and decline in FEV1 as continuous variables. Therefore, the results from the present study should be regarded only as hypothesis generating, and it will be necessary to replicate them in different studies, especially in those with a cohort design.
In conclusion, an association of the colony-stimulating factor 3 -1719C/T single nucleotide polymorphism with the baseline level of forced expiratory volume in one second was found. However, this association needs to be replicated in further studies. Moreover, additional functional study of this single nucleotide polymorphism, or other single nucleotide polymorphisms in linkage disequilibrium with it, is warranted.
Support statement
This work was supported by grants from the Canadian Institutes of Health Research and National Institutes of Heath Grant 5R01HL064068-04. The Lung Health Study was supported by contract N01-HR-46002 from the Division of Lung Diseases of the National Heart, Lung, and Blood Institute (Bethesda, MD, USA).
Statement of interest
None declared.
- Received April 5, 2007.
- Accepted February 28, 2008.
- © ERS Journals Ltd