Abstract
Genetic variants in the tumour necrosis factor (TNF) gene have been investigated in chronic obstructive pulmonary disease (COPD). However, there are many instances of nonreplication of these associations due to insufficient power or other factors. In this study, a large number of subjects were examined to elucidate whether genetic variations of TNF and/or lymphotoxin A (LTA), which is clustered with TNF, are associated with variations in lung function among smokers.
The present authors designed two nested case–control studies in the National Heart, Lung, and Blood Institute Lung Health Study (LHS), which enrolled 5,887 smokers. The first design included continuous smokers who had the fastest (n = 279) and the slowest (n = 304) decline of lung function during the 5-yr follow-up period, and the second included the subjects who had the lowest (n = 533) and the highest (n = 532) post-bronchodilator % predicted forced expiratory volume in one second at the start of the LHS. Within the TNF and LTA region, 10 tagging single-nucleotide polymorphisms were selected and genotyped.
Unlike the previous associations between TNF-308 and COPD in Asians, the current study found no association between either of the two phenotypes and the LTA and TNF polymorphisms.
In conclusion, these results support the findings of previous studies in late-onset chronic obstructive pulmonary disease in Caucasian populations.
The pathological characteristics of chronic obstructive pulmonary disease (COPD) include chronic inflammation of the airways, parenchyma and pulmonary vasculature. Polymorphonuclear leukocytes, macrophages and T-lymphocytes found at the site of disease are believed to release mediators that promote and maintain inflammation, which leads to tissue damage and remodelling. Tumour necrosis factor (TNF), which is released primarily from macrophages, is thought to play a critical part in the progression of COPD by increasing the expression of various pro-inflammatory mediators such as interleukin 8 1. Huang et al. 2 found that the -308A allele in the TNF promoter was associated with an increased risk of bronchitis in the Taiwanese population. Subsequently, many investigators have sought to implicate polymorphisms in TNF in the pathogenesis of COPD. Some of these reports have found an association between polymorphisms of TNF and subgroups of COPD 3–5. However, there are many instances of nonreplication of these associations 6–9. This inconsistency might result from false-positive and false-negative studies, owing to small sample size, insufficiently defined phenotypes, lack of adjustment for important covariates and/or genetic heterogeneity of populations. A similar inconsistency of results has been observed in functional studies of the -308 polymorphism and other TNF alleles 10.
Another closely related candidate gene for COPD is lymphotoxin A (LTA; previously designated TNF-β). LTA is clustered with TNF within a 6.1-kb region on chromosome 6p21.3, and polymorphisms of LTA and TNF have been reported to be in strong linkage disequilibrium (LD) in many populations 11. LTA is involved in the normal development of lymphoid tissue and also acts as an inducer of the inflammatory response 12. A polymorphism of LTA at position +252, which is thought to be involved in the modulation of gene expression 13, 14, has been reported as a susceptibility variant in asthma and other diseases such as myocardial infarction 14, 15. In a recent detailed histological analysis of the small airways in patients with COPD, Hogg et al. 16 found that the percentage of airways containing lymphoid follicles was strongly associated with the late stages of airway obstruction.
In this study, a large number of subjects were examined in two nested case–control studies, designed to elucidate whether genetic variations of TNF and/or LTA are associated with lung-function decline or lung-function level in smokers with mild-to-moderate airway obstruction.
MATERIALS AND METHODS
Study subjects
All subjects were of European descent selected from the participants in the National Heart, Lung, and Blood Institute Lung Health Study (LHS). The LHS enrolled 5,887 smokers who had spirometric evidence of mild-to-moderate lung-function impairment, from 10 North American medical centres 17.
Two nested case–control studies were conducted as previously described 18. In the first design, individuals who had the most and least rapid rate of decline of lung function were selected from among those who continued to smoke for the duration of the 5-yr follow-up. Those whose forced expiratory volume in one second (FEV1) % predicted decreased by ≥3.0%·yr-1 during the 5-yr period (fast decline group; n = 279) were compared with those whose FEV1 % pred increased by ≥0.4%·yr-1 (nondecline group; n = 304). In the second design, subjects who had the highest post-bronchodilator FEV1 % pred (≥88.9%; high function group; n = 532) and subjects who had the lowest post-bronchodilator FEV1 % pred (≤67.0%; low function group; n = 533) at the start of the LHS were compared. A total of 140 subjects included in the fast or nondecline groups had baseline lung function within the criteria described above, and therefore they were also included in the baseline lung-function study.
TagSNP selection and genotyping
For the selection of the TNF and LTA single-nucleotide polymorphisms (SNPs), information from SeattleSNPs was used 19. Tag SNPs were chosen using the LDSelect program (version 1.0) 20. An LD threshold of r2>0.8 and a minor allele frequency (MAF) of >10% were set in the program. Seven SNPs in the LTA gene sequence ( GenBank accession number AY070490) , and three SNPs in the TNF gene sequence (GenBank accession number AY066019) were chosen (fig. 1⇓). Because no assay to genotype the SNP located at 559 in the LTA gene sequence could be established, and there was no alternative SNP within the same bin, or group of SNPs where the alleles are highly associated (in linkage disequilibrium), this SNP was excluded from the study. The SNP located at 352 in the TNF gene sequence (TNF-238), whose allele frequency was <10%, was included in the study, because this SNP had been reported to be a susceptibility locus in many diseases 10. Ultimately, 10 SNPs in the region were analysed using the TaqMan 5′ exonuclease assay 21.
Statistical analysis
Hardy–Weinberg equilibrium tests and LD estimation were done using the genetics package for R 22. For descriptive purposes, expectation-maximisation haplotype frequencies were estimated with an expectation–Maximisation algorithm using the R haplo.stats package 22.
For the 2×2 contingency tables of dominant and recessive models, the Fisher Exact test was performed. For the 3×2 codominant model, the Chi-squared statistic and asymptotic p-values were calculated. Since some cell counts were low, significance was also assessed by permutation testing. The p-values from asymptotics and permutation tests were similar and therefore the asymptotic p-values are given. For the additive model, the Armitage trend test was performed. All single-locus association tests were performed using R.
For both the rate of decline and baseline samples, multivariate logistic regression was used to adjust for potential risk and confounding factors. Age, sex, pack-yrs of smoking and research centre were examined as potential factors. Generalised additive models were first used to examine the relationships between the log odds of having poor lung function (i.e. low baseline lung function or a fast rate of decline) and the continuous covariates of age and pack-yrs. In the case of pack-yrs, a linear relationship was not appropriate. A quadratic term for pack-yrs was added to the model to account for the nonlinearity.
Haplotype association was tested using hapassoc 23, a contributed R package available at www.r-project.org. Haplotypes of the six loci in LTA and the four loci in TNF were considered. Since the LTA and TNF SNPs are in LD with each other, haplotypes of three SNPs at a time were also tested for all sets of three consecutive SNPs across the two genes (haplotype windows). All covariates previously considered in the single SNP association regression models were included in the logistic regression models for haplotype association.
Power calculations
Unadjusted analyses
The power of the two study designs was estimated using the two independent proportions and many proportions functions in PASS 2005 (Number Cruncher Statistical Systems, Kaysville, UT, USA). The sample sizes used were those of the baseline (high 532, low 533) and rate of decline (fast 279, non 304) groups.
To estimate power for the codominant models, the odds ratio (OR) for two copies of the variant was set to be the square of the OR for one copy. Power was calculated for the two degrees of freedom Chi-squared test. These ORs and the observed proportions in the control groups were used to determine likely values of W, the measure of effect size 24, for calculating power.
Adjusted analyses
PASS was used to calculate sample size for multiple logistic regression. The method assumes that the effect of one dichotomous covariate is of interest, while controlling for the effect of other covariates. To account for the other covariates, the sample size found when no covariates are included was corrected by a factor involving the correlation coefficient (R2) between the covariate of interest and the other covariates. To estimate reasonable values for the correlation coefficient, the genotypes for the recessive and dominant models for all genes were regressed on age, sex, centre, pack-yrs and (pack-yrs)2. For the baseline models, the maximum R2 was 0.02, and for the rate of decline models the maximum R2 was 0.03.
RESULTS
Characteristics of subjects
The characteristics of the study subjects are shown in tables 1⇓ and 2⇓. Some potentially confounding factors were significantly different between the groups, as has been reported in previous LHS studies 18.
Single SNP analysis
Each of the selected SNPs were analysed for assocation with in addition to the rate of lung-function decline (table 3⇓). A multivariate logistic regression model was used to adjust for potential risk and confounding factors including age, sex, pack-yrs of smoking, and research centre in the analyses. There were no differences between the fast decline group and the nondecline group in the distribution of genotype frequencies for these SNPs. Further analysis using dominant, recessive and additive models confirmed the lack of any association between the rate of lung-function decline and LTA or TNF polymorphisms (data not shown).
Similarly, in the baseline lung-function study, no significant differences in genotype frequencies between the high and low baseline groups were found (table 4⇓). The analysis was repeated, excluding the subjects who were involved in the rate of lung-function decline study. This second analysis confirmed the lack of association (data not shown). All the SNPs were in Hardy–Weinberg equilibrium, except for rs3093543 in the high function group of the baseline lung-function study (p = 0.018).
Haplotype analysis
The distribution of haplotypes formed by the six LTA polymorphisms and in the four TNF polymorphisms was analysed in the two study designs (tables 5⇓ and 6⇓). As in the single SNP analyses, potential risk and confounding factors were taken into account. There was no significant association between the LTA or TNF haplotypes and the two phenotypes. Furthermore, a three-SNP haplotype analysis was performed using a sliding window, but no significant associations were found (data not shown).
Power calculations
Unadjusted analyses
Figure 2⇓a and b give the power curves for the rate of decline and baseline data sets, respectively. For the baseline group sample sizes, there would be >80% power to detect an OR of ≥1.75 assuming an MAF of 0.10 or an OR of 1.5 assuming an MAF of 0.20. Since the proportions of individuals in the dominant genotype category range 0.1–0.6, there is reasonable power to detect an OR of 1.5 for most SNPs genotyped. However, for a recessive model, the OR would have to be >2 to have greater than 80% power for the 0.03 recessive genotype category proportion. Since the rate of decline groups are smaller than those of the baseline groups, the ORs would have to be higher still to have reasonable power. For a proportion of 0.10 nondecliners in a genotype category, the OR would have to be 2 to have a >80% power to detect the difference in proportions.
For a codominant model, at an effect size of at least W = 0.10, there is ≥80% power to detect the difference in genotypic proportions in the baseline data set of 1,096 individuals. Under most distributions of TNF–LTA genotypes, this value corresponds to an OR of 1.4 per allele. For the rate of decline data set of 595 individuals, the necessary effect size is W≥0.13. For instance, this effect size could correspond to an MAF of 0.3 and an OR of 1.4 per allele, or a MAF of 0.25 and an OR of 1.6 per allele. For all genotype distributions observed for TNF–LTA, there would be adequate power to detect an OR of 2 per allele in the rate of decline data set.
Adjusted analyses
The power for the rate of decline models was calculated using R2 values of 0 and 0.03, with an OR of 2 and a variety of percentages of the sample with the risk genotype that corresponded to the values in the unadjusted power analyses. The power computed for an R2 of 0.03 was very close to that of an R2 of 0 and that calculated using the difference in proportion functions given in the previous section and in figure 2⇓a and b.
DISCUSSION
In the present study, 10 polymorphisms within the TNF and LTA region were selected and genotyped in two nested case–control studies: one using the rate of decline of lung function as the phenotype; and the other using the level of lung function as the phenotype. The relationship between these two COPD phenotypes and polymorphisms in the LTA and TNF genes was analysed and no evidence for association was found.
Polymorphisms of TNF, especially at the TNF-308 locus, have been reported as susceptibility variants in many infectious and autoimmune diseases 10. An association between COPD and TNF polymorphisms was first reported by Huang et al. 2. These investigators genotyped 42 adult males with chronic bronchitis and 42 sex-, age- and smoking-matched control subjects as well as 99 randomly sampled schoolchildren. They found that the frequency of the TNF-308A allele was significantly higher in the cases than the controls (p<0.001; OR 11.1; 95% confidence interval 2.9–42.6). Subsequently, Sakao et al. 4 reported a significant association of the 308A allele in a Japanese population of smokers with an FEV1 <80% pred 4. However, no association between TNF SNPs and COPD phenotypes was found by another Japanese group 6 or in a study of a Thai population 7.
Conversely, in a Caucasian population, Keatings et al. 3 showed that COPD patients who were AA homozygotes for TNF-308 demonstrated less reversibility of air flow obstruction following administration of a β2-agonist (p<0.05). However, in the same study, no difference was found in the distribution of the TNF-308 genotypes between the COPD patients and controls. This lack of association was consistent with the results from studies of other Caucasian populations 8, 9.
An important challenge in case–control studies is the possibility of false-positive or -negative associations caused by small sample sizes, population stratification, multiple testing, and differences in phenotypic definition 25. The inconsistent results of previous studies of TNF polymorphisms in COPD might be due to some or all of these factors. Most previous studies have had small sample sizes and therefore the negative results may be due to low power. Since power is such an important consideration in the interpretation of studies which report negative results, the present authors performed a detailed analysis of the power to detect associations in the current study. The results of this analysis show that for most of the allele frequencies and genetic models studied, the present study should have >80% power to detect a genotype/haplotype relative risk of ∼1.7 or greater. Therefore, there was sufficient power to detect the difference of genotypic proportions shown in the previous studies in which positive association between phenotypes of COPD and TNF-308 had been reported 2, 4. However, the possibility that the true odds ratio for the polymorphism is <1.7 cannot be excluded. Meta-analysis may be an effective tool to detect such variants, although true sources of variability that may exist between populations and publication bias make this problematic.
Another explanation of inconsistent results in association studies is differences in phenotypic definition. Most previous studies used different phenotypes such as diagnosis of chronic bronchitis, COPD or emphysema. In the present study, two well-defined phenotypes based on lung function were used, which are closely linked to the diagnosis and evaluation of COPD, but other phenotypes such as emphysema were not investigated. TNF was shown to be crucial to the development of emphysema in a study of TNF receptor knockout mice exposed to smoke 26. Therefore, there is a possibility that the specific sub-phenotype of emphysema is associated with TNF polymorphisms.
Furthermore, the inconsistent results in association studies may come from the possibility that the reported variants did not contribute to the diseases but are in LD with causal variants. In addition, compared with a SNP-level approach, the gene-based approach for replication has the advantage of being less susceptible to erroneous findings due to genetic differences between populations 27. This is the major reason that the present authors have expanded a previous study of the TNF-308 and LTA+252 SNPs 28 to cover 10 SNPs in TNF and LTA. Although TNF polymorphisms, particularly TNF-308, have been most studied in COPD, whether these polymorphisms are functional is a controversial issue as a result of both ex vivo and in vitro studies 10. Recently, Knight et al. 29 developed a method called haploChIP, employing chromatin immunoprecipitation and mass spectrometry to detect the amount of phosphorylated RNA polymerase II, the quantity of which is related to the transcriptional activity, bound to two different alleles of a gene. Using this technique, they showed that functionally specific haplotypes of the TNF/LTA locus did not correlate with allele-specific transcription of TNF but did correlate with evidence of transcription of LTA. They speculated that polymorphisms of TNF were in LD with causal variants of LTA. To avoid overlooking potential causal variants in the two candidate genes, a stringent LD threshold was used to select a set of highly informative markers that covered from 2,000 bp upstream of the 5′ end of LTA to 1,500 bp downstream of the 3′ end of TNF, including the whole of the TNF promoter site.
Most recently, an association has been found between the TNF-308 polymorphism and both quantitative and qualitative phenotypes related to COPD in the Boston Early-Onset COPD study, which included 949 individuals from 127 families 9. However, the same authors were unable to replicate this association in a case–control study of usual later-onset COPD, which included 304 patients and 441 controls 9. The authors suggested that genetic factors related to severe early-onset COPD might be different from usual later-onset COPD, as a possible explanation of their inconsistent results. Furthermore, they emphasised the necessity of replication in an independent cohort because of the possibility that positive associations might be the result of multiple testing. The current results, however, are not a direct replication of the results from the case–control study 9. The study designs employed in the current study are not true case–controls but rather comparison of phenotypic extremes within a cohort of smokers selected for evidence of mild-to-moderate airflow obstruction. Therefore, the current study is an investigation of disease severity genes. Conversely, the selection of phenotypic extremes is an accepted strategy for identifying risk alleles for complex genetic disease 30. Although it is difficult to estimate how this design would influence the ability to identify genetic risk, it might be expected to increase rather than decrease power, since it is likely that individuals at phenotypic extremes harbour susceptibility alleles.
The present study was designed to be relatively robust to type II error, but there are still limitations, which relate to phenotypic and ethnic differences, compared with previous studies. In particular, emphysema was not used as a phenotype. Additionally, the current study was limited to Caucasian individuals and many of the previous associations between TNF-308 and COPD were reported in Asian populations.
In conclusion, a set of highly informative markers was selected based on the r2 linkage disequilibrium statistic within the tumour necrosis factor and lymphotoxin A region. To reduce the type II error in the association study, a relatively large number of samples was analysed for two well-defined phenotypes, the rate of decline and baseline level of the lung function, and the analysis was adjusted for factors which had an effect on the phenotypes. In contrast with previous associations between the tumour necrosis factor-308 polymorphism and chronic obstructive pulmonary disease in Asians, no association was found between either of the two phenotypes and polymorphisms of lymphotoxin A and tumour necrosis factor. The current results support the findings of other previous studies in late-onset chronic obstructive pulmonary disease in Caucasian populations.
Single-nucleotide polymorphisms (SNPs) in the region of the tumour necrosis factor (TNF) and lymphotoxin A (LTA) genes whose allele frequency was >10% in European-American Coriell samples. Bin: a group of SNPs where the alleles are highly associated; D′ and r2: measures of linkage disequilibrium. #: tag SNPs; ¶: rs915654 was excluded from the study; +: rs361525 (TNF-238), whose allele frequency was <10%, was included in the study.
a) Power curves for the rate of decline data set sample sizes of 279 fast decliners and 304 nondecliners over a range of odds ratios (ORs) and a range of proportion of controls having susceptibility genotype. b) Power curves for the baseline dataset sample sizes of 532 high baseline and 533 low baseline subjects over a range of ORs and a range of proportion of controls having susceptibility genotype. ○: OR 1.4; ▵: OR 1.5; □: OR 1.75; ▿: OR 2; ⋄: OR 2.5.
Characteristics of subjects in the rate of lung-function decline study
Characteristics of subjects in the baseline lung function study
Genotypic distribution of the single-nucleotide polymorphisms (SNPs) of the lymphotoxin A (LTA) and tumour necrosis factor (TNF) genes in the rate of lung-function decline study
Genotypic distribution of the single-nucleotide polymorphisms (SNPs) of the lymphotoxin A (LTA) and tumour necrosis factor (TNF) genes in the baseline lung-function study
Haplotype analysis of the lymphotoxin A(LTA) and tumour necrosis factor (TNF) genes in the rate of lung-function decline study
Haplotype analysis of the lymphotoxin A(LTA) and tumour necrosis factor (TNF) genes in the baseline lung-function study
Acknowledgments
The authors gratefully acknowledge the National Heart, Lung and Blood Institute for the recruitment and characterisation of the current study. The authors wish to thank P. Lindgren for her insightful review of the manuscript.
Footnotes
-
For editorial comments see page 8.
- Received March 31, 2006.
- Accepted August 15, 2006.
- © ERS Journals Ltd