Introduction

Studies of twins, families, and birth cohorts have all shown that genetics plays an important role in the development of asthma, but it has been difficult to clearly identify the exact genetic variants involved (Ober and Hoffjan 2006). There is no shortage of positive findings and candidate genes, but replication has proved difficult, and even the most replicated associations have one or more negative reports (Ober and Hoffjan 2006). There is a clear need for well designed and executed replication studies.

The aim of this study was to identify by rigorous replication the most likely candidate genes for asthma and related phenotypes, based on a common-disease, common-variant hypothesis. According to this hypothesis, genetic variants that impart risk for common diseases will have common alleles that are present in all populations and be associated with a relatively modest effect size (Blakey et al. 2005; Chanock et al. 2007). Large sample sizes consisting of thousands of subjects are needed to clearly define the contribution of these variants to asthma and allergic phenotypes (Blakey et al. 2005; Chanock et al. 2007; Maier et al. 2006).

The present study design involved testing genetic variants in 93 genes, previously reported (in one or more studies) to be associated with asthma and/or related traits (Supplementary Table S1), in four panels that include three family based studies and a population-based, case-control sample (Table 1). These combined resources have sufficient power (≥50%) to detect a spectrum of alleles with heterozygote relative risk (RR) > 1.5 (Fig. 1).

Table 1 Panel sizes by study, phenotype and ethnic background
Fig. 1
figure 1

Power of the combined samples, on the y-axes, to detect an association with asthma status at P < 0.00001, for a susceptibility allele found in frequency q in the population, having a genetic effect represented by the relative risk on the x-axis. Each curve is associated with a different risk allele frequency q, as indicated on the graphs. In the left panel, the minor allele of a SNP is the risk allele; in the right panel, the major allele of a SNP is the risk allele. Each point on the power curves is estimated from 1,000 simulated replicates, assuming a disease prevalence of 7.5%, an additive model for the risk and homogeneity across samples

Four discrete phenotypes (asthma, atopy, atopic asthma, and airway hyperresponsiveness (AHR)) were selected a priori and tested for association using a general allelic likelihood ratio test. We have used physician diagnosis (asthma), skin prick tests (atopy), and results of methacholine challenge (AHR) to define both affected and unaffected individuals (see “Methods”). We harmonized (asthma, atopic asthma, and AHR) and standardized (atopy) phenotype definitions between panels.

We selected 154 SNPs with prior evidence for association (positive and negative) with one or more asthma-related phenotypes and we supplemented these with 26 coding non-synonymous SNPs in previously associated genes. Selected SNPs had a minor allele frequency (MAF) ≥ 0.05 in at least one HapMap population. We complemented the study with 719 tagSNPs, which were within gene-intervals that include 10 kb sequences, both upstream and downstream, as determined by using NCBI Build #34 of the human genome reference sequence (see “Methods”).

Despite the relatively large sample size few (13%) of the genes showed any evidence for association with one or more phenotype. This study which represents the most comprehensive attempt to date at replication of genetic associations in asthma/allergy was conducted at the dawn of the era of genome wide association studies which are sure to identify additional candidates and illustrates the need to carefully re-evaluate positive associations in large and heterogeneous populations.

Methods

All DNA samples were collected with informed consent, in compliance with each recruiting center’s Research Ethics Boards.

Study samples (panels)

The individual panels include (1) the Canadian Asthma Primary Prevention Study (CAPPS) cohort (Becker et al. 1999, 2004; Chan-Yeung et al. 1999, 2000, 2005; He et al. 2003; Hegele et al. 2001; Kaan et al. 2000), which is comprised of 549 children from 545 families at high risk for developing asthma and their parents and is predominantly Caucasian (79.2%); (2) the Study of Asthma Genes and Environment (SAGE) cohort (Kozyrskyj et al. 2008; Mai et al. 2007) composed of 723 children and their parents from Manitoba, Canada, who are primarily Caucasian (74.2%); (3) the Saguenay-Lac-Saint-Jean and Quebec City Familial Asthma Collection (SLSJ) (Begin et al. 2007; Laitinen et al. 2004; Laprise et al. 2004; Poon et al. 2004, 2005; Raby et al. 2002; Tremblay et al. 2006) consisting of a French-Canadian founder population panel of 306 multigenerational families with at least one asthmatic proband; and (4) The Busselton Health Study (Busselton) (James et al. 2005a, b), a population-based, nested, case-control panel of 1,549 individuals of European Caucasian descent from Australia. Of note is that at the time of phenotyping, subjects in the CAPPS and SAGE panels were between the ages of 6 and 8 (birth cohorts), whereas subjects in the SLSJ and Busselton panels were primarily teenagers or adults. Detailed information regarding each study panel is provided in the online supplement.

Phenotype definitions

Four dichotomous phenotypes were defined a priori for the analysis:

  1. (1)

    asthma defined as doctor-diagnosed asthma in CAPPS, SAGE, and SLSJ, and as a positive response to the question “has a doctor ever told you that you had asthma, or bronchial asthma?” at either survey (1981 and 1994) in the Busselton panel. Detailed information regarding the asthma phenotype is contained in the online supplement. (Supplementary Table S2)

  2. (2)

    atopy defined as any skin-prick test wheal at least 3 mm greater than the negative control for any allergen tested (Supplementary Table S2 for specific allergens);

  3. (3)

    atopic asthma defined as individuals diagnosed with both asthma and atopy; and

  4. (4)

    airway hyperresponsiveness as assessed by methacholine-challenge testing in all four samples. A positive response was defined as a 20% decrease in FEV1. For the SLSJ sample, AHR was defined as PC20 < 8 mg/ml; for Busselton, PD20 ≤ 3.9 μmol. Because AHR is more prevalent in children (Godfrey 2000; Godfrey et al. 1999), for the childhood samples (CAPPS & SAGE), we used PC20 < 3.2 mg/ml, because it yields the greatest sum of sensitivity and specificity in children (Godfrey 2000).

Gene and SNP selection

For this study, we selected genes that were reported to be associated with asthma or related phenotypes in at least one prior study. We selected 154 SNPs that had been reported in these studies, and supplemented these with 26 coding non-synonymous SNPs in these candidates reported in dbSNPs (build 34) that had frequency data and a MAF > 0.05 in any population, and 719 tagging SNPs that were selected from HapMap (2005), or SeattleSNPs (see web resources in Appendix), or Innate Immunity (see web resources in Appendix) of the National Heart Lung and Blood Institute’s Programs for Genomic Applications by using linkage disequilibrium data derived from the HapMap CEU dataset.

Genotyping and data cleaning

We genotyped samples with the Illumina Bead Array System in accordance with the manufacturer’s protocol (Shen et al. 2005) and according to Lincoln et al. (2005). Fourteen SNPs were genotyped using TaqMan assays (Applied Biosystems) (see Supplementary Methods). For 669 samples from SAGE and CAPPS, we used DNA templates generated using a Whole Genome Amplification method (WGA), using the RepliG Midi kit (Qiagen Cat# 150045) and the protocol described by (Qiagen, Hilden, Germany). We retained markers for analysis if they had a minimum call rate of 90%, a maximum of four Mendelian errors, a maximum of one reproducibility error, and showed consistency with Hardy–Weinberg equilibrium at the level > 0.001. In total, we genotyped 898 SNPs, of which 861 passed all quality control checks and entered the analysis.

Statistical analyses

We tested for association between SNPs and the dichotomous outcomes of (asthma, atopy, atopic asthma, and airway hyperresponsiveness) using a general allelic likelihood ratio test χ 2 (one degree of freedom) as implemented in UNPHASED (Dudbridge 2003, 2008) (see online supplement for details). Our strategy for correction for multiple testing was influenced by the study design. If we were to apply a Bonferroni correction, there would be no SNP that would survive this stringent level of correction. However, it is recognized that a Bonferroni correction does not take into account the correlation between tagSNPs, and would result in a significant overcorrection and subsequent loss of power (Nyholt 2004). This is a large scale replication study, with one of the aims being to rank order the significance of genes to the etiology of asthma. Some genes needed more SNPs to be adequately covered than others. Applying a global multiple correction factor, in this study to each individual SNP result would have an undesirable effect: densely typed genes would show greater trends of association merely because they use a greater proportion of the total SNP resources. Rather than correcting for all genes and SNPs tested we employed a gene-based approach by applying a correction only with respect to the number of SNPs in that gene and its neighborhood and the number of phenotypes tested (Nyholt 2004). For each gene investigated, an effective number of independent SNPs was calculated by using the definition of Li and Ji (2005), as implemented in SNPSpD (Nyholt 2004). By using a similar procedure, the Matrix Spectral Decomposition (matSpD) approach, we estimated the effective number of independent phenotypes.

We recognize some readers may not be satisfied that this computationally fast approach adequately approximates the permutation distribution. Therefore, when a SNP was deemed significant using this fast approach, the corrected significance levels (cP) were corroborated using the computationally intensive permutation procedure implemented in UNPHASED (Permutation cP), using 50,000 random permutations for each genes.

This correction strategy was deemed to be appropriate for tagSNPs, which were selected for indirect tests of association, the hypothesis being that the marker SNP is only in LD with an unknown disease variant. The prior probability for a SNP to be associated with an asthma related phenotype differs between a tagSNP and a SNP previously reported to be associated with our phenotypes of interest. Thus, there is a natural stratification of the prior probability for association in our dataset. We choose to account for this difference in priors, using a simple stratified approach. We grouped our SNPs into two strata’s, one with SNPs identified in the literature with at least one prior positive association with an asthma related phenotype (N = 104), and the second strata containing the remaining SNPs. For SNPs in the former strata, we conducted a direct test of association, whereby we corrected the nominal P value only for the number of phenotypes evaluated in the study.

We present the nominal P value, the corrected (cP), and the permutation P value where appropriate (Tables 2 and 3). We also supply the published references for SNPs previously reported in the literature, allowing the reader to fully evaluate the evidence for statistical significance and association. Additionally, for the 104 SNPs that were not subjected to a full correction we provide both the P values for both the modified and standard correction strategy. This allows the reader to select the level of significance they feel is appropriate in light of the prior evidence for association.

Table 2 Significant associations between asthma, atopy, atopic asthma, and AHR in the combined analysis
Table 3 Significant associations between asthma, atopy, atopic asthma, and AHR in the individual samples

Website and online database (http://genapha.icapture.ubc.ca/)

To support further identification and research into the most robust candidate genes for asthma and related phenotypes, we have compiled a comprehensive database that is being made available to the scientific community on our website (http://genapha.icapture.ubc.ca/). This database contains information on all genes and SNPs studied in the four panels and includes detailed information on allele frequencies (with comparision to HapMap frequencies), association results, corrected and uncorrected P values, and linkage disequilibrium (LD) plots for each gene by sample. Search tools have been created and incorporated into the database to allow users to search and identify information of interest to them. Users can search by gene or by SNP. For each SNP there is a comprehensive summary page that provides the following information; SNP function (intron, exon, non-synonymous and synonymous coding), detailed association results for all panels and phenotypes for that SNP, links to dbSNPS, OMIM, NCBI, The Genetic Association Database, Pharm GKB, UCSC Genome Browser, Seattle SNPs, Innate Immunity and KEGG databases. A pathway search interfaces our results with the KEGG database, users can search by gene, SNP, or genetic pathway. For each search a list of all genetic pathways identified in the KEGG database is displayed. Once a pathway has been selected a pathway graphic is displayed and the genes which have been genotyped in our datasets are highlighted and appear in a box on the left hand side of the screen. Users can then view association results for genes in specific genetic pathways of interest. Exhaustive literature searches have been conducted to identify previous genetic associations with allergic disease phenotypes. If a SNP has been previously reported in the literature this is identified on the SNP summary page (* Previous Associations) and clicking on the link provides users with a reference list. A full tutorial (http://genapha.icapture.ubc.ca/research-tutorial.do) is provided to facilitate the access and use of this unique and valuable resource by the asthma and allergy community.

Results

Genotyping and data cleaning

After applying standard quality filters to the set of 898 SNPs (see “Methods”), we retained 861 SNPs for analysis. From a set of 372 replicate samples, we observed only 20 genotype differences, resulting in a discrepancy rate of 0.005%. In the family based samples, relationships between samples were confirmed by comparing pairwise average number of alleles identical by state to what is expected. A total of 24 families (13 CAPPS, 8 SAGE, and 3 SLSJ) were excluded due to either mispaternity or unresolved DNA switches. At most one Mendelian inconsistency was observed for 99.1% of the SNPs in the remaining families. Four sets of twins were identified in the CAPPS cohort (2 monozygotic and 2 dizygotic), the dizygotic twins were retained; a single sib was chosen from the monozygotic pairs for inclusion in the analysis. In the case-control sample of Busselton, we identified 73 parent-offspring relationships and 52 sib pairs. To address these relationships we eliminated 125 samples. We moreover identified two samples that were likely to be of Asian descent, inclusion of which (in a case-control design) could contribute to spurious results because of population differences in allele frequency. We also identified two duplicate samples (identical at all loci) but as the two samples differed in phenotype (one case and one control) both samples were removed. Removal of these samples resulted in a total of 644 asthma cases and 751 controls available for analysis.

Combined results

We examined the evidence for association with 93 candidate genes by using direct and indirect tests of association in a joint analysis that utilized all samples and also with individual panels to evaluate both heterogeneity and individual-panel contributions to the joint analysis. Detailed results are provided for the 26 SNPs which showed trends for association in the combined analysis, after allowing for the number of SNPs within their respective genes (Table 2). Results for all panels, genes, and SNPs by phenotype are available in the Supplement (Supplementary Table S3). P values have been adjusted for both the effective number of independent SNPs tested in a gene as well as the effective number of independent phenotypes tested, which was estimated to be 3 (see “Methods”), and confirmed by permutation tests where deemed significant. None of the results resist a correction for the total number of genes tested using a Bonferroni correction.

Because this set of genes has been associated with asthma or asthma-related traits, a global examination of our evidence for and against association that includes all genes and SNPs may be more informative than an examination of individual associations. When quantile–quantile plots for the χ 2 test statistics are compared to the expected distribution, we observe a deviation in the shoulder of the distribution, but overall no gross inflation in the tail of the test statistics (Fig. 2). These observations are consistent with a candidate gene study, and indicate an excess of SNPs with modest evidence for association. These results are both consistent and to be expected from a set of candidate genes of relatively small effect size OR < 1.4, although the possibility remains that the inflation of the χ 2 test statistic is the result of unresolved population stratification in the Busselton sample or artifacts such as non-random absence of genotype data. Our results indicate that even larger sample sizes than the current study (N = 5,565) are needed to clearly delineate the causal variants for these candidate genes.

Fig. 2
figure 2

For each of the four phenotypes, a qq plot of the χ 2 test statistics is constructed by ranking the values of the test statistic from smallest to largest (order statistic) and plotting them against the expected distribution. Values for each quantile are shown in black. The expected distribution is based upon 861 independent χ 2 tests. Lines indicate the 95% confidence bands of the expected distribution

The genes for which we found trends for replication with asthma and allergic phenotypes after correcting for the number of tests at each locus (see “Methods”) are among those most consistently replicated in the literature; these include IL13, STAT6, and TBXA2R (Ober and Hoffjan 2006). IL13 and VDR have been previously examined in the CAPPS (He et al. 2003) (IL13) and SLSJ (Poon et al. 2004) (VDR) panels. A sensitivity analysis was conducted by excluding the CAPPS cohort from the IL13 combined analysis with a small change in the adjusted P value for asthma rs2243204 (= 0.0126 to = 0.0328), rs20541 (= 0.0342 to = 0.08016), and rs1295686 (= 0.0424 to = 0.11406). For the VDR analysis we excluded the 227 families recruited from the Saguenay-Lac-Saint-Jean region of Quebec as these families may have been used in the previous analysis. Families recruited from Quebec City (N = 79) were retained as they were not utilized in the prior study. The adjusted P values are for rs1540339 (asthma = 0.0267 to = 0.7377, atopy = 0.0066 to = 0.0894, atopic asthma = 0.0096 to = 0.1670), rs731236 (atopy = 0.0327 to = 0.2119, AHR = 0.0265 to = 0.05814), rs2239185 (atopic asthma = 0.0354 to = 0.4101 AHR = 0.0377 to = 0.15), rs3782905 (atopic asthma = 0.0285 to = 0.1700 and for rs11608702 (atopic asthma = 0.0306 to = 0.0408).

With related phenotypes, we expected to see concordance of the associations with more than one phenotype. IFNGR2, TBXA2R, STAT6, and VDR are all associated with more than one phenotype. Genes showing trends for association with one or more phenotypes were examined for nominal associations or trends with other phenotypes. The following genes have an unadjusted P value < 0.05 for the additional phenotype: TLR10 (asthma), TLR9 (asthma), and IL13 (atopy and AHR). VDR demonstrates evidence for association with all four phenotypes; IL13 and IFNGR2 with three; and TLR10, TBXA2R, and TLR9 with two. None of these candidate genes are in regions recently identified in the first genome-wide association study for asthma (Moffatt et al. 2007).

Individual panel results

We recognize that there may be important differences in phenotype expression, prevalence, gender distribution, and risk factors between childhood and adult asthma. Recent studies have suggested that there may be gender-specific responses to environmental exposures (Arshad and Hide 1992; Arshad et al. 2001; Gilliland et al. 2003; Li et al. 2000) and that the timing of environmental exposures during early childhood may be critically important (Melen et al. 2001; Salam et al. 2004). These differences have led researchers to speculate whether adult and childhood asthma are, in fact, the same disease (Martinez 2007).

In recognition of these potentially important differences, we present individual point estimates (OR and P values) for each of the individual samples and the joint analysis. This allows researchers to evaluate the evidence for association in both childhood (CAPPS and SAGE) and adult samples (Table 3 and Supplementary Table S4), and facilitates future meta-analyses.

Caucasian only results

To further evaluate heterogeneity and sensitivity, we stratified the samples by ethnicity and examined the evidence for association in the Caucasian only sample (Supplementary Table S3). In general, the results were similar to the analysis where non-Caucasian samples were included in the combined analysis.

SNPs previously reported in the literature

We report on the associations for 154 SNPs previously reported in the literature that passed genotyping and entered the analysis (Supplementary Tables S1 and S4). About 104 of the 154 SNPs have been previously reported to be associated with an asthma related phenotype. For those 104 SNPs previously found to be associated in the literature we present the results correcting only for the number of independent phenotypes tested (direct test, Supplementary Table S4). For the 16 SNPs which showed trends for association in the combined analysis we present corrections using only the direct test, as well as correcting for both the number of SNPs and phenotypes tested in the analysis.

Discussion

The study was initiated at a time when the literature contained close to 100 published reports of association between genetic variants at candidate loci and asthma or atopy, and prior to the recent advent of genome-wide scans (Weidinger et al. 2008). The main message that is derived from the current study is that single alleles associated with asthma and atopy appear to have very modest effects (with ORs < 1.4) and that many published associations for asthma and atopy may be false-positives. The low rate of replication may also be due to small effect size, differences in phenotypic definition, differential environmental effects and/or genetic heterogeneity. A study such as this one provides a more realistic picture of genetic predisposition of asthma and atopy than hundreds of independent reports. Although we come to the conclusion that four studies with over 5,000 subjects is not enough to delineate false from true positive associations having very small effects, we believe that prioritization of loci to be validated can be enabled by reporting the combined and single-study results to the asthma and allergy community. We have also provided an extensive web resource (http://genapha.icapture.ubc.ca/) which includes access to online tools that interface our data with additional online resources, databases, and references.

We accepted evidence for replication if any single SNP in the candidate genes survived correction for multiple comparisons at P < 0.05 in the combined analysis. We note that two genes with the strongest associations in the combined analysis are for genes on chromosome 12 (STAT6 and VDR), which is the chromosome that has often been implicated in linkage studies of asthma (Ober and Hoffjan 2006). STAT6 which was associated with atopic asthma and AHR in this study was recently associated with IgE levels in a genome-wide association study (Weidinger et al. 2008). STAT6, a member of the Signal Transducer and Activator of Transcription (STAT) family acts in signaling pathways linked to IL4 and IL13 and as a transcription factor for genes that are involved in TH2-associated processes (Hebenstreit et al. 2006). The Vitamin D Receptor (VDR), as well as Interleukin-13 have been implicated in many genetic association studies (Hebenstreit et al. 2006; Ober and Hoffjan 2006; Poon et al. 2004) and are also involved in TH1–TH2 processes. We found that VDR was associated with all four phenotypes; Poon et al. found significant association with asthma and atopy whereas Raby et al. found association with asthma and IgE levels in women (Raby et al. 2004). We found two IL13 SNPs (rs2243204 and rs1295686) to be associated with asthma; previous associations for this gene include asthma, atopic asthma, atopy, IgE levels, and AHR. (Ober and Hoffjan 2006; Maier et al. 2006) and is among the most replicated of asthma and allergy genes.

TBXA2R (Thromboxane A2 Receptor) has been extensively studied in the Japanese and Korean populations, but it has not been studied outside of these populations. In these populations, polymorphisms in TBXA2R have been associated with asthma, atopy, asprin sensitivity, and blood levels of specific IgE (Ober and Hoffjan 2006; Kim et al. 2008). We provide evidence demonstrating that TBXA2R is associated with both atopy and AHR in Caucasians (Table 2). The Thromboxane A receptor, which is involved in prostaglandin and leukotriene pathways, has diverse physiological and pathophysiological actions related to allergies, modulation of acquired immunity, atherogenesis, and neovascularization (Nakahata 2008).

Studying genetic susceptibility variants across diverse studies and populations may increase statistical power when the variants have direct effects on the phenotype that are independent of environment, age, ethnicity, etc. If the genetic variants act in conjunction with other risk factors via interactions (genetic and environmental), they may not be consistently show association across studies. However, providing the results from many studies may be a step toward identification of the environmental and other factors that modulate the expression of the genetic variant.

None of the candidate genes we found to be associated are in regions recently identified in the first genome-wide association study for asthma (Moffatt et al. 2007). Additionally, the current study did not include SNPs in ORMDL3, as this association was reported after the SNPs were selected for this study. We note that the ORMDL3 association has been replicated in the SLSJ cohort (Madore et al. 2008), and numerous other studies (Galanter et al. 2008; Hirota et al. 2008; Tavendale et al. 2008).

This study was limited to only one form of genetic variation, single nucleotide polymorphisms. Although this is likely the major source of variation in the human genome it is becoming increasingly clear that structural alterations including copy number variants and inversions can make a substantial contribution to susceptibility to common diseases (Sharp 2008).

Our study focused on the primary genetic effects, we did not consider the combined additive, or interactive (epistasis) effects of these SNPs, or gene–environment interactions as this was outside the scope of the current work.

In conclusion, this study provides important insight into the interpretation of the literature related to the genetics of asthma and allergy, since many of the genes included have not been extensively studied (Ober and Hoffjan 2006) or have only been studied with small sample sizes (<300 subjects, Supplementary Table 1). The present work is an important step in elucidating the genetic etiology of these complex traits. The electronic resource (http://genapha.icapture.ubc.ca/) provides a mechanism for researchers to access and query a large dataset of genetic variation at close to 100 loci, in over 5,000 subjects phenotyped for asthma and allergic traits.