Abstract
Significant advances have been made in our understanding of the role of genetic variation in determining complex human phenotypes such as asthma. It is now well established that there is no single “gene for asthma”, in the way that the cystic fibrosis transmembrane receptor is the “gene for cystic fibrosis”. It is also clear that among all genetic variants eventually found to be associated with asthma, only a few will be replicated, and in the same direction, in the majority of well-performed studies. Current evidence suggests that most asthma-related polymorphisms determine risk for the disease in a context-dependent manner, i.e. they interact with environmental factors, with polymorphisms in other genes and with the specific developmental phase of the disease in which the association is tested. Elucidating these complex interactions will allow us to understand better the heterogeneity of the disease and thus to develop therapeutic tools tailored to the specific form of the disease in each patient.
SERIES “GENETICS OF ASTHMA AND COPD IN THE POSTGENOME ERA”
Edited by E. von Mutius, M. Kabesch and F. Kauffmann
Number 2 in this Series
The progress made since the 1990s in our understanding of the structure and function of the human genome has been nothing but stunning. As part of these advances, catalogues of polymorphisms covering both the whole genome and single genes are now available. Genotyping technologies have become much cheaper and readily available to researchers with access to DNA samples from patients with diseases whose genetic basis has been either confirmed or is suspected. For asthma and allergic diseases, tens of genome-wide linkage screens and hundreds of association studies have been published.
Studies of the genetics of complex diseases have been accompanied by the hope, both among researchers and the general public, that the results help in the rapid development of new diagnostic and therapeutic approaches for these diseases. Many researchers (including the present author) believed that the clinical use of arrays of genetic markers that would define with a reasonable degree of precision the risk for asthma at the beginning of life (and, therefore, regardless of life history) was only a matter of time. A realistic assessment of the current status of the field would lead to a different conclusion: the hoped-for has not materialised. This is not to say that no significant advances have been made in this field. On the contrary, genes associated with asthma and allergies have been identified, both through searches using anonymous markers in the whole genome 1 and by direct studies of association between single nucleotide polymorphisms (SNPs) and allergy, asthma and related traits 2. The present author’s research group has had an active participation in this quest, having contributed to the identification of SNPs associated with allergies in the genes for interleukin (IL)-13, CD14, and the T-cell immunoglobulin domain and mucin domain–IL-2-inducible T-cell kinase cluster, all in chromosome 5q31–33 3–6. However, two consistent patterns have emerged. First, most SNPs show a low attributable risk proportion: in only a small proportion of subjects can the presence of the phenotype be attributed to the SNP or SNPs under study. Secondly, in almost all instances, when replication of genetic associations has been attempted in different populations and in different locales, replication has been very uneven; some researchers have corroborated the findings, while others have been unable to do so or have even found associations in the opposite direction to those of the original reports 7, 8. As a result, there is no test available that would reliably predict risk for asthma and allergies in the population and that could be used with practical results in a clinical setting.
WHY ARE GENETIC STUDIES OF ASTHMA-RELATED TRAITS DIFFICULT TO REPLICATE?
The most frequently proposed explanations for the discrepancies are technical 9. Many studies are too small to have enough power to detect linkage or association, or are based on a large number of comparisons that are not taken into account adequately in deciding which results are truly significant from a statistical point of view. Direct replication is often made more difficult by the tendency of investigators to stress positive results that are often inflated by random variations in allele distribution when a large number of markers are tested with respect to a single phenotype. This tendency is made even more obvious when many related phenotypes are tested for association with respect to the same markers. It certainly cannot be denied that these factors can play a dramatic role in amplifying the heterogeneity of results of genetic studies of asthma and other complex phenotypes.
However, reducing the issue to a simple technical matter oversimplifies the problem. It is important to understand what is implicit in the reasoning behind technical explanations for the difficulties encountered in replicating association studies of complex diseases. Surprisingly, these implicit assumptions have never been seriously evaluated or debated. In order to do so, it is necessary first to define the three possible roles that genetic variations may have in determining risk for a disease or condition or, more generally, in participating in the expression of a certain phenotype. This same analysis also applies to environmental factors that determine susceptibility to any phenotype.
General genetic switches
Variations present in specific genes may be a necessary condition for the development of a phenotype. Much like the main power switch of a home needs to be in the “on” position for there to be electricity available for any room or appliance, general genetic switches are sine qua non determinants of a hereditary condition. Although the variation may not always be expressed as a phenotype, conditions that are caused by general switches require the presence of variations in one particular gene; otherwise the condition will not be expressed. If the condition is a disease that invariably decreases the likelihood of reproduction, the genetic variations associated with the disease are generally rare, unless very specific contexts (e.g. heterozygous advantage for recessive diseases) favour the persistence of the variation in the population. Examples are all so-called monogenic diseases, including cystic fibrosis, Huntington's disease and the many haemoglobinopathies.
Most monogenic diseases are said to be “caused” by the genes involved (which is usually called the “gene for” the disease), and seldom is the potential role of environmental factors considered as part of the causative process involved in the disease. If this concept is illustrated in the form of a diagram in which an objective or arbitrary scale of the phenotype is expressed in the y-axis and any environmental exposures are expressed in the x-axis (the so-called “norm of reaction”), the different genotypes for monogenic diseases are usually represented as straight lines parallel to the x-axis (fig. 1⇓). In other words, all of the phenotypic variation is attributable to the genotype. This pattern of expression is observed empirically in most monogenic diseases, and this has facilitated mapping of these diseases to the human genome, because samples of subjects with well-defined disease phenotypes from many different locales can be pooled and studied without regard to environmental context.
It is interesting to note here that phenotypic variation between siblings affected with the same monogenic disease is usually attributed to so-called modifier genes 10, with very little attention paid to environmental factors. This conception, however, runs contradictory to well-established observations made in monozygotic twins with cystic fibrosis, who often show remarkable differences in course and prognosis (personal communication; D. Schidlow, Dept of Pediatrics, Drexel University College of Medicine, Philadelphia, PA, USA). It is thus possible that hidden gene–environment or gene–development interactions (see below) may indeed also be present in many monogenic diseases, and these interactions may play a crucial role in the development of these diseases.
Perhaps the most illustrative example of the conceptual simplification involved in the implicit attribution to genotype of all phenotypic variation in monogenic diseases is the case of phenylketonuria (PKU). This condition is caused by an accumulation of unmetabolised phenylalanine in many tissues, including the brain. Two factors concurrently contribute to this accumulation: the amount of phenylalanine in the diet; and the absence of the active metabolising enzyme, phenylalanine hydroxylase. This absence is almost invariably the result of mutations in the phenylalanine hydroxylase gene, on chromosome 12q24. Figure 2⇓ expresses the phenotypic expression of PKU as a norm of reaction. It can clearly be seen that both the environmental exposure and the genetic variant are necessary for the expression of the disease phenotype. This is why, almost paradoxically, PKU, which is classified as a genetic disease in all textbooks, is treated with an environmental intervention, namely the elimination of phenylalanine from the diet.
There is obvious consensus among all researchers interested in asthma that the disease as a whole is not caused by one specific general switch: there is no “asthma gene”, in the same way that the cystic fibrosis transmembrane receptor is the gene for cystic fibrosis. The same can be said for allergies and for all other complex human phenotypes.
Local genetic switches
It is possible, however, that very well-defined and specific forms of a disease or phenotype are caused by variations in one or more genes, which will be associated with the development of that specific form of the disease. These can be called local genetic switches, to follow the analogy with electrical controllers proposed above. The implicit assumption made by most researchers working in the genetics of asthma is that the disease is caused by either several (oligogenic model) or a large number (polygenic model) of such local genetic switches. These local genetic switches should work much like most electrical switches for specific appliances or rooms in a home: they have pre-set “on” and “off” positions. In other words, variations in local genetic switches are expected to produce an increase or decrease in the activity or quantity of the protein or proteins encoded by the “affected” gene, which, if it has a genetic effect, will always influence the risk for the disease in the same direction, independent of the context in which the gene is expressed.
It is well known that the expression of these local genetic switches is a function of their penetrance (as is that of general genetic switches). Usually, the factors that determine variation in penetrance are not stated and the only important assumption is the one mentioned above, namely that specific alleles will always be associated with the phenotype under study in the same direction in all contexts. In support of this premise, it is usually argued that studies of family resemblance in asthma have almost invariably shown that both monozygotic twins and first-degree relatives are more concordant for asthma and related traits than dizygotic twins or unrelated individuals, respectively. Since these phenotypes are known to be multigenic, concordances are explained by additive effects of alleles in different polymorphisms, with one allele having similar effects (increase, decrease or no influence) between families, regardless of the context. However, nothing in the design of studies of family resemblance in which no assessment is made of environmental or genetic context justifies the premise that an allele will always be indifferent, protective or deleterious. In fact, it is theoretically possible that the contrary may be true: that for some pairs, the allele shared by both members of the pair increases the likelihood of having the phenotype, while for other pairs, the same allele may decrease the likelihood of having the phenotype in both pair members. It is important to stress here that both these situations will increase concordance of illness between pair members. Even in the case of studies that compare the phenotypic concordance of monozygotic twins raised together or apart since birth, it is dangerous to ignore the potential influence of exposures occurring in utero that may programme the foetuses through epigenetic changes that are common to both twins. These in utero influences may explain the increased concordance among monozygotic twins in ways that are independent of their identical genetic background. Thus, increased pairwise concordances among relatives (including twins) are not necessarily explained by the presence of genetic variants acting in the same direction in both pair members.
Nevertheless, and for experimental purposes, what matters is that, if the local switch paradigm is accepted, the decrease in power caused by the penetrance function will eventually be overcome if sufficient numbers of unrelated subjects are studied. This is represented as a norm of reaction in figure 3⇓. It can be clearly seen that researchers studying this genetic variant in such environmental (or genetic) contexts as those represented in the box on the left side of the graph will report that the variant has a very low penetrance and a very low heritability, whereas those studying it in the context represented on the right side of the graph will surmise that it has a higher penetrance and heritability. They are obviously both right. It is likely that, in many instances, situations such as those represented in figure 3⇓ may explain the inability of well-designed genetic studies to replicate the findings of researchers studying the same variant in different contexts. However, if all well-designed association studies of a genetic variant of the “local switch” type represented in figure 3⇓ are pooled together in a massive meta-analysis, a statistically significant association will eventually be found.
The local genetic switch theory has garnered widespread implicit consensus among researchers interested in the genetics of asthma. This is not surprising. From an empirical point of view, the theory is based firmly on the extraordinary success obtained by “gene-hunters” searching for general genetic switches in monogenic diseases. In addition, from a practical point of view, the theory offers an approach to the study of the genetics of asthma that is irresistibly attractive: any well-trained clinician with a PCR machine, who masters some basic statistical concepts and who has a reasonable knowledge of the pathogenesis of asthma, could find an “asthma gene”. The literature is awash with examples of the consequences of such an approach: genetic variations in tens of genes encoding for proteins known to be associated with asthma have been genotyped in subjects classified as having asthma or asthma-related phenotypes and in controls, with numbers ranging from a few dozens to thousands.
The main obstacle that these approaches have to face is that, by considering the other determinants of the phenotype as background “noise” for the expression of the gene, essential aspects of causal pathways that the gene is implicated in are left untapped. This may not be of great significance for genes (such as IL-13 or IL-4, for instance) for which it is difficult (albeit not impossible) to imagine that a variant that is associated with increased expression of the gene in any context will be protective for the development of asthma. However, for genes and their products whose nature and direction of influence in the determination of asthma is less clear, knowing the contexts in which the gene is expressed or not expressed may offer crucial clues as to how exactly that gene and its variants determines disease susceptibility.
Context-dependent genetic determinants of disease
A series of recent findings, from both experimental and observational studies, suggest that, although it is possible that some genetic determinants of complex diseases do indeed act as local switches, many may influence disease development in a much more intricate way. These new findings suggest that the nature and direction of the association between many common SNPs and complex disease phenotypes is not linear and depends on the genetic, environmental and developmental context in which the SNP genotypes express themselves. Moreover, in vitro functional studies of SNPs, especially those in regulatory regions, paint a picture of very heterogeneous effects of these SNPs in different cell types and, within a cell type, at different times during the maturation of that specific cell type. It is thus not at all evident or predictable, even from well-performed functional studies of genetic variations in regulatory regions, in which direction with respect to a specific phenotype an SNP will exert its influence.
Below are specific examples of these interactive effects.
Epistatic effects
The fact that the influence of one genetic variation on a given phenotype could be influenced by that of other genetic variations in the same gene or in other genes has been well known since the beginnings of genetics as a science. In the specific field of asthma, for instance, elegant studies by Howard et al. 11 have shown that individuals with risk genotypes for SNPs in the genes for both IL-4 receptor alpha (IL-4R) and IL-13 were five times as likely to have asthma than carriers of both nonrisk genotypes, and the latter were as likely to have asthma as carriers of risk genotypes for one gene but not for the other. In the case of IL-4R and IL-13, it is plausible to surmise that these types of interactions may occur, because IL-4R is one of the components of the receptor system for IL-13. Moreover, as explained earlier, it is unlikely that asthma is associated with anything but increased expression or activation of IL-13 and IL-4R, given the known function of these two molecules in the pathogenesis of the disease.
However, epistatic effects do not only occur in a way that allows for the straightforward identification of “risk genotypes” within a specific locus and, therefore, they cannot always be explained in additive terms. The best recent example has been presented in plants. Kroymann and Mitchell-Olds 12 recently mapped a quantitative trait locus (a chromosomal segment containing one or more genes that influence a phenotype) that they had detected by linkage analyses of growth rate in Arabidopsis thaliana. They found that the linkage signal was due to two genes in this region. The fact that two genes, and not one as researchers often assume, were responsible for the linkage signal is not what is remarkable about their finding. The two loci each had a very small effect and are tightly linked within a 210-kb genomic segment. However, they exhibit what has been called antagonistic epistatic interactions 13, that is, an allele at one locus could either increase or decrease growth rate (the phenotype under study) depending on which allele was present at the other locus. Indeed, as has been pointed out 13, had the researchers not carefully controlled genetic background and not explicitly considered epistasis, these loci would not have been detected in a straightforward linkage analysis.
If similar patterns of antagonistic epistatic interaction exist in humans (and there is no reason to believe that this type of interaction is exclusive to plants), the consequences are sobering. Genotypes for an SNP may increase the risk for asthma in one person and decrease the risk for asthma in another person, depending on the genotype that each of these persons carry in a different SNP, which could be in the same gene, in the same region where the first gene is located or in a completely different region of the genome. From the point of view of the population, the same allele for a genetic variant may be protective in one population and a risk factor for asthma in a different population, depending on the allele frequency of a different SNP that is in antagonistic epistasis with the first SNP.
The above discussion was limited to potential interactions between two loci. The reader can imagine the complexity that these interactions may acquire if three or more loci are involved in mutually antagonistic epistatic effects. The final result would certainly be a convoluted series of causal pathways that will depend on the specific genotypes carried by each subject. The same can be said for populations, if the frequencies of the different interacting variants vary markedly between them. The point is thus obvious: if the way in which a genotype influences a phenotype is modified by several other loci, it is not possible to determine the effect of the genotype by simply typing it in isolation.
Gene–environment interactions
A second level of intricacy is introduced by interactions between expression of genes and environmental influences as determinants of complex phenotypes. That environmental exposure determines to a great extent the development of asthma is undisputed. What is still not clear is the manner in which these exposures influence the phenotype, and specifically the way in which their influence interacts with that of the genetic background of the individual. As was the case in genetic studies in which environmental influences were considered unmeasured “noise”, in many (if not most) studies of the association between exposures and asthma the genetic background of the individuals under study is what is considered to be the “noise”.
A different approach to gene–environment interactions has been discussed in detail elsewhere 14, and will not be repeated here. It is possible that, although the expression of a certain genotype may differ in different environments, its influence on the phenotype will always influence the phenotype so in the same direction (as represented in fig. 3⇓). As explained earlier, if sufficient well-designed studies are pooled, the final conclusion will invariably be that the genotype does influence the risk or magnitude of the phenotype in the same direction. In technical terms, this is called the marginal effect of the genotype. However, as for epistatic effects, there is now evidence that genotypes and environmental influences can have antagonistic effects on the phenotype. Zambelli-Weiner et al. 15 and Eder et al. 16 have recently shown that a functional SNP in the promoter region of CD14 (CD14/-159, also called CD14/-260) has opposite effects on asthma-related phenotypes depending on exposure to the ligand for CD14, endotoxin. When homozygotes for the T allele for CD14/-159 were exposed to low levels of endotoxin in house dust, they were less likely than carriers of the other two genotypes to have asthma or high serum immunoglobulin (Ig)E levels. However, when homozygotes for the T allele lived in homes with high levels of house dust endotoxin, they were more likely to have asthma or high levels of circulating IgE. Although a clear explanation of the molecular mechanisms involved is not available, this type of antagonistic interaction had been reported in plants and invertebrates many years previously 17. It is thus possible that other examples of this type of interaction may exist, which could explain the apparently contradictory results of studies of the genetics of asthma and allergies in different locales.
Gene–development interactions
A third type of interactive effects that is not often addressed is what could be called, for lack of a better term, gene–development interactions. Once again, many studies of the genetics of asthma and allergies group together patients of different ages and at different stages in the development of these diseases. The assumptions made in these studies are that the way in which genetic variants affect the phenotype does not vary markedly with age and that if it varies, it will always affect the phenotype in the same direction. Evidence derived from studies of asthma and allergies now suggests that both these assumptions may not always hold true.
A good example of age–genotype interaction has been provided by O'Donnell et al. 18, who studied the association between asthma-related phenotypes and the same SNP in CD14 (CD14/-159) that was mentioned earlier. They found that, early during the school years, the T allele for CD14/-159 was clearly protective, but that later during childhood, this protection disappeared. Although these findings require replication, the present author’s research group has found a similar pattern of gene–age interaction in the relation between SNPs in the A disintegrin and metalloproteinase 33 gene and asthma-related phenotypes (unpublished data). These studies suggest that the same SNP may have different effects on the expression of heterogeneous phenotypes such as asthma, perhaps because different proportions of sub-phenotypes of asthma are represented among subjects who have asthma symptoms at different ages 19, and genetic variants may have unequal influences on these different sub-phenotypes.
An extreme example of these potential unequal influences has been recently suggested by studies performed in the UK and Germany. Many children who have asthma-like symptoms during the first 2–3 yrs of life have lower respiratory illnesses during respiratory syncytial virus (RSV) infections 20. Hull et al. 21 first reported that carriers of the A allele for an SNP in the promoter region of the IL-8 gene (IL-8/-251) increased the risk of severe RSV in early life. Functional studies using in vitro stimulation of whole blood with endotoxin suggested that the A allele is associated with increased production of IL-8 22. However, when Heinzmann et al. 23 studied this same SNP in older, school-age children, the A allele was found to be protective against asthma. Longitudinal studies in the same population will be needed to confirm these apparent contradictory effects, but it is not unreasonable to suggest that IL-8 may have opposite effects on RSV-related airway obstruction with respect to those present in older children, whose wheezing episodes are more likely to be caused by rhinovirus or by allergen exposure than by RSV.
CONCLUSIONS
Several lines of evidence suggest that, although there may be genes and genetic variants that have linear marginal effects on the expression of asthma and asthma-related traits, the association between many genetic variants and complex traits may not be as straightforward and linear as many researchers have assumed. The current state of knowledge does not allow measurement of the proportion of the genetic determination of these complex phenotypes that is of the simple, additive type and the proportion that is highly dependent on nonlinear interactions with developmental, epistatic and environmental influences. Researchers involved in the study of the genetics of asthma, but also those interested in the genetics of other complex diseases, are having difficulty replicating each other's results when studying different populations in different locales, even when those studies are well designed and the definitions of the phenotypes are well standardised. It is suggested that these discrepancies are only explained partially by technical limitations of the different studies and that, more probably, the mechanisms by which genetic variants influence these phenotypes are more complex than was originally assumed. Elucidating these mechanisms, therefore, will require a more comprehensive, multidisciplinary approach, and will shed new light into why some subjects have asthma and allergies and others do not.
Footnotes
-
Previous articles in this series: No. 1: Le Souëf PN, Candelaria P, Goldblatt J. Evolution and respiratory genetics. Eur Respir J 2006; 28: 1258–1263.
- Received July 3, 2006.
- Accepted October 27, 2006.
- © ERS Journals Ltd