Abstract
Comments on a FinnGen sleep apnoea genetics study: need to disambiguate causal and pleiotropic associations with obesity/cardiometabolic diseases and opportunities for biobanks (and deeper phenotyping) to discover new treatments and disease subtypes https://bit.ly/3dntk0I
Obstructive sleep apnoea (OSA), characterised by recurrent upper airway obstruction, intermittent hypoxia and fragmented sleep, affects >15% of the adult population, with prevalence increasing markedly with advancing age and age-related cardiometabolic and pulmonary disorders [1]. Causal associations between OSA and hypertension, coronary heart disease, diabetes, heart failure, atrial fibrillation, stroke and mortality are suggested by numerous large prospective studies [2], raising the possibility that effective treatment of OSA might provide a novel strategy for primary and secondary prevention of cardiometabolic disease. However, randomised controlled trials have not yet produced evidence that use of a device for splinting the airway (i.e. continuous positive pressure) improves clinical cardiovascular disease. Multiple potential reasons for these disappointing results have been suggested, ranging from inadequate use of continuous positive pressure due to device intolerance, to exclusion of individuals from the trials who are most susceptible to OSA-related adverse outcomes. It is increasingly evident that there is a need to better identify OSA phenotypes that predict individuals at highest risk for developing chronic disease, as well as the underlying OSA disease mechanisms most amenable to targeted interventions.
Advanced signal analysis of polysomnography signals has begun to provide tools for dissecting OSA heterogeneity, revealing marked physiological differences (or endotypes) in OSA between men and women [3] and across background groups that may inform treatments that target physiological mechanisms, such as neuromuscular compensation or elevated loop gain [4]. However, there has been limited progress in identifying the underlying molecular mechanisms for OSA that could serve as druggable targets. One promising approach is to harness the power of hypothesis-free genome-wide association studies (GWAS) to discover novel genetic and molecular pathways; an approach applied successfully to developing novel drugs for dyslipidaemia and other disorders [5]. Genetics tools also are increasingly used to tease apart complex causal and bidirectional associations among disorders that aggregate, such as OSA and obesity, thus clarifying disease mechanisms and helping to identify appropriate intervention targets.
The promise of genetic studies to reveal underlying disease mechanisms for OSA is supported by family and twin studies. These studies demonstrated that a number of OSA traits including the apnoea–hypopnoea index (AHI; the traditional metric for OSA diagnosis), snoring, excessive daytime sleepiness, overnight hypoxaemia and respiratory event duration are heritable [6]. Studies of research cohorts undergoing standardised overnight sleep studies have uncovered several genetic variants linked to the AHI and overnight hypoxaemia, including those that implicate inflammatory and neurological pathways [7, 8]. However, the availability of both genome-wide data and physiological sleep assessments across research cohorts is limited. Even the largest studies to date that have aggregated data across multiple research cohorts (e.g. the Trans-Omics In Precision Medicine programme (TOPMed)) generally reported on <25 000 individuals, an order of magnitude smaller than GWAS emerging from large consortia studying more simply defined traits (e.g. diabetes [9], myocardial infarction [10], etc.), and prior studies have only replicated a few of the reported signals. Given that the proportion of variance explained by individual common variants is low (small effects), there is a clear need for larger studies with improved power to detect more variants [5]. For example, the first GWAS of type 2 diabetes in 2007 identified five loci in 1363 individuals [11]. By 2018, this number increased to 243 using a sample of ∼900 000 individuals [9]. Moreover, there is a need for large independent data sources to assess whether newly discovered findings replicate and generalise across populations.
Large biobanks (assembled using available clinical data supplemented with genetic assays) provide data sources to overcome sample size limitations common in prospective research cohorts. In this issue of the European Respiratory Journal, Strausz et al. [12] performed a comprehensive series of analyses using the unique biobank, FinnGen, In an analysis of 217 955 individuals (16 761 OSA cases), five genome-wide significant associations with OSA were identified. Moreover, the authors explored causal associations with body mass index (BMI) and pleiotropic associations with cardiometabolic disorders. The study strengths include the large number of cases and analysis of a broad range of clinical and medical data, providing the field with the first published biobank study of OSA. The study also attempted to replicate findings in three European biobanks: UK Biobank, Estonia Biobank, and All New Diabetics in Scania.
A major concern in the use of administrative data from biobanks is the accuracy and granularity of phenotype data. In FinnGen, OSA was identified based on ICD codes obtained from hospital discharge and death registries. The authors reported a very high positive predictive value for their classification, providing assurance that the cases in their discovery sample likely were true cases. However, the sensitivity and negative predictive value were not reported. Prevalence of OSA was 7.7%, which was nearly eight-fold higher than found in the UK Biobank (where the very low diagnosis suggests substantial under-recognition of OSA in the UK compared to prevalence estimated in epidemiology studies), raising concerns about comparability of data across centres due to local differences in OSA recognition and coding practices. In addition to misclassification, administrative OSA codes do not characterise disease severity or its association with key OSA features associated with cardiometabolic disease such as hypoxaemia, and do not indicate rapid eye movement or non-rapid eye movement dominance, which likely are partly determined by unique pathophysiological mechanisms and are subtypes associated with genetic associations in previously published GWAS [13–16]. So, while there are notable advantages to the analysis of a large number of cases, as reported in FinnGen, the power to detect genetic associations was likely reduced by the heterogeneity resulting from using an ICD code that may reflect divergent OSA disease mechanisms (and thus, likely, genetic factors). Nonetheless, associations that were found for the OSA coded phenotype may reflect mechanisms common across OSA subtypes, or associations sufficiently strong that they are robust to misclassification.
A major focus of the study by Strausz et al. [12] was to attempt to tease apart the shared versus unique genetic risk factors for obesity and OSA. Effects of obesity on OSA susceptibility relate to both the mechanical and biochemical effects of obesity, including the influences of obesity on airway narrowing, reduced lung size (reducing oxygen stores and decreasing the traction on upper airway structures), and possibly through the release of pro-inflammatory factors that influence airway neuromuscular function and/or ventilatory control. Multiple genes implicated in adiposity are involved in hypothalamic and other nervous system processes; therefore, variants in adiposity genes may influence OSA through pleiotropic nervous system effects [17]. To gain insight to the biological mechanisms of OSA through or independent of obesity pathways, Strausz et al. [12] analysed associations without and with adjustment for BMI. Of the five significant associations reported, four were only observed in analyses prior to adjustment for BMI, reflecting the influence of obesity as a risk factor for the ICD-based phenotype; possibly because of the consistency and strength of obesity as a risk factor across multiple OSA subtypes not differentiated with use of one ICD code, or due to selective recognition and diagnosis of OSA in obese individuals. The four BMI-dependent loci were: FTO and GAPVDI, known obesity loci; CXCR4, previously associated with lung function; and CAMK1D, without a clear mechanistic association. Of these, only the variants in FTO and GAPVDI replicated in independent samples, and thus findings need to be cautiously interpreted.
While obesity is an important OSA risk factor, ∼60% of the genetic variation of the AHI is independent of adiposity traits [18]. BMI-adjusted associations may reveal novel mechanisms that operate independently of obesity, and thus their identification may be particularly informative for developing pharmacological interventions. In the study by Strausz et al. [12] only one locus (RMST/NEDD1) was significant after BMI adjustment. Using partitioned heritability analysis (disease variance explained by genetic components) across different tissues, enrichment of BMI-adjusted OSA signals was found in tissues in the central nervous system. While this finding may relate to central ventilatory or sleep–wake control mechanisms that influence OSA susceptibility, examining phenome-wide associations via PhenoScanner (www.phenoscanner.medschl.cam.ac.uk) shows that the lead variant in RMST/NEDD1 is also suggestively associated with waist-to-hip ratio in nonsmokers in a prior genome-wide meta-analysis of obesity traits [19]. It is possible that BMI adjustment did not adequately account for differences in body fat distribution, which is a risk factor for both OSA and cardiometabolic disease. In particular, neck circumference and tongue fat have been associated with OSA, in causal/mediation pathways after adjustment for BMI [20–22]. Future studies dissecting the role of obesity-dependent from obesity-independent pathways should further adjust for more detailed measures of regional adiposity. In addition, since the RMST/NEDD1 locus was three times more common in Finnish than other European populations, and was not replicated, further evidence is needed regarding its role in OSA.
Genetics can be powerfully leveraged to reveal shared genetic risk factors and dissect causal associations for complex disorders such as OSA using modern statistical genetic methods including genetic correlation, Polygenic Risk Score (PRS) and Mendelian randomisation (MR) analyses. Using these tools, Strausz et al. [12] demonstrated a shared genetic background and causal relationship between obesity and increased risk of OSA. Their study estimated a 70% genetic correlation (the correlation between additive genetic effects) between OSA and BMI, somewhat higher than the approximately 60% genetic correlation estimated in a prior US based family study [18]. A PRS for BMI (derived from a prior GWAS as a weighted summation of genome-wide BMI associated alleles that predicts obesity risk) also predicted OSA risk in Finnish individuals. The MR analysis used genetic variants strongly associated with BMI (exposure) as an instrument to infer the causal relationship between BMI and OSA (outcome). Their results further supported a causal relationship between obesity and increased risk of OSA, consistent with previous evidence that weight loss improves OSA [18, 23]. However, MR assumes no association between the genetic instrument and outcome independent of exposure (no horizontal pleiotropy), which was not thoroughly addressed in this study. Specifically, some of the BMI-associated single nucleotide polymorphism instruments were previously shown to be involved in neuronal processes [17], and may be directly associated with OSA, in which case the assumption of no horizontal pleiotropy would be violated.
Strausz et al. [12] also made important steps in addressing the underlying complexity of OSA and its many comorbidities, which may have causal, bi-directional or pleiotropic associations. Their study identified high genetic correlations between OSA and type 2 diabetes, hypertension, coronary artery disease (CAD), stroke, depression, hypothyroidism, asthma and rheumatic diseases, consistent with previous epidemiological and clinical observations [2, 24–27], and suggested underlying genetic overlap (or pleiotropy). Comparing the attenuation of genetic correlations between OSA and these disorders, with and without BMI adjustment, suggests a differential role for BMI in the shared genetic regulation of OSA and these comorbidities. For example, the 0.38 genetic correlation of CAD and OSA is attenuated to 0.24 with BMI-adjusted OSA, suggesting the shared genetic background between CAD and OSA is partially attributable to obesity, while the genetic correlation of stroke and OSA (0.33) is not attenuated with BMI-adjusted OSA (0.32). Interestingly, a significant genetic correlation between OSA and depression was observed (0.43), which was only modestly attenuated after BMI adjustment (0.33), consistent with underlying neurological bases for both disorders. However, when interpreting the reported pleiotropic associations, it is important to note that the OSA definitions used included comorbidities such as hypertension and cardiac disease for cases when the AHI was 5 to 15, potentially biasing the estimated genetic associations for the same disorders used in the definition of OSA. Given the potential to target disease mechanisms that overlap multiple disorders, future work at dissecting the shared genetic architecture of common causes of OSA-related morbidity may identify novel intervention strategies while further elucidating which morbidities reflect common risk factors versus modifiable consequences of OSA.
In their study, a PRS of five genome-wide significant loci of OSA was calculated in replication analysis [12]. The top quintile of their PRS was associated with a 1.24 increased odds of OSA in an independent cohort, suggesting replication evidence of the combined effects of the top OSA signals for BMI-unadjusted results. Since the PRS is increasingly used to predict individual risk, constructing robust PRS for OSA (and/or its subtypes) using larger samples may provide powerful epidemiological and clinical tools for screening patients at risk or predicting those likely to benefit from specific interventions.
The loci reported by Strausz et al. [12] are novel for OSA and their study did not replicate genetic loci identified in previous smaller GWAS that analysed quantitative OSA phenotypes [13–15]. One reason is that individuals in clinic-based samples may be more symptomatic and more likely be diagnosed with OSA due to underlying comorbidities, while those in research cohorts with undiagnosed disease may be less symptomatic and generally healthier. A second reason is that the traits analysed are different. The research cohorts have focused on quantitative traits (e.g. AHI during non-rapid eye movement sleep in males, severity of overnight hypoxaemia, and average respiratory event duration) that likely reflect physiological properties of OSA that are not well captured by a simple binary OSA outcome defined using hospitalised cases. There are also ancestry differences between samples used in prior studies (admixed Hispanic/Latinos and multi-ancestry population) and the Finnish (a more isolated population [28]), limiting generalisability of genetic findings. Given the high prevalence of OSA in non-white individuals, large studies with diverse and well-phenotyped populations are needed to ensure findings are applicable to populations with high disease burden.
Numerous international efforts at expanding biobanks include the “All of Us” and Million Veteran's programmes in the USA, the Australia Biobank Graz, and the China Shanghai Zhangjiang Biobank. These growing resources will provide additional opportunities for large scale GWAS and meta-analyses of OSA, derivations of PRS across more diverse populations, and interrogation of rare and functional variants and molecular pathways, as well as further evaluations of pleiotropy and causal associations, as nicely demonstrated in the current study [12]. A key challenge to such biobank studies, however, centres around the availability of unbiased and informative phenotype data. The overall under-recognition of OSA across medical systems may most affect the integrity of control samples, with many labelled controls representing cases of undiagnosed OSA. The differential likelihood of diagnosis among certain groups, such as in males, whites, patients reporting certain symptoms or with comorbidities, or those with better health insurance, could bias study findings (e.g. leading to spuriously high estimates of pleiotropy between OSA and cardiac disease) and exacerbate health disparities (e.g. if female or minority cases are not included in analyses used to inform research and clinical decision making). Moreover, the vastly different prevalence rates of OSA across biobanks underscores the challenges in combining such data. A critical need is to improve the recognition of OSA across patient populations (such as implementing clinical screening algorithms that mitigate the effects of biased under-reporting), as well as to improve the visibility and accessibility of well-annotated sleep data within the electronic medical record to sleep researchers.
While the current study used the best available measures of OSA in their biobank, future improvements in the ability to analyse quantitative sleep data embedded within health records is essential for enhancing sleep phenotyping and for setting the foundation for sleep precision medicine. The lack of quantitative phenotype data to characterise disease severity and to subtype OSA using physiological traits reduces the power within biobanks to identify genetic variants that influence OSA subtypes that vary markedly across patients, including those that differentially respond to treatment [4]. Several promising approaches may help overcome the current limitations in biobanks. Application of natural language processing that interrogates sleep study reports and sleep clinic records to derive and extract more detailed phenotypes, and advanced electronic medical record phenotyping tools [29] that improve diagnostic phenotyping accuracy, may enrich the available data and reduce bias. Variants discovered to associate with crude measures of OSA in large datasets can be followed up with carefully designed animal and human laboratory studies and with studies of molecular functions using metabolites and protein expression. Another approach is to triangulate different sources of data that vary in size and granularity. For example, PRS for cruder phenotypes can be developed in large datasets and then tested in smaller data sets of deeply phenotyped data to refine genetic estimates and mechanistic inferences. It will be particularly interesting to systematically compare the genetic similarity (using genetic correlations and PRS) of different OSA traits (simple composites, specific endotypes) by cohort, sex and ancestry to identify which genetic risk factors are shared across divergent OSA traits and populations. Use of multi-trait meta-analysis (e.g. combining data using multiple measures of OSA) may further enhance statistical power by improving the reliability of the phenotypes and thus identify additional novel genetic loci and pathways [30]. Given the sexual dimorphism of OSA, GWAS stratified by sex or gene-sex interaction analyses are important to include in all new research.
As the results of Strausz et al. [12] highlight, a major opportunity and challenge in sleep apnoea genetics is to thoughtfully consider the role of adiposity in genetic risk for OSA. The genetics of obesity itself is highly complex, Some adiposity genes more specifically influence BMI and others more strongly influence specific fat depots, many associations demonstrate sexual dimorphism, and numerous adiposity genes are expressed in neurological tissues and influence central brain processes that may inter-relate with sleep–wake and possibly ventilatory phenotypes [17]. As reported by Strausz et al. [12], underlying pleiotropic associations vary in the extent to which they are explained by obesity. Future research is needed that carefully disambiguates adiposity-dependent from adiposity-independent OSA pathways, recognising that use of BMI as a covariate may inadequately adjust for differences in body fat distribution. When identifying a genetic signal associated with both OSA and obesity, it will be important to determine whether the inferred mechanism reflects the mechanical effects of obesity on airway size and patency, or rather pleiotropic effects of obesity-related genes on ventilatory drive, neuromuscular compensation, or other intermediate pathways. Genetic associations that become stronger after adjusting for obesity are of great interest as they suggest underlying physiological mechanisms that may be targets for pharmacological intervention (ventilatory control), or possibly ones that influence craniofacial anatomy rather than obesity. Adiposity and OSA are tightly intertwined and teasing out the shared versus unique mechanisms linking these disorders presents important challenges and opportunities.
Finally, essential to harnessing genetics to advance sleep apnoea is the recognition of the critical need for very large data resources that include data on diverse well-phenotyped populations. The UK Biobank, TOPMed, and other programmes have successfully promoted collaborative research across large multi-disciplinary teams, recently bringing together expertise, data and tools that have accelerated levels of scientific rigor and productivity in an unprecedented manner. Collaboration across the sleep community at this scale has not yet been realised. International professional organisations should further support the sleep research community in efforts to improve standards for clinical sleep data collection, archival and representation in the medical record, as well as help support collaboration and data sharing needed to advance the field. The National Sleep Research Resource (www.sleepdata.org) contains a large easily accessible resource of over 35 000 sleep studies (sharing 2 terabytes of sleep data per week to the international community), as well links to a suite of open-source advanced signal processing tools (http://zzz.bwh.harvard.edu/luna) and will be integrated into the National Heart Lung Blood Institute's cloud-based BioData Catalyst, which is planned to facilitate linkage of sleep studies with genotype data. A tantalising next step is to leverage the approaches and tools from this (and other resources) to enrich medical record-based biobanks, thus achieving the goals of generating data of sufficient size and granularity, broadly accessible to the sleep community.
Shareable PDF
Supplementary Material
This one-page PDF can be shared freely online.
Shareable PDF ERJ-04644-2020.Shareable
Footnotes
Conflict of interest: H. Wang has nothing to disclose.
Conflict of interest: M.O. Goodman has nothing to disclose.
Conflict of interest: T. Sofer has nothing to disclose.
Conflict of interest: S. Redline reports grants from NIH, during the conduct of the study; and grants and personal fees from Jazz Pharma. In addition, S. Redline is the first incumbent of an endowed professorship donated to the Harvard Medical School by Dr. Peter Farrell, the founder and Board Chairman of ResMed, through a charitable remainder trust instrument, with annual support equivalent to the endowment payout provided to the Harvard Medical School during Dr. Farrell's lifetime by the ResMed Company through an irrevocable gift agreement.
Support statement: Supported by National Heart, Lung, and Blood Institute (grant: R35 135818) Funding information for this article has been deposited with the Crossref Funder Registry.
- Received December 28, 2020.
- Accepted February 10, 2021.
- Copyright ©The authors 2021. For reproduction rights and permissions contact permissions{at}ersnet.org