Abstract
Background Elevated exhaled nitric oxide fraction at a flow rate of 50 mL·s−1 (FENO50) is an important indicator of T-helper 2-driven airway inflammation and may aid clinicians in the diagnosis and monitoring of asthma. This study aimed to derive Global Lung Function Initiative reference equations and the upper limit of normal for FENO50.
Methods Available individual FENO50 data were collated and harmonised using consensus-derived variables and definitions. Data collected from individuals who met the harmonised definition of “healthy” were analysed using the generalised additive models of location, scale and shape (GAMLSS) technique.
Results Data were retrospectively collated from 34 782 individuals from 34 sites in 15 countries, of whom 8022 met the definition of healthy (19 sites, 11 countries). Overall, height, age and sex only explained 12% of the between-subject variability of FENO50 (R2=0.12). FENO device was neccessary as a predictor of FENO50, such that the healthy range of values and the upper limit of normal varied depending on which device was used. The range of FENO50 values observed in healthy individuals was also very wide, and the heterogeneity was partially explained by the device used. When analysing a subset of data in which FENO50 was measured using the same device and a stricter definition of health (n=1027), between-site heterogeneity remained.
Conclusion Available FENO50 data collected from different sites using different protocols and devices were too variable to develop a single all-age reference equation. Further standardisation of FENO devices and measurement are required before population reference values might be derived.
Tweetable abstract
The GLI network has collated FENO50 values from healthy individuals. Due to heterogeneity between sites and FENO devices, it was not possible to develop a single all-age reference equation. Further standardisation is required. https://bit.ly/3srBeA6
Introduction
Nitric oxide (NO) is a ubiquitous intra- and inter-cellular messenger whose synthesis may largely vary due to the complexity of the underlying biological mechanisms regulating the NO synthases [1]. Acute or chronic inflammatory diseases, including asthma, increase NO synthesis via transcription of the inducible synthases [2]. Elevated concentrations of the fraction of exhaled nitric oxide (FENO) are associated with airway inflammation, especially eosinophilic T-helper 2 (Th2)-driven inflammation, and may be useful in diagnosing and monitoring asthma [3, 4]. Within clinical guidelines, it is recommended that FENO at a flow rate of 50 mL·s−1 (FENO50) [5] is used to detect Th2-driven inflammation, predict inhaled corticosteroid response, assess treatment compliance, select patients with severe asthma for biological treatment and monitor people with a diagnosis of asthma [6].
Unlike other pulmonary function tests, for which results are related to population norms and expressed as % predicted or z-scores, FENO50 is usually expressed as high cut-off values [6–9]. Cut-offs are used because population-based studies of “healthy” individuals consistently show that the distribution of FENO50 values is right-skewed, with significant overlap between the distribution in people with stable or controlled asthma. The cut-off values are derived in studies of children and adults with a confirmed diagnosis of asthma and anchored to clinically relevant end-points such as sputum eosinophil count or response to inhaled corticosteroids. However, several factors influence FENO50 values, including age, height, sex, smoking, allergen exposure, rhinovirus infections and nitrate intake [6, 10–13]. Therefore, using fixed cut-offs that do not consider these non-asthmatic factors may misclassify individuals.
Previous studies have developed reference equations for FENO50 in single populations and found that the upper limit of normal varies with age, height and biological sex [14]. Comparing these reference equations demonstrates considerable differences between the upper limit of normal defined within the published literature. Employing the same methodology that has proven successful for the standardisation of spirometry by the Global Lung Function Initiative (GLI) [15–18], we aimed to develop reference equations for FENO50 using data from many populations and validate the discriminative ability of the upper limit of normal to differentiate individuals with a confirmed or suspected diagnosis of asthma.
Methods
An application was approved for a European Respiratory Society (ERS) Task Force to develop all-age reference equations for FENO50. The Task Force comprised scientists and healthcare professionals with expertise in developing international guidelines, lung physiology, lung function testing and biostatistics.
A pragmatic review of the literature in MEDLINE, Embase, Web of Science, Scopus and Cochrane Library (supplementary tables S1–S5) was conducted to identify published studies that included measurement of FENO50 in healthy individuals and those with confirmed or suspected asthma, COPD or primary ciliary dyskinesia (PCD). The authors of studies with at least 50 participants were contacted and invited to share their data with the Task Force. Invitations were also circulated through international and local respiratory societies to solicit unpublished data.
An online secure data portal (REDCap) [19] was used to capture individual data. In addition to providing NO data, the following mandatory variables were requested: sex, age, height, weight, atopy status and cigarette smoking status (in the last 12 months). Individuals with missing mandatory values were excluded. All data were pseudo-anonymised before submission and entered into a standard data template; initial data cleaning was performed and contributors were contacted directly to clarify discrepancies. If centres contributed more than one data point per individual, one measurement was randomly selected. Individual-level data were collected from healthy individuals to define the reference range. Data from individuals with a confirmed or suspected diagnosis of asthma, COPD or PCD were collected to investigate the discriminative ability of the upper limit of normal to differentiate between health and disease. Meta-data describing the study population, FENO device and methodology were also collected. A series of questions (supplementary tables S6 and S7) were asked to verify that submitted data met all acceptability and repeatability criteria outlined in the 2005 American Thoracic Society (ATS)/ERS recommendations [20]. Data collected from sites where we could not confirm expiratory flow rates were excluded. A summary of the included sites is presented in supplementary table S6.
“Healthy” individuals were defined as nonsmokers within the last year, with no history of self-reported or physician-diagnosed atopy (including eczema, rhinitis or positive skin prick test/total IgE >110 kU·L−1) or respiratory disease (e.g. asthma, COPD). Obese individuals were excluded. In all but one included study, atopy was confirmed using positive skin prick test/IgE levels; this study was excluded from a “strictly healthy” definition which also excluded overweight and obese individuals and those who had ever smoked. We assumed all individuals under 12 years old were never-smokers and were not diagnosed with COPD or PCD. Rhinitis, eczema, sinusitis, chronic bronchitis and nasal polyps were not mandatory variables; nonetheless, healthy participants with confirmation of any of these were excluded. We assumed that these individuals were healthy if these variables were not reported. Individuals under 4 years old and over 80 years old were also excluded. A sensitivity analysis was conducted using the “strictly healthy” definition; strictly healthy individuals fulfilled all criteria for healthy plus the additional criteria that no assumptions were made for any of the mandatory variables, meaning that subjects with unknown smoking status, with an unknown history of ever smoking or with an unknown history of asthma, COPD or atopy were not considered as strictly healthy.
Statistical analysis
The reported FENO50 values were visualised by plotting sex against height, age or body mass index (BMI); suspected outliers were confirmed with study sites or against established international cut-offs (e.g. obese individuals were excluded from the healthy population if they had a BMI >30 kg·m−2 in adults, or if BMI centile for age was ≥85th for children) [21]. In addition, children with height-for-age or weight-for-age z-scores <−5 or >5 were also considered outliers and removed (figure 1). FENO50 values <2 ppb were excluded as not biologically plausible across the 4–80-year age range. Differences between sites and FENO devices were first explored using the observed FENO50 values.
The generalised additive models of location, scale and shape (GAMLSS) technique [22], previously used for other GLI Task Forces, was used to define the reference range of FENO50 values. Briefly, the GAMLSS technique allows the median value to be summarised (mu) as a function of multiple explanatory variables (e.g. height, age, sex), the spread of values around the median value to be constant or vary by a function of an explanatory variable, and any departure from a normal distribution (skewness) to be transformed to normal using a Box-Cox transformation. Thus, the resulting model residuals will be normally distributed. Previous GLI reference equations have relied on the Box-Cox Cole and Green family; however, the distribution of the FENO50 data has a heavy right skew even after the log transformation of FENO50 values, requiring a more complex model. For FENO50 values, we used the Box-Cox-t distribution to allow a fourth parameter (tau) for extreme values. The goodness of fit was assessed by Schwartz Bayesian criteria, Q-Q plots and worm plots. Analysis was done using the GAMLSS package in the statistical programme R (version 4.2.1, www.r-project.org).
The following explanatory variables were evaluated one at a time and then together (i.e. sex, age, height, weight and BMI) for each of the four model parameters (mu, sigma, lambda, tau). The variables significantly associated with FENO50 were kept in the final model. We did not investigate race and ethnicity as a predictor of FENO50, because race and ethnicity are social constructs without a consistent definition globally, and recent statements endorsed by both the ATS and ERS have recommended against its continued use in reference equations [17]. We also investigated whether there were differences in the median or upper limit of normal based on the analysing method (chemiluminescence or electrochemical cell). To meet our a priori criteria to combine data from multiple sites, the difference between sites (or devices) and the average of all sites combined had to be <10 ppb. Similarly, the upper limit of normal from the combined data and each site (or device) had to be <10 ppb.
Results
FENO50 measurements from 34 782 individuals were provided by 34 sites in 15 countries (figure 1). After exclusions, 8022 healthy participants (49% female) across 19 sites and 11 countries were used to define the reference range (table 1). Overall, data were collated across the 4–80-year age range, with relatively fewer observations for individuals aged 25–30 years and 65–80 years (figure 2a). The distribution of FENO50 values was right-skewed (figure 2b). In healthy subjects, 3.9% had values >50 ppb (table 1), with 3.7% of adults and 7.2% of children (i.e. <18 years) having values >35 ppb. The median FENO50 varied between sites within the subset of “healthy” data (figure 3); in many cases, the average difference in FENO50 between sites was >10 ppb units. We further investigated whether the site differences persisted after accounting for the differences in sex, height and age between the sites.
Although height, sex and age were statistically significant predictors of average FENO50, the rate of change in FENO50 with height and age was small (median FENO50 increased 0.07 ppb with each year increase in age when holding height and sex constant, figure 4). In addition to being predictors of average FENO50, height and sex were statistically significant predictors of the between-subject variability of FENO50 (i.e. the spread of values around the median predicted value varied by height and sex). Overall, height, age and sex only explained 12% of the between-subject variability (R2=0.120). The addition of FENO device was a predictor of the median FENO50 and between-subject variability, such that the healthy range of values and the upper limit of normal varied depending on which device was used (figure 5). Adding a device into the model explained an additional 4% of the variability (R2=0.164). Including the site in the model instead of device explained an additional 7% of the variability (R2=0.191). For some devices, the between-subject variability was small (e.g. the coefficient of variation (CV) for FENO50 in the Sievers 280 is 0.51), and there were no observations with FENO50 values outside the upper limit of normal (e.g. NIOX VERO, NIOX FLEX). For other devices, the between-subject variability was twice as big (e.g. CV for FENO50 in Medisoft is 1.05), such that a larger proportion of healthy individuals would fall outside the upper limit of normal (figure 5). Consequently, it was not possible to define a single reference equation for FENO50 that can be used across all devices.
We further explored differences between sites in a subset of data (n=4254 from seven sites) that used the same device (NIOX MINO). Within this subset, we observed heterogeneity between the sites in terms of the FENO50 and the between-subject variability (figure 6), even after adjusting for differences in height, sex and age between participants in each site.
We further analysed a subset of data meeting our strictly healthy definition (n=1027), such that individuals were included only if no assumptions were made about the inclusion criteria. This excluded one of the largest datasets where atopy status was self-reported and not confirmed with skin prick test or IgE levels. The spread of residuals was still wide (figure 7).
Discussion
Measured values of FENO50 in healthy individuals from different devices across 19 sites varied between individuals. The variability between sites and devices precludes the meaningful collation of data from defining a reference range. Even when limiting the analysis to sites that used the same device, heterogeneity in the observed data remained such that it was not appropriate to develop a reference range. Standardisation of FENO50 measurements made using different devices and at different sites is required before robust population reference values can be derived.
This study applied an established methodology, as recommended in a systematic review, and supported by ERS, to determine population reference values for FENO50, and data from 8022 healthy individuals were obtained from nations around the world, using numerous devices, across the age range of 4–80 years. We believe this work has collected FENO50 measurements from the largest number of individuals to date. Therefore, these findings have important implications for ongoing and future research. The heterogeneity of FENO50 measurements between devices and centres is large, and the use of existing reference equations or cut-offs derived from a single study or single device [14, 23, 24] should be applied cautiously in other populations and with other devices.
In the collated dataset, we observed that the distribution of FENO50 in healthy individuals is skewed to the right. Although it was methodologically possible to apply the GAMLSS technique to derive reference equations for this type of data, the heterogeneity of FENO50 data between centres and device types meant it was not methodologically useful to develop a single reference range and upper limit of normal. Forcing a single reference equation would result in some centres under-identifying elevated FENO50 in individuals, while other centres would over-identify elevated FENO50 and would not improve existing site-specific equations. Even within the strictly healthy definition (n=1027), the differences between centres and devices persisted, suggesting that factors other than an individual's health status contribute to differences in FENO50 values between sites. These findings suggest that unmeasured factors, e.g. measurement protocols, population characteristics and even individual-level factors, influence the NO measurement. It is also possible that the smaller sample size used in the strictly healthy definition introduced sampling variability.
Although it is possible to address differences between devices using device-specific reference equations, substantial heterogeneity remains between centres measuring FENO50 using the same device, meaning that adjustment for the device would not provide sufficiently accurate normative data. Further, some devices are no longer commercially available, and in many cases, the number of observations was too small to derive specific equations for all devices.
Establishing reference equations for FENO50 may help clinicians to diagnose and manage chronic respiratory conditions. Unlike other pulmonary function parameters with lower and upper limits of normal [25], low levels of FENO50 do not necessarily imply underlying respiratory disorders, because background synthesis of NO is required for optimal bronchial and pulmonary vascular tone [26, 27]. Elevated FENO50 is associated with conditions such as asthma and COPD but also atopy without respiratory symptoms. As a result, determining the upper limits of normal for FENO50 and other exhaled NO parameters has always been challenging [4, 24, 28], especially for respiratory specialists interested in chronic inflammatory airway diseases [29–31]. The fact that biological pathways resulting in NO synthesis cross-link with those of many key molecules of Th2 inflammation in asthma [2] has made FENO50, together with eosinophils, two major biomarkers in asthma and other Th2-related inflammatory diseases [32–34]. Interestingly, many international guidelines, including the one published by the ATS in 2011 [4] and the recent ERS guidelines for the diagnosis of asthma in adults [35], have set 50 ppb as the optimal cut-off supportive of a diagnosis of asthma. Results from the present study show that <4% of healthy subjects worldwide have a FENO50 >50 ppb (table 1), and 7% of children >35 ppb. Results from figure 2c are in line with current major international asthma guidelines.
It is well established that FENO50 measurements are not interchangeable between different devices [36]. Further, differences in measurement protocol (e.g. single exhalation versus three exhalations and lack of flow registrations) may contribute to the observed differences in the GLI dataset. Until these differences are mitigated through standardised FENO devices and measurement protocols, it is unlikely that a reference equation can be derived for clinical applications applicable across different centres, whether they use the same device or not.
Limitations
The analysis reported here is limited to datasets shared with the GLI Task Force and may not be fully representative of all populations and all devices. Although a literature search was conducted and all corresponding authors were contacted, some centres declined, were unable to gain appropriate approvals to share data or had not collected the mandatory variables. Further, during the conduct of this Task Force, stricter General Data Protection Regulation rules were established, which further limited the sharing of data from some regions of the world. We do not believe that the differences between devices and sites would have been reduced by including data from more sites.
We could not verify the specific methodology for FENO50 measurement used by each site, only what was reported in the meta-data (supplementary material). Therefore, we cannot be sure how much of the inter-site difference between FENO values was attributable to methodological differences. A further limitation is that one dataset contributed the largest proportion of data (approximately one-third of the dataset) and only included self-reported atopy. Therefore, our findings may be influenced by a single study.
Conclusions
Owing to heterogeneity in FENO50 values between sites and FENO devices, it was not possible to develop a single all-age reference equation for FENO50 by collating data collected in healthy individuals. Further standardisation of FENO50 measurement and FENO devices is required before population reference values can be derived.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material ERJ-00370-2023.Supplement
Shareable PDF
Supplementary Material
This one-page PDF can be shared freely online.
Shareable PDF ERJ-00370-2023.Shareable
Footnotes
This document was endorsed by the ERS Executive Committee on 16 October 2023.
The contributing GLI FENO Task Force members (in alphabetical order) include: Rital Amaral, Uppsala University; Vibeke Backer, Rigs Hospitalet; Paolo Cameli, Azienda Ospedaliera Universitaria Senese; Anh Tuan Dinh-Xuan, Cochin Hospital; Sy Duong-Quy, Lam Dong Medical College; Ting Fan Leung, The Chinese University of Hong Kong; Ulrike Gehring, Utrecht University; Graham Hall, Telethon Kids, Australia; Marieann Högman, Uppsala University; Soo-Jong Hong, Asan Medical Center; Thong Hua-Huy, Cochin Hospital; Tiago Jacinto, CINTESIS; Mohamed Jeebhay, University of Cape Town; Fanny Wai San Ko, The Chinese University of Hong Kong; Carla Martins, University of Porto; Charles McSharry, Glasgow University; Anna-Carin Olin, University of Gothenburg; Mario Olivieri, University of Verona; Domingo Pérez Bejarano, Neumolab; Romy Rodriguez, InselBern; Lidwien A.M. Smit, Utrecht University; Woo-Jung Song, Asan Medical Center; Steve Turner, University of Aberdeen; Denis Vinnikov, al-Farabi Kazakh National University.
Conflict of interest: C. Bowerman reports personal payments for work on this manuscript through the Global Lung Function Initiative Clinical Research Collaboration subsidised by the ERS, and a leadership role as junior executive for the Global Lung Function Initiative. S. Stanojevic reports consulting fees from Chiesi Pharmaceuticals, lecture honoraria from Vyaire Medical and advisory board participation with Ndd technologies, outside the submitted work. A.T. Dinh-Xuan reports lecture honoraria from Circassia, outside the submitted work. All other authors have no potential conflicts of interest to disclose.
- Received March 6, 2023.
- Accepted September 22, 2023.
- Copyright ©The authors 2024. For reproduction rights and permissions contact permissions{at}ersnet.org