Abstract
The interpretation of pulmonary function tests (PFTs) to diagnose respiratory diseases is built on expert opinion that relies on the recognition of patterns and the clinical context for detection of specific diseases. In this study, we aimed to explore the accuracy and interrater variability of pulmonologists when interpreting PFTs compared with artificial intelligence (AI)-based software that was developed and validated in more than 1500 historical patient cases.
120 pulmonologists from 16 European hospitals evaluated 50 cases with PFT and clinical information, resulting in 6000 independent interpretations. The AI software examined the same data. American Thoracic Society/European Respiratory Society guidelines were used as the gold standard for PFT pattern interpretation. The gold standard for diagnosis was derived from clinical history, PFT and all additional tests.
The pattern recognition of PFTs by pulmonologists (senior 73%, junior 27%) matched the guidelines in 74.4±5.9% of the cases (range 56–88%). The interrater variability of κ=0.67 pointed to a common agreement. Pulmonologists made correct diagnoses in 44.6±8.7% of the cases (range 24–62%) with a large interrater variability (κ=0.35). The AI-based software perfectly matched the PFT pattern interpretations (100%) and assigned a correct diagnosis in 82% of all cases (p<0.0001 for both measures).
The interpretation of PFTs by pulmonologists leads to marked variations and errors. AI-based software provides more accurate interpretations and may serve as a powerful decision support tool to improve clinical practice.
Abstract
There is poor accuracy and substantial disagreement between pulmonologists when interpreting complex pulmonary function data. Automating interpretation with artificial intelligence provides a powerful decision support tool in clinical practice. http://ow.ly/Tj9h30nxw4U
Introduction
Pulmonary function tests (PFTs) are our primary tool to evaluate the function of the respiratory system [1]. In practice, the interpretation is based on expert opinion, and involves the recognition of a pattern (obstructive, restrictive, mixed and normal) and the grading of its severity according to international guidelines [2–4]. To arrive at the final diagnosis the results of PFTs are combined with patient information, symptoms and, possibly, the results of other tests, such as imaging, blood analysis, biopsies and exercise tests [5, 6].
In 2005, an American Thoracic Society/European Respiratory Society (ATS/ERS) task force designed a simplified algorithm to assess lung function in clinical practice [2]. However, when these recommended guidelines were translated into software for diagnostic decision support, it led to only 38% of correct disease predictions. Adding patient characteristics into such an algorithm improved the accuracy to 68%, highlighting a vast potential for automated diagnostic labelling when combining PFTs with clinical information [7]. In fact, the Belgian Pulmonary Function Study (BPFS) demonstrated that expert panels could reach 77% accuracy when predicting the diagnosis based on PFTs and clinical history alone [8]. Although one may doubt if a computer algorithm carries any added value to a group of experts, the question of whether it may help individual readers is yet to be answered.
The number of successful applications of artificial intelligence (AI) is quickly rising. Supported by various outstanding achievements in the field and because of its unlimited potential to deal with big data, high expectations are also emerging for healthcare. For instance, one study demonstrated the ability of an AI algorithm to identify and classify skin cancer with similar expertise as 21 board-certified dermatologists [9]. Another study reached the same level of performance when analysing retinal fundus images for the identification of diabetic retinopathy [10]. Moreover, there are multiple examples from radiology in detecting traces of breast and lung cancer [11, 12]. Notwithstanding these technical superiorities of AI-based systems, translation into clinical practice with broad acceptance has been very challenging [13–15]. As PFTs are entirely standardised and used worldwide [16], they are ideally suited for the development of AI algorithms for test interpretation and diagnostics. PFTs provide an extensive series of numeric outputs, easily controllable by computers, yet the patterns are not always easily perceptible or appropriately recognised by the human eye. Moreover, the example of automated interpretation for ECGs, which is widely adopted and standardised in most equipment, highlights its potential use.
In this study, we hypothesised that AI can improve the clinical reading of PFTs and overcome the variable test interpretation of individual pulmonologists. We explored the accuracy and interrater variability of pulmonologists when interpreting patterns of PFTs, and when suggesting a specific category of respiratory disease diagnosis based on limited clinical information and PFTs. In addition, we compared the pulmonologists' performance with that of AI-based software developed and validated in more than 1500 historical cases.
Methods
Study design
120 pulmonologists from 16 hospitals in five European countries participated in this multicentre non-interventional study. They independently evaluated complete PFTs (pre- and/or post-bronchodilator spirometry, whole-body plethysmography for lung volumes and airway resistance, and diffusing capacity) and limited clinical information (smoking history, cough, sputum and dyspnoea) of 50 randomly selected patients, admitted to the University Hospital Leuven (Leuven, Belgium) for a respiratory problem. Evaluation sessions were performed in each hospital in the period from August 15, 2017 to December 13, 2017. All pulmonologists independently examined different patient cases according to a pre-established protocol by providing: 1) PFT pattern interpretation: obstructive, restrictive, mixed or normal pattern; 2) choice of one of nine preferred diagnostic categories: asthma, chronic obstructive pulmonary disease (COPD), other obstructive disease (OBD) (including bronchiectasis, bronchiolitis and cystic fibrosis), interstitial lung disease (including idiopathic pulmonary fibrosis, non-specific interstitial pneumonitis and sarcoidosis), pulmonary vascular disease (including pulmonary hypertension, embolism and vasculitis), neuromuscular disease (including paralysis of the diaphragm, poliomyelitis and myopathy), thoracic deformity (including pneumectomy, lobectomy, chest wall problems and kyphoscoliosis), healthy and other diseases; and 3) confidence in their decision on a Likert scale: from 1 point (“absolutely not sure”) to 5 points (“absolutely sure”) (an example is shown in supplementary figures S1 and S2). Finally, 4) the same patient files were examined by in-house developed AI-based software for PFT interpretation and diagnostic suggestion.
Study population
The study included a random sample of 50 subjects prospectively collected at the outpatient clinic of University Hospital Leuven in August 2017. All enrolled subjects were Caucasians aged >18 years who had performed a complete PFT and provided clinical information. The gold standard diagnosis was derived from clinical history, PFT and all necessary additional tests, and finally confirmed by an expert panel in Leuven. This ad hoc expert panel consisted of three experienced clinicians that reviewed all baseline and clinical follow-up data to agree on a final gold standard diagnosis out of the nine categories. Consensus was reached for all these cases. Baseline characteristics are shown in table 1, covering a wide range of respiratory diseases that may present with an abnormal PFT. Other conditions (such as lung cancer, cardiovascular disease, and ear, nose and throat problems) were excluded from the test sample (n=3). The Ethics Committee of the University Hospital Leuven approved the study protocol (approval S60619; August 4, 2017). The study design can be found at ClinicalTrials.gov (identifier NCT03264417). All included patients provided informed consent for the use of their data (approval S60243; June 23, 2017).
AI software
The development of software for automated reading of PFTs was performed in R language and its machine learning framework. The software used the same lung function data as input as presented to the pulmonologists (absolute values, percent predicted of normal reference values and z-scores; also shown in supplementary figure S1) combined with patient characteristics, age, pack-years, sex and body mass index. For pattern interpretations, the PFT algorithm was in line with ATS/ERS strategies [2]. However, the engine for complex diagnostic categorisation had to be developed and a machine learning approach was adopted.
The machine learning model was built using data from 1430 subjects used in our previous work to ensure a broad variety of data [7, 8, 17]. This data came from two cohorts: 1) BPFS, a prospective cohort study that enrolled a clinical population-based sample (n=851) of all successive undiagnosed patients admitted for the first time to one of the 33 participating Belgian hospitals due to respiratory symptoms [8], and 2) a retrospectively collected PFT data cohort of patients followed at the outpatient clinic of the University Hospital Leuven based on predefined established diagnoses (neuromuscular disease (n=112), chest/pleural wall problems, including pneumectomy and lobectomy (n=64), pulmonary vascular disease (n=76), OBD (n=100), COPD (n=47), asthma (n=40), healthy (n=50) and interstitial lung disease (n=90)). Briefly, all subjects were Caucasians aged between 18 and 85 years who had performed a complete PFT (including post-bronchodilator spirometry, whole-body plethysmography for lung volumes and airway resistance, and diffusing capacity). The final diagnosis was established with all additional tests deemed necessary by the responsible clinician, the patients' history and PFTs. Subsequently, it was validated by an ad hoc installed expert panel (BPFS) or by the clinical expert panel taking care of the patients in follow-up (Leuven data). The expert discussions of the BPFS were organised during the local meetings of physicians, at which all individual cases were presented to obtain a final diagnosis by consensus. In case there was disagreement, voting was used for a final gold standard diagnosis and, if needed, a secondary diagnosis [8]. For the retrospective PFT data collection of patients followed at the University Hospital Leuven, corresponding medical records were verified on the final diagnosis. For the few cases in which there was doubt about the diagnosis, the PFT data were not extracted and these cases were rejected. Internal 10-fold cross-validation tuned the machine learning model, with the best model resulting in a diagnostic accuracy of 74%. To obtain an unbiased estimate of accuracy and validate findings, the model was run at the Leuven pulmonary service on a randomly selected sample of 136 subjects. The model demonstrated a consistent diagnostic accuracy of 76% [17]. Probabilistic output for each of the diagnostic categories obtained by the machine learning model was summarised in a report (supplementary figure S3).
Pulmonary function tests
All PFTs were performed with standardised equipment by respiratory technicians (MasterLab; Jäger, Würzburg, Germany), according to ATS/ERS criteria [18]. Spirometry data, as well as plethysmography and single-breath diffusing capacity data, were given as absolute values, but also expressed as percent predicted of normal reference values and as z-scores [19–21]. In the current prospective study, these data were presented to the AI software and pulmonologists, the latter also having access to the corresponding flow–volume loops, plethysmography and diffusing capacity manoeuvres.
Statistical analysis
Statistical analysis was performed using R version 3.3.3 (Foundation for Statistical Computing, Vienna, Austria). Figures were produced using Prism version 6 (GraphPad, La Jolla, CA, USA). The interobserver agreements were assessed using Fleiss' κ for multiple raters on categorical data. Interpretative strategies for lung function tests from the ATS/ERS task force were used as the gold standard to define a correct lung function pattern [2]. Preferred diagnostic category, by pulmonologists or software, was considered as correct if it corresponded to the gold standard diagnosis made historically by the expert panel based on all data. For both measures, i.e. PFT pattern interpretation and diagnostic category suggestion, accuracy was defined as the percentage of correctly labelled cases. The t-test and Mann–Whitney U-test were used to evaluate differences between groups with normal and non-parametric distribution, respectively. The Kruskal–Wallis test was used to determine the statistical difference between multiple groups. The one-sample t-test was used to assess the difference of AI performance and the average accuracy of pulmonologists. Results are presented as mean with standard deviation or as median with range.
Results
There were 120 pulmonologists who all together made 6000 evaluations of PFTs with clinical information. The pulmonologist group consisted of more senior members (n=88, established pulmonologists) than junior members (n=32, pulmonologists in training). A minimum number of five pulmonologists per centre was needed to participate.
PFT pattern interpretations
Applying the ATS/ERS interpretative strategies for PFTs revealed that the population consisted of 18 patients with an obstructive pattern, 10 patients with a restrictive pattern and 22 patients with a normal lung function pattern, while there were no subjects with a mixed pattern. The interpretations of 118 pulmonologists (data were missing from two) matched with the reference PFT pattern in 74.4±5.9% of the 50 cases, ranging from 56% to 88% per individual. The identification of a restrictive pattern was more difficult (positive predictive value 59% and sensitivity 75%) compared with normal and obstructive patterns (table 2). Even though a mixed pattern was not present, 376 (6%) cases were interpreted as mixed. A κ=0.67 signified a considerable interrater variability or disagreement between different pulmonologists. When the accuracy between different centres was compared, no significant differences in correct detections were found (p=0.06) (figure 1a). There were no significant differences between university and non-university centres (p=0.06) or between senior and junior readers (p=0.49). Interestingly, out of the 285 misclassified normal patterns falsely labelled into an obstructive pattern, 216 (76%) were on the four cases having a forced expiratory volume in 1 s (FEV1)/forced vital capacity (FVC) ratio the above lower limits of normal but still below the 0.7 fixed cut-off.
Preferred diagnostic categories
For an individual pulmonologist, it was rather difficult to assign a correct preferred diagnostic category based on complete PFT data and clinical information. The mean accuracy of 6000 evaluations was only 44.6±8.7%, and it ranged from 39% to 51% per centre and from 24% to 62% per individual pulmonologist (figure 1b). A low κ score of 0.35 was indicative of a common disagreement between pulmonologists. Interestingly, age or clinical experience of the examiners did not influence the mean accuracy (seniors 45±4.2% versus juniors 43.6±4.8%; p=0.46). Likewise, results were neither different between hospitals (p=0.44) nor affected by hospital type (university 44.1±9.4% versus non-university 45.2±7.8%; p=0.47) or by country (p=0.26).
Due to a higher sensitivity, patterns of healthy subjects (true positive rate 71%) and subjects with COPD (true positive rate 65%) were more often identified on lung function than any of the other categories. Patient cases of less prevalent conditions, without a straightforward pattern (“fingerprint”) on lung function, were more difficult for the pulmonologists (thoracic deformity and neuromuscular disease, true positive rate 25%; asthma, true positive rate 20%). A detailed statistical group comparison is shown in table 3 and supplementary figure S4.
Confidence in decision making
Rarely, pulmonologists were “absolutely not sure” (in 2.7% of cases) or “not sure” (11.5%) when suggesting the preferred diagnostic category. Most commonly they were “sure” (36.5%) and “absolutely sure” (16%) in their decisions. Higher confidence in diagnostic suggestion was observed in decisions that were correct (p<0.0001) compared with the incorrect decisions. However, high confidence did not necessarily lead to correct diagnosis. From all “sure” and “absolutely sure” records, only 51.8% of the diagnoses were correct (supplementary figures S5 and S6).
Comparison with the AI software
The in-house developed AI-based software perfectly matched the pattern interpretations of the ATS/ERS guidelines (100%). Software response was 0.2 s, giving immediate and consistent interpretations. Moreover, it assigned a correct diagnostic category in 82% of the cases, which was greatly superior to the average 44.6% accuracy of the pulmonologists (p<0.0001) (figure 2). It also proved to be highly sensitive in recognising COPD, neuromuscular disease, interstitial lung disease and healthy subjects. Concerning positive predictive value, the software showed powerful results for the majority of the respiratory disease diagnoses (figure 3 and table 4). Both the sensitivity and positive predictive value of the AI-based algorithm were superior to expert-based diagnostic category allocation in each of the eight disease groups (figure 3). AI lacked sensitivity for the OBD group, which was recouped by the very high positive predictive value.
Discussion
In this study, we explored the accuracy and consistency between pulmonologists when interpreting PFT patterns and providing a preferred diagnostic category. PFT pattern interpretations matched the ATS/ERS guidelines in 74.4% of cases with an interrater variability of κ=0.67, demonstrating that such a fundamental task is prone to mistakes and disagreements. PFTs combined with limited clinical information were difficult for pulmonologists as the only tool for reaching an accurate diagnostic category (accuracy of 44.6% and significant variability of κ=0.35). However, our advanced AI-based software for the automated clinical reading of PFTs perfectly interpreted (100%) PFT patterns and pointed to the correct diagnostic category in 82% of all cases. Consequently, it outperformed the pulmonologists in both tasks by 34% and 84%, respectively, which demonstrates that individual pulmonologists do not sufficiently capture the information available in PFTs.
Facilitating clinical practice with decision support systems is not a new idea and it has been shown that the majority (64%) of such systems do improve the performance of individual clinicians [22]. Nowadays, we regularly use them to interpret ECGs, to analyse mammogram irregularities or as reminders for drug prescription [23, 24]. Although automated analyses of PFTs have been evaluated previously [25, 26], none has become a clinical reality. First, there is an obvious difficulty in reaching a preferred diagnosis without knowing the clinical context [27, 28]. Second, there is a lack of clear international diagnostic guidelines to label respiratory diseases based on PFTs, with controversial and often arbitrary choices of cut-offs to label abnormality. This implies that not all pulmonologists are using the same interpretative strategies in their daily routine [29, 30]. For example, a typical conflict is often seen in the first interpretative step: should we take the lower limits of normal or fixed 0.7 cut-off for the FEV1/FVC ratio [31]. Undoubtedly, this will explain some of the differences between the interpretations of pulmonologists, but it also highlights a more general concern. Different recommendations on which cut-offs to use will reclassify individual patients from healthy to diseased and vice versa, while in real life the disease processes will present as a continuum around pre-fixed values. The strength of complete PFTs lies in the variety and multitude of tests in order to recognise disease-specific patterns, regardless of these fixed cut-off points.
Using AI, we approached each disease as having a unique fingerprint on the PFT. As such, AI identifies subtle and defining characteristics that are challenging for humans to detect, and incorporates them into a powerful discriminating diagnostic algorithm. In our case, the AI system takes complete input data and maps them into a high-dimensional space. As a result of a large number of known disease cases, with known magnitudes and patterns between all input data, AI will construct the most optimal hyperplanes that categorise new examples. Once presented with the data of a new patient, AI maps them into the same high-dimensional space and predicts to which category they belong. Such a multidimensional approach exceeds human capabilities to observe the same data in terms of accuracy. Fundamentally, the AI algorithm is no longer dependent on the arbitrary cut-offs, but is a purely patient data-driven knowledge system. In fact, with the increase in computing resources, modern AI algorithms have entirely moved away from rule-based systems and currently adopt a probabilistic approach. Our study confirms that a unique data-driven fingerprint of each disease often exists in the PFTs.
A fascinating characteristic of AI-based software is its ability to improve over time by being exposed to new and more difficult cases. In other words, the developed software may improve (as do physicians) by learning from mistakes and gaining experience. It is too ambitious to expect the software to be correct in 100% of cases, as some respiratory diseases do not show characteristic lung function abnormalities. Particularly for early disease stages or combined complex disease processes, disease-specific characteristics may be hidden. As the current accuracy of the AI software is situated within the range that clinical expert panels reached during the BPFS [8], there is probably little room for improvement. However, it also indicates that a computer can process all necessary information as effectively as a group of experts (not the individual), yet at a much higher speed and with 100% consistency for the same data input. The further usefulness of the AI software will be demonstrated if it decreases the time to final diagnosis, reduces the number of tests needed for a final diagnosis and, if by standardising PFT interpretation, a number of misdiagnoses can be avoided.
Comparable with the human examiner marking their confidence on a Likert scale, AI expresses its certainty as a probability of a patient belonging to one of the disease categories. In the situations where AI made a wrong diagnostic suggestion, it should be noted that it never attributed a high probability to this diagnosis. More specifically, probability barely exceeded 50% in two out of the nine mislabels and it was <50% in the seven other cases. Surprisingly, the use of the COPD Assessment Test for the quantification of symptoms in the BPFS study did not contribute to further improving the accuracy of our AI software. This suggests that most respiratory diseases present with similar non-specific symptoms such as cough and dyspnoea. It is tempting to speculate that more input, e.g. more extensive history taking, and tests like exhaled nitric oxide fraction, forced oscillometry and/or blood/radiological markers, could enhance its future potential. In particular, for diseases such as asthma that can present with a normal PFT, the added value of such tests when integrated into our AI-based software is obvious.
A limitation of the current study is that we underestimated the accuracy of the pulmonologists by limiting the amount of clinical data to suggest a preferred diagnosis. In reality, a diagnosis is reached by a synergy of multiple factors, including expanded history, clinical examination, imaging and blood sampling. The real-life situation may therefore yield better outcomes. Additionally, the test sample we used may not entirely reflect the prevalence of diseases that pulmonologists confront in daily clinical practice. It is clear that we only explored the maximum output that could be reached from PFTs and clinical information, representative of the first diagnostic encounter. Furthermore, we did not formally test the level of agreement within the ad hoc expert panel to define the final diagnosis. Although the experts relied on all available test information, one may speculate that providing the AI interpretation would have favoured their initial agreement. A final limitation is that the risk of misinterpretation and misdiagnosis increases if tests are poorly performed [32]. However, sufficient quality of the tests is needed for both human and computer interpretations.
To conclude, our data indicate that interpretation of PFTs and the suggestion of primary respiratory disease diagnosis by pulmonologists is highly variable. The AI-based software has superior performance and may provide a powerful decision support tool for clinicians. The significance of such technology in improving clinical practice will drive real-life acceptance by the medical community.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material ERJ-01660-2018.Supplement
Acknowledgements
We thank all the pulmonologists, pulmonary function technicians, patients and hospitals who participated in the study for providing and analysing data.
Footnotes
This article has supplementary material available from erj.ersjournals.com
This study is registered at ClinicalTrials.gov with identifier number NCT03264417.
Author contributions: All authors critically revised the manuscript and approved the final version. All authors organised evaluation sessions in hospitals, examined patient files and interpreted results. M. Topalovic performed the data acquisition, analysis, interpretation as well as contributed to the study design and wrote the manuscript. N. Das contributed to data acquisition. W. Janssens takes responsibility for the content of the manuscript, contributed to the study design, and assisted in the data analysis, interpretation and writing of the manuscript.
The Pulmonary Function Study Investigators: R. De Pauw, C. Depuydt, C. Haenebalcke, S. Muyldermans, V. Ringoet, D. Stevens (AZ Sint-Jan Hospital, Bruges, Belgium); S. Bayat, J. Benet, E. Catho, J. Claustre, A. Fedi, M.A. Ferjani, R. Guzun, M. Isnard, S. Nicolas, T. Pierret, C. Pison, S. Rouches, B. Wuyam (CHU Grenoble Alpes, Grenoble, France); J.L. Corhay, J. Guiot, K. Ghysen, L. Renaud, A. Sibille (University Hospital, Liege, Belgium); H. De La Barriere, C. Charpentier, S. Corhut, K.A. Hamdan, M. Schlesser, G. Wirtz (Centre Hospitalier de Luxembourg, Luxembourg, Luxembourg); E. Alabadan, G. Birsen, P.R. Burgel, A. Chohra, C. Hamard, B. Lemarié, M.N. Lothe, C. Martin, A.C. Sainte-Marie, L. Sebane (Cochin Hospital, Paris, France); Y. Berk, B. de Brouwer, R. Janssen, J. Kerkhoff, A. Spaanderman, M. Stegers, A. Termeer, I. van Grimbergen, A. van Veen, L. van Ruitenbeek, L. Vermeer, R. Zaal, M. Zijlker (Canisius Wilhelmina Hospital, Nijmegen, The Netherlands); J. Aumann, K. Cuppens, D. Degraeve, K. Demuynck, B. Dieriks, K. Pat, L. Spaas, R. Van Puijenbroek, K. Weytjens, J. Wynants (Jessa Hospital, Hasselt, Belgium); V. Adam, B.J. Berendes, E. Hardeman, P. Jordens, E. Munghen, K. Tournoy, P. Vercauter (Onze-Lieve-Vrouw Hospital, Aalst, Belgium); T. Alame, M. Bruyneel, M. Gabrovska, I. Muylle, V. Ninane, D. Rozen, P. Rummens. S. Van Den Broecke (Saint-Pierre Hospital, Brussels, Belgium); A. Froidure, S. Gohy, G. Liistro, T. Pieters, C. Pilette, F. Pirson (Université Catholique de Louvain, Brussels, Belgium); H. Kerstjens, M. Van den Berge, N. Ten Hacken, M. Duiverman, D. Koster (University Medical Center Groningen, Groningen, The Netherlands); B. Vosse, L. Conemans, M. Maus, M. Bischoff, M. Rutten, D. Agterhuis, R. Sprooten (Maastricht University Medical Center, Maastricht, The Netherlands); B. Beutel, A. Jerrentrup, A. Klemmer, C. Viniol, C. Vogelmeier (University Medical Center, Marburg, Germany); H. Bode, C. Dooms, D. Gullentops, W. Janssens, K. Nackaerts, D. Rutens, E. Wauters, W. Wuyts (University Hospital Leuven, Leuven, Belgium); E. Derom, S. Dobbelaere, S. Loof, G. Serry, B. Putman, L. Van Acker, Y. Vandeweygaerde (Ghent University Hospital, Ghent, Belgium); M. Criel, M. Daenen, R. Gubbelmans, S. Klerkx, E. Michiels, M. Thomeer, A. Vanhauwaert (Hospital Oost-Limburg, Genk, Belgium).
Conflict of interest: M. Topalovic has nothing to disclose.
Conflict of interest: N. Das has nothing to disclose.
Conflict of interest: P-R. Burgel reports personal fees from AstraZeneca, Boehringer Ingelheim, Chiesi, Novartis, Teva and Vertex, outside the submitted work.
Conflict of interest: M. Daenen has nothing to disclose.
Conflict of interest: E. Derom has nothing to disclose.
Conflict of interest: C. Haenebalcke reports personal fees from Novartis, Chiesi, GSK and AstraZeneca, outside the submitted work.
Conflict of interest: R. Janssen has nothing to disclose.
Conflict of interest: H.A.M. Kerstjens has nothing to disclose.
Conflict of interest: G. Liistro has nothing to disclose.
Conflict of interest: R. Louis reports grants and personal fees from GSK and Novartis, personal fees from AstraZeneca, and grants from Chiesi, outside the submitted work.
Conflict of interest: V. Ninane has nothing to disclose.
Conflict of interest: C. Pison has nothing to disclose.
Conflict of interest: M. Schlesser has nothing to disclose.
Conflict of interest: P. Vercauter has nothing to disclose.
Conflict of interest: C.F. Vogelmeier reports personal fees from Almirall, Cipla, Berlin-Chemie/Menarini, CSL Behring and Teva, grants and personal fees from AstraZeneca, Boehringer Ingelheim, Chiesi, GSK, Grifols, Mundipharma, Novartis and Takeda, grants from German Federal Ministry of Education and Research (BMBF) Competence Network Asthma and COPD (ASCONET), Bayer Schering Pharma AG, MSD and Pfizer, outside the submitted work.
Conflict of interest: E. Wouters reports personal fees for board membership from Nycomed and Boehringer, grants from AstraZeneca and GSK, and personal fees for lectures from AstraZeneca, GSK, Novartis and Chiesi, outside the submitted work.
Conflict of interest: J. Wynants has nothing to disclose.
Conflict of interest: W. Janssens has nothing to disclose.
Support statement: This work was supported by the Vlaams Agentschap Innoveren & Ondernemen (VLAIO, government body, 2016–2018). The funder had no role in study design and conduct of the study; collection, management, analysis and interpretation of the data; preparation, review or approval of the manuscript; and decision to submit the manuscript for publication. Funding information for this article has been deposited with the Crossref Funder Registry.
- Received August 30, 2018.
- Accepted January 25, 2019.
- Copyright ©ERS 2019