Abstract
The opportunities that may arise from using AI software for interpretation of PFTs and help in diagnosing lung diseases are evident. The idea of a complementary partnership between doctors and AI-supported systems sounds very promising. http://bit.ly/2JE60gb
From the authors:
We appreciate the attentive comments of S. Gonem and S. Siddiqui in their correspondence addressing our article [1]. They further highlight the apparent opportunities and the need to create a space for the introduction of novel technologies, such as our work in the field of pulmonary function testing (PFT).
We agree with the authors that disease prevalence within the training dataset will influence the outcome provided by the artificial intelligence (AI), which is why real-life situations should be considered when building the initial training dataset. It is common knowledge that classes (diseases) with few cases in the training data or unbalanced datasets may result in a poor approximation. Therefore, careful consideration should be taken to secure a large enough sample for each disease, as we did in our work. It is essential to add that the final model performance will also depend on the data pre-processing steps (such as data cleaning, normalisation, transformation, feature extraction and selection) that were taken and final model parameter tuning and optimisation [2]. This all goes into the final equation for which the algorithm will be able to recognise specific patterns that each disease carries in the PFT data. Since the prevalence of different diseases does indeed affect the overall accuracy of the algorithm, we attempted to study the disease-specific sensitivities and specificities, which are prevalence-independent test statistics [3]. The results are shown in figure 3 of the original manuscript, where the positive predictive value was usually highest for the low prevalent diseases.
The selection of 50 cases in the study was carried out over 5 days in a random week in which we randomly selected 10 cases with complete PFT performed on that day. As cases were selected at a university hospital and reference centre for respiratory diseases, a broad spectrum of disease classes was obtained.
S. Gonem and S. Siddiqui noted that all studies today focus on the knowledge that a computer can receive from humans, without any consideration for the opposite. We firmly believe in the benefits that this may create for our community, for the education of trainees and the quality of the PFT outcome and, most importantly, for the patients themselves. We can only speculate on where the future will take us, but the idea of a complementary partnership between humans and AI-supported systems sounds very promising. It may be even illusionary to believe that this will not be the case in many fields of medical diagnostic applications. During the development of our AI software, we anticipated the value it may bring and the need to make it available for the global community. Therefore, the AI software presented in the study is already available for everyday use at the University Hospital of Leuven via our interface. To date, more than 5000 complete PFTs have already been analysed by our software in daily clinical practice. These are real patients visiting lung function testing facilities. This number increases daily, and we are hoping to accelerate it over the coming months by adding new lung function testing laboratories. The initial and straightforward advantage of such software is that it can save a lot of the time spent on analysing PFTs in the laboratory. Moreover, the outcome of the software can be considered as an accurate second opinion (similar to automated ECG interpretations), thereby increasing the quality of the interpretations themselves. It is evident that with the appropriate setup the AI software can further enhance its performance with respect to accuracy, probabilities, and detection of new diseases or phenotypes.
Finally, as the authors noted, there is always some degree of scepticism of AI applications, owing to their being perceived as a closed “black box”. To increase confidence and trust in the software, our current work focuses on unveiling the complex reasoning of AI behind making each decision. This will, apart from offering probabilities for a diagnostic decision, additionally provide key factors that influenced that decision (for example, the probability for COPD might be 80%, because a patient has low forced expiratory volume in 1 s to forced vital capacity ratio, severely decreased diffusion capacity, increased airway resistance, hyperinflation and air-trapping, and the patient is an elderly heavy smoker).
Our original study is an excellent example of the potential that novel technologies such as AI have. It is important to emphasise that this potential needs to be nurtured appropriately, with careful attention to all factors: training data, machine learning methodology and the often-forgotten domain knowledge. We will continue our work in this field, with a promise to find a mechanism for the global medical community to implement AI software for interpretation of pulmonary function tests in their clinical practices.
Footnotes
Conflict of interest: M. Topalovic is a co-founder of a spin-off company, ArtiQ.
Conflict of interest: N. Das has nothing to disclose.
Conflict of interest: W. Janssens is a co-founder of a spin-off company, ArtiQ.
- Received April 18, 2019.
- Accepted April 19, 2019.
- Copyright ©ERS 2019