Abstract
Prediction modelling is critical for improving ALS respiratory care. Assessing external validity of a model should account for discrimination, calibration, and cohort characteristics. Further work on outcomes is necessary before practice implementation. http://bit.ly/2XbfeUq
From the authors:
We appreciate the interest of D. Adler and colleagues in our recent manuscript describing a prognostic model for early respiratory insufficiency in amyotrophic lateral sclerosis (ALS) [1]. After applying our prediction model in a cohort of 50 ALS patients at their centre in Geneva, they obtained a c-statistic, sensitivity and specificity all virtually identical to our external validation in the Pooled Resource Open-Access Clinical Trials (PRO-ACT) ALS database. In addition, the Geneva cohort calibration curve resembles figure E5 from PRO-ACT in our supplementary material [1].
In this setting, perhaps the primary conclusion is that D. Adler and co-workers' single-centre results are very similar to those from our large multicentre validation cohort, despite differences in the study samples. First, 24% of patients in the Geneva group were using noninvasive ventilation (NIV) at baseline and were excluded, compared to 1% in the Penn cohort (and 4% in the PRO-ACT). The European Federation of Neurological Societies guidelines propose a NIV initiation threshold at forced vital capacity (FVC) <80% [2], which is much higher than that of the American Academy of Neurology guidelines (FVC <50%), which may explain these differences [3]. Only 20% of the 50 patients included in the Geneva cohort developed respiratory insufficiency or died within 6 months of observation, compared to 39% and 35% in Penn and PRO-ACT, differences likely due the exclusion of the sicker patients already using NIV from the study sample (selection bias).
Due to the differences in the underlying risk of respiratory failure in the Geneva sample, we calculated a lower positive predictive value (36%, 95% CI 13–65%) and higher negative predictive value (86%, 95% CI 71–95%) than PRO-ACT (62% and 76%, respectively). Of course, the small sample size of the Geneva cohort caused extremely wide 95% confidence intervals for the discrimination estimates and likely for the calibration curve (although they are not shown). Despite these differences and wide confidence intervals, the findings by D. Adler and co-workers closely resemble our external validation findings, testifying to the robustness and generalisability of our model.
The properties of discrimination and calibration of prediction rules support different uses. Both are key measurements for assessing the validity of prediction models. Calibration refers to the agreement between predicted and observed outcomes in a population. Discrimination refers to the model's ability to distinguish patients with versus without an outcome. A model that predicts all individuals to have a risk equal to the actual incidence of an outcome would be a model with excellent calibration but poor discrimination. Highlighting that an average predicted risk of 24% is higher than the actual incidence of 20% does not fully characterise the model's discrimination or calibration abilities. However, the Geneva cohort had a similar c-statistic, sensitivity and specificity to PRO-ACT (thus similar discrimination) and a similar calibration curve which provided reasonably accurate estimates, realising that identical and consistent calibration at all levels of risk of the outcome may not be a realistic goal [4, 5].
We agree with the need for a useful, discriminating and calibrated prediction model for respiratory events in ALS to expedite timeliness of care, shape patient expectations and enrich clinical trial design. We also agree that further research is necessary before widespread clinical use of the prediction rule. For example, next steps may include applying the prediction rule to identify high-risk patients for inclusion in randomised clinical trials; stratification of randomised patients by the predicted risk of respiratory failure; or assessing how randomising patients/clinicians to receiving prediction results affects quality of life and respiratory outcomes. We agree with D. Adler and colleagues that more work needs to be done in early identification and treatment of ALS patients at high risk of respiratory failure.
Footnotes
Conflict of interest: J. Ackrivo has nothing to disclose.
Conflict of interest: L. Elman has nothing to disclose.
Conflict of interest: S.M. Kawut has nothing to disclose.
Support statement: This work was supported by the US Department of Health and Human Services, National Institutes of Health, National Heart, Lung, and Blood Institute (F32 HL-144145 and K24 HL-103844). Funding information for this article has been deposited with the Crossref Funder Registry.
- Received June 20, 2019.
- Accepted June 21, 2019.
- Copyright ©ERS 2019