Abstract
A predictive model for COVID-19 https://bit.ly/2WHiEkJ
To the Editor:
Over the past 3 months, coronavirus disease 2019 (COVID-19) has emerged across China and developed into a worldwide outbreak [1]. The disease has caused varying degrees of illness. The proportion of patients with COVID-19 with non-severe illness was 84.3% on admission, and severe cases accounted for 15.7% [2]. Most of the non-severe pneumonia patients would gradually alleviate and be cured with treatment, while others would rapidly progress to severe illness, which has a poor prognosis [3, 4]. As recently reported, the cumulative risk of the composite end-point was 3.6% in all COVID-19 patients, and the cumulative risk was 20.6% for severe illness [2].
However, it is still unknown whether early identification and intervention for non-severe patients with COVID-19 could prevent progression into severe disease. According to the experience of treating other diseases, there might be a large promoting effect of treatment. In this paper, we aim to build a predictive model for identifying high-risk non-severe pneumonia patients at an early stage.
86 patients with COVID-19 and non-severe pneumonia on admission were recruited as the training cohort at Renmin Hospital of Wuhan University from 2 to 20 January, 2020, and another 62 patients were prospectively enrolled as the validation cohort from 28 January to 9 February, 2020. COVID-19 was confirmed by real-time PCR. Disease severities of COVID-19 were defined as severe and non-severe pneumonia based on the criteria of American Thoracic Society guidelines for community-acquired pneumonia [2, 5]. The exclusion criteria included: 1) degrees of severity were not available on admission or during follow-up; 2) diagnosed with severe illness at the time of admission; 3) confirmed with COVID-19 and treated at other hospitals; 4) medication was administered within 15 days before admission; 5) received oxygen support during follow-up. Patients were divided into “progressed” or “non-progressed” groups, based on whether they progressed to severe illness or not during the 14-day follow-up period. Comorbidity included diabetes, hypertension, cardiovascular and cerebrovascular diseases, COPD, malignant tumour, chronic liver disease, chronic kidney disease, tuberculosis and immunodeficiency diseases, etc.
Clinical characteristics and laboratory findings were extracted from electronic medical records. Radiological features were extracted from chest computed tomography (CT) imaging using a double-blind method [6]. To evaluate the lesion size accurately, a diagnosis system for COVID-19 based on artificial intelligence (AI) was employed to measure volume ratio of pneumonia automatically by analysing CT values [7, 8].
Logistic regression was used as the classifier to build the predictive model. The discriminative performance of the predictive model was quantified by the value of the area under the receiver operating characteristic curve (AUC) in the cross-validation of the training and validation datasets. Risk index calculated with the weight of each variable in the model was used to identify high-risk groups. All analyses were performed using R-3.6.0.
The median age of the 148 patients was 46.5 years (interquartile range (IQR) 35.8–58.0 years), and 81 (54.7%) were female. A total of 60 (40.5%) non-severe patients progressed to severe illness, and the median time of progression was 5.0 days (IQR 2.8–9.0 days). For training cohort, 60 (40.5%) non-severe patients progressed to severe illness, and 26 (41.9%) cases were in validation cohort. The median time of progression in these two cohorts were 5.5 days (IQR 1.0–9.0 days) and 5.0 days (IQR 3.0–9.8 days). Description of variables was provided in the table 1.
To build the predictive model, we tested all the clinical, laboratory and radiological variables, except for characteristics about treatment. Four variables were finally included in the model, including comorbidity (β=1.234, p=0.036), dyspnoea on admission (β=1.583, p=0.095), lactate dehydrogenase (β=0.007, p=0.027) and lymphocyte count (β=−2.012, p=0.002). The Hosmer Lemeshow test of the training dataset was done (Chi-squared=10.451, p=0.235). The AUC value in the cross-validation of training dataset was 0.819 (95% CI 0.731–0.907). It was 0.759 (95% CI 0.635–0.884) in the validation dataset. According to the regression coefficients, the four variables were given different weights. Comorbidity was 12 points per unit, dyspnoea was 16, lactate dehydrogenase was 0.07, and lymphocyte count was −20. Then, total scores for each person were calculated, and different scores showed different risks. AUC value based on the risk scores in training dataset was 0.856 (95% CI 0.776–0.935). Patients were divided into high-risk and low-risk groups (total score >−6.0 and ≤−6.0) based on the best cut-off value determined by the Youden index; the sensitivity was 0.941, specificity was 0.635. More details can be found in table 1.
In our prediction model, comorbidity was associated with disease progression, which meant that patients with comorbidities were more likely to progress to severe disease than those without. Previous studies have shown a higher proportion of patients with comorbidities in those with more severe disease [9]. We further confirmed that non-severe patients with comorbidities were more likely to progress. It should be explained that the p value for dyspnoea on admission was not less than 0.05 in the multivariate regression, which might be due to the relationship between dyspnoea and the outcome in this study not being strictly linear after adjusting for other variables. Although we did try other models with better performance earlier, we finally chose the logistic model because of its interpretability and simplicity of application. Patients who progressed have been found to be more likely to accompany this with a decrease in lymphocyte count and an increase in lactate dehydrogenase [2, 10]. Our research further confirmed that these two indicators were also related to disease progression. A decrease in lymphocyte count usually indicated the decline of immune function, and multiple organ dysfunction might lead to an increase in lactate dehydrogenase [11], which are consistent with the phenomena we have observed clinically.
Previous reports have pointed out that advanced age was one of the risk factors for poor prognosis in patients with COVID-19 [2, 3]. However, age was not included in the model. It suggests that treatment for young non-severe illness patients should not be neglected in at an early stage. We speculate that the contribution of age to disease progression was reflected in comorbidities and dyspnoea. In addition, some studies reported the correlations between radiological indicators and COVID-19 disease [12]. Although radiological features in CT images on admission were described in detail, they were not included into the model. We speculate that multiple images during treatment instead of a single image could indicate further progression of the disease. Although variables extracted with AI from CT imaging were not included in the model, this was showed promise and will be the focus of our subsequent research.
There were some limitations to this study. First, patients with COVID-19 included in this study were from a single hospital, which is a potential constraint for the generalisation of our model. Second, critically ill patients were transferred to other designated hospitals according to the regulations of the local government. We were unable to track these patients’ deaths in the short term, and the association between the model and overall survival could not be evaluated, which unfortunately was a major limitation of this study.
Conclusively, the progression of non-severe patients with COVID-19 could be predicted by our model based on clinical characteristics on admission. The model was further verified with a prospective validation cohort with good performance. With the help of our model, clinicians could easily identify high-risk non-severe patients on admission with few routine clinical indicators, thereby contributing to the treatment and prevention of COVID-19.
Shareable PDF
Supplementary Material
This one-page PDF can be shared freely online.
Shareable PDF ERJ-01234-2020.Shareable
Footnotes
Support statement: This work was supported by the National Natural Science Foundation of China (grant: 81901817), Natural Science Foundation of Hubei Province (grant: 2018CFB136), and Innovation Seed Funding of Wuhan University (grant: TFZZ2018020). Funding information for this article has been deposited with the Crossref Funder Registry.
Conflict of interest: M. Ji has nothing to disclose.
Conflict of interest: L. Yuan has nothing to disclose.
Conflict of interest: W. Shen has nothing to disclose.
Conflict of interest: J. Lv has nothing to disclose.
Conflict of interest: Y. Li has nothing to disclose.
Conflict of interest: J. Chen has nothing to disclose.
Conflict of interest: C. Zhu has nothing to disclose.
Conflict of interest: B. Liu has nothing to disclose.
Conflict of interest: Z. Liang has nothing to disclose.
Conflict of interest: Q. Lin has nothing to disclose.
Conflict of interest: W. Xie has nothing to disclose.
Conflict of interest: M. Li has nothing to disclose.
Conflict of interest: Z. Chen has nothing to disclose.
Conflict of interest: X. Lu has nothing to disclose.
Conflict of interest: Y. Ding has nothing to disclose.
Conflict of interest: P. An has nothing to disclose.
Conflict of interest: S. Zhu has nothing to disclose.
Conflict of interest: M. Gao has nothing to disclose.
Conflict of interest: H. Ni has nothing to disclose.
Conflict of interest: L. Hu has nothing to disclose.
Conflict of interest: G. Shi has nothing to disclose.
Conflict of interest: L. Shi has nothing to disclose.
Conflict of interest: W. Dong has nothing to disclose.
- Received March 16, 2020.
- Accepted May 10, 2020.
- Copyright ©ERS 2020
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0.