Machine learning can predict disease manifestations and outcomes in lymphangioleiomyomatosis

Saisakul Chernbumroong; Janice Johnson; Nishant Gupta; Suzanne Miller; Francis X. McCormack; Jonathan M. Garibaldi; Simon R. Johnson

doi:10.1183/13993003.03036-2020

Abstract

Background Lymphangioleiomyomatosis (LAM) is a rare multisystem disease with variable clinical manifestations and differing rates of progression that make management decisions and giving prognostic advice difficult. We used machine learning to identify clusters of associated features which could be used to stratify patients and predict outcomes in individuals.

Patients and methods Using unsupervised machine learning we generated patient clusters using data from 173 women with LAM from the UK and 186 replication subjects from the US National Heart, Lung, and Blood Institute (NHLBI) LAM registry. Prospective outcomes were associated with cluster results.

Results Two- and three-cluster models were developed. A three-cluster model separated a large group of subjects presenting with dyspnoea or pneumothorax from a second cluster with a high prevalence of angiomyolipoma symptoms (p=0.0001) and tuberous sclerosis complex (TSC) (p=0.041). Patients in the third cluster were older, never presented with dyspnoea or pneumothorax (p=0.0001) and had better lung function. Similar clusters were reproduced in the NHLBI cohort. Assigning patients to clusters predicted prospective outcomes: in a two-cluster model the future risk of pneumothorax was 3.3 (95% CI 1.7–5.6)-fold greater in cluster 1 than cluster 2 (p=0.0002). Using the three-cluster model, the need for intervention for angiomyolipoma was lower in clusters 2 and 3 than cluster 1 (p<0.00001). In the NHLBI cohort, the incidence of death or lung transplant was much lower in clusters 2 and 3 (p=0.0045).

Conclusions Machine learning has identified clinically relevant clusters associated with complications and outcome. Assigning individuals to clusters could improve decision making and prognostic information for patients.

Abstract

Using machine learning, simple clinical information from women with LAM can be used to group individuals into clusters. Clusters have differing clinical features, levels of complications and survival, and may improve personalised care for LAM. https://bit.ly/2UVanYV

Introduction

Lymphangioleiomyomatosis (LAM) is a rare multisystem disease that occurs both sporadically and in those with tuberous sclerosis complex (TSC) [1]. The prevalence of LAM is estimated to be less than 1 per 100 000 women [2], and the diagnosis of an orphan disease is frequently difficult for patients due to feelings of isolation and uncertainty over their prognosis and future disease manifestations [3]. This is particularly true for LAM where both the clinical manifestations and rates of disease progression vary. Although all patients have lung cysts, only 70% have pneumothorax [4, 5]. Half of women with sporadic LAM and almost all with TSC-LAM have angiomyolipomas, a proportion of which enlarge and are at risk of haemorrhage [6]. Around 20% of patients have significant lymphatic disease [7]. Prognosis can be difficult to predict as some patients have well-preserved lung function long term, while others require lung transplantation within a decade of diagnosis.

There are few predictive markers of outcome in LAM. Oestrogen is thought to contribute to disease progression [8–10] and pre-menopausal status is associated with more rapid loss of lung function [10, 11]. High levels of the lymphangiogenic growth factor, vascular endothelial growth factor type D (VEGF-D), and the presence of bronchodilator reversibility are associated with more rapid loss of forced expiratory volume in 1 s (FEV₁) in some studies [12, 13], and genetic variants in vitamin D binding protein are associated with shorter survival [14]. Smaller studies have reported other features that are associated with outcome, including mode of presentation and initial lung function, although all of these associations lack predictive power in individual subjects [15, 16]. Uncertainty around disease progression and complications can worry patients, and lead to restrictive lifestyle changes and an unselective approach to management, with many patients given unnecessarily pessimistic advice [17, 18].

We hypothesised that groups of clinical features preferentially cluster together, and that identifying these associations would improve prediction of complications and outcomes. We used machine learning to associate biological and physiological variables in two national cohorts with the aim of identifying subphenotypes within the LAM population that could be used to predict disease manifestations and improve clinical advice.

Methods

The clinical cohorts, variables and analysis are described fully in the supplementary material.

Subjects and clinical data

The discovery cohort comprised 173 women recruited at the UK National Centre for LAM (Nottingham, UK) between 2011 and 2018 (figure 1). All subjects had LAM defined by American Thoracic Society/Japanese Respiratory Society criteria [19]. A further 10 women were added after the discovery analysis until December 2019. All patients attending the Centre were invited to participate and measurements were made as part of clinical care. At their first visit, which formed the baseline assessment, subjects had computed tomography (CT) of the chest, abdomen and pelvis, screening for TSC, lung function tests, bronchodilator reversibility testing and a 6-min walk test according to European Respiratory Society/American Thoracic Society standards [20]. CT was used to screen for angiomyolipoma and lymphatic disease, the latter defined as the presence of lymphatic enlargement, chylous pleural effusion or ascites. Review appointments were scheduled according to clinical need and at least annually; complications were recorded, FEV₁ and transfer factor of the lung for carbon monoxide (T_LCO) were repeated, and angiomyolipoma size monitored according to a defined protocol [21]. The East Midlands Research Ethics Committee approved the study (REC13/EM/0264) and participants gave written informed consent.

FIGURE 1

Enrolment and data available in the cohorts studied: women with lymphangioleiomyomatosis (LAM) were recruited from the UK National Centre for LAM and the US National Heart, Lung, and Blood Institute (NHLBI) LAM registry. LFT: lung function testing. Not all data were available for all subjects for all end-points. Exact numbers are specified in the individual analyses.

The replication cohort comprised 186 subjects recruited between 1998 and 2003 to the US National Heart, Lung, and Blood Institute (NHLBI) Registry study on the natural history of LAM (figure 1) [7]. Clinical and serial lung function data were obtained from the National Disease Research Interchange (Philadelphia, PA, USA). All-cause mortality and lung transplantation data for the period until December 2010, prior to the use of rapamycin, were obtained from the US National Death Index and the United Network for Organ Sharing databases, respectively.

Cluster assignment was performed using data from the baseline visit (table 1) and outcomes assessed prospectively from this point. Survival is quoted as overall time since diagnosis. Change in lung function was calculated as the slope of all FEV₁ (ΔFEV₁) or T_LCO (ΔT_LCO) values [22].

View this table:

TABLE 1

Disease-related variables captured in the discovery and replication cohorts

Machine learning methodology

The workflow is summarised in figure 2 and described in detail in the supplementary material. Briefly, the dataset was pre-processed, cleaned and checked for validity. Imputation of missing data was performed using multiple imputation by chain equations (MICE), random forest (RF) and MICE+RF. Cluster analysis using multiple algorithms was repeated five times to ensure cluster stability and 42 internal cluster validation schemes were applied to determine the optimal number of clusters. We identified the smallest number of variables necessary to classify women with LAM into clusters based on feature selection schemes including recursive feature elimination, correlation-based feature detection, maximum relevance minimum redundancy and bivariate statistical tests. Five classification algorithms, i.e. RF, decision tree (CART, C4.5 and C5.0) and naive Bayes, were used to develop models for classifying subjects into clusters. Five-fold cross-validation repeated for 10 runs was used when identifying markers and developing classification models. The analysis was carried out using R (www.r-project.org). The clustering algorithms are available at www.github.com/nmpn/lam-stratification.

FIGURE 2

Study workflow, data identification and separation of features into two clusters. LAM: lymphangioleiomyomatosis; MICE: multiple imputation by chain equations; RF: random forest; PCA: principal component analysis; MCA: multiple correspondent analysis. a) Summary workflow of data processing and analysis. The dataset was pre-processed, which involved data cleaning and data validity checking. Missing data were imputed using MICE, RF and MICE+RF. Data were transformed from numerical and categorical variables for clustering analysis using PCA with MCA and Gower's distance. The optimal number of clusters was identified and then internally validated using gap statistics with bootstrapping. Cluster analysis uses four algorithms and classification models were developed using recursive feature elimination followed by the classification algorithms naive Bayes, RF and nearest neighbour. Full details are given in the supplementary methods. b) Inertia gain plot measuring the degree of homogeneity between clusters using hierarchical and k-means clustering. Division of the data into two and three clusters gives good separation. c) Cluster dendrogram showing separation between the three clusters using hierarchical and k-means clustering. d) PCA showing separation of subjects into three clusters.

Statistical analysis

Data were tested for normality using the Shapiro–Wilk test. Parametric data were analysed using the unpaired two-tailed t-test or one-way ANOVA and nonparametric data were analysed using Kruskal–Wallis or Mann–Whitney tests. Categorical data were analysed by the Chi-squared test or Fisher's test. Kolmogorov–Smirnov tests were used to determine whether two datasets had different distributions. Survival analysis was performed using Kaplan–Meier analysis and the Mantel–Cox test. Data were analysed in Excel (Microsoft, Redmond, WA, USA) and Prism version 7.03 (GraphPad, San Diego, CA, USA).

Results

Cluster model development

Complete demographic, presentation and phenotype data were available for all discovery cohort subjects, and treatment, disease activity and oestrogen exposure for >90%. Serum VEGF-D and bronchodilator response data were available for 74% and 61% of subjects, respectively (table 1). Data distribution of missing variables imputed using MICE, RF and MICE+RF did not differ from the original distributions; data imputed from MICE was used (supplementary figure S1).

Two clusters provided optimal separation of factors between groups by majority voting (figure 2 and supplementary table S1). Three clusters also proved clinically useful. Of the five machine learning techniques using five-fold cross-validation repeated 10 times, naive Bayes delivered the strongest accuracy (0.98, 95% CI 0.9502–0.9964), sensitivity (1.0) and specificity (0.96) for cluster assignment, and was used henceforth (supplementary table S2 and figure S2). Three classification models were developed, two comprised of two clusters and one of three clusters. The initial two-cluster model was based on multiple clustering algorithms, with variables based on feature selection techniques. The alternative two-cluster model used multiple clustering algorithms, with variables based on statistical tests. While both models produced similar groupings, the latter separated subjects using fewer terms, was more effective at predicting complications and is reported henceforth. The three-cluster model was based on hierarchy and k-means, with selected variables based on statistics comparing clusters. Subjects were assigned to the cluster for which the output probability was between 0.5 and 1.0.

Two-cluster model

13 input variables divided subjects into clusters comprising 51% and 49% of the discovery cohort (table 2). The most informative factors discriminating clusters were age at first LAM symptom (p=7.6×10⁻⁷), age at assessment (p=4×10⁻¹⁴), presentation with dyspnoea (p=0.00001), pneumothorax (p=0.00001), angiomyolipoma (p=0.00001) or as a chance finding (p=0.00001), ever experiencing pneumothorax (p=0.00001) or angiomyolipoma (p=0.00017) and baseline T_LCO (p=0.0097) (supplementary figure S3). Cluster 1 was comprised of younger women with earlier onset disease, predominantly presenting with pneumothorax or angiomyolipoma that had often required intervention, whereas lymphatic manifestations were uncommon. Subjects in cluster 2 were on average 10 years older, tended to present with dyspnoea, and had more lymphatic complications and larger defects in gas transfer (lower T_LCO and post-walk arterial oxygen saturation). Pneumothorax was infrequent and although many had angiomyolipomas, these seldom required intervention (table 2, supplementary tables S3 and S4, and supplementary figure S4).

View this table:

TABLE 2

Discriminating features of the two-cluster model

Three-cluster model

In the three-cluster system, cluster 1 comprised 69% of subjects who were most likely to present with dyspnoea or pneumothorax and had moderately impaired lung function. Cluster 2 comprised 22% who very commonly presented with angiomyolipoma-related problems, rather than respiratory symptoms, a higher prevalence of TSC and better lung function than cluster 1. Cluster 3 comprised only 9% of subjects who were older at presentation with more recent symptom onset which comprised respiratory symptoms other than breathlessness or pneumothorax, or without LAM symptoms after investigations for other issues. Pneumothorax was very infrequent and lung function almost normal (table 3, figure 3, supplementary figure S3, and supplementary tables S5 and S6).

View this table:

TABLE 3

Discriminating factors of the three-cluster model

FIGURE 3

Features of the three-cluster model. FEV₁: forced expiratory volume in 1 s; T_LCO: transfer factor of the lung for carbon monoxide; S_aO₂: arterial oxygen saturation; CT: computed tomography. a) Distribution of age, age at first symptom, FEV₁ % pred and T_LCO % pred at baseline, and hypoxia during exertion in the three-cluster model. b) Representative CT imaging of subjects from clusters 1, 2 and 3. Cluster 1 subject presented at age 36 years with pneumothorax (grey arrow). Cluster 2 subject presented at age 35 years with ruptured angiomyolipoma requiring embolisation (black arrow). Cluster 3 subject was diagnosed at age 37 years after a lymphatic mass (white arrow) was detected during a CT scan performed for another indication.

Cluster validation

To determine if these clusters could be reproduced in other populations, we used subjects recruited in a different country and time period from the discovery cohort. The NHLBI cohort was slightly younger with better lung function than the UK cohort, angiomyolipoma was less common, although other clinical characteristics were similar and age at diagnosis was used in place of age at first symptom. Applying the algorithm without imputation of missing data reproduced both models with a similar level of differentiation other than for angiomyolipoma (figure 4, and supplementary tables S7 and S8).

FIGURE 4

Cluster validation analyses. T_LCO: transfer factor of the lung for carbon monoxide; FEV₁: forced expiratory volume in 1 s; TSC: tuberous sclerosis complex; NHLBI: National Heart, Lung, and Blood Institute; S_aO₂: arterial oxygen saturation. a) Comparison of variable distribution in the UK and NHLBI cohorts for the three-cluster model. Clusters are represented by the percentage of positive subjects for each variable within that cluster in the two cohorts. ^#: presenting symptom; ^¶: feature ever present. b) Effect of missing data upon cluster assignment. 112 subjects from the UK cohort with complete data were assigned to clusters and then reassigned with each variable removed in turn. The heatmap is red for correctly assigned subjects (columns) and pink when omission of that variable (rows) led to mis-assignment to cluster 1, grey to cluster 2 and yellow to cluster 3. Subjects for each cluster ranked according to strength of assignment (posterior prediction) to the cluster from 1.0 (strong) to 0.5 (weak) left to right along the y-axis.

The effect of missing data on cluster assignment was examined by running the clustering algorithm with single factors omitted. Running the three-cluster model using 112 UK subjects for whom all factors were available was compared with sequential removal of each factor. Omission of factors resulted in misclassifications in a median (range) of 0.7% (0–7.1%) of subjects in cluster 1, 5.4% (0–38%) of subjects in cluster 2 and 8.3% (0–17%) of subjects in cluster 3. The chance of misclassification was greater where the original clustering probability was closer to 0.5 than 1.0 and with omission of factors with the greatest contribution to cluster separation, such as age at first symptom (figure 4, and supplementary figures S5 and S6).

Association of clusters with clinical outcomes

To determine if the models could be used to predict outcomes, we examined lung function decline and disease-related complications prospectively from the point of cluster assignment. Survival was assessed from diagnosis. As rapamycin reduces lung function decline, rapamycin-treated and untreated subjects were examined separately. Serial lung function data spanning a mean±sd of 54±36 and 38±17 months were available for 112 UK and 174 NHLBI subjects, respectively, who had not received rapamycin, and for 81 UK subjects treated with rapamycin for 45±30 months. There were no significant differences between clusters in rate of loss of FEV₁ or T_LCO using either model for untreated or rapamycin-treated subjects (figure 5a and b, and supplementary tables S9 and S10).

FIGURE 5

Prospective clinical outcomes stratified by cluster. FEV₁: forced expiratory volume in 1 s; T_LCO: transfer factor of the lung for carbon monoxide; NHLBI: National Heart, Lung, and Blood Institute. a, b) Rate of change of a) FEV₁ (ΔFEV₁) and b) T_LCO (ΔT_LCO) for subjects in the UK and NHLBI cohorts combined who were not being treated with rapamycin stratified using the two- and three-cluster models. Data are presented as mean and standard error. Values within bars are the number of subjects with lung function data available for analysis. None of the differences between clusters in the models was significant. c) Kaplan–Meier analysis of the prospective risk of pneumothorax following cluster assignment in the UK and NHLBI cohorts combined for the two-cluster model. Those in cluster 1 have a 3.3-fold higher risk of pneumothorax, independent of prior treatment for pneumothorax, compared with those in cluster 2. d) Kaplan–Meier analysis of the combined risk of death or need for lung transplantation since diagnosis in the NHLBI cohort stratified using the three-cluster model.

UK subjects are screened for angiomyolipoma at baseline and tumours monitored using a standardised protocol [21]. Risk of angiomyolipoma intervention was examined irrespective of treatment with rapamycin. Using the two-cluster model, risk of intervention was 0.059 per patient-year after assignment to cluster 1 and 0.025 per patient-year after assignment to cluster 2 (p<0.00001). In the three-cluster model, despite a high prevalence of angiomyolipoma in clusters 2 and 3, the need for interventions in clusters 2 and 3 was significantly lower than in cluster 1 (p<0.00001) (supplementary table S11).

Future risk of pneumothorax was greatest in cluster 1 using both models in both cohorts (supplementary figure S7). The two-cluster model had the best predictive power, where combining all subjects showed the risk of pneumothorax was 3.3 (95% CI 1.7–5.6)-fold greater in cluster 1 than cluster 2 (p=0.0002) (figure 5c).

Survival and transplant data were available for 166 patients in the NHLBI cohort. Over a mean follow-up of 14 years from cluster assignment and up to 33 years from diagnosis, 38 patients had required lung transplantation and 14 had died. Time to the combined end-point of death or transplant was similar in the two-cluster model (supplementary figure S8). In the three-cluster model the incidence of death or transplant was 41.7% in cluster 1, 0% in cluster 2 and 4.2% in cluster 3 (p=0.0045) (figure 5d and supplementary table S12).

Discussion

By applying machine learning to carefully characterised clinical cohorts we have identified groups of related factors which are together associated with outcomes in women with LAM. While clinicians, and indeed patients, have recognised some associations between disease-related manifestations, our data for the first time allow us to quantify the risk of complications, improve prognostic advice and work toward stratified care. Separation into three clusters identifies a large cluster tending to present with pneumothorax or dyspnoea. Women in the second cluster are on average 5 years younger with a high prevalence of angiomyolipoma symptoms and TSC. Women in cluster 3, while comprising only 9% of subjects, presented 10–15 years later than those in clusters 1 and 2 with nonclassical or no symptoms, did not experience pneumothorax and tended to have almost normal lung function. Cluster 1 represents the classic description of women with LAM, presenting in their mid-30s with dyspnoea or pneumothorax and airflow obstruction. Cluster 2, where angiomyolipoma haemorrhage or TSC is the first clue to the presence of LAM and respiratory disease, is less severe. The third cluster represents an increasingly recognised group with milder disease who present at an older age with nonclassical symptoms, including haemoptysis and cough, or without LAM symptoms. We consider our findings are widely applicable and robust as we were able to independently replicate clusters, and although accuracy was reduced somewhat by missing data, the factors required for clustering are available in routine practice. Factors less commonly measured and requiring imputation in the initial analysis, including exertional hypoxaemia, bronchodilator reversibility and VEGF-D, were not required for clustering.

The importance of our findings lies in the differences in clinical manifestations, complications and outcomes between clusters. Women with LAM present at varying ages with different symptoms, lung function and menopausal status. Current guidelines do not give guidance on risk of complications or survival and patients with markedly differing disease may receive similar clinical advice [18, 19, 23]. Applying the methodology described here could allow clinical advice and decision making to be improved. Those assigned to clusters 2 and 3 presenting in their 50s or later could be reassured that their lifespan is unlikely to be shortened by LAM. The risk of pneumothorax is a common concern [17] and applying the two-cluster model can better quantify this risk, with individuals in cluster 1 having a 10% 1-year and 43% 5-year risk of pneumothorax compared with 0% and 15%, respectively, in cluster 2. Such data could be used to improve both patient advice and inform discussions on the need for preventative surgery. Despite a higher prevalence of angiomyolipoma in clusters 2 and 3, the risk of an intervention during follow-up is lower than in cluster 1 and the need for surveillance may be less in these groups. This reflects the differing natural history of angiomyolipoma across the clusters, with cluster 2 and to a lesser extent cluster 3 more likely to present with angiomyolipoma and need intervention than cluster 1, meaning enlarging and symptomatic tumours have already been treated. The absence of presentation with angiomyolipoma symptoms in cluster 1, despite an angiomyolipoma prevalence approaching 50%, suggests that angiomyolipoma is often overlooked in this group and makes intervention more likely in these newly identified tumours.

The use of unsupervised machine learning informs us both which variables are important in phenotyping subjects and potentially underlie or report upon desease progression. Input variables were chosen for their potential relevance to LAM based on disease manifestations and previous literature. These features included mode and age of presentation, existing clinical manifestations and their severity, oestrogen exposure, and pattern of lung physiology. The strongest factors separating clusters were age at first symptom and age at time of assessment. We are unable to say whether clusters represent discreet endotypes: clusters may reflect differences in disease activity with lead-time bias separating subjects presenting earlier due to pneumothorax or angiomyolipoma rather than later with dyspnoea. However, as rate of FEV₁ decline, the best-documented marker of disease activity [9, 10, 24], is similar in all clusters, and clusters have separate disease manifestations suggesting differences in organ involvement, it seems likely the clusters represent discreet endotypes. In either case, assigning women with LAM to these clusters may be clinically useful. The molecular and cellular processes underlying differences between clusters are not clear, and further work examining biomarkers and histological features within the clusters is required [25–27]. This initial study shows that machine learning can be applied to the relatively small datasets provided by rare lung diseases using only basic clinical data. Improvements in imaging and biomarker development mean that these variables could be factored into future models which may further improve predictive accuracy.

Our findings are based on two of the largest and best-categorised cohorts of women with LAM reported, yet despite using unbiased methodology the study has some limitations. Cluster 3 in both cohorts comprised a relatively small number of subjects that may have some inbuilt survivor bias. Some variables require further assessment, e.g. pre-menopausal status has been associated with accelerated loss of lung function. Menopausal status was not a strong differentiator between clusters, and rates of decline of FEV₁ and T_LCO were similar between clusters despite differing proportions of pre-menopausal women. Age was a strong determinant of cluster assignment, as menopausal status and age are related. Menopausal status may still contribute to some of these differences and should continue to be a factor in clinical decisions. Due to differences in data recording between the UK and USA we were unable to reproduce all data, particularly for angiomyolipoma. Since the NHLBI cohort closed, rapamycin has become the standard of care for those with progressive disease [23] and has improved outcomes. How rapamycin affects different clusters and how clustering may inform the decision to use rapamycin should be studied prospectively, including using data from the ongoing Multicenter Interventional Lymphangioleiomyomatosis Early Disease Trial (ClinicalTrials.gov: NCT03150914). Our study was not designed to predict the need for therapy; however, it could be argued that those in cluster 1 should already be considered for early treatment with mammalian target of rapamycin inhibitors to prevent further loss of lung function.

In conclusion, we have used machine learning techniques to stratify women with LAM into clusters using simple clinical data. The method has the potential to improve advice on disease trajectory, complications and screening. Further prospective studies are warranted to determine if this can be translated to improve management for women with LAM.

Supplementary material

Supplementary Material

Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.

Supplementary material ERJ-03036-2020.SUPPLEMENT

Shareable PDF

Supplementary Material

This one-page PDF can be shared freely online.

Shareable PDF ERJ-03036-2020.Shareable

Acknowledgements

We are grateful to the original NHLBI cohort investigators, the women with LAM who contributed to the study and Anne Tattersfield (Emeritus Professor of Respiratory Medicine, University of Nottingham, Nottingham, UK) for critical reading of the manuscript.

Footnotes

This article has supplementary material available from erj.ersjournals.com
Data sharing: De-identified participant data for the US National Heart, Lung, and Blood Institute cohort is available from the National Disease Research Interchange (https://ndriresource.org) on request according to their terms. Individual-level data from the current UK cohort, even when anonymised, are potentially recognisable within the community and are not being made available. Code to run the clustering protocols is available on GitHub (www.github.com/nmpn/lam-stratification) without restriction.
Author contributions: S. Chernbumroong and J.M. Garibaldi performed the machine learning analysis. J. Johnson extracted clinical data. S. Miller performed laboratory analyses. F.X. McCormack and N. Gupta analysed and provided the NHLBI survival data. S.R. Johnson conceived the study, obtained the funding, saw the UK patients, performed data analysis and wrote the manuscript. All authors contributed to the final manuscript.
Conflict of interest: S. Chernbumroong has nothing to disclose.
Conflict of interest: J. Johnson has nothing to disclose.
Conflict of interest: N. Gupta has nothing to disclose.
Conflict of interest: S. Miller reports grants from British Lung Foundation, outside the submitted work.
Conflict of interest: F.X. McCormack has nothing to disclose.
Conflict of interest: J.M. Garibaldi has nothing to disclose.
Conflict of interest: S.R. Johnson reports grants from the National Institute for Health Research, The LAM Foundation and LAM Action, during the conduct of the study; personal fees for advisory board work from Pfizer, outside the submitted work.
Support statement: The study was funded by the Nottingham Medical Research Council/Engineering and Physical Sciences Research Council Molecular Pathology Node and the National Institute for Health Research Rare Disease Translational Research Consortium. Funding information for this article has been deposited with the Crossref Funder Registry.

Received August 5, 2020.
Accepted November 17, 2020.

https://www.ersjournals.com/user-licence

References

↵
1. Johnson SR,
2. Taveira-DaSilva AM,
3. Moss J
. Lymphangioleiomyomatosis. Clin Chest Med 2016; 37: 389–403. doi:10.1016/j.ccm.2016.04.002
OpenUrl PubMed
↵
1. Harknett EC,
2. Chang WYC,
3. Byrnes S, et al.
Regional and national variability suggests underestimation of prevalence of lymphangioleiomyomatosis. Q J Med 2011; 104: 971–979. doi:10.1093/qjmed/hcr116
OpenUrl CrossRef PubMed Web of Science
↵
1. Carel H,
2. Johnson S,
3. Gamble L
. Living with lymphangioleiomyomatosis. BMJ 2010; 340: c848. doi:10.1136/bmj.c848
OpenUrl FREE Full Text
↵
1. Taveira-DaSilva AM,
2. Steagall WK,
3. Rabel A, et al.
Reversible airflow obstruction in lymphangioleiomyomatosis. Chest 2009; 136: 1596–1603. doi:10.1378/chest.09-0624
OpenUrl CrossRef PubMed
↵
1. Johnson J,
2. Johnson SR
. Cross-sectional study of reversible airway obstruction in LAM: better evidence is needed for bronchodilator and inhaled steroid use. Thorax 2019; 74: 999–1002. doi:10.1136/thoraxjnl-2019-213338
OpenUrl Abstract/FREE Full Text
↵
1. Yeoh Z,
2. Navaratnam V,
3. Bhatt R, et al.
Natural history of angiomyolipoma in lymphangioleiomyomatosis: implications for screening and surveillance. Orphanet J Rare Dis 2014; 9: 151. doi:10.1186/s13023-014-0151-3
OpenUrl CrossRef PubMed
↵
1. Ryu JH,
2. Moss J,
3. Beck GJ, et al.
The NHLBI Lymphangioleiomyomatosis Registry: characteristics of 230 patients at enrollment. Am J Respir Crit Care Med 2006; 173: 105–111. doi:10.1164/rccm.200409-1298OC
OpenUrl CrossRef PubMed Web of Science
↵
1. Johnson SR,
2. Tattersfield AE
. Clinical experience of lymphangioleiomyomatosis in the UK. Thorax 2000; 55: 1052–1057. doi:10.1136/thorax.55.12.1052
OpenUrl Abstract/FREE Full Text
↵
1. Taveira-DaSilva AM,
2. Stylianou MP,
3. Hedin CJ, et al.
Decline in lung function in patients with lymphangioleiomyomatosis treated with or without progesterone. Chest 2004; 126: 1867–1874. doi:10.1378/chest.126.6.1867
OpenUrl CrossRef PubMed Web of Science
↵
1. Johnson SR,
2. Tattersfield AE
. Decline in lung function in lymphangioleiomyomatosis: relation to menopause and progesterone treatment. Am J Respir Crit Care Med 1999; 160: 628–633. doi:10.1164/ajrccm.160.2.9901027
OpenUrl CrossRef PubMed Web of Science
↵
1. Gupta N,
2. Lee H-S,
3. Young LR, et al.
Analysis of the MILES cohort reveals determinants of disease progression and treatment response in lymphangioleiomyomatosis. Eur Respir J 2019; 53: 1802066. doi:10.1183/13993003.02066-2018
OpenUrl Abstract/FREE Full Text
↵
1. Young LR,
2. Lee H-S,
3. Inoue Y, et al.
Serum VEGF-D concentration as a biomarker of lymphangioleiomyomatosis severity and treatment response: a prospective analysis of the Multicenter International Lymphangioleiomyomatosis Efficacy of Sirolimus (MILES) trial. Lancet Respir Med 2013; 1: 445–452. doi:10.1016/S2213-2600(13)70090-0
OpenUrl
↵
1. Le K,
2. Steagall WK,
3. Stylianou M, et al.
Effect of beta-agonists on LAM progression and treatment. Proc Natl Acad Sci USA 2018; 115: E944. doi:10.1073/pnas.1719960115
OpenUrl Abstract/FREE Full Text
↵
1. Miller S,
2. Coveney C,
3. Johnson J, et al.
The vitamin D binding protein axis modifies disease severity in lymphangioleiomyomatosis. Eur Respir J 2018; 52: 1800951. doi:10.1183/13993003.00951-2018
OpenUrl Abstract/FREE Full Text
↵
1. Lazor R,
2. Valeyre D,
3. Lacronique J, et al.
Low initial KCO predicts rapid FEV₁ decline in pulmonary lymphangioleiomyomatosis. Respir Med 2004; 98: 536–541. doi:10.1016/j.rmed.2003.11.013
OpenUrl CrossRef PubMed Web of Science
↵
1. Cohen MM,
2. Pollock-BarZiv S,
3. Johnson SR
. Emerging clinical picture of lymphangioleiomyomatosis. Thorax 2005; 60: 875–879. doi:10.1136/thx.2004.035154
OpenUrl Abstract/FREE Full Text
↵
1. Young LR,
2. Almoosa KF,
3. Pollock-BarZiv S, et al.
Patient perspectives on management of pneumothorax in lymphangioleiomyomatosis. Chest 2006; 129: 1267–1273. doi:10.1378/chest.129.5.1267
OpenUrl CrossRef PubMed Web of Science
↵
1. Johnson SR,
2. Cordier JF,
3. Lazor R, et al.
European Respiratory Society guidelines for the diagnosis and management of lymphangioleiomyomatosis. Eur Respir J 2010; 35: 14–26. doi:10.1183/09031936.00076209
OpenUrl FREE Full Text
↵
1. Gupta N,
2. Finlay GA,
3. Kotloff RM, et al.
Lymphangioleiomyomatosis Diagnosis and management: high-resolution chest computed tomography, transbronchial lung biopsy, and pleural disease management. An Official American Thoracic Society/Japanese Respiratory Society Clinical Practice Guideline. Am J Respir Crit Care Med 2017; 196: 1337–1348. doi:10.1164/rccm.201709-1965ST
OpenUrl PubMed
↵
1. Miller MR,
2. Crapo R,
3. Hankinson J, et al.
General considerations for lung function testing. Eur Respir J 2005; 26: 153–161. doi:10.1183/09031936.05.00034505
OpenUrl FREE Full Text
↵
1. Bee J,
2. Bhatt R,
3. McCafferty I, et al.
A 4-year prospective evaluation of protocols to improve clinical outcomes for patients with lymphangioleiomyomatosis in a national clinical centre. Thorax 2015; 70: 1204. doi:10.1136/thoraxjnl-2015-207171
OpenUrl
↵
1. Bee J,
2. Fuller S,
3. Miller S, et al.
Lung function response and side effects to rapamycin for lymphangioleiomyomatosis: a prospective national cohort study. Thorax 2018; 73: 369–375. doi:10.1136/thoraxjnl-2017-210872
OpenUrl Abstract/FREE Full Text
↵
1. McCormack FX,
2. Gupta N,
3. Finlay GR, et al.
Official American Thoracic Society/Japanese Respiratory Society Clinical Practice Guidelines: lymphangioleiomyomatosis diagnosis and management. Am J Respir Crit Care Med 2016; 194: 748–761. doi:10.1164/rccm.201607-1384ST
OpenUrl PubMed
↵
1. McCormack FX,
2. Inoue Y,
3. Moss J, et al.
Efficacy and safety of sirolimus in lymphangioleiomyomatosis. N Engl J Med 2011; 364: 1595–1606. doi:10.1056/NEJMoa1100391
OpenUrl CrossRef PubMed Web of Science
↵
1. Miller S,
2. Stewart ID,
3. Clements D, et al.
Evolution of lung pathology in lymphangioleiomyomatosis: associations with disease course and treatment response. J Pathol Clin Res 2020; 6: 215–226. doi:10.1002/cjp2.162
OpenUrl
1. Osterburg AR,
2. Nelson RL,
3. Yaniv BZ, et al.
NK cell activating receptor ligand expression in lymphangioleiomyomatosis is associated with lung function decline. JCI Insight 2016; 1: e87270. doi:10.1172/jci.insight.87270
OpenUrl
↵
1. Lamattina AM,
2. Poli S,
3. Kidambi P, et al.
Serum endostatin levels are associated with diffusion capacity and with tuberous sclerosis-associated lymphangioleiomyomatosis. Orphanet J Rare Dis 2019; 14: 72. doi:10.1186/s13023-019-1050-4
OpenUrl

View this article with LENS

Vol 57 Issue 6 Table of Contents

Citation Tools

Full Text (PDF)

Subjects

Interstitial and orphan lung disease

Original Articles

Show more Original Articles

Rare lung diseases

Show more Rare lung diseases

[1] ↵
Johnson SR,
Taveira-DaSilva AM,
Moss J
. Lymphangioleiomyomatosis. Clin Chest Med 2016; 37: 389–403. doi:10.1016/j.ccm.2016.04.002
OpenUrl PubMed

[2] Johnson SR,

[3] Taveira-DaSilva AM,

[4] Moss J

[5] ↵
Harknett EC,
Chang WYC,
Byrnes S, et al.
Regional and national variability suggests underestimation of prevalence of lymphangioleiomyomatosis. Q J Med 2011; 104: 971–979. doi:10.1093/qjmed/hcr116
OpenUrl CrossRef PubMed Web of Science

[6] Harknett EC,

[7] Chang WYC,

[8] Byrnes S, et al.

[9] ↵
Carel H,
Johnson S,
Gamble L
. Living with lymphangioleiomyomatosis. BMJ 2010; 340: c848. doi:10.1136/bmj.c848
OpenUrl FREE Full Text

[10] Carel H,

[11] Johnson S,

[12] Gamble L

[13] ↵
Taveira-DaSilva AM,
Steagall WK,
Rabel A, et al.
Reversible airflow obstruction in lymphangioleiomyomatosis. Chest 2009; 136: 1596–1603. doi:10.1378/chest.09-0624
OpenUrl CrossRef PubMed

[14] Taveira-DaSilva AM,

[15] Steagall WK,

[16] Rabel A, et al.

[17] ↵
Johnson J,
Johnson SR
. Cross-sectional study of reversible airway obstruction in LAM: better evidence is needed for bronchodilator and inhaled steroid use. Thorax 2019; 74: 999–1002. doi:10.1136/thoraxjnl-2019-213338
OpenUrl Abstract/FREE Full Text

[18] Johnson J,

[19] Johnson SR

[20] ↵
Yeoh Z,
Navaratnam V,
Bhatt R, et al.
Natural history of angiomyolipoma in lymphangioleiomyomatosis: implications for screening and surveillance. Orphanet J Rare Dis 2014; 9: 151. doi:10.1186/s13023-014-0151-3
OpenUrl CrossRef PubMed

[21] Yeoh Z,

[22] Navaratnam V,

[23] Bhatt R, et al.

[24] ↵
Ryu JH,
Moss J,
Beck GJ, et al.
The NHLBI Lymphangioleiomyomatosis Registry: characteristics of 230 patients at enrollment. Am J Respir Crit Care Med 2006; 173: 105–111. doi:10.1164/rccm.200409-1298OC
OpenUrl CrossRef PubMed Web of Science

[25] Ryu JH,

[26] Moss J,

[27] Beck GJ, et al.

[28] ↵
Johnson SR,
Tattersfield AE
. Clinical experience of lymphangioleiomyomatosis in the UK. Thorax 2000; 55: 1052–1057. doi:10.1136/thorax.55.12.1052
OpenUrl Abstract/FREE Full Text

[29] Johnson SR,

[30] Tattersfield AE

[31] ↵
Taveira-DaSilva AM,
Stylianou MP,
Hedin CJ, et al.
Decline in lung function in patients with lymphangioleiomyomatosis treated with or without progesterone. Chest 2004; 126: 1867–1874. doi:10.1378/chest.126.6.1867
OpenUrl CrossRef PubMed Web of Science

[32] Taveira-DaSilva AM,

[33] Stylianou MP,

[34] Hedin CJ, et al.

[35] ↵
Johnson SR,
Tattersfield AE
. Decline in lung function in lymphangioleiomyomatosis: relation to menopause and progesterone treatment. Am J Respir Crit Care Med 1999; 160: 628–633. doi:10.1164/ajrccm.160.2.9901027
OpenUrl CrossRef PubMed Web of Science

[36] Johnson SR,

[37] Tattersfield AE

[38] ↵
Gupta N,
Lee H-S,
Young LR, et al.
Analysis of the MILES cohort reveals determinants of disease progression and treatment response in lymphangioleiomyomatosis. Eur Respir J 2019; 53: 1802066. doi:10.1183/13993003.02066-2018
OpenUrl Abstract/FREE Full Text

[39] Gupta N,

[40] Lee H-S,

[41] Young LR, et al.

[42] ↵
Young LR,
Lee H-S,
Inoue Y, et al.
Serum VEGF-D concentration as a biomarker of lymphangioleiomyomatosis severity and treatment response: a prospective analysis of the Multicenter International Lymphangioleiomyomatosis Efficacy of Sirolimus (MILES) trial. Lancet Respir Med 2013; 1: 445–452. doi:10.1016/S2213-2600(13)70090-0
OpenUrl

[43] Young LR,

[44] Lee H-S,

[45] Inoue Y, et al.

[46] ↵
Le K,
Steagall WK,
Stylianou M, et al.
Effect of beta-agonists on LAM progression and treatment. Proc Natl Acad Sci USA 2018; 115: E944. doi:10.1073/pnas.1719960115
OpenUrl Abstract/FREE Full Text

[47] Le K,

[48] Steagall WK,

[49] Stylianou M, et al.

[50] ↵
Miller S,
Coveney C,
Johnson J, et al.
The vitamin D binding protein axis modifies disease severity in lymphangioleiomyomatosis. Eur Respir J 2018; 52: 1800951. doi:10.1183/13993003.00951-2018
OpenUrl Abstract/FREE Full Text

[51] Miller S,

[52] Coveney C,

[53] Johnson J, et al.

[54] ↵
Lazor R,
Valeyre D,
Lacronique J, et al.
Low initial KCO predicts rapid FEV₁ decline in pulmonary lymphangioleiomyomatosis. Respir Med 2004; 98: 536–541. doi:10.1016/j.rmed.2003.11.013
OpenUrl CrossRef PubMed Web of Science

[55] Lazor R,

[56] Valeyre D,

[57] Lacronique J, et al.

[58] ↵
Cohen MM,
Pollock-BarZiv S,
Johnson SR
. Emerging clinical picture of lymphangioleiomyomatosis. Thorax 2005; 60: 875–879. doi:10.1136/thx.2004.035154
OpenUrl Abstract/FREE Full Text

[59] Cohen MM,

[60] Pollock-BarZiv S,

[61] Johnson SR

[62] ↵
Young LR,
Almoosa KF,
Pollock-BarZiv S, et al.
Patient perspectives on management of pneumothorax in lymphangioleiomyomatosis. Chest 2006; 129: 1267–1273. doi:10.1378/chest.129.5.1267
OpenUrl CrossRef PubMed Web of Science

[63] Young LR,

[64] Almoosa KF,

[65] Pollock-BarZiv S, et al.

[66] ↵
Johnson SR,
Cordier JF,
Lazor R, et al.
European Respiratory Society guidelines for the diagnosis and management of lymphangioleiomyomatosis. Eur Respir J 2010; 35: 14–26. doi:10.1183/09031936.00076209
OpenUrl FREE Full Text

[67] Johnson SR,

[68] Cordier JF,

[69] Lazor R, et al.

[70] ↵
Gupta N,
Finlay GA,
Kotloff RM, et al.
Lymphangioleiomyomatosis Diagnosis and management: high-resolution chest computed tomography, transbronchial lung biopsy, and pleural disease management. An Official American Thoracic Society/Japanese Respiratory Society Clinical Practice Guideline. Am J Respir Crit Care Med 2017; 196: 1337–1348. doi:10.1164/rccm.201709-1965ST
OpenUrl PubMed

[71] Gupta N,

[72] Finlay GA,

[73] Kotloff RM, et al.

[74] ↵
Miller MR,
Crapo R,
Hankinson J, et al.
General considerations for lung function testing. Eur Respir J 2005; 26: 153–161. doi:10.1183/09031936.05.00034505
OpenUrl FREE Full Text

[75] Miller MR,

[76] Crapo R,

[77] Hankinson J, et al.

[78] ↵
Bee J,
Bhatt R,
McCafferty I, et al.
A 4-year prospective evaluation of protocols to improve clinical outcomes for patients with lymphangioleiomyomatosis in a national clinical centre. Thorax 2015; 70: 1204. doi:10.1136/thoraxjnl-2015-207171
OpenUrl

[79] Bee J,

[80] Bhatt R,

[81] McCafferty I, et al.

[82] ↵
Bee J,
Fuller S,
Miller S, et al.
Lung function response and side effects to rapamycin for lymphangioleiomyomatosis: a prospective national cohort study. Thorax 2018; 73: 369–375. doi:10.1136/thoraxjnl-2017-210872
OpenUrl Abstract/FREE Full Text

[83] Bee J,

[84] Fuller S,

[85] Miller S, et al.

[86] ↵
McCormack FX,
Gupta N,
Finlay GR, et al.
Official American Thoracic Society/Japanese Respiratory Society Clinical Practice Guidelines: lymphangioleiomyomatosis diagnosis and management. Am J Respir Crit Care Med 2016; 194: 748–761. doi:10.1164/rccm.201607-1384ST
OpenUrl PubMed

[87] McCormack FX,

[88] Gupta N,

[89] Finlay GR, et al.

[90] ↵
McCormack FX,
Inoue Y,
Moss J, et al.
Efficacy and safety of sirolimus in lymphangioleiomyomatosis. N Engl J Med 2011; 364: 1595–1606. doi:10.1056/NEJMoa1100391
OpenUrl CrossRef PubMed Web of Science

[91] McCormack FX,

[92] Inoue Y,

[93] Moss J, et al.

[94] ↵
Miller S,
Stewart ID,
Clements D, et al.
Evolution of lung pathology in lymphangioleiomyomatosis: associations with disease course and treatment response. J Pathol Clin Res 2020; 6: 215–226. doi:10.1002/cjp2.162
OpenUrl

[95] Miller S,

[96] Stewart ID,

[97] Clements D, et al.

[98] Osterburg AR,
Nelson RL,
Yaniv BZ, et al.
NK cell activating receptor ligand expression in lymphangioleiomyomatosis is associated with lung function decline. JCI Insight 2016; 1: e87270. doi:10.1172/jci.insight.87270
OpenUrl

[99] Osterburg AR,

[100] Nelson RL,

[101] Yaniv BZ, et al.

[102] ↵
Lamattina AM,
Poli S,
Kidambi P, et al.
Serum endostatin levels are associated with diffusion capacity and with tuberous sclerosis-associated lymphangioleiomyomatosis. Orphanet J Rare Dis 2019; 14: 72. doi:10.1186/s13023-019-1050-4
OpenUrl

[103] Lamattina AM,

[104] Poli S,

[105] Kidambi P, et al.

Main menu

User menu

Search

Machine learning can predict disease manifestations and outcomes in lymphangioleiomyomatosis

Abstract

Abstract

Introduction

Methods

Subjects and clinical data

Machine learning methodology

Statistical analysis

Results

Cluster model development

Two-cluster model

Three-cluster model

Cluster validation

Association of clusters with clinical outcomes

Discussion

Supplementary material

Supplementary Material

Shareable PDF

Supplementary Material

Acknowledgements

Footnotes

References

Citation Manager Formats

Subjects

More in this TOC Section

Original Articles

Rare lung diseases

Related Articles

Contact us

Main menu

User menu

Search

Machine learning can predict disease manifestations and outcomes in lymphangioleiomyomatosis

Abstract

Abstract

Introduction

Methods

Subjects and clinical data

Machine learning methodology

Statistical analysis

Results

Cluster model development

Two-cluster model

Three-cluster model

Cluster validation

Association of clusters with clinical outcomes

Discussion

Supplementary material

Supplementary Material

Shareable PDF

Supplementary Material

Acknowledgements

Footnotes

References

Citation Manager Formats

Jump To

Subjects

More in this TOC Section

Original Articles

Rare lung diseases

Related Articles

Contact us