PT - JOURNAL ARTICLE AU - Orla M. Doyle AU - Roald van der Laan AU - Marko Obradovic AU - Peter McMahon AU - Flora Daniels AU - Ashley Pitcher AU - Michael R. Loebinger TI - Identification of potentially undiagnosed patients with nontuberculous mycobacterial lung disease using machine learning applied to primary care data in the UK AID - 10.1183/13993003.00045-2020 DP - 2020 Jan 01 TA - European Respiratory Journal PG - 2000045 4099 - http://erj.ersjournals.com/content/early/2020/04/29/13993003.00045-2020.short 4100 - http://erj.ersjournals.com/content/early/2020/04/29/13993003.00045-2020.full AB - Nontuberculous mycobacterial lung disease (NTMLD) is a rare lung disease often missed due to a low index of suspicion and unspecific clinical presentation. This retrospective study was designed to characterise the pre-diagnosis features of NTMLD patients in primary care and to assess the feasibility of using machine learning (ML) to identify undiagnosed NTMLD patients.IQVIA Medical Research Data (IMRD; incorporating THIN, a Cegedim Database), a UK electronic medical records primary care database was used. NTMLD patients were identified between 2003 and 2017 by diagnosis in primary or secondary care or record of NTMLD treatment regimen. Risk factors and treatments were extracted in the pre-diagnosis period, guided by literature and expert clinical opinion. The control population was enriched to have at least one of these features.A total of 741 NTMLD and 112 784 control patients were selected. Annual prevalence rates of NTMLD from 2006 to 2016 increased from 2.7 to 5.1 per 100 000. The most common pre-existing diagnoses and treatments for NTMLD patients were chronic obstructive pulmonary disease, asthma, penicillin, macrolides and inhaled corticosteroids. Compared to random testing, ML improved detection of patients with NTMLD by almost a thousand-fold with AUC of 0.94. The total prevalence of diagnosed and undiagnosed cases of NTMLD in 2016 was estimated to range between 9 and 16 per 100 000.This study supports the feasibility of ML applied to primary care data to screen for undiagnosed NTMLD patients with results indicating that there may be a substantial number of undiagnosed cases of NTMLD in the UK.FootnotesThis manuscript has recently been accepted for publication in the European Respiratory Journal. It is published here in its accepted form prior to copyediting and typesetting by our production team. After these production processes are complete and the authors have approved the resulting proofs, the article will move to the latest issue of the ERJ online. Please open or download the PDF to view this article.Conflict of interest: Dr. Doyle reports other from Insmed, during the conduct of the study; and is employed by IQVIA.Conflict of interest: Dr. van der Laan reports other from Insmed, during the conduct of the study; and is employed by Insmed.Conflict of interest: Dr. Obradovic reports other from Insmed, during the conduct of the study; and is employed by Insmed.Conflict of interest: Dr. McMahon reports other from Insmed, during the conduct of the study; and is employed by IQVIA.Conflict of interest: Dr. Daniels reports other from Insmed, during the conduct of the study; and is employed by IQVIA.Conflict of interest: Dr. Pitcher reports other from Insmed, during the conduct of the study; and is employed by IQVIA.Conflict of interest: Dr. Loebinger reports personal fees and other from Insmed, during the conduct of the study; personal fees from Savara, personal fees from Grifols, personal fees from Bayer, personal fees from Astra Zeneca, personal fees from Polyphor, personal fees from Meiji, outside the submitted work.