TY - JOUR T1 - Identification of potentially undiagnosed patients with nontuberculous mycobacterial lung disease using machine learning applied to primary care data in the UK JF - European Respiratory Journal JO - Eur Respir J DO - 10.1183/13993003.00045-2020 VL - 56 IS - 4 SP - 2000045 AU - Orla M. Doyle AU - Roald van der Laan AU - Marko Obradovic AU - Peter McMahon AU - Flora Daniels AU - Ashley Pitcher AU - Michael R. Loebinger Y1 - 2020/10/01 UR - http://erj.ersjournals.com/content/56/4/2000045.abstract N2 - Nontuberculous mycobacterial lung disease (NTMLD) is a rare lung disease often missed due to a low index of suspicion and unspecific clinical presentation. This retrospective study was designed to characterise the prediagnosis features of NTMLD patients in primary care and to assess the feasibility of using machine learning to identify undiagnosed NTMLD patients.IQVIA Medical Research Data (incorporating THIN, a Cegedim Database), a UK electronic medical records primary care database was used. NTMLD patients were identified between 2003 and 2017 by diagnosis in primary or secondary care or record of NTMLD treatment regimen. Risk factors and treatments were extracted in the prediagnosis period, guided by literature and expert clinical opinion. The control population was enriched to have at least one of these features.741 NTMLD and 112 784 control patients were selected. Annual prevalence rates of NTMLD from 2006 to 2016 increased from 2.7 to 5.1 per 100 000. The most common pre-existing diagnoses and treatments for NTMLD patients were COPD and asthma and penicillin, macrolides and inhaled corticosteroids. Compared to random testing, machine learning improved detection of patients with NTMLD by almost a thousand-fold with AUC of 0.94. The total prevalence of diagnosed and undiagnosed cases of NTMLD in 2016 was estimated to range between 9 and 16 per 100 000.This study supports the feasibility of machine learning applied to primary care data to screen for undiagnosed NTMLD patients, with results indicating that there may be a substantial number of undiagnosed cases of NTMLD in the UK.Compared to random testing, machine learning improved detection of undiagnosed patients with NTMLD by almost a thousand-fold with AUC of 0.94 supporting the feasibility of using machine learning applied to primary care data to screen for undiagnosed NTMLD patients https://bit.ly/2WmT5nZ ER -