%0 Journal Article %A Marjolein Engelkes %A Zubair Afzal %A Hettie Janssens %A Jan Kors %A Martijn Schuemie %A Katia Verhamme %A Miriam Sturkenboom %T Automated identification of asthma patients within an electronical medical record database using machine learning %D 2012 %J European Respiratory Journal %P P4655 %V 40 %N Suppl 56 %X Background: Use of electronic medical record (EMR) databases for epidemiological research on asthma/COPD is increasing. A key challenge for use of these huge databases is disease validation. The conventional method is labor intensive and often non-systematic. One strategy to address this, is the use of machine learning (ML) to identify cases.Aim: To investigate the performance of ML in the automated identification of children with asthma.Methods: From the IPCI database, a GP database with medical records of > 1 million patients, all potential asthma patients, aged 6-18 years between 2000-2011, were identified with a broad automated search on asthma codes, free text and asthma drugs. First, a random sample (n=5039) of all potential cases (n=64327) was manually reviewed by 2 MDs and categorized according to a predefined algorithm. Second, based on this sample set, ML recognizes complex patterns to automatically generate decision trees for case identification. Training and testing was done by 5-fold cross validation.Results: The sample set consisted of 6% definite, 24% probable, 2% doubtful cases and 68% non-cases. Depending on the sampling strategy, the positive predictive value (PPV) varies from 0.11-0.26, sensitivity (Sn) 0.57-0.94 and specificity (Sp) 0.52-0.89 for definite cases (diagnosis by specialist). For probable cases (diagnosis by GP) PPV varies from 0.49-0.51, Sn 0.84-0.86 and Sp 0.69-0.73.Conclusion: ML for automatic identification of asthma cases in a huge EMR database performs well. The optimal ML method depends on the research question e.g. incidence/prevalence studies require a method with a large Sn, while outcome studies require a large Sp. %U https://erj.ersjournals.com/content/erj/40/Suppl_56/P4655.full.pdf