Abstract
We aimed to estimate the population prevalence of obstructive sleep apnoea (OSA) in an urban community of German third graders (age range 7.3–12.4 yrs) and the diagnostic test accuracy of two OSA screening methods.
Using a cross-sectional study design with a multi-stage sampling strategy, 27 out of 59 primary schools within the city limits of Hanover, Germany, were selected. 1,144 third graders were screened for symptoms and signs of OSA using questionnaires and nocturnal home pulse oximetry. 183 children underwent abbreviated nocturnal home polysomnography (OSA definition: apnoea/hypopnoea index ≥1) and 22 were diagnosed to suffer from OSA.
In general, sensitivity for both screening methods was low (<0.6), while specificity was moderately high (mostly >0.7). Independent predictors for OSA were body mass index, history of allergy, a composite questionnaire score, and two oximetry-based criteria. Based on these variables and logistic regression, a prediction model (accuracy; 95% confidence interval: 0.86; 0.71–0.94) was constructed and applied to children who had not successfully undergone polysomnography. This resulted in nine additional OSA cases and an overall design-adjusted population prevalence (95% confidence interval) of 2.8% (1.5–4.1%).
Clinical and oximetry findings may be helpful for screening and predicting OSA in primary school children.
Childhood obstructive sleep apnoea (OSA) is one expression of sleep-disordered breathing (SDB) and characterised by sleep-related episodes of partial and/or complete upper airway obstruction with or without hypoxaemia, hypercapnia and respiratory-related arousal. The episodes may accompany snoring, laboured breathing, chest retraction, cyanosis and disturbed sleep 1. OSA occurs in children of all ages. It is most common in the pre-school age group, due to adenotonsillar hyperplasia. Full sleep laboratory-based polysomnography is the gold standard for diagnosing OSA in children 2. Most children with OSA will have both symptomatic and polysomnographic resolution following adenotonsillectomy 3.
Many studies attempting to estimate the prevalence of OSA in children have been undertaken 4–18. They yielded point estimates for the population prevalence of OSA ranging from 0.7% 4, 8 to 31.4% 13. However, none of these studies attempted to draw a representative sample from the population (or did not report on it) and only one 14 combined stratum-specific estimates to calculate overall population prevalence estimates. Moreover, most studies used a one-stage screening procedure with questionnaires as the only screening instrument 4, 5, 7, 8, 12, 15–18. Some studies used nonaccepted standards for diagnosing OSA 4, 12 and others used adult rather than paediatric polysomnographic criteria to diagnose OSA 6, 10.
Regarding European countries, prevalence studies on paediatric OSA have been performed in the UK 4, Iceland 5, Sweden 7, Italy 9, 12, Spain 10, Greece 15 and Turkey 18, but not yet in Germany. In 2000, the authors initiated a comprehensive community-based cross-sectional study on SDB in children (i.e. the German Study on SDB in Primary School Children) 19. Among others, the aims of this study were to obtain unbiased estimates for the population prevalence of OSA in an urban community of German third graders (age range 7.3–12.4 yrs) and to determine the diagnostic test accuracy of OSA screening methods.
METHODS
Study design, subjects and screening procedure
Details on sample size calculation, sampling strategy, comparisons for representativeness, screening methods and study procedures have been published elsewhere 19. In short, 27 of the 59 public regular primary schools located within the city limits of Hanover, Germany, were selected using a multi-stage, stratified (by socioeconomic status), probability clustered design (fig. 1). Following approval by the institutional review board and the regional directorate of education, 1,760 children attending third-grade classes were approached between February and December 2001 and 1,144 (65.0%) were enrolled. Children were included if parents gave written informed consent. Comparisons with the target population (n = 4,109) revealed good to excellent representativeness of the study sample concerning sex distribution, socioeconomic status, academic performance and doctor-diagnosed asthma 19. Children were screened twice using a widely used and partially validated parental SDB-questionnaire (SDB-Q) 20–24 and nocturnal home pulse oximetry (HPO) 25–27.
Questionnaire
The SDB-Q by Gozal 22 was adjusted to enable calculation of the OSA score according to Brouillette et al. 20 and extended with questions concerning parental education, child's demographic and anthropometric characteristics 19, daytime behaviour 28, frequent sleep problems 29 and current health status (see Appendix 28). The body mass index (BMI) was calculated using a standard formula (BMI = weight (kg)/height (m)2) and transformed into age- and sex-specific centiles using German reference values 30. Snoring was assessed with the question “Does your child snore?” and rated on a 4-point scale. Children were classified as habitual snorers if the answers were “frequently” or “always”. The OSA score according to Brouillette et al. 20, the SDB score according to Gozal 22, and an adapted SDB score according to Paditz et al. 28 were calculated. For the calculation of these scores, arbitrary numerical scores were assigned to each of the answers ranging from 0 (never), 1 (rarely) and 2 (occasionally) to 3 (frequently) and 4 (almost always). To enable calculation of scores for each single child and to achieve high sensitivity, missing answers were scored as 0 (never). This imputation method was used for the screening process and the construction of the prediction model. For estimating diagnostic test accuracy, multiple missing data imputation methods were used (see Statistical analysis). Based on questionnaires obtained between February and July 2001 (n = 671), the 95th centile for the adapted SDB score was calculated and found to be 24. Children were screened positive if they: 1) were reported to snore habitually (SDB-Q criterion 1); 2) had an OSA score ≥0 (SDB-Q criterion 2 20); or 3) had an adapted SDB score ≥24 (SDB-Q criterion 3).
Home pulse oximetry
Recordings of HPO-derived arterial haemoglobin oxygen saturation (Sp,O2) were performed overnight in the child's home using an instrument with a new generation oximeter module that was capable of storing continuous trend and episodic event data 25, 26. Data analysis software was used to determine artefact-free recording time and to calculate the mean, standard deviation, median, and 5th and 10th centiles Sp,O2, as well as the number of desaturation events of ≥4% Sp,O2. Recordings with artefact-free recording time <5 h were excluded. The nadir Sp,O2, the number of desaturation events to ≤92% and to ≤90% Sp,O2, as well as desaturation event clusters were manually determined using information on signal quality, low perfusion and pulse waveform. Desaturation event clusters were defined as ≥5 desaturation events of ≥4% Sp,O2 occurring within a 30-min period 27. In addition, the average distance from the optimum of 100% Sp,O2 and a cumulative hypoxaemia score were calculated for each recording 26. Desaturation indices, defined as events per hour of artefact-free recording, were calculated for desaturation events of ≥4% Sp,O2 (DI4), desaturation events to ≤92% (DI92) and to ≤90% Sp,O2 (DI90) as well as desaturation event clusters (DIC). Based on 100 recordings obtained between February and July 2001, the 95th centile for DI4 and DIC was calculated and found to be 3.9 and 0.4, respectively. Children were screened positive if they: 1) had ≥3 desaturation events to ≤90% Sp,O2 and ≥3 desaturation event clusters (HPO criterion 1 27); 2) had the DI90 >0.6 (HPO criterion 2 31); or 3) had the DI4 >3.9 and the DIC >0.4 (HPO criterion 3 25). To assess clinical factors that possibly influence oximetry results or result in sleep-related hypoxia, a customised questionnaire (i.e. HPO-Q) was developed and distributed together with the oximetry device 25, 32. The questionnaire included items on the presence of heart disease, chronic lung disease, physician-diagnosed allergy/chronic rhinitis, current upper respiratory tract infection, anaemia, preferred sleeping position, bed/wake time and sensor placement. Parents were asked to fill in this questionnaire on the evening of the oximetry recording.
Home polysomnography
Home polysomnography (HPSG) was performed in all screen-positives and in a subgroup of screen-negatives (i.e. control group). To form the control group, all screen-negatives were listed by date of enrolment and every 20th child on that list contacted. For participation in this control group, a ticket for the Hanover Zoo was offered as an incentive. For the HPSG, an ambulatory polygraphic device recorded chest and abdominal wall movements, nasal pressure and linearised nasal airflow estimation, oral airflow, snoring, Sp,O2, pulse rate, pulse waveform, actigraphy, body position, and user events over one single night 33. Recordings were then manually analysed for the corrected estimated sleep time, and mixed and obstructive apnoeas, as well as hypopnoeas based on standard guidelines or published criteria 34. An apnoea was scored if: 1) the amplitude of the nasal airflow fell to ≤20% of the average amplitude of the two preceding breaths; 2) no airflow was detected at the mouth; and 3) the event comprised at least two breath cycles (i.e. ∼6 s for the age group under study). Obstructive apnoeas were scored if criteria for apnoea were fulfilled and out-of-phase movements of the chest and abdomen were present. Mixed apnoeas were defined as apnoeas with central and obstructive components, each of them lasting at least two (not necessarily consecutive) breath cycles. Hypopnoeas were scored if: 1) the amplitude of the nasal airflow fell to ≤50% of the average amplitude of the two preceding breaths; 2) a fall in Sp,O2 by ≥4% occurred within 30 s of the onset of the event; and 3) the event comprised at least two breath cycles. Recordings with a corrected estimated sleep time <4 h were excluded. An apnoea/hypopnoea index (AHI) was calculated, defined as sum of all mixed and obstructive apnoeas and obstructive hypopnoeas per hour of corrected estimated sleep time. OSA was defined as AHI ≥1 to comply with international guidelines 35.
Statistical analysis
Diagnostic test accuracy
The following parameters were evaluated for their accuracy in predicting OSA on HPSG following re-evaluation of screening results: snore score 23, OSA score 20, SDB score 22, adapted SDB score 28, nadir Sp,O2, DI4, DI90, DI92 and DIC. Accuracy was investigated using nonparametric receiver-operating characteristic (ROC) analysis with area under the ROC curve (AUC) and its 95% confidence interval (95% CI), as well as classical measures of accuracy like sensitivity, specificity, and positive and negative likelihood ratio. To enable comparability, SDB-Q scores and HPO parameters were dichotomised into “test positive” and “test negative” based on the ROC curve. Cut-off values for dichotomisation were set to achieve 0.8 specificity. For the questionnaire scores, missing answers were handled in four different ways: 1) missing answers were scored as 0 (never; this was the primary analysis and in accordance to the screening procedure); 2) missing answers were scored as the item-specific sample mean; 3) missing answers were scored as the maximal item-specific response category (mostly 4 for almost always); and 4) missing answers led to exclusion of individuals. Measures of accuracy were then calculated for all four data sets.
OSA prediction model
Using the subset of children who had undergone HPSG, a prediction model for OSA was elaborated using an explorative data analysis and consecutively applied to those children who were not evaluated with HPSG. Therefore, children with OSA were compared with children without OSA using Pearson's Chi-squared test for categorical variables and the Mann–Whitney U-test for continuous variables. 34 factors from the SDB-Q (including age, sex and SDB-Q scores), four factors from the HPO questionnaire, and 25 factors from the HPO were evaluated. Differences in distributions/ranks with a p-value <0.1 were identified. With the exception of SDB-Q scores, identified SDB-Q factors were then dichotomised into several binary dummy variables using different cut-offs. For example, the variable of a questionnaire item with three response categories (e.g. never, occasionally, frequently) were dichotomised into the dummy variable “never versus occasionally/frequently” and “never/occasionally versus frequently”. Identified HPO factors were dichotomised into dummy variables using published cut-off or reference values 20, 25, 30. Replacing categorical variables by binary dummies aimed to reduce the number of parameters in the regression model which in turn enhanced statistical power. Finally, multiple Pearson's Chi-squared tests were performed on each factor to identify the dummy variable with the lowest p-value. Multiple binary logistic regression analysis was used to construct the prediction model 36. All SDB-Q scores and binary dummy variables selected from the explorative data analysis were potentially eligible for inclusion. To enable a complete data set, missing values within each dummy variable were replaced by the same value to form a distinct “missing” category. Variables were added to the model using the conditional step-wise forward selection method. A p-value of 0.2 was the criterion for including or excluding a variable.
OSA population prevalence
After establishing the prediction model, probability values for OSA (range: 0–1) were calculated for all children using the logistic function 36. Probability values were compared between children with and without OSA using ROC curves, and AUC and its 95% CI. Using the ROC curve, a cut-off for the probability values was searched that allowed prediction of OSA on HPSG with at least 0.95 specificity. Based on the probability values and the above-mentioned cut-off value, OSA was predicted in children who had not undergone HPSG. “Predicted” OSA cases were added to the HPSG-defined OSA cases and the population prevalence of OSA estimated. To account for the complex sampling strategy and varying response proportion, stratum- and cluster-specific sampling weights were used to adjust the point estimate and the 95% CI for the population prevalence 37.
Analysis software and algorithms
Recoding and creation of variables, descriptive statistics, group-wise comparisons, logistic regression analyses, and creation of ROC curves were performed using SPSS 15.0 (SPSS, Inc., Chicago, IL, USA). Nonparametric ROC analysis (i.e. AUC, its standard error and 95% CI) was performed using Stata 9.2 (Stata Corp., College Station, TX, USA). AUC was computed using the trapezoidal rule; the standard error for AUC was computed using the algorithm described by DeLong et al. 38; the 95% CI for AUC was determined using the bootstrap t approach with 1,000 replications 39. The design-adjusted point estimate for the population prevalence of OSA and its 95% CI were calculated using the complex survey module of Stata 9.2. No adjustment for multiple testing was performed.
RESULTS
Screening results
Basic characteristics of the study sample and study subgroups are presented in table 1; screening results are given in figure 2. The SDB-Q was successfully obtained in all children. The amount of missing SDB-Q data ranged from 1.0 to 27.1%. A detailed description of missing SDB-Q data is given in the Appendix. In total, 114 children snored habitually, 37 had an OSA score >0, and 45 children had an adapted SDB score ≥24. Thus, 125 children were selected for HPSG based on SDB-Q results.
Acceptable HPO recordings were obtained in 995 children. Based on the pre-defined screening criteria, 24, 10 and 35 recordings fulfilled HPO criterion 1, 2 and 3, respectively. In addition, six children had typical recurrent desaturation clusters in their oximetry recording, but did not meet our pre-defined screening criteria. As these recordings were clinically suggestive for OSA, we also included these children in the HPSG follow-up. Thus, 51 children were selected for HPSG based on HPO results. Finally, 169 children (14.4% of the total study sample) met at least one out of six screening criteria or were suspected to have OSA based on their HPO recording.
Polysomnographic results
Of 169 screen-positives, 13 families could not be contacted by either phone or mail and eight families declined participation in a sleep study. Hence, 148 sleep studies were performed. Of these, 132 recordings comprising at least 4 h of corrected estimated sleep time. Children who successfully underwent HPSG were not systematically different from those eligible concerning demographic variables like age, sex and maternal education (data not shown). There was a mean (minimum–maximum) time gap between screening with the SDB-Q and performing the HPSG of 32 weeks (4–77). Of 132 children successfully evaluated by HPSG, 20 had an AHI≥1 and were diagnosed to suffer from OSA.
Of 975 screen-negatives, 65 children were approached and 11 children or their parents declined participation. Demographic variables (age, sex, maternal education) did not differ between participants and nonparticipants (data not shown). Of 54 recordings performed, 48 comprised at least 4 h of corrected estimated sleep time and were, thus, considered acceptable for analysis. Two of the remaining recordings could be successfully repeated (one had to be repeated twice), while four children denied further participation. In one child, who had originally screened positive and underwent HPSG, re-evaluation of screening results revealed that the screening had in fact been negative. This child was assigned post hoc to the control group, thereby leading to a final sample of 51 children. Mean (minimum–maximum) time gap between screening with the SDB-Q and the performance of HPSG was 39 weeks (range 10–87). Of 51 children successfully evaluated by HPSG, two had an AHI ≥1 and were diagnosed to suffer from OSA.
Follow-up
Parents of the 22 children with OSA were informed about the HPSG result and encouraged to visit their otorhinolaryngologist for further evaluation. Six parents refused any treatment and further evaluation, four children were lost to follow-up, five children had their AHI <1 at follow-up (weight loss was recommended in two cases), and some type of surgical intervention was performed in five children.
Diagnostic test accuracy
Measures of accuracy for screening criteria used in this study are given in table 2. Measures of accuracy for SDB-Q scores and HPO parameters are given in table 3. ROC curves for SDB-Q scores are given in figure 3 and ROC curves for HPO parameters are given in figure 4. In general, sensitivity for screening criteria was low (<0.6), while specificity was moderately high (mostly >0.7; table 2). Regarding other potential screening methods, AUC for SDB-Q scores was lower throughout compared with HPO parameters (table 3). According to the prerequisite of at least 0.8 specificity, sensitivity ranged from 0.27 (snore score) to 0.65 (DI90 and nadir Sp,O2).
There were only minor variations in the SDB-Q scores with the different data imputation methods used. If missing answers were scored as the item-specific sample mean, AUC (95% CI) was 0.55 (0.39–0.70), 0.61 (0.44–0.75), 0.58 (0.43–0.72) and 0.56 (0.42–0.69), respectively, for the snore score, OSA score, SDB score and adapted SDB score. For the data set where missing answers were scored as the maximal item-specific response category, AUC (95% CI) values were 0.54 (0.38–0.69), 0.57 (0.40–0.71), 0.55 (0.41–0.68) and 0.53 (0.38–0.66), respectively, for the four scores. For the data set where questionnaires containing missing answers were excluded, corresponding AUC (95% CI) values were 0.54 (0.36–0.70), 0.60 (0.44–0.74), 0.59 (0.42–0.74), and 0.52 (0.37–0.68), respectively.
OSA prediction model
Of 63 factors investigated, four from the SDB-Q, one from the HPO questionnaire, and seven from HPO were significantly differently distributed between children with and without OSA (table 4). Stepwise forward logistic regression analysis performed seven steps and included the BMI, history of allergy, OSA score, DI90, and HPO criterion 1 (table 5). Goodness-of-fit (Nagelkerke R2) significantly improved from step 1 (R2 = 0.133) to step 7 (R2 = 0.383). Median (minimum–maximum) probability of OSA delivered by the prediction model was 0.033 (0.001–0.699) for the non-OSA group and 0.331 (0.008–0.938) for the OSA group. AUC (95% CI) was 0.86 (0.71–0.94) and hence higher compared with all SDB-Q scores and HPO parameters. According to the prerequisite of at least 0.95 specificity, the cut-off value for the probability values was set at 0.291. This yielded a sensitivity of 0.59, a specificity of 0.95, a positive likelihood ratio of 11.89, and a negative likelihood ratio of 0.43 in predicting OSA on HPSG.
OSA population prevalence
Applying 0.291 as cut-off to the probability values of all non-HPSG-validated children, nine additional OSA cases were predicted (four in screening-negatives, five in screening-positives; fig. 2). Adding these predicted cases to the 22 HPSG-validated cases resulted in a total number of 31 children suspected to suffer from OSA. The stratum-specific point estimates (95% CI) for the prevalence of OSA was 1.8 (0.6–3.1) for SES stratum 1, 2.8 (0.6–4.9) for SES stratum 2, and 3.9 (0.2–7.6) for SES stratum 3 (fig. 1). This yielded a design-adjusted point estimate (95% CI) for the population prevalence of OSA of 2.8% (1.5–4.1). Although not statistically significant, the risk of having OSA was higher in SES stratum 2 (odds ratio (95% CI): 1.5 (0.6–4.0)) and SES stratum 3 (2.2 (0.7–6.5)) compared with SES stratum 1, suggesting a dose-effect gradient.
DISCUSSION
We found a relatively high population prevalence of OSA in our urban community of primary school children. If this is true for the total population of primary school children in Germany, OSA is one of the most frequent chronic respiratory diseases in childhood. Asthma, another chronic respiratory disease, was found to have a 12-month prevalence of 3% in the German Health Interview and Examination Survey for Children and Adolescents 40. In our school enrolment cohort from 1998, which was the sampling frame for the present study in 2001, 3.9% of children were reported to suffer from doctor-diagnosed asthma 19. These data suggest that, at least in school children, OSA is as prevalent as asthma. Given its potentially life-long consequences 41, OSA may require more attention from paediatric public health services, clinicians and researchers than currently provided.
Several methodological features probably enabled us to obtain a highly accurate estimate for the population prevalence of OSA, as follows: 1) compared with other studies, we achieved a high response proportion (65%) 19; 2) our study sample was representative of the target population 19; 3) we used a two-stage clinical screening procedure including an objective test for OSA; 4) a prediction model was used to detect individuals who had not been validated by HPSG but probably suffered from OSA; and 5) the estimate for the population prevalence of OSA was adjusted for design aspects like sampling strategy, response proportion, and clustering of individuals within schools.
In contrast to our study, four studies applied HPSG to the total sample and would have been able to yield accurate prevalence estimates 6, 10, 13, 14, 16. These studies, however, suffered from low response, lack of representativeness, and/or the use of adult criteria for diagnosing OSA. The study by Redline et al. 6 resulted in a prevalence estimate (10.3%) that was much higher than the current one. Surprisingly, this was achieved despite using an AHI ≥5 for defining HPSG-based OSA, a relatively high cut-off that is predominantly used in adults. However, a more proper cut-off value would have increased their point estimate even further. Compared with our sample, their children were more obese (mean BMI, 18.5 versus 17.5 kg·m−2), were more likely to be of African-American ethnicity (19.1 versus <1%) and more often had doctor-diagnosed asthma (13.5 versus 4.9%). All these factors are suspected to be risk factors for OSA. Increasing the prevalence of risk factors also increases the prevalence of the disease in a population. This may explain at least partly the difference in the estimates between the study by Redline et al. 6 and the current study. In summary, there are indications so suggest that their sample was not representative of the healthy population.
Sanchez-Armengol et al. 10 investigated 101 adolescents with HPSG. However, the authors used adult instead of paediatric criteria for diagnosing OSA, the response proportion was only 31%, and the prevalence of OSA was surprisingly high (17.8%). It remains questionable whether the recruited sample was representative of healthy adolescents and whether OSA was appropriately defined.
The Tucson Children's Assessment of Sleep Apnoea study reported estimates for the prevalence of OSA in 2003 and 2005 13, 16. However, the study suffered from a low response proportion, and the reported high prevalence of OSA (31.4 and 24.0%, respectively) questions the representativeness of their sample and/or their diagnostic criteria for OSA. It is unlikely that this study provided valid estimates for the population prevalence of OSA in childhood.
A further study was performed in 2003 by Rosen et al. 14. A population-based cohort of 850 children was studied and OSA defined as AHI ≥5 or OAI ≥1. The population prevalence was derived from cohort-specific estimates with birthweights from US live births data. Using these methods, OSA was detected in 4.7% of participants and the adjusted population prevalence of OSA was estimated to be 2.2% (95% CI 1.2–3.2). The authors came up with an estimate very close to the current one and with a confidence interval that includes our point estimate. Due to the methods used, their study likely provides a largely unbiased estimate for the population prevalence of OSA in US children.
Apart from the above-mentioned studies, most prevalence studies used questionnaires 5, 7, 8, 11, 12, 15–18 and only one study used pulse oximetry for screening purpose 9. However, none of these studies included screen-negatives for gold standard evaluation. Consequently, estimation of the accuracy of screening tests used in these studies was not possible. As we used six different screening criteria and included screen-negatives for HPSG evaluation, we were able to estimate the accuracy of our screening criteria. In general, sensitivity was low (<0.6) and specificity high (>0.7) for both SDB-Q and HPO criteria. However, after a detailed investigation of screening methods and analysis of continuous test results, it turned out that AUC was generally higher for HPO parameters (mostly >0.7) compared with SDB-Q scores (mostly <0.6). This has several implications: 1) OSA prevalence studies using only questionnaires are likely to underestimate the true population prevalence; 2) in contrast to previous studies on the diagnostic test accuracy of HPO 27, sensitivity may be enhanced by using other than the published criteria 27; and 3) HPO may be used as a screening test for OSA.
Regarding the SDB-Q, we faced several problems. This questionnaire was mainly based on a questionnaire from another epidemiological study in primary school children 22; however, accuracy in a community-based study was unclear. In fact, the questionnaire was not used in its original form, and was modified as follows: 1) three items were adapted to enable calculation of Brouillette's OSA score 20; 2) six items were taken from a German questionnaire on OSA in toddlers and young children 28; and 3) five items on sleep problems were newly developed 42. Of the three SDB-Q-based screening criteria (i.e. habitual snoring, OSA score ≥0, adapted SDB score ≥24), only the OSA score had been validated 20. Initially, we were concerned about the low specificity (and, thus, many false positives) of the OSA score. To cope with this problem, we increased the cut-off value for a positive test result from -1 to 0. Conversely, we were also concerned about a low sensitivity when using the OSA score as the only screening criterion. We therefore decided to establish a second SDB-Q score (i.e. the adapted SDB score) and to evaluate all habitually snoring children with HPSG.
Sensitivity was also a matter of concern with HPO. Using the criteria suggested by Brouillette et al. 27, HPO had a sensitivity of only 0.43 in one study. The accuracy of pulse oximetry in a community setting was, in analogy to the SDB-Q, unknown. To enhance its sensitivity, we added two more screening criteria: 1) DI90 >0.6 (criterion 2 31) and 2) DI4 >3.9 and DIC >0.4 (criterion 3 25). The latter criterion, however, was introduced during the study, as reference values from a healthy subgroup finally became available 25. There were two reasons why we used HPO as a screening method. First, an objective screening test was needed, because accuracy of subjective parental observations (and reporting via the SDB-Q) of a child's breathing during sleep may depend on demographic (e.g. single-parent family), socioeconomic (e.g. number of rooms in the household) and ethnic factors (e.g. perception of sleep-related symptoms may differ between ethnic groups). Relying on only parental perception therefore most probably decreased the sensitivity of our screening procedure. Hence, an objective, easily applicable, and low-cost screening test was considered mandatory. Secondly, we were also interested in intermittent hypoxia as an intervening factor in the relationship between SDB and several outcomes, such as impaired behaviour 43 and academic achievement 24. Intermittent hypoxia is thought to cause prefrontal cortical dysfunction leading to impaired cognitive execution 44. To clarify the role of intermittent hypoxia, we decided to include a screening method that also allows the assessment of night-time intermittent hypoxia.
In our study, a prediction model was used to estimate the population prevalence of OSA. Variables for the model were selected and weighted using effect estimates from logistic regression. Using the model, probability values for OSA were calculated in children that were not investigated by HPSG and children assigned as “predicted” OSA cases. Predicted and validated cases were added to obtain the best estimate for the population prevalence of OSA. To our knowledge, this is the first prediction model for paediatric OSA that is based on two different screening tests and is constructed from data of a community-based sample. Prediction models are often used in adults to: 1) exclude a diagnosis of OSA when the probability is low so that no further testing is required; 2) establish an a priori probability before considering the use of a diagnostic method other than polysomnography; and 3) prioritise patients needing polysomnography according to the probability that they will have a positive result 45. Four prediction models for paediatric SDB have been published 46–49. They combined several data sources (i.e. clinical history, anthropometry and radiography) and types of modelling. Silvestri et al. 46 published a prediction model showing 0.81 accuracy. However, they did not present other measures of accuracy and did not publish raw data to allow calculation of these measures. The discriminant analysis classification system by Shouldice et al. 47 showed 0.86 sensitivity and 0.82 specificity. However, the test set was small and results were not prospectively confirmed in a larger group of children. Xu et al. 48 demonstrated that radiological features of upper airway narrowing due to adenotonsillar hyperplasia were found to be predictors for clinically relevant OSA. A combination of six predictors had a sensitivity and specificity of 0.94 and 0.42, respectively. Finally, Bitar et al. 49 presented a clinical score for obstructing adenoids. Polysomnography, however, was not performed and diagnostic accuracy for OSA not determined.
In contrast, the current prediction model has several advantages. First, the model is based on parameters that can be easily obtained by filling in a questionnaire, measuring height and weight, and performing an overnight oximetry recording. In our study, the required data were successfully obtained in schools. Hence, it seems possible to use this model for large-scale screening programmes as well as for primary care settings. Secondly, sensitivity and specificity can be “adjusted”. As the model delivers probability values (i.e. a continuous test result), cut-off values for a positive screening result may be adjusted according to the type of application. If necessary, sensitivity (or specificity) may be enhanced. This, however, would be at the expense of the specificity (or sensitivity). For our estimation of the population prevalence of OSA, we adjusted the cut-off to gain a specificity of >0.95 in order to decrease the false positive fraction. In other settings (e.g. screening studies with a second test or a gold standard evaluation), it could be more advisable to increase sensitivity and lower the false negative fraction. Thirdly, compared with each single SDB-Q score and HPO parameter (AUC≤0.75), accuracy of the model was superior (AUC = 0.86). It is inherent that a combination of diagnostic criteria shows higher accuracy than each single criterion. However, further studies are needed before prediction models similar to the current one may be used in clinical settings.
Limitations
Limitations of the current study have been discussed elsewhere 19, 24. Briefly, there might be a selection bias if subjects with symptoms were more likely to agree to participate. This would cause an overestimation of prevalence. The sample was drawn from an urban community of third graders. As the geographical variation in the prevalence of OSA is unclear, results can be extrapolated to suburban or rural communities only cautiously. Selected individuals for this study were third graders with an age range of 7–12 yrs. This is not the age span where the prevalence of OSA is thought to have its maximum. As OSA is mostly caused by adenotonsillar hyperplasia in children, and the quotient between pharyngeal diameter and adenotonsillar tissue size has its minimum in the first years of life, the age span of 3–5 yrs is suggested to have the highest prevalence. We were, however, interested in the relationship between SDB and academic achievement, which is not assessed until the third grade.
Adenotonsillectomy is the accepted first line treatment for OSA in children. Hence, the frequency of this procedure performed in a population may affect the population prevalence of OSA. In our sample, the frequency of adenotonsillectomy was 3.9%. In other populations with higher or lower rates of this procedure, the prevalence may differ substantially. However, adenoidectomy was a risk (and not a preventive) factor for habitual snoring in one study 50 and adenotonsillectomy did not decrease the risk for habitual snoring in another study 51. Moreover, adenotonsillectomy was found to be ineffective in 50% of cases on 1-yr follow-up 52, and, in the present study, neither adenoidectomy nor tonsillectomy was a preventive factor for OSA. Hence, it remains speculative if high rates of adenoidectomy and/or tonsillectomy would substantially reduce the prevalence of OSA in a population.
Some screening criteria were introduced during the study to enhance sensitivity. Consequently, some children were screened positive in 2001 and not evaluated with HPSG until the end of 2002. OSA is thought to be a rather stable disease, but the precise variation in its expression and severity is unknown. It is possible that some children who were screened positive and had OSA in 2001 were not suffering from OSA anymore when the diagnostic procedure was done in 2002. This would have led to disease misclassification and may have biased both the prevalence estimate and the estimate of accuracy.
For a final diagnosis of OSA, we used abbreviated HPSG that did not include electroencephalography, -occulography and -myography. In 2000, no device for full HPSG was commercially available and we defined OSA on the basis of the AHI without need for arousal determination or sleep staging. One concern with abbreviated HPSG is the possible loss of diagnostic accuracy because sleep cannot be distinguished from wakefulness. This is based on the assumption that if detection of rapid-eye-movement sleep (when OSA is usually present or most severe) is not possible, OSA cannot be reliably ruled out. Three validation studies on abbreviated HPSG showed conflicting results 53–55. However, as previously discussed by Morielli et al. 56 and Jacob et al. 54 there is invariably rapid-eye-movement sleep present in an all night recordings, even though it may not be possible to determine which specific epochs are included. Meanwhile, abbreviated HPSG has been used by a series of other community-based studies 6, 10, 12, 14, possibly because full HPSG suffers from significant artefacts in the electroencephalographic and myographic channels 57. The convincing advantages of abbreviated HPSG are convenience for both parents and children, and cost-effectiveness 54, 58. Moreover, the omission of sensors and leads attached to the child's face and head should help to improve sleep quality and establish a regular sleep profile in the night of recording. In summary, there is no evidence that abbreviated HPSG is not a valid and reliable diagnostic test procedure for OSA in children.
HPSG was performed in only 16% of all participants. For a prevalence study, it is surely desirable that the diagnostic test is applied to the total population or to all individuals of a representative sample drawn from that population. Performing HPSG in hundreds of children, however, is very cost-intensive and this may be the reason that there are only four prevalence studies where HPSG was performed in the entire sample ranging from 101 to 850 individuals 6, 10, 13, 14. Most researchers used some kind of screening procedure to identify at-risk individuals for further diagnostic evaluation 4, 5, 7–9, 11, 12, 15–18. If the screening procedure is sufficiently sensitive, this approach is obviously more cost-effective and reduces the burden of diagnostic procedures for low-risk individuals without introducing bias and underestimating the true sample and population prevalence. In the present study, we used six different screening criteria and a prediction model to reach a high level of sensitivity. We, hence, believe that performance of HPSG in the total sample is unlikely to have led to a significantly higher prevalence estimate than reported here.
Measures of accuracy were prone to verification bias, which occurs if not all screen-positives and only a small fraction of screen-negatives undergo gold standard evaluation 59. When screen-positives are more likely to be verified for disease than screen-negatives, the bias in naïve estimates of accuracy is always to increase sensitivity and to decrease specificity from their true values. In our study, not all screen-positives and roughly 5% of screen-negatives underwent HPSG. Hence, the estimates of accuracy should be interpreted cautiously.
The prediction model is based on data from 7- to 12-yr-old children and should hence be applied only to this age group. In children outside this age range, factors other than those identified in this study may be more predictive for OSA or the same factors need to be weighted or combined in a different way. This is particularly true for infants and toddlers, where the BMI may not be predictive of SDB 60. We thus warn against the use of our prediction model outside the age range of primary school children. Moreover, the number of validated subjects (n = 183) could be insufficient for a precise estimation of the diagnostic test accuracy of the prediction model. No sample size calculation had been performed and the confidence interval for the AUC was rather wide, ranging from 0.71 to 0.94. We, hence, do explicitly not recommend its clinical use until more validation data are available. However, we believe that its use as an additional “diagnostic” procedure to detect possible OSA cases was reasonable for the current study.
Conclusions
The population prevalence of OSA in German primary school children is likely to be at 2–3%. Hence, OSA may be one of the most frequent chronic respiratory diseases in this age group. There are clinical symptoms and oximetry findings that may be helpful to detect OSA in this age group. These symptoms and signs or a combination of both in a prediction model may be used for screening purposes. Such a model may also be used in future studies on the population prevalence of OSA in other settings.
Acknowledgments
The authors would like to thank C. Ehrhardt (Dept of Public Health, City Council, Hanover, Germany), P. Martinsen (Supervisory School Authority, Hanover), J. Hegemann (District Government, Hanover) and the headmasters and teachers of the participating schools for their support and cooperation. Our thanks also go to R. Downes (Getemed AG, Teltow, Germany) and Volker von Einem (Dept of Biomedical Engineering, Hanover Medical School, Hanover) for technical assistance, E. Eggebrecht, A. Guenther and J. Wolff (Dept of Paediatric Pulmonology and Neonatology, Hanover Medical School, Hanover) for obtaining oximetric recordings, and D. Veer, A. Noehren and S. Eitner (Dept of Paediatric Pulmonology and Neonatology, Hanover Medical School) for obtaining polygraphic recordings. We also thank the Hans Meineke Foundation (Hanover, Germany) for supporting this study and we particularly wish to thank all the children and their parents for their patience and cooperation; they made this study possible.
Footnotes
Support Statement
The study was supported by a research grant from a private, nonprofit organisation (Hans Meineke Foundation, Hanover, Germany). Oximeters were provided by GeTeMed (Teltow, Germany). Oximeter sensors were provided by Masimo (Irvine, CA, USA). Polygraphic devices and sensors were provided by ResMed Germany (Martinsried, Germany). There was no financial support by GeTeMed, Masimo, or ResMed Germany. Study sponsors were not involved in study design, data analyses, or manuscript preparation.
Statement of Interest
None declared.
- Received May 14, 2009.
- Accepted February 12, 2010.
- ©2010 ERS