Abstract
Our aim was to validate optimal action points in written action plans for early detection of asthma exacerbations.
We analysed daily symptoms and morning peak expiratory flows (PEFs) from two previous studies. Potential action points were based on analysis of symptom scores (standard deviations) percentage of personal best PEF, PEF variability in relation to a run-in period or combinations of these measures. Sensitivity and specificity for predicting exacerbations were obtained for each action point. The numbers needed to treat to prevent one exacerbation and the time interval between reaching action point criteria and the start of the exacerbation were calculated. Based on these parameters, the optimal action points for symptoms, PEF and PEF plus symptoms were determined, and their performance compared with published guidelines’ action points.
The optimal action points were, for symptoms, statistical variability (standard deviations) and, for PEF, <70% of personal best. The combination of PEF plus symptoms performed best, with improved specificity and earlier detection. The main benefits associated with using these action points was to reduce false positive rates for detecting exacerbations.
Early detection of asthma exacerbations can be improved using a composite action point comprising symptoms and PEF measurements over 1 week.
Exacerbations of asthma are common and, even when asthma is mild, constitute a significant health risk [1]. Assessing future risk of adverse events, including exacerbations, and educating patients to use a self-management plan is recommended [2–6].
Self-management includes developing individualised Written Asthma Action Plans (WAAPs). WAAPs specify the level of symptoms or peak expiratory flow (PEF) (called action points, APs) at which to adjust medication (usually starting oral corticosteroids) in order to either prevent or reduce the severity of exacerbations. To ensure effective intervention, an AP should detect an imminent exacerbation well before its onset.
Gibson et al. [7] and Gibson and Powell [8] have previously validated several APs using quality control analysis (QCA). However, in the Global Initiative for Asthma (GINA) guidelines and the Dutch national guidelines, thresholds for symptoms or PEF are not specified [3, 9]. Although APs in the current British Thoracic Society (BTS) and US National Heart Lung and Blood Institute (NHLBI) guidelines are more specific, these APs have not been validated [2, 5, 6]. The optimum time point at which changes in either symptoms or PEF may be detected, or the relevant thresholds reached prior to an exacerbation are largely unknown. This lack of validation means that physicians often determine APs for individual patients empirically. If APs are inaccurately selected, this potentially leads to over treatment (false-positive APs) or missed opportunities for early intervention (false-negative APs).
In this study, our aim was to develop optimal APs based on symptoms and/or PEF threshold levels for early detection of asthma exacerbations that allow timely intervention in patients with mild-to-moderate asthma. Subsequently, we aimed to validate the performance of the optimised APs in a similar but separate study population.
METHODS
We analysed asthma symptoms, morning PEFs, the occurrence of exacerbations and the use of prednisone using data from written daily diaries from two previous studies [10, 11]. The development dataset was obtained from a randomised controlled trial designed to compare the effects of 6 months of treatment with regular inhaled salbutamol, salmeterol or placebo [10]. The validation dataset was obtained from a single-blind placebo-controlled trial that explored the use of exhaled nitric oxide fraction (FeNO) to guide treatment in chronic asthma [11]. The follow-up period was 1 yr.
Subjects
There were 165 patients in the development dataset and 94 in the validation dataset, all with stable mild-to-moderate chronic asthma [10, 11].
Daily diaries
In both studies, daily diary recordings included symptoms of daytime and night-time chest tightness/wheeze/dyspnoea, cough, sputum production, exercise impairment, and either appearance of or increased frequency of nocturnal awakening. All were scored on a 0–3 scale or by a yes/no response where appropriate. The best of three PEF measurements was also recorded each morning and evening. Missing data were interpolated using the mean of the recordings from the previous and following days.
Exacerbations
Exacerbations were defined in both studies using a composite daily asthma score. The scoring criteria were similar between the two studies, but differed regarding the use of a β-agonist “reliever” and nocturnal awakening (table 1 in Taylor et al. [10] and table 2 in Smith et al. [11]). In brief, major exacerbations were defined as a visit to the emergency department, a PEF <40% personal best (pb) for ≥1 day, a PEF <60% pb for ≥2 days plus an increase in symptoms, or a PEF <60% pb for ≥1 day and PEF <75% pb for ≥2 days with an increase in symptoms.
During the study, courses of prednisone were administered in response to deteriorating symptoms and/or peak flows, or at the discretion of patients or clinicians independently of diary data. Prednisone use for ≥3 days is widely used as a definition for exacerbations [4]. Therefore, as a sensitivity analysis, we also assessed the predictive utility of APs using this alternative definition.
The use of action points in an 8-week peak flow chart with an exacerbation at the half-way point. The dotted lines indicate the thresholds of potential action points, on the left based on % of personal best (pb) peak expiratory flow (PEF), and on the right based on individual standard deviations for PEF. The observation period is divided into weeks before and during the exacerbation, and weeks of normal control, respectively coded as pre-exacerbation weeks and stable weeks. In this example, we have highlighted the action points PEF <70% pb and PEF < -3 sd. The action point PEF <70% pb is reached twice, once as a false positive in a stable week and once accurately 2 days before the exacerbation in the pre-exacerbation week. The action point PEF < -3 sd is never reached in this example, representing a false negative prediction for the pre-exacerbation week (marked X).
The number of days that the exacerbation is predicted before its occurrence is plotted against the (potential) number needed to treat (NNT) in order to prevent one exacerbation, for a series of different action points. The lower left corner represents the optimal action point, i.e. early prediction and low NNT. a) Exacerbations are defined using the definition described in the Methods section. b) Exacerbations are defined as a ”use of oral prednisone”. Action point 1: symptoms (Sy) >2 sd; 2: peak expiratory flow (PEF) <70% personal best (pb); 3: PEF <60% pb; 4: Sy >2 sd + PEF <70% pb; 5: Sy >2 sd + PEF <70% pb within 1 week; 6: National Heart Lung and Blood Institute (NHLBI). a) and b) similar results are shown, although the differences are larger in a). Action points 3, 4 and 5 perform similarly, with a slight increase in NNT for each day the exacerbation is diagnosed earlier. The optimum depends on the trade-off between NNT and early detection. To allow sufficient time to successfully intervene, we opted for number 5. Action points 1, 2 and 6 perform considerably worse, due to the high NNTs.
Action points
A range of pre-specified APs was evaluated. For symptoms, we assessed APs used in currently recommended WAAPs: the occurrence of nocturnal awakening or the appearance of any symptoms [2, 3, 6, 9]. Additionally, we evaluated APs based on QCA of symptoms using standard deviations from the mean symptom score during run-in for each patient. To this end, we developed a composite daily symptom score (range 0–6), which combined all daily recorded individual symptoms and “reliever” β-agonist use, with higher scores representing more severe symptoms (table S5). The mean score and its standard deviation were determined per patient during the run-in period when asthma was well controlled. Subsequently, occasions characterised by deviation from the mean by more than one, two or three standard deviations were evaluated as potential APs. In patients without any symptoms during the run-in, the mean symptom score and standard deviation was 0. In these cases, the one, two and three standard deviation thresholds were set at 0.17, 0.34 and 0.50, respectively, representing the minimal possible changes in composite symptom score.
For PEF, the APs were derived from percentages of personal best morning PEF measurement obtained during the run-in period (% pb), or QCA based on the approach outlined by Gibson et al. [7] and Gibson and Powell [8]. We also analysed whether combining PEF and symptoms as a composite AP might perform better, since using single outliers of PEF or symptoms alone might result in relatively high false positive rates for exacerbation prediction. Therefore, we assessed whether a combination of symptom and PEF thresholds were reached on the same day, and also within a 1-week time window. Finally, we assessed the performance of the APs currently recommended by the NHLBI, which are based on both symptoms and PEF (“yellow zone”) [2]. As it is not clear whether reaching the threshold for either symptoms or PEF alone is sufficient or both are required, we analysed both options.
For each patient, every week in the diary recordings was coded as either a “stable week”, when no exacerbation occurred, or “pre-exacerbation week’ for the week prior to an exacerbation. For all stable and pre-exacerbation weeks, we assessed whether the AP(s) either predicted a future exacerbation (when one or more of the daily recordings in that week fulfilled criteria for that specified AP), or predicted that a future exacerbation would not occur (when daily recording(s) did not reach the defined thresholds) (fig. 1).
Analysis
All analyses were performed with STATA (release 11; StataCorp, College Station, TX, USA). Contingency tables for each AP threshold were constructed to calculate performance characteristics including sensitivity, specificity, accuracy and area under the receiver operating characteristic (AUC) curve for predicting an exacerbation. In addition, for each AP threshold we assessed the (potential) number needed to treat (NNT) in order to prevent one exacerbation, given a hypothetical perfect treatment and early detection, defined as the number of days before the onset of an exacerbation the AP was reached for the first time in a pre-exacerbation week. NNT was calculated by dividing the total number of times an AP was reached (true positives and false positives) by the number of times it accurately predicted a future exacerbation (true positives).
The APs that performed optimally were grouped within four categories: 1) symptoms solely; 2) PEF solely; 3) symptoms and PEF on the same day; and 4) symptoms and PEF within 1 week prior to an exacerbation, using the development dataset. Optimal performance was defined as a sensitivity of ≥75% combined with the best trade-off between early detection and potential NNT. To determine this outcome, we plotted the number of days on which an exacerbation was predicted before its occurrence against the NNT for a series of different APs (fig. 2).
To assess the external validity of the optimal APs derived from the development dataset, their performance was assessed and compared with several published APs using the validation dataset [11].
RESULTS
The development dataset consisted of daily recordings from 164 patients. 88 exacerbations, defined using diary data, occurred during 18 months of follow-up. Exacerbations occurred in 39 different patients, a mean rate 1.8 per patient per year, ranging from 1 to 13. 147 exacerbations, defined as the use of a course of oral prednisone, occurred during the follow-up interval.
In the validation dataset, 94 patients provided daily recordings. 22 exacerbations occurred. Exacerbations occurred in 17 patients and the mean rate was 1.5 per patient per year (range 1 to 5). Oral prednisone was used on 75 occasions.
The characteristics of patients from both studies are listed in table 1.
Action points
The performance of 25 potential APs was analysed (a complete overview of results is presented in tables S4a–d). Six APs were based on symptoms, eight on PEF, nine on combinations of symptoms and PEF on the same day, and two on combinations of symptoms and PEF within 1 week. In general, APs based on standard deviations of symptom scores performed better than pre-defined absolute levels of symptoms. This judgment was based on lower NNTs for the former approach. PEF using % pb resulted in considerably lower NNTs than using standard deviations.
The optimal symptom AP was a score that increased by more than two standard deviations more than the run-in mean, and this detected exacerbations 2.9 days before occurrence with 88.5% sensitivity, 86.3% specificity and a NNT of 24. For PEF, the optimally performing AP was a PEF <60% pb, which is also currently proposed by the BTS as the threshold for commencing oral prednisone treatment [5]. It had a sensitivity of 78.2%, specificity of 98.7% and a NNT of 3. However, it detected exacerbations only 1 day before their occurrence. The optimal combination (symptoms and PEF) comprised a symptom score increase of more than two standard deviations plus PEF decrease to <70% pb. This combination detected exacerbations 1.4 days before their occurrence with 80.5% sensitivity, 98.3% specificity and a NNT of 4. Within a 1-week window, this symptom–PEF combination detected exacerbations 4.1 days (mean) before their occurrence with a sensitivity of 85.1%, specificity of 97.2% and a NNT of 6 (table 2).
The performance characteristics of optimal APs in the validation dataset are presented in table 3. In general, the sensitivities for each of the optimal APs differed somewhat from those obtained using the developmental dataset, whereas specificities remained similar. For each optimal AP, the number of days before the onset of an exacerbation at which the AP predicted future exacerbations was better in the validation dataset, i.e. between 0.4 and 1.0 day earlier.
For both versions of the AP recommended by the NHLBI, the combination of “appearance of any symptoms” plus PEF <80% pb performed best (table 3). It detected exacerbations 4.9 days before onset, with a sensitivity of 100% and specificity of 86.8%. However, the NNT is 43, whereas it is 12 for the optimal AP from the development dataset (fig. 2).
The comparable data using the alternative definition of “use of oral prednisone” are also reported in tables 2 and 3, and tables S4a–d in online supplementary material. In general, sensitivities were considerably lower, overall accuracies were similar, early diagnosis was slightly later, but the NNTs were better.
DISCUSSION
The present study provides the most comprehensive data to date of the performance characteristics of a range of symptom and/or PEF thresholds at which patients might intervene to abort an asthma exacerbation or to reduce its severity. For symptoms, a change of more than two standard deviations in a composite symptom score provided optimum outcomes. For PEFs, a decrease to <60% pb was optimal. However, an AP based on a combination of changes in symptom score (more than two standard deviations) and PEF (<70% pb) occurring during a 1-week period performed even better. This combination predicted exacerbations 5 days before their occurrence, thus allowing sufficient time to intervene, whilst the NNT remained low.
Previously, in a Cochrane review, Powell and Gibson [12] compared the use of WAAPs based on symptoms with those based on PEF [12]. Results showed that these were equivalent with regard to outcomes, i.e. hospitalisations or unscheduled doctor visits. Our data indicate that combining symptoms and PEFs provide added value. Clearly, it is not practical for patients to do the necessary calculations and therefore, in practice, an AP based solely on PEF <60% pb might be optimal. Nevertheless, with the advent of internet-based applications (“Apps”), the use of seemingly complex APs is now feasible [13]. Although compliance with paper diary recordings is generally poor [14], such an approach is feasible with electronic recordings [15] and is of particular relevance in patients with difficult or brittle asthma.
The fact that a “both/and” combination of symptoms and PEF performed better than single APs is not surprising. Even with good asthma control, symptoms and PEFs may vary discordantly, and one of these parameters may change in isolation, especially in “poor perceivers”. APs with threshold levels based solely on either symptoms or PEF are susceptible to these variations. Using a more stringent threshold, such as PEF <60% pb, can solve this issue, but has the disadvantage of late detection of an imminent exacerbation. Therefore, using a 1-week window for the symptoms plus PEF provided the best AP as it detected exacerbations 5.1 days before occurrence, at only a slight cost in specificity and NNT. To assess whether symptoms or PEF drive earlier detection using the AP with a one week time window, we performed a subgroup analysis of the 74 predicted exacerbations. There was no consistent pattern as to whether changes in symptoms preceded PEFs or vice versa. Symptoms occurred earlier in 25 subjects, the threshold for PEF changes was reached earlier in 23, and in 26 there was no discordance.
Previously, Gibson et al. [7] analysed nine different APs and showed that QCA of daily PEFs performs better than percentages of personal best PEF (in contrast to the present data) or percentage predicted of PEF. Gibson et al. [7] reported that the optimal QCA AP detected 91% of exacerbations and falsely predicted an exacerbation in 23% of periods of normal control. Tattersfield et al. [16] analysed the false positive rate of APs based on the median values of PEF and symptoms at 2 days before the start of an exacerbation. They found a false-positive rate of 6.4% using the advent of night-time symptoms, 26% for morning PEF and 30% for daytime symptoms. Thamrin et al. [17] analysed daily fluctuations in PEF and, by calculating conditional probabilities of future decreases in lung function, predicted the risk of exacerbations with a sensitivity of 68.8% and specificity of 67.4%. The AUC was 0.85, which is only slightly lower than AUCs of most optimal APs in this study [17].
The time course of changes in symptoms and PEF that constitute an asthma exacerbation is important in determining the optimum time for intervention. If changes can only be identified after the time at which intervention is likely to be effective, then the rationale for using WAAPs would be weak. Previous data suggest that symptoms and PEF start declining 5–10 days before exacerbations [16, 18]. The changes in PEFs and symptom scores associated with exacerbations in our patients are illustrated in figure 3a and b. Based on these findings, we systematically analysed the 7-day period preceding exacerbations. We found that changes in the optimal APs occurred between 1.7 and 5.1 days before the defined onset of an exacerbation (table S4). The onset of action of systemic corticosteroid is within 12–24 h, and so the APs would be reached in sufficient time to allow for steroids to have a modifying effect. The effectiveness of quadrupling the dose of inhaled corticosteroids was recently investigated by Oborne et al. [19], and might have resulted in greater clinical benefits if commenced at the times calculated to be optimal in our study.
a) Changes in peak expiratory flow (PEF) and b) symptom scores from day -14 to day +10 before and after an exacerbation using the mean PEF and symptom score data from each exacerbation in the development dataset.
Our study has several possible limitations. First, we selected criteria for acceptable sensitivity and specificity (see Statistical analysis section), as we aimed to balance early detection of exacerbations against potential overdiagnosis. Secondly, the composite symptom score(s) used in the two studies were not externally validated. It is not certain whether applying QCA to alternative scoring systems such as the Asthma Control Questionnaire or the Asthma Control Test would give similar results [20, 21]. However, given the overall similarity between results using both of our datasets, there is reason to believe that QCA is a valid approach to optimising APs independently of the exact scoring system used. Thirdly, APs were based on parameters that were incorporated in the definition of an exacerbation. Our study was not designed to be explanatory but rather to model predictive performance, and as such is methodologically sound. Our definition of major exacerbations, i.e. either emergency room visits or changes in PEF plus symptoms for ≥2 days, is in accordance with recent criteria for severe exacerbations [4]. Furthermore, in modified forms, our definition has been used in several previous studies [10, 11, 22, 23]. However, accepting that the definition of an exacerbation is important in the interpretation of our data, we performed additional analyses using “use of oral prednisone” as the definition of an exacerbation (tables 2 and 3 and table S4 of the online supplementary material). The order of optimal APs was similar with regard to early detection and NNT (fig. 2b). Using this definition, the sensitivity to detect exacerbations was considerably lower when using PEF either solely or in combination with symptoms, whereas it was only slightly lower using symptoms alone (table S4). This implies that the decision to administer prednisone depended more on symptoms than on PEF. Given that the sensitivity was <75%, and the NNT was high, we concluded that the analysed APs did not perform well enough to predict exacerbations defined as “use of oral prednisone. Such events were generally less severe than the exacerbations defined a priori using composite symptom scores and PEFs. It is therefore arguable that our APs performed well in predicting events of higher severity and in which earlier intervention is clinically desirable.
In conclusion, the optimal AP for the early detection of asthma exacerbations consists of a greater than two standard deviations increase in a composite symptom score and a fall in PEF to <70% pb, occurring within a 1-week window. With the advent of handheld computer technology, there is potential to use these criteria more readily in day-to-day practice, and thus reduce the impact of exacerbations, particularly in patients with a history of frequent exacerbations. Prospective studies or further analyses using other published datasets should be carried out to confirm the present findings, and together they should be used to revise and improve the empirical recommendations offered in current guidelines.
Acknowledgments
The developmental and validation datasets were provided to the principal authors (P.J. Honkoop and J.K. Sont), with full permission to undertake additional analyses, by D.R. Taylor.
Footnotes
This article has supplementary material available from www.erj.ersjournals.com
Support Statement
The study was partly funded by a short-term research fellowship awarded by the Netherlands Asthma Foundation.
Statement of Interest
Statements of interest for A.D. Smith and J.K. Sont can be found at www.erj.ersjournals.com/site/misc/statements.xhtml
- Received November 24, 2011.
- Accepted March 19, 2012.
- ©ERS 2013