Abstract
Current tools for recording chronic obstructive pulmonary disease (COPD) exacerbations are limited and often lack validity testing. We assessed the validity of an automated telephonic exacerbation assessment system (TEXAS) and compared its outcomes with existing tools.
Over 12 months, 86 COPD patients (22.1% females; mean age 66.5 yrs; mean post-bronchodilator forced expiratory volume in 1 s 53.4% predicted) were called once every 2 weeks by TEXAS to record changes in respiratory symptoms, unscheduled healthcare utilisation and use of respiratory medication. The responses to TEXAS were validated against exacerbation-related information collected by observations made by trained research assistants during home visits. No care assistance was provided in any way. Diagnostic test characteristics were estimated using commonly used definitions of exacerbation. Detection rates, compliance and patient preference were assessed, and compared with paper diary cards and medical record review.
A total of 1,824 successful calls were recorded, of which 292 were verified by home visits (median four calls per patient, interquartile range three to five calls per patient). Independent of the exacerbation definition used, validity was high, with sensitivities and specificities between 66% and 98%. Detection rates and compliance differed extensively between the different tools, but were highest with TEXAS. Patient preference did not differ.
TEXAS is a valid tool to assess COPD exacerbation rates in prospective clinical studies. Using different tools to record exacerbations strongly affects exacerbation occurrence rates.
Exacerbations of chronic obstructive pulmonary disease (COPD) are acute episodes of sustained symptom aggravation that last from several days to weeks [1], strongly impair health-related quality of life [2–4] and contribute substantially to COPD-related costs [5, 6]. The burden of exacerbations indicates a growing need to better focus on their prevention and management. Hence, the attention of researchers has shifted from lung function decline as the primary outcome of interest to occurrence of exacerbations [7].
Despite the emerging importance of exacerbation as a study outcome, there is still no generally accepted definition of exacerbation. Recently, much attention has been paid to the substantial variety in both symptom- and event-based definitions, and, in particular, to the impact of using different algorithms on exacerbation outcomes in clinical trials [8, 9]. So far, surprisingly little attention has been paid to the tools with which exacerbations are actually “measured”. Studies on exacerbation outcomes often fail to provide a detailed description of the precise tools that were used to detect exacerbations. Moreover, exacerbation measurement tools often lack validity testing. Currently, we do not know the impact of using different recording strategies on exacerbation rates. Commonly used methods are based on periodic (retrospective) questionnaires, patient diary cards and medical record review [10–12]. These methods of data collection all have that they are rather time-consuming for patients and/or researchers, often at the expense of patients’ compliance, in common. The introduction of electronic diaries is a promising development [13], although their validity should be tested first before they can be recommended for use in clinical COPD research.
In the current study, we assessed the validity of a recently developed automated telephonic exacerbation assessment system (TEXAS) to record exacerbations in prospective clinical studies. We also assessed the system’s exacerbation detection rate, patient compliance and patient preference, and compared these outcomes with two conventional exacerbation recording methods, i.e. paper diary cards and medical record review. We hypothesised that using different tools to record COPD exacerbations would have an impact on exacerbation rates, even when a uniform definition of exacerbation is applied.
METHODS
Study design and population
This study was a 1-yr prospective cohort study in which 86 patients with moderate-to-severe COPD [14] were included. Our cohort size resembled the cohort size used in the East London, UK studies on exacerbation outcomes [1, 4]. With an expected exacerbation frequency of 2.5 exacerbations per patient per year [15], the number of patients would be sufficient to obtain meaningful estimates regarding the validity of TEXAS. Recruitment took place between August 2006 and October 2007 in patients who had participated in a previous COPD study [16] or regular pulmonary rehabilitation programmes at the Dept of Pulmonary Diseases (Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands).
Inclusion criteria were: chest physician-confirmed diagnosis of COPD in Global Initiative for Chronic Obstructive Lung Disease stage II or III; age ≥40 yrs; and no exacerbations in the previous 4 weeks. Exclusion criteria included severe comorbid conditions with a reduced life expectancy, travelling time to the study centre >30 min, inability to speak Dutch, telephone incompatible with system requirements and, in the latter case, patients not willing to switch to another telephone as offered by the investigators.
The study was approved by the Medical Ethics Committee (Arnhem-Nijmegen, the Netherlands; approval number 2006/081). All participants gave written informed consent.
TEXAS
We have recently developed TEXAS to record COPD exacerbation-related items in prospective clinical trials. This system consists of questions regarding changes in respiratory symptoms, use of healthcare resources and use of respiratory medication in the 2 weeks prior to the call (see online supplementary material). The questions are based on common and recommended definitions of exacerbation, i.e. symptom- and event-based exacerbations [8]. Once every 2 weeks, a patient with COPD receives an automated telephone call with a real-life voice on the day and time of his/her own preference. If the call cannot be answered a new attempt is made up to four times in the following hour. Prior to the current study, we pre-tested TEXAS in a small group of COPD patients (n=8) and healthcare professionals (n=9) and, as a consequence, made minor adjustments to the structure and contents of the system.
Study definitions of exacerbation
TEXAS enables researchers to detect exacerbations based on various existing definitions. We used four of the most common and generally accepted definitions of exacerbation, i.e. two symptom-based and two event-based definitions (table 1). The symptom-based definitions were based on the concept of major (dyspnoea, sputum purulence and sputum amount) and minor symptoms (common cold, wheeze, sore throat and cough) [1, 17]. We used exacerbation definition A as our primary definition, as this is the one most often used definition in COPD studies (it has, for instance, been used consistently in all East London Cohort study reports [1, 4, 18]). Definition B has recently been used in studies on the self-treatment of acute exacerbations [19, 20]. The two event-based definitions were modified from recent landmark COPD trials [21–23].
Procedures
Baseline assessment included mapping of demographic characteristics, respiratory symptoms, smoking history, respiratory medication use, and spirometry before and after administration of 400 μg salbutamol via a Volumatic® spacer (GlaxoSmithKline, Uxbridge, UK). All participants were instructed how to respond to the TEXAS calls, and received a laminated summary card with the precise questions and response categories for the calls. Participants were also instructed how to use a weekly paper diary card (conventional recording method) containing questions about changes in respiratory symptoms, use of respiratory medication and use of unscheduled healthcare services (see online supplementary material). After 2 weeks, participants’ experiences with and handling of the TEXAS calls were reviewed, and the formal observation period started. During the study, the results of the calls were monitored on a website that had been specifically designed for the study. This enabled us to contact the patient when two or more consecutive call days showed missing data.
Observations made during home visits by lung function technicians served as gold standard; with the information collected, the responses of the patients to TEXAS could be verified. The technicians were employed at the Dept of Pulmonary Diseases, and were equally experienced in interviewing COPD patients and measuring their lung function. Home visits included spirometry (data not shown), and a standardised interview including questions about changes in respiratory symptoms, use of respiratory medication and unscheduled healthcare utilisation in the preceding 2 weeks. The interviews consisted of more questions than TEXAS, and the questions that were also asked in TEXAS were put in a different order. All calls that met exacerbation definition A were considered positive and were followed by a home visit. For each participant, two randomly selected negative calls were also followed by home visits to serve as negative-control episodes. These visits were scheduled ≥4 weeks after a positive call. Home visits were scheduled within 3 days of a positive or negative call. The visiting technicians were not informed whether the call had been positive or negative for an exacerbation.
Copies of patients’ medical records were requested from the patients’ general practitioners and chest physicians at the end of follow-up. Two investigators (E.W.M.A. Bischoff and J. Molema) independently extracted exacerbations from the combined medical records using standardised exacerbation extraction forms based on the four definitions as displayed in table 1 (interobserver variability: Cohen’s κ 0.82–0.94). The completed paper diary cards were collected on a monthly basis using pre-paid return envelopes. At the end of follow-up, all participants completed a short questionnaire to review their experiences with TEXAS (see online supplementary material).
Analyses
A new exacerbation event was defined as an event that was preceded by 2 weeks in which no major symptoms had changed (symptom-based definitions), or no use of antibiotics and/or prednisolone, or unscheduled physician contacts had been recorded (event-based definitions). Exacerbation recovery was defined as a period of ≥2 weeks in which no worsening of any major symptom or use of antibiotics, prednisone or healthcare services was reported after a previous period in which either one or more major symptoms had worsened or oral medication or healthcare services were used. If an event was preceded by missing data, the event was considered as missing and excluded from further analysis.
Common diagnostic test characteristics (sensitivity, specificity, and positive and negative predictive values [24], with 95% confidence intervals) were calculated to establish the diagnostic validity of the TEXAS calls relative to the gold standard, i.e. the information collected during the home visits. Diagnostic odds ratios were estimated by logistic mixed models via residual pseudo-likelihood with subject as a random effect. Diagnostic test characteristics and odds ratios were calculated for all four study definitions of exacerbation (table 1).
We counted the number of exacerbations recorded by TEXAS, using the paper diary cards and the combined medical records for each exacerbation definition. To adjust for the effect of differences in follow-up time, we used a time-weighted statistical approach [25]. Exacerbation rates were expressed as number of exacerbations per patient per year, and were compared between TEXAS and the diary cards, and TEXAS and the medical records, using weighted rate ratios [26]. Statistical significance was tested using a negative binominal regression analysis [26]. Compliance was calculated by counting the complete, incomplete and missing TEXAS calls and paper diaries. Paired t-tests were used to compare patients’ compliance and preferences between TEXAS and the diary cards.
SPSS version 16.0.2 (SPSS Inc., Chicago, IL, USA) was used to calculate the diagnostic test characteristics and paired t-tests. SAS version 9.2 for Windows (SAS Institute Inc., Cary, NC, USA) was used for regression analyses. We considered p<0.05 as statistically significant.
RESULTS
Study population
Of the 86 patients enrolled in the study, five (5.8%) patients withdrew their participation during the observation period (fig. 1). The total time of follow-up was 4,226 weeks, or 49.1 weeks per patient. Table 2 shows the characteristics of the study population at baseline. The majority of the patients were male. Most patients were ex-smokers, and were using a combination of a long-acting bronchodilating agent and an inhaled corticosteroid.
Flow chart of participants in the study. COPD: chronic obstructive pulmonary disease. #: the training period enabled participants to become familiar with responding to the telephonic exacerbation assessment system (TEXAS) and completing paper diaries; after 2 weeks, participants’ experiences with and handling of the TEXAS calls and the paper diaries were reviewed and the formal observation period started. ¶: five patients did not report any symptom changes that matched exacerbation definition A; +: medical records of two patients excluded due to incomplete data, both chest physician and general practitioner medical records were missing.
Process of TEXAS calls
Overall, 2,850 call attempts were made on 2,078 scheduled call days (mean±sd 24.2±3.8 call days per patient). On 1,572 (75.6%) days, a call received input from the patient at the first attempt; on 252 (12.1%) days, input was received after several attempts; and on 254 (12.2%) days, there was no input. Reasons for not providing input were hospitalisation (11 call days), not willing to be called during holidays (43 call days), not able/willing to answer the call (26 call days) or unknown (174 call days). So, a total of 1,824 (87.8%) call days resulted in useful data entry and were therefore considered successful. The mean±sd duration of a successful call was 192.8±45.6 s.
Validity of TEXAS
81 patients received 292 home visits (median of four visits per patient, interquartile range three to five visits per patient). Five patients did not report any symptom changes that matched exacerbation definition A during their observation period. Two home visits were excluded due to incomplete interview data. Mean±sd time between date of the TEXAS call and date of the home visit was 2.3±1.4 days. 190 (65.1%) home visits were scheduled following a call that met exacerbation definition A. In 156 (82.1%) of these visits, the interview data matched the responses of the patients to TEXAS. Table 3 shows the diagnostic test characteristics of TEXAS using the various study definitions of exacerbation. Regardless of the definition used, sensitivity and specificity of TEXAS were high and varied between 66.2% and 97.8%. Sensitivity was lowest and specificity was highest when using exacerbation definition D. Diagnostic odds ratios were high, but highest when using event-based definitions.
Comparison of TEXAS with other tools
At the end of follow-up, 3,378 diary cards, 82 (95.3%) general practitioner medical records and 84 (97.7%) chest physician medical records were received. 15.9% of the exacerbations documented in the medical records lacked any data about changes in major or minor respiratory symptoms. Table 4 shows the exacerbation rates for the different detection methods as well as the rate ratios for TEXAS relative to the paper diary cards and medical record review. Compared with the diary card method, counting exacerbations with TEXAS resulted in statistically significant higher occurrence rates for definitions B and C. Also, TEXAS revealed more exacerbations with two or more major symptom changes (definition B) that were not reported to healthcare professionals, compared with the diary cards (47.4% versus 37.6%, respectively). Compared with the medical record review method, TEXAS resulted in significantly higher exacerbation rates for definitions A, B and C.
Table 5 displays patients’ compliance with and preferences for TEXAS compared with the diary cards. Overall, compliance with TEXAS was higher than with the diary cards, i.e. more registration weeks and more weeks with complete data. The difference in the mean number of weeks with complete registration per patient was almost 1 month in the 12-month observation period. 76 (88.4%) patients responded to the questionnaire about patients’ experiences with TEXAS. Most patients (96.5%) found TEXAS easy to use (data not shown), but no significant differences in patients’ preferences were observed.
DISCUSSION
We assessed the validity of TEXAS to record COPD exacerbations in prospective clinical studies, and compared its detection rate, compliance and patient preference with conventional recording methods (weekly paper diary cards and medical record review). The validity of TEXAS was high, independent of the exacerbation definition used. Detection rates and patients' compliance in providing exacerbation-related information differed significantly between the recording strategies, but were highest with TEXAS. Patient preference did not differ significantly between TEXAS and the paper diary cards.
When assessing the validity of any instrument, deciding on the gold standard (i.e. the generally accepted method to measure the outcome) is crucial [27]. TEXAS was developed to record exacerbations based on common definition criteria, such as symptom changes or events [8]. Consequently, we used as our gold standard the information on worsening of symptoms, use of oral medication and/or use of healthcare services that was collected by standardised personal interviews by well-trained professionals during home visits. We believe that this was (and still is) the best available procedure to address our primary study objective.
As the definition of exacerbation has an impact on the effect size of interventions [8, 9], we assumed that it would also affect the validity of our detection method. Therefore, we used four different but commonly used definitions of exacerbation. Overall, the validity of TEXAS was high, but it did differ between the respective exacerbation definitions. Positive predictive values varied between 75% (symptom-based definition) and 90% (event-based definition), which indicates a potential, but small, number of false-positive exacerbations, particularly when using symptom-based definitions. With negative predictive values of >85%, only a few true exacerbations will be missed. The differences in sensitivity suggest that patients were more prone to record symptom changes than use of healthcare services (sensitivity of 91.2% versus 66.2%, respectively), which makes TEXAS less suitable to detect this type of event-based exacerbation. The differences in specificity suggest that patients may perform better in recording the absence of healthcare utilisation than the absence of symptom deterioration (specificity of 97.8% versus 71.4%, respectively).
To further evaluate the validity of TEXAS, our results should be compared with other studies. Recently, the Exacerbations of Chronic Pulmonary Disease Tool (EXACT) patient-reported outcome (PRO) electronic diary has been developed to measure exacerbation frequency, duration and severity [28]. EXACT-PRO introduces a new concept of exacerbation, which makes it the best available method to give insight into the clinical course of an exacerbation. However, exacerbations measured with EXACT-PRO cannot simply be compared with exacerbations based on other definitions. Obviously, this is a benefit of TEXAS; its content has been based on existing and commonly accepted definitions of exacerbation. However, unlike EXACT-PRO, TEXAS fails to provide detailed information on the precise duration and day-to-day clinical course of an exacerbation.
We demonstrated that the use of different detection methods can result in different exacerbation rates. This is important information in view of the interpretation of studies that use exacerbation rates as an outcome. The low number of symptom-based exacerbations retrieved from the medical records should be interpreted with caution, as the general practitioners and chest physicians were not instructed a priori how to record symptom-related items. Also, when defining exacerbation as the use of prescriptions of antibiotics and/or prednisolone, exacerbation rates were lowest with the medical record review method. This may have been caused by the use of standing prescriptions to support exacerbation self-management by patients. With TEXAS, exacerbation rates based on symptoms were much higher than based on unscheduled healthcare utilisation. This is consistent with previous studies that showed that only half of the exacerbations are reported to healthcare professionals [1, 29].
The higher exacerbation detection rates of TEXAS may be related to the higher compliance rate. With TEXAS, patients had one additional month of complete registration data and, as a consequence, of capturing exacerbations compared with the paper diary cards. We adjusted the exacerbation rate for differences in study follow-up time (time-weighted approach) [25, 26], but believe that adjusting for compliance would provide a better estimate of the exacerbation rate. The high compliance is consistent with a previous study on compliance with paper and electronic diaries [30], and can be explained by the benefits of the system, i.e. it requires less self-discipline, patients are called on their preferred day and time, and when using mobile telephones, patients do not have to stay at home. Although not statistically significant, more patients preferred TEXAS compared to the diary cards. A benefit for researchers is the automated data collection, which diminishes the costs usually spent on manual data collection, i.e. sending and receiving paper diaries, and manually entering and cleaning data in a database.
In recent years, the use of telephone devices as instruments to capture exacerbations has rapidly evolved in COPD care. Telehealthcare for COPD seems to have the potential to affect the quality of life of patients and the frequency of emergency department visits [31]. Before we can recommend TEXAS as an instrument for telemonitoring or self-management purposes (i.e. rapid intervention when an exacerbation is imminent), it should be tested in a controlled study that has been specifically designed for this objective.
Our current study has several limitations. First, it may have been affected by bias. Diagnostic suspicion bias was prevented by blinding the technicians who performed the home visits regarding the responses of the patients to TEXAS. However, our results may have been influenced by incorporation bias, as, inevitably, the questions that make up TEXAS were comparable with the questions asked during the home visits. By asking more questions than within TEXAS and by changing the order of the questions during the home visit interviews, we believe that this type of bias has been limited. Secondly, TEXAS is not a daily dairy card, but measures exacerbations once every 2 weeks. Hence, when using TEXAS, researchers are unable to detect the exact start and end dates of an exacerbation. Additionally, the time-frame of 2 weeks may have introduced recall bias, which may have had more impact on exacerbations based on symptom changes than on the use of healthcare services. Thirdly, with >1,600 negative calls, it was not realistic to perform a home visit after every call that did not meet our exacerbation definition. Therefore, we verified the absence of an exacerbation in two random negative calls per patient. We believe that this has resulted in an accurate estimate of the diagnostic test characteristics.
In conclusion, this study shows that TEXAS is a valid method to detect exacerbations in prospective clinical COPD studies. Its exacerbation rates and compliance appear to be higher than those of conventional detection methods. The differences in exacerbation rates between the different detection tools indicate that the recording strategy should be taken into account when comparing study results on exacerbation outcomes. Future studies should, therefore, provide at least a detailed description of the exacerbation recording procedure.
Acknowledgments
The authors are grateful to the patients who participated, the lung function technicians for the home visits, the family physicians and chest physicians for their medical records, B. Robberts (Dept of Primary and Community Care, Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands) for data entry and data cleaning, and AlphaComm Solutions BV (Rotterdam, the Netherlands) for designing the automated call facility and accompanying web application.
Footnotes
This article has supplementary material available from www.erj.ersjournals.com
Support Statement
The study was financed by the Radboud University Nijmegen Medical Centre (Nijmegen, the Netherlands) with additional support from Boehringer Ingelheim BV (Ingelheim, Germany). The funding sources had no involvement in the study design, data collection and analysis, data interpretation, writing of the report or the decision to submit the paper for publication. The corresponding author had full access to the whole dataset and had final responsibility for the decision to submit for publication.
Statement of Interest
A statement of interest for the study itself can be found at www.erj.ersjournals.com/site/misc/statements.xhtml
- Received April 4, 2011.
- Accepted August 30, 2011.
- ©ERS 2012