This report concerns the development and validation of two patient-reported outcomes questionnaires developed to assess chronic obstructive pulmonary disease (COPD) patients’ ability to perform morning activities and to evaluate their morning symptoms.
Based on interviews with COPD patients, the Capacity of Daily Living during the Morning (CDLM) questionnaire and the Global Chest Symptoms Questionnaire (GCSQ) were developed, linguistically validated and incorporated into two multicentre, randomised trials involving a total of 1,100 COPD patients; those trials were registered at ClinicalTrials.gov (NCT00496470 and NCT00542880) . Data from these trials were used to determine the reliability, validity and responsiveness of the questionnaires and to derive estimates of minimal important differences (MIDs).
Both questionnaires displayed good-to-high reliability (Cronbach’s α 0.75–0.93). Analysis of convergent validity showed that CDLM and GCSQ scores correlated significantly (p<0.001) with symptoms, health-related quality of life (HRQoL) and use of rescue medication. In both trials, CDLM and GCSQ scores discriminated between patients with different levels of HRQoL, as assessed by the St George's Respiratory Questionnaire for COPD patients (SGRQ-C), but not with disease severity, as assessed by the Global Initiative for Chronic Obstructive Lung Disease (GOLD) criteria. A significant improvement in CDLM and GCSQ scores occurred in response to treatment. Estimations of MID scores, corresponding to an SGRQ-C MID of 4, were 0.20 for the CDLM questionnaire and 0.15 for the GCSQ.
Both the CDLM questionnaire and the GCSQ are easy-to-use, reliable, responsive, self-administered questionnaires that report on patients' symptoms and ability to perform morning activities.
- Capacity of Daily Living during the Morning questionnaire
- chronic obstructive pulmonary disease
- Global Chest Symptoms Questionnaire
- morning symptoms and activities
- patient-reported outcomes
Traditionally, asthma has been perceived as a variable condition that is worse at night and in the early morning, whereas chronic obstructive pulmonary disease (COPD), between exacerbations, is regarded as a less variable disease. However, there is increasing evidence that COPD displays diurnal variability in physiological parameters of lung function (inspiratory capacity, forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC) and peak expiratory flow (PEF)) 1–4.
Quantification of what this diurnal variability in COPD might mean in terms of patient symptoms has recently been undertaken and has found that most patients, especially those with severe disease, reported the morning as the most burdensome time, impacting on routine morning activities such as washing, drying and dressing 5. Existing patient-reported outcome (PRO) questionnaires utilised in evaluating the impact of treatment on patients with COPD do not specifically address this diurnal/chronobiological variability 6. Furthermore, the heterogeneity of COPD implies that no single measure can fully reflect the burden of the disease on the patient, and it is increasingly recognised that relying on physiological end-points does not capture the full experience of patients and the impact of the disease on their daily activities and health-related quality of life (HRQoL) 7–10.
As the morning appears to be an especially troublesome time of day for patients with COPD, there is a need for PRO questionnaires to assess the burden and extent of morning symptoms and the ability of patients to perform morning activities. However, there are no validated PRO instruments that specifically capture morning symptoms or the ability to perform morning activities. The aim of the present study is to describe the development and validation of the Capacity of Daily Living during the Morning (CDLM) questionnaire and the Global Chest Symptoms Questionnaire (GCSQ) as instruments for the evaluation of COPD patients’ ability to perform routine morning activities and to assess morning symptoms.
Development of the CDLM questionnaire and GCSQ
The development of the PRO tools to assess morning activities and morning symptoms followed an iterative process of patient interviews and respiratory expert review 11.
Concept and item generation
Eight patients (five males, two current smokers and six previous smokers, mean age 68.1 (range 47–86) yrs) with severe COPD (diagnosis of disease with symptoms >2 yrs) were interviewed during May–June 2006 to identify the relevant issues of importance to the patients in general and those related to their morning activities in particular. A structured interview guide was developed and the interviews were administered by a psychologist, aiming to capture in a uniform manner patients’ experiences of COPD in general, their symptoms and feelings in the morning, perceptions of the disease’s impact and progression, and how treatment affects their disease. Based on patients’ descriptions, questions and response options were drafted to gather feedback regarding the patients’ ability to perform morning activities and a global impression of symptoms experienced by the patient at the time of query.
The interview results were assessed by a group of respiratory experts, comprising three respiratory physicians and two health outcomes scientists from the UK, France, Sweden and Poland, to develop concepts and design questions and response options for use in the development of the CDLM questionnaire and the GCSQ. To confirm that questions and response options were clearly understandable and simple to answer, seven further patients with COPD (five males, four current smokers and three previous smokers, mean age 66.7 yrs) were interviewed using a cognitive debriefing interview technique. Six of the seven patients found the questionnaires easy to understand and answer. Subsequently, specialists in respiratory medicine evaluated the interview results and provided input as to the final wording, the selection of final items, and the structure of the CDLM questionnaire and the GCSQ. At this stage, three items were deleted because the evidence indicated they did not measure the intended concepts. The final questionnaires were translated into 21 different languages and linguistically validated prior to use in clinical trials.
Administration of the questionnaires
The CDLM questionnaire was developed as a self-administered daily assessment, and in the present clinical trials it was administered through an e-Diary device (eSense™, PiKo®; PHT Corporation, Geneva, Switzerland). The patients were required to: 1) report on their ability to carry out six different morning activities; and 2) rank the difficulty of performing each of those activities on a five-point Likert-type scale ranging from 0 (not at all difficult) to 4 (extremely difficult) (table 1). To capture the effect of medication, patients were instructed to complete the questions when they had finished all of their morning activities. To ensure timely completion of the questionnaire, the e-Diary was used as a reminder device; 2 h after opening the e-Diary in the morning, an electronic alarm sounded alerting the patients to complete their questionnaire (if they had not already done so). The responses from the two questions for each morning activity were used to calculate a score ranging from 0 (so difficult that the activity could not be carried out by the patient on their own) to 5 (activity was not at all difficult to carry out). There was no weighting of the different morning activities; the total CDLM score was calculated as the average of all morning activities.
The GCSQ was also developed to be self-administered daily, which in the present trials was carried out through an e-Diary device. The GCSQ could be assessed at any specific time to capture the patients’ experience of chest symptoms at that specific moment (table 1). The GCSQ consisted of two questions that required the patient to rate shortness of breath and feelings of chest tightness. The patients recorded their response on a five-point Likert-type scale ranging from 0 (not at all) to 4 (extremely), the total score being calculated as the average score of the two questions.
Validation of questionnaires
To validate the questionnaires, results of secondary analyses of blinded data were used from two multicentre, multinational, randomised studies as follows. Study 1: Welte and co-workers 12, 13, registered at ClinicalTrials.gov (NCT00496470) ; Study 2: Partridge et al. 14, registered at ClinicalTrials.gov (NCT00542880) .
Patient selection criteria were similar in both clinical trials and included patients ≥40 yrs of age, with a clinical diagnosis of COPD and symptoms for ≥2 yrs, at least one COPD exacerbation in the previous 12 months, current or previous smoking habit with a smoking history of ≥10 pack-yrs, pre-bronchodilator FEV1 ≤50% of predicted normal, and FEV1/FVC <70% pre-bronchodilator.
Study 1 was a 12-week, placebo-controlled, parallel-group trial comparing the effect of once-daily tiotropium 18 μg with a combination of tiotropium 18 μg plus twice-daily budesonide/formoterol 320/9 μg (Symbicort® Turbuhaler®; AstraZeneca, Lund, Sweden 12, 13; The Symbicort® dry powder formulation Turbuhaler® is not currently approved in the USA). Study 2 compared, in a cross-over design, the effect of budesonide/formoterol 320/9 μg or fluticasone/salmeterol 500/50 μg (Seretide® Diskus®, GlaxoSmithKline, Greenford, UK), both one inhalation twice daily for 1 week each, separated by a 1–2-week washout period, during which time patients received their prescribed inhaled corticosteroid 14.
Morning activities and morning symptoms assessment
In both studies, patients completed the GCSQ before, and at 5 and at 15 min post-morning dose, while the CDLM questionnaire was completed before noon, after completing all morning activities.
Lung function assessment
Morning pre- and post-dose FEV1 and PEF, in both studies, were measured at home and transmitted wirelessly to an e-Diary. In both trials, post-bronchodilator FEV1 was also assessed at the randomisation visit at the clinic. For the purposes of validation in the present study, the pre-dose morning PEF assessed at home and the post-bronchodilator FEV1 assessed at clinical visit (randomisation visit) were used.
St George’s Respiratory Questionnaire for COPD patients
The St George’s Respiratory Questionnaire for COPD patients (SGRQ-C) 15, a modified and improved version of the original SGRQ 16, was administered at clinic visits at the beginning and end of the treatment period. A change of four units in the SGRQ-C total score was estimated to constitute a clinically meaningful change 16, and is known as a minimal important difference (MID).
COPD symptoms and reliever use
In Study 1, the e-Diary was used daily to record breathlessness, cough, chest tightness and sleep (night-time awakenings due to COPD symptoms), with recordings being made each evening using a five-point Likert-type scale, ranging from 0 (symptom not present) to 4 (symptom severe, almost constant). The e-Diary was also used to record reliever use in both Studies 1 and 2.
Clinical COPD questionnaire
The Clinical COPD Questionnaire (CCQ) is a brief measure of clinical control in patients with COPD, consisting of 10 items that are used to capture a total score and three domain scores (symptoms, functional state and mental state) 17. Patients respond to each question on a seven-point Likert-type scale, ranging from 0 (asymptomatic/no limitation) to 6 (extremely symptomatic/total limitation). To assess clinical control in Study 2, the 24-h version of the CCQ was used daily and recorded in the e-Diary every evening before the intake of study medication.
Unless otherwise specified, blinded baseline data obtained in the run-in period were used in the analyses. For the analysis of responsiveness and estimations of MID, changes in scores from run-in to the average over the entire treatment period (e-Diary data) or to the end-of-treatment period (clinic visit data) were used.
Reliability of the questionnaires was examined through assessment of the internal consistency and reproducibility, i.e. test–retest reliability. Reliability was calculated using Cronbach's α and measures of the internal consistency between different items of an instrument. Reliability scores >0.70 were considered predictive of sufficient reliability, while values >0.90 represented high reliability. Reproducibility measures the consistency between two or more quantitative measurements and was assessed using intraclass correlation coefficient (ICC), comparing the scores on day 5 with the scores on day 7 of the run-in period.
Construct validity examines the correlation of a measure being evaluated with variables that are already known to be related to that measurement. Construct validity was assessed by Pearson product–moment correlations between the CDLM or the GCSQ scores with FEV1, PEF, SGRQ-C (total and domain scores), CCQ (total and domain scores; Study 2 only), COPD symptoms (breathlessness, cough, chest tightness and sleep; Study 1 only) and use of reliever medication. Known-groups validity, which examines the ability of a measure to discriminate between specific known groups that may be anticipated to show differences in scores, was tested by comparing the CDLM and GCSQ scores with different levels of disease severity based on quartiles of HRQoL, assessed by the SGRQ-C total score and FEV1.
Responsiveness, i.e. the ability of a measure to detect an underlying change, was used to evaluate long-term changes in CDLM or GCSQ scores, as well as short-term changes in GCSQ scores. The responsiveness of CDLM and GCSQ scores to within-group changes from run-in to the treatment period was evaluated by measuring effect size (ES), standardised response mean (SRM) and t-statistics. ES represents the mean change from baseline divided by the sd of the baseline scores, while SRM represents the mean change from baseline divided by the sd of this change. ES and SRM effect scores were categorised as trivial (<0.20), small (≥0.20 and <0.50), medium (≥0.50 and <0.80) or large (≥0.80). t-statistic values >1.96 were considered significant at the p<0.05 level 18, 19.
ES, SRM and t-statistics were also used to evaluate the ability of GCSQ to detect short-term within-group changes from pre-dose to 5- and 15-min post-morning dose. In Study 1, these short-term changes were assessed during treatment weeks 1, 6 and 12 (for 7 consecutive days in each treatment period); in Study 2, assessment was during the 1-week treatment period. To ascertain the ability of GCSQ to reflect real changes, GCSQ changes were also correlated to the observed changes in PEF measurements collected at pre-dose and at 5- and 15-min post-morning dose.
Anchor- and distribution-based methods were used to estimate MIDs. Changes from the run-in to the treatment period in CDLM and GCSQ scores were regressed using the frequently used MID estimate of 4 for SGRQ-C 16, with geometric regression to reduce measurement errors. This estimate was then evaluated in relation to distribution-based estimates of 0.5 sd 20 and the standard error of measurement (SEM) 18, 19. The distribution-based approach was used to provide support for the anchor-based estimate.
The demographic and clinical characteristics of the subjects in Studies 1 and 2 were similar (table 2).
Cronbach’s α and ICC calculations showed that both the CDLM questionnaire and GCSQ exhibited good-to-high reliability (table 3).
Pearson product–moment correlation coefficients were generated for CDLM and GCSQ scores, utilising lung function measures, disease symptoms and HRQoL (CCQ and SGRQ-C) data (table 4).
As an expression of divergent validity (defined as a lack of correlation between conceptually unrelated measures), the correlations between the physiological measures and CDLM and GCSQ were very low (as measured by PEF), or nonsignificant (as measured by FEV1), in both Studies 1 and 2 (table 4). As evidence of convergent validity (defined as validity supported by a substantial correlation of conceptually similar measures), the correlations between COPD symptoms and HRQoL measures, and CDLM and GCSQ scores were moderate to high (table 4). Symptoms of breathlessness, cough, chest tightness and night-time awakenings (assessed only in Study 1) showed high and significant correlations (p<0.001) in the expected direction with CDLM and GCSQ scores. CCQ (total, symptoms, functional state and mental state (assessed only in Study 2)) also showed high and significant correlations (p<0.001) in the expected direction with CDLM and GCSQ scores. Furthermore, in both trials, SGRQ-C (total, symptom, activity and impact), as well as total rescue medication use showed low to moderately sized but statistically significant correlations (p<0.001) in the expected direction with CDLM and GCSQ scores (table 4).
Known-groups validity was assessed by evaluating how CDLM and GCSQ scores differed for various quartiles of patients, grouped according to their SGRQ-C scores (fig. 1) or FEV1 values according to Global Initiative for Chronic Obstructive Lung Disease (GOLD) severity stage (fig. 2).
In both Studies 1 and 2, CDLM scores discriminated between patients with different levels of HRQoL, as assessed by SGRQ-C (fig. 1a). According to separate evaluations of linear trends, the CDLM scores differed significantly between quartiles of patients in both studies (Study 1: t(533) = -14.26, p<0.001; and Study 2: t(357) = -10.69; p<0.001). Similarly, testing of linear trends showed that GCSQ scores differed significantly between quartiles of patients with different disease severity in both studies (Study 1: t(642) = 13.57, p<0.001; and Study 2: t(433) = 7.24, p<0.001) (fig. 1b). However, when GOLD stage severities were used as the measure of disease severity, the CDLM and the GCSQ scores did not distinguish between patients at different GOLD stages in either study (fig. 2).
Long-term responsiveness of the instruments was assessed by comparing end-of-treatment CDLM and GCSQ scores with those during run-in (table 5), whereas short-term effects were measured by comparing scores pre- and post-dose (table 6). In both Studies 1 and 2, significant (p<0.001) improvements in CDLM and GCSQ scores were seen in response to treatment in comparison with the baseline assessments, indicative of responsiveness (table 5). In Studies 1 and 2, ES values ranged 0.23–0.24 for the CDLM score and 0.24–0.32 for the GCSQ score; the corresponding SRM values ranged 0.36–0.44 for the CDLM score and 0.34–0.42 for the GCSQ score.
The changes in GCSQ from pre-dose to 5- and 15-min post-morning dose were significant in both trials, with ES ranging -0.27– -0.30 at 5 min post-dose and -0.40– -0.48 at 15-min post-dose. The corresponding SRM values ranged -0.68– -0.69 at 5 min post-dose and -0.79– -0.89 at 15-min post-dose. A relatively large variability was observed in both GCSQ and PEF; thus, the correlation between the change in GCSQ and PEF assessments, although statistically significant, was weak (table 6).
Estimation of minimal important differences
To enable better understanding of what represents a clinically meaningful change, anchor- and distribution-based procedures were used to estimate MIDs for the CDLM questionnaire and GCSQ (table 7). The geometric regression analyses with SGRQ-C, using the established MID of 4, showed that the corresponding change of four units on SGRQ-C ranged between 0.15 (Study 1) and 0.22 (Study 2) for CDLM, and between 0.12 (Study 2) and 0.13 (Study 1) for GCSQ. Scatter plots were used to visualise the distribution of changes in CDLM and GCSQ scores and their corresponding changes in SGRQ-C scores, on which these geometric regression analyses were made (see figure 1 of the online supplementary material).
As a means of triangulating these MID estimates, the recommended distribution-based estimates SEM and 0.5 sd 20 were also calculated 18, 19. The SEM range was 0.19–0.28 for CDLM and 0.22–0.27 for GCSQ. The 0.5 sd values were somewhat higher, ranging 0.40–0.42 for the CDLM score and 0.32–0.42 for GCSQ score (table 7), thus corroborating the findings of the regression analyses.
Using secondary analyses of data from two multicentre randomised trials 12, 14, we found that the morning questionnaires developed here, the CDLM questionnaire and the GCSQ, were reliable and responsive to treatment effects and could discriminate between patients with different levels of HRQoL as assessed by SGRQ-C.
It is increasingly acknowledged that COPD symptoms and the ability to perform daily activities are at their worst in the mornings compared with other times during any 24-h period 11, 21, 22. Although a number of PRO questionnaires have been developed which assess symptoms and HRQoL in patients with COPD, none of these specifically measure the impact of the disease in the morning and in particular, how the disease affects important routine morning activities, such as washing, drying, dressing and eating breakfast.
It should be highlighted that chest symptoms (as assessed by GCSQ) may impact on a patient’s ability to carry out various activities (as assessed by the CDLM questionnaire), indicating that there is some relationship between the tools. However, it should also be stressed that GCSQ and the CDLM questionnaire measure distinct aspects of the disease, necessitating the validation of both tools. Validation assessment of the CDLM questionnaire and GCSQ showed both to be reliable, with expected correlations observed with all measures. Both the CDLM questionnaire and GCSQ showed moderate-to-high correlations with the conceptually related measures of COPD symptoms, HRQoL (SGRQ-C and CCQ) and use of rescue medication. As the CDLM questionnaire aims to capture morning activities, while the GCSQ captures global chest symptoms, the CDLM scores showed somewhat weaker correlations with COPD symptoms and the symptoms domains of SGRQ-C and CCQ compared to the correlation of these domains with GCSQ. Similarly, the CDLM questionnaire showed somewhat stronger correlations with the activity domain of SGRQ-C compared to the GCSQ. Both questionnaires differentiated between groups of patients with different levels of HRQoL, as assessed by SGRQ-C, and therefore showed evidence of known-groups validity. In this way, both the CDLM questionnaire and the GCSQ were able to distinguish between different levels of HRQoL as indicators of disease severity. The CDLM questionnaire and the GCSQ showed less correlation with PEF and no correlation with FEV1; this was observed in both trials. Similarly, the CDLM and GCSQ scores did not distinguish between patients with different levels of disease severity when using GOLD severity stage, based on the physiological measure FEV1. This is in line with previous suggestions that the GOLD stages do not make a distinction between which COPD patients are active or inactive in daily life 22 and is consistent with other data showing poor correlation between PRO measures and physiological parameters such as FEV1 or dyspnoea score 23. However, it should be noted that in the present study, GOLD stage I patients were not represented and the majority were classified as GOLD stage III patients.
The CDLM questionnaire and GCSQ scores revealed significant improvements in response to the changes in treatment from run-in to treatment period in both Studies 1 and 2. The ES and SRM values obtained reflect small-to-moderate effects for these long-term changes 24. With regard to the short-term effects, GCSQ showed significant changes from pre-dose to 5- or 15-min post-morning dose, with greater changes observed at the 15-min post-dose assessment. The correlation between GCSQ changes and the corresponding alterations in PEF were weak, which may have been due to the relatively large variability in changes in GCSQ (range -2.00–0.75 and -2.50–1.00 for changes at 5- and 15-min post-dose, respectively) and in PEF (range −96.7–109.0 and -96.0–175.9 for changes at 5- and 15-min post-dose, respectively). The low correlation between changes in GCSQ and PEF indicates the limitations in GCSQ’s ability to reflect relevant changes observed in PEF measurements. It is worth noting that there was a stronger correlation between the GCSQ and the 15-min post-dose PEF assessment (compared with the 5-min post-dose PEF measure), which may indicate that the GCSQ requires a longer treatment time frame to manifest a significant response.
The CDLM questionnaire and GCSQ were responsive in both studies, and differences recorded by SGRQ-C assessment correlated with changes in CDLM and GCSQ scores in the expected direction. An anchor-based approach was used to estimate corresponding MIDs for CDLM and GCSQ scores. Distribution-based approaches (i.e. SEM and 0.5 sd) provided confirmation that these estimates were in a reasonable range, although with somewhat higher estimates of 0.5 sd. As has been pointed out previously 18, distribution-based methods provide guidance and support on clinically significant and meaningful changes, but do not define a minimally important change. It may be argued that the 12-week study (Study 1) should be given more weight than Study 2 in estimating MIDs, since it allows for sufficient changes in the anchor SGRQ-C to occur. However, inclusion of multiple trials should strengthen validation of new tools and consequently, taking Study 2 data into account, we suggest that MIDs of 0.20 for the CDLM questionnaire and 0.15 for GCSQ are the most reasonable point estimates based on the present dataset. As with all MID estimates, future clinical trial data may provide further support and adjustments to these estimates. The consistency observed in the reliability, validity and responsiveness across the two trials further supports the robustness and utility of the CDLM questionnaire and GCSQ tools to evaluate morning activities and morning symptoms in clinical trials, despite the differences in treatment during the run-in and treatment phase as well as the different duration of the trials.
A limitation of this study is the small number of patients used to develop the CDLM questionnaire and the GCSQ. While the reliability and validity of the questionnaires was assessed using two independent trials involving a total of ∼1,100 patients, the support for the reliability and validity of the tools may be limited to the patient populations studied in these trials, which included patients at GOLD stages II to IV. The validity and responsiveness of these tools for patients with mild COPD (GOLD stage I) remains to be established; however, as the morning is less burdensome to these patients, the need for these tools amongst this group is reduced. Although only 25% of the patients in Studies 1 and 2 were female, no differences between sexes were observed in either trial in the analyses of score changes from baseline to treatment period. Although assessment of differences between sexes was not an objective of the present report, and the studies were not powered for such an analysis, this finding suggest no differences between sexes are expected in the performance of the instruments developed here.
The questionnaires were developed to be self-administered daily, and in the present trials e-Diaries were used for data capture; however, we did not assess the possible impact of health literacy of the subjects on the administration of the questionnaires via this device. Validation was via e-Diary, which provides a more reliable means of data capture and should thus avoid the problem of missing data. Potentially, the questionnaires could also be administered in paper format, but further work would need to be done to assess the validity of utilising the questionnaires in this way.
Both the CDLM questionnaire and the GCSQ are reliable and responsive PRO questionnaires that can measure the ability of patients with COPD to perform morning activities and their morning symptoms, respectively. Estimations of MID scores, corresponding to an SGRQ-C MID of 4, were 0.20 for the CDLM questionnaire and 0.15 for GCSQ. The CDLM questionnaire and GCSQ could be incorporated into multinational clinical trials to assess the impact of COPD on morning symptoms and the patient’s ability to perform morning activities. Further evaluation of the CDLM questionnaire and GCSQ will determine the utility of these tools in general clinical practice.
This article has supplementary material available from www.erj.ersjournals.com
This study was funded by AstraZeneca. Editorial assistance with the development of this manuscript was provided by M. Tadayyon from MediTech Media Ltd (London, UK). M.R. Partridge was involved in the original development of the questionnaires, in their refinement and their use (as a principal investigator in Study 2), and has been involved in every draft of the paper. M. Miravitlles participated in data interpretation and has been involved in every draft of the manuscript. E. Ståhl was the initiator of the development of the questionnaires, supported the validation work undertaken, and advised on the manuscript during its preparation. N. Karlsson was involved in the planning, conducting, data analyses and writing of the manuscript and the work described. K. Svensson was involved in statistical analyses and data interpretation of the work described. T. Welte was primary investigator in one of the included studies (Study 1) and was involved in the discussion of the study results as well as the preparation of the manuscript.
Statement of Interest
Statements of interest for all authors of this study and for the study itself can be found at www.erj.ersjournals.com/misc/statements.dtl
- Received August 3, 2009.
- Accepted October 23, 2009.
- ©ERS 2010