Abstract
The divergence between clinical trial results and real-world outcomes is largely unknown for many cancer types. The present study aims overall to assess the efficacy–effectiveness gap (difference between outcomes in clinical trials and the real world) in systemic treatment for metastatic nonsmall cell lung cancer (NSCLC).
All patients diagnosed with stage IV NSCLC between 2008 and 2014 within a network of seven Dutch large teaching hospitals (Santeon) were studied. For every patient, an efficacy–effectiveness (EE) factor was calculated by dividing individual patients' overall survival (OS) by the pooled median OS assessed from clinical trials with the respective treatment.
From 2989 diagnosed patients, 1214 (41%) started with first-line treatment. For all studied regimens, real-world OS was shorter than OS reported in clinical trials. Overall, the EE factor was 0.77 (95% CI 0.70–0.85; p<0.001). Real-world patients completed their treatment plan less often and proceeded less frequently to further lines of treatment. These parameters together with Eastern Cooperative Oncology Group performance status explained 35% of the variation in EE factor.
Survival of patients with metastatic NSCLC treated with chemotherapy or targeted therapy in real-world practice is nearly one-quarter shorter than for patients included in trials. Patients' performance status, earlier discontinuation and fewer subsequent lines of treatment partly explained this difference.
Abstract
Survival of patients with metastatic NSCLC treated with chemotherapy or targeted therapy in real-world practice is nearly one quarter shorter than for patients included in clinical trials. These real-world data provide useful information for clinicians. http://ow.ly/Khd230minFF
Introduction
Nonsmall cell lung cancer (NSCLC) represents 77% of all lung cancer diagnoses in the Netherlands [1]. The majority of patients present with locally advanced or metastatic disease at the time of diagnosis [2, 3]. The overall 1-year survival rate of metastatic NSCLC (stage IV) is only 22% [1]. Palliative treatment provided to these patients can consist of symptomatic relief by best supportive care or systemic treatment targeted at tumour tissue.
Numerous phase 3 clinical trials have shown the superiority of systemic treatment over best supportive care in patients with metastatic NSCLC [2, 4]. However, data from oncology clinical trials, although providing critical evidence of clinical activity, do not provide adequate information to determine the impact of treatments when used in the real-world setting. Due to strict patient inclusion criteria, important patient characteristics predictive for treatment response are often underrepresented in clinical trials. This results in uncertainty about how results from clinical trials translate to the real-world population. However, this information is crucial for both patients and physicians to decide on what type of treatment to choose, if any, especially in the palliative setting, where quality of life may outweigh the possible extension of survival by systemic therapy.
Some efforts have been made to provide insight into treatment outcomes in real-world populations with lung cancer versus clinical trials. For example, Zhu et al. [5] showed a shorter median survival estimate in routine practice compared to participants in the Eastern Cooperative Oncology Group (ECOG) 4599 trial, and Jungels et al. [6] showed comparable response rates, but shorter survival in the real world for pemetrexed in patients previously treated with chemotherapy. Another example is a Canadian study which showed that second-line treatment with erlotinib resulted in a shorter median survival in clinical practice versus in a clinical trial setting [7]. Outside the few single treatment comparisons, a comprehensive overview of the possible divergence between clinical trials and routine care settings is still lacking for lung cancer [8].
Considering that a significant proportion of metastatic NSCLC patients are treated outside clinical trials, the present study aims to overall explore the effectiveness of systemic treatment in real-world practice versus efficacy data from clinical trials (efficacy–effectiveness (EE) gap), and to look into the factors that may explain a gap.
Methods
Data source
This study was conducted using clinical data originating from a network of seven large (nonuniversity) teaching hospitals geographically spread in the Netherlands, named Santeon (online supplementary appendix 4). Santeon was established in 2007 and serves >12% of the Dutch patient population.
In 2012, Santeon constructed the Care for Outcome registry, which includes clinical and outcome data from all patients diagnosed with lung cancer in one of the hospitals from 2008 onwards. These data include tumour characteristics, patient characteristics, treatment planning and clinical outcomes. Information on how data is procured, standardised and validated can be found elsewhere [9].
Parallel to the Care for Outcome registry, Santeon established the Santeon Farmadatabase, which comprises all prescribed and dispensed drugs at the individual patient level for all patients receiving care in one of the hospitals from 2010 onwards. For every prescription the database includes, among others, drug name, dosage, date of administration and administration route. Full details of the Santeon Farmadatabase are described elsewhere [10].
This study was discussed by a medical research ethics committee and the need for informed consent was waived because the study was considered exempt from review.
Study participants
Within the Care for Outcome registry, all patients with metastatic (stage IV) NSCLC diagnosed between 2008 and 2014 were selected for this study (staging in 2008–2009 was based on the sixth edition of the American Joint Committee on Cancer TNM (tumour, node, metastasis) staging system for lung cancer and from 2010 onwards was based on the seventh edition; in both editions, the same criteria for stage IV NSCLC were used). Patients with vital status “emigrated” (at January 31, 2017) were excluded, due to their unknown survival status. For the remaining patients, the following clinical patient characteristics were collected: age at diagnosis, sex, Charlson comorbidity index (CCI) and separate comorbidities (e.g. diabetes), ECOG performance status (PS), histology and year of diagnosis.
Next, for every patient we searched the Farmadatabase to determine whether or not they received systemic treatment for metastatic NSCLC. For the years 2008 and 2009, individual pharmacy systems were checked for missing data about systemic treatment.
Identification of systemic treatment per patient
For every patient, an overview of systemic treatments was constructed from the recorded prescription data, including the identification of regimens and whether it was first, second or further line of treatment. The initial systemic treatment following the date of diagnosis was defined as first-line treatment. Switching to another regimen >90 days after the theoretical end date of the previous regimen was (pragmatically) considered as a subsequent line of treatment. Switches to another regimen <90 days after the theoretical end date of the previous regimen were reviewed manually to determine a switch due to disease progression (subsequent line) or toxicity (same line of treatment).
Based on all identified regimens, a list was constructed that covered >90% of all first-line treatment regimens. For practical reasons, the remaining (rarely applied) regimens were put into an “other” category.
Systematic literature review for reference outcomes
A systematic literature search and meta-analysis were conducted to obtain a pooled efficacy result for the first-line treatment regimens from the aforementioned list. First, an overview of all corresponding phase III studies was constructed from an extensive search in the PubMed, Embase, and CENTRAL (Cochrane library) databases (searched up to September 1, 2017). The exact details of this search are provided in online supplementary appendix 1. An article was considered eligible if all the following criteria were met. 1) Patients diagnosed with NSCLC; 2) main article of a phase III randomised trial; 3) intervention under study is one of the regimens identified in our data; 4) patients with stage IV disease (possibly in combination with stage IIIB); and 5) overall survival (OS) as outcome. To create a homogenous selection of articles, we excluded articles about a secondary analysis (e.g. subgroup analysis, post hoc analysis or intermediate analysis); articles with randomisation between variations of the same systemic treatment (e.g. differences in timing, dose or duration); articles about systemic treatment together with radiotherapy; articles with randomisation after progression of disease; and articles that did neither provide confidence intervals for the OS nor a survival curve.
The first selection of articles was screened for eligibility based on title and abstract by two reviewers (CC and EvdG or FS or BP). Consensus was sought in case of differences between reviewers. Subsequently, full-text articles were examined by a single reviewer (CC), with a second reviewer in case of uncertainty about eligibility (EvdG). No reference tracking was performed. Online supplementary appendix 2 provides in more detail per regimen the yield of the systematic review and the meta-analysis data. From all included articles, the outcome (OS in months), inclusion period, inclusion criteria, line of treatment, patient characteristics, percentage of patients with a completed treatment plan and percentage of patients who received a subsequent line of systemic treatment was extracted. In case the median OS was not described, this outcome was derived from the survival curve.
The computer programme Review Manager (version 5.3; The Nordic Cochrane Centre, The Cochrane Collaboration, Copenhagen, Denmark; 2014) was used to combine all reported OS outcomes from the clinical trials to estimate the “reference outcome” (pooled OS) per regimen from all included articles. Fixed-effect or random-effect (DerSimonian and Laird) measures were calculated depending on the level of heterogeneity (p<0.05 level). The required standard error has been calculated using the equation se=(upper limit 95% CI – lower limit 95% CI)/2×1.96, or using the interquartile range (IQR) derived from the Kaplan–Meier curve, where sd≈(IQR)/1.35 and se=sd/√N.
Real-world treatment outcomes
For every patient, OS was calculated based on the time between start date of systemic treatment and date of death. Patients still alive at January 31, 2017 were given this end of follow-up date as imputed date of death (n=54). Subsequently, we calculated an EE factor for every patient by dividing the individual real-world OS by the pooled clinical trial OS for the first-line treatment regimen. This EE factor should be interpreted as how an individual outcome relates to the median OS from the corresponding clinical trial (e.g. an EE factor of 0.80 means that survival is 20% shorter). Besides survival, toxicity was assessed using percentage of treatment switches, dose reductions (≤80% of the initial dose) and early discontinuation (fewer than four cycles or tyrosine kinase inhibitor use <1 month) as proxy.
Statistical analysis
Statistical software (SPSS version 24 for Windows; IBM, Armonk, NY, USA) was used for statistical analysis. Continuous data were expressed as mean±sd or median (range) when appropriate. Categorical data were analysed using Chi-square and continuous data using t-tests, rank tests and one-way ANOVA when appropriate. First, to assess the presence of an EE gap, the distribution of the calculated EE factors was tested relative to 1.0 using the Wilcoxon signed-rank test. If that distribution does not have a median of 1.0, the null hypothesis (that the median OS in the real-world population is similar to that reported in the clinical trials) is rejected. The latter test was performed overall and per regimen.
Next, a multivariable linear regression analysis was applied after log-transformation of the EE factor (because of the non-normal distribution) to study the association between the patient characteristics and the magnitude of the EE gap. All available characteristics were explored as potential prognostic factors. In this analysis, missing values were imputed by regression imputation (single run with all available characteristics in the model). The explanatory effect of the final model was assessed by variance analysis.
To assess the robustness of our main analysis, we conducted four sensitivity analyses: 1) calculating the real-world OS not from start of treatment, but based on date of diagnosis and based on start date of systemic treatment minus 14 days (this 2-week window was chosen because most clinical trial protocols require first drug administration within 2 weeks of randomisation); 2) patients still alive at January 31, 2017 were given 1 year extra survival time (imputed date of death: January 31, 2018) instead of end of follow-up date; 3) replacing missing values in the variable ECOG PS by both the highest/worst value and the lowest/best value instead of the imputed value; and 4) a statistical bootstrap analysis to incorporate the uncertainty in the estimated pooled effect of the reference outcome for comparison with the real-world outcome. Therefore, a dataset was created by computer-based sampling (in Excel 2010; Microsoft, Redmond, WA, USA) with replacement from the original observations. The resampling and estimation procedure was repeated 10 000 times, thus providing a distribution of the EE factor with a percentile-based confidence interval.
Results
From the Care for Outcome registry we were able to identify 2989 patients diagnosed with stage IV NSCLC in the period 2008–2014. After the exclusion of seven patients with vital status “emigrated”, the final sample size was 2982 patients. Of these patients, 1214 (41%) received first-line systemic treatment. Table 1 outlines the baseline patient characteristics per regimen and overall. The mean age was 63 years; 54% of all patients had a CCI of 0 (no comorbidities), and the majority (84%) of patients had an ECOG PS of 0 or 1 at the time of diagnosis (5% missing data). The most frequently received chemotherapy regimen was cisplatin-pemetrexed (347 (29%) out of 1214 patients), followed by cisplatin-gemcitabine (n=214, 18%), and carboplatin-pemetrexed (n=213, 18%). Epidermal growth factor receptor inhibitors (gefitinib and erlotinib) were used by 5% of the patients. Overall, eight regimens were responsible for 92% of the variety in applied first-line treatments (n=1122 patients).
For all regimens, the median OS in the real-world setting is shorter than the clinical trial reference OS (table 2), ranging from 1.3 to 7.7 months, depending on regimen. Overall, the distribution of the EE factor is significantly different from a hypothesised median of 1.00 (median EE factor 0.77, 95% CI 0.70–0.85; p<0.001), and the median EE factor is <1.00 for all individual treatment regimens (figure 1).
The multivariable linear regression analysis showed that a patient's ECOG PS is significantly associated with the magnitude of the EE factor (B −0.130, 95% CI −0.163–−0.098; p<0.001). The negative B-value indicates a larger EE gap for patients with a higher/worse ECOG PS (category 0–4). The EE gap was 19% in patients with an ECOG PS of 0–1 (EE factor 0.81, p<0.001) and 61% in patients with ECOG PS ≥2 (EE factor 0.39, p<0.001; n=140, 12%). The other variables in the model (age at diagnosis, sex, CCI (and separate comorbidities), histology and year of diagnosis) showed no statistically significant association with the EE gap. However, ECOG PS alone explained only 5.2% of the variation in EE factor.
The four sensitivity analyses confirmed the robustness of our findings. Independently of how the real-world OS was calculated, there was a significant divergence from the trial outcomes (the EE factor based on date of diagnosis was 0.89 (p<0.001), and the EE factor based on start date of systemic treatment minus 14 days was 0.82 (p<0.001)). Furthermore, giving an additional 1 year survival time to patients still alive at January 31, 2017 resulted in a similar significant overall EE gap (EE factor of 0.77, p<0.001). Replacing missing values in the variable ECOG PS by both the highest/worst value (PS 4) and the lowest/best value (PS 0) instead of the imputed value did not change the finding that ECOG PS is associated with the magnitude of the EE factor (B −0.083, p<0.001 and B −0.126, p<0.001, respectively). Finally, the EE factors per chemotherapy regimen derived from the bootstrap analysis agreed very closely with the ones derived from the main statistical analysis; the same regimens appeared to have a statistical significant EE gap and the largest (although small) difference was for gefitinib (absolute difference of 0.03 between the original and the bootstrapped EE factor).
In real world, patients completed their treatment plan less often than in clinical trials (56% versus 61%, p=0.024), and fewer patients received a subsequent line of chemotherapy in real world compared to patients in clinical trials (34% versus 46%, p<0.001) (table 3). The level of dose reductions ranged up to 39% (carboplatin-gemcitabine) of the patients (online supplementary appendix 3). Overall, switching to another regimen due to toxicity occurred in 12% of the patients (n=131). By adding the variables “early discontinuation (<4 cycles)” and “subsequent line of chemotherapy” to the prognostic model for the EE factor, the explanatory effect of the model increased to 35%.
Discussion
This study showed that in patients with metastatic NSCLC treated with first-line systemic treatment, the median OS is nearly one-quarter shorter in real-world practice than in clinical trials (EE factor 0.77, p<0.001).
To our knowledge, this is the first study that provides a complete overview on the efficacy–effectiveness gap for different systemic treatments (chemotherapy and targeted therapy) in a large unselected population of metastatic NSCLC patients. The constant pattern of reduced effectiveness that we observed adds to the conclusion that the existence of a gap is a general phenomenon irrespective of the systemic treatment regimen. It also exists for targeted therapy.
The magnitude of the EE gap found in our study is in line with previous studies of routine care versus trials for specific regimens in metastatic NSCLC. Zhu et al. [5] observed a median survival estimate of 9.7 months for bevacizumab-carboplatin-paclitaxel in routine practice in stage IV NSCLC patients versus a median survival of 12.3 months for participants in the ECOG 4599 trial (equivalent to an EE factor of 0.79). The study by Sheikh and Chambers [7] showed that second-line treatment of advanced/metastatic NSCLC with erlotinib resulted in a median survival of 5.2 months in clinical practice, whereas the reference clinical trial reported a median OS of 6.7 months (EE factor 0.78).
In previous studies regarding prognostic factors in advanced NSCLC, ECOG PS has been shown to be an important independent prognostic parameter [11]. Patients with ECOG PS ≥2 usually account for a small proportion of patients enrolled in trials of first-line treatment for advanced disease, but represent a significantly higher proportion (up to 30–40%) when population-based surveys are conducted [11]. However, in our study population, the proportion of patients with ECOG PS ≥2 was only 12.5%, which is not very different from the clinical trials in our meta-analyses (15%). This means that, although our data confirm performance status as prognostic factor for worse survival outcome, this is probably not the sole driver of the observed EE gap. Therefore, it needs to be associated with worse outcome in the real-world setting as well as being more prevalent in the real world compared to a clinical trial. This means that other factors play a role. Two possible explanations from our data are the observed lower frequency of patients who received a subsequent line of treatment in the real-world population, and a lower percentage of patients who completed four or more cycles in the real-world population (addition of these two measures increased the explanatory effect of the model from 5% to 35%). These measures may possibly reflect a difference in comorbidity that influences vitality to undergo systemic treatment and/or selection towards more motivated patients in trials. Regarding comorbidities, in our dataset we observed no evident difference in frequencies between patients with or without early discontinuation of their systemic treatment (online supplementary appendix 3, table S11). However, we acknowledge that the available comorbidity data are somewhat granular (e.g. chronic obstructive pulmonary disease (COPD) yes/no instead of COPD stages) and that the list of comorbidities is limited. The latter is supported by the results from the variance analysis, indicating that other unmeasured factors are involved. Other factors that are often mentioned in relation to an EE gap are the many exclusion criteria of clinical trials and that participating in a clinical trial itself is beneficial (Hawthorne effect [12]). Unfortunately, these factors could not be assessed further in this project because the required data were not available in the Care for Outcome registry, nor are the required individual patient-level data from the corresponding clinical trials.
A strength of this study is that it is based on a large unselected population of patients diagnosed with stage IV NSCLC in the Netherlands, providing a general overview of most applied systemic treatment options and their outcomes in a western European country with a high standard level of healthcare. In addition, we captured a time frame of >7 years, reducing the risk for bias from temporal factors. Finally, this study is based on high-resolution data. The Care for Outcome registry includes detailed and validated patient data with a very low number of missing values (only one variable with 5% missing data).
A limitation of this study could be our approach to calculate the real-world OS. Because most clinical trials calculate the OS from date of randomisation, our calculation based on date of first drug administration holds a risk of underestimating the OS. However, it is reassuring that our sensitivity analysis regarding this matter confirmed our main conclusion. Furthermore, one could argue about our approach to compare median OS between real-world and clinical trials primarily. An alternative way could be a hazard ratio based approach, but we considered this not feasible because of the unavailability of individual patient data from the included clinical trials and several other methodological drawbacks when it comes to extracting the necessary proxy data from published papers [13]. Regarding the consequence of not being able to censor patients in our approach, we consider the potential risk of bias rather small, because only 4% of the cohort was still alive at end of follow-up time. Our second sensitivity analysis based on a different imputed date of death for these patients also confirmed the absence of any substantial impact thereof. Thirdly, the reference median OS from the clinical trials might be overestimated, because of the allowance of patients with stage IIIB as well in almost all trials included in the meta-analysis. However, exclusion of trials with >10% of stage IIIB patients from the meta-analysis did result in only very small alterations of the reference outcome, and with very little impact on the calculation of EE factors (overall 0.75, p<0.001). Finally, a drawback could be that the external validity of our findings might change due to the recent introduction of novel treatment options (e.g. immunotherapy and other targeted therapies) not covered by the time frame under study. Nevertheless, currently, the vast majority of patients receive one of the regimens studied in our project. Thus our findings have potential to improve shared decision making by adding insight into real-world effectiveness data to the conversation. For example, the data can be used to inform a male patient with ECOG PS 1 planning to start four cycles of cisplatin-pemetrexed that his OS is likely to be 10 weeks shorter than in the clinical trial report, and that preceding patients like him completed the four cycles in 60% of the cases.
In conclusion, our results show that patients treated in real-world practice have a nearly one-quarter shorter survival than those in clinical trials. The constant pattern of reduced effectiveness shows that the existence of this gap is a general phenomenon irrespective of the type of systemic treatment regimen. Patients' performance status, earlier discontinuation and fewer subsequent lines of treatment partly explain the difference. Clinical decision making should be more strongly founded on real-world results.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material: Appendices 1 to 4. ERJ-01100-2018_Supplement
Acknowledgements
The authors thank M. van Hulst (Martini Hospital, Groningen, The Netherlands), M.J. Deenen (Catharina Hospital, Eindhoven, The Netherlands), S.F. Oude Wesselink (Medisch Spectrum Twente, Enschede, The Netherlands), E.A.F. Haak (OLVG, Amsterdam, The Netherlands), and J. van der Mee (Canisius Wilhelmina Hospital, Nijmegen, The Netherlands) for their efforts to complete and validate the hospital pharmacy data. Furthermore, we thank R.H.H. Groenwold (University Medical Center Utrecht, The Netherlands) for assistance with the bootstrap procedure, and C. Sloof (St Antonius Hospital, Utrecht/Nieuwegein, The Netherlands) for literature search assistance. No one received financial compensation for these contributions.
Footnotes
This article has supplementary material available from erj.ersjournals.com
The Santeon NSCLC study group (collaborators) are: A.J. Polman, Medisch Spectrum Twente, Enschede, The Netherlands; B.E.E.M. van den Borne, Catharina Hospital, Eindhoven, The Netherlands; J.W.G. van Putten, Martini Hospital, Groningen, The Netherlands; A.A.J. Smit, OLVG, Amsterdam, The Netherlands; A. Termeer, Canisius Wilhelmina Hospital, Nijmegen, The Netherlands.
Author contributions: E.M.W. van de Garde, F.M.N.H. Schramel and H.J.M. Groen obtained research funding. C.M. Cramer-van der Welle, E.M.W. van de Garde, B.J.M. Peters, F.M.N.H. Schramel and H.J.M. Groen were involved in conception and study design. E.M.W. van de Garde contributed to study supervision. C.M. Cramer-van der Welle, B.J.M. Peters, E.M.W. van de Garde and O.H. Klungel were responsible for data analysis and interpretation. C.M. Cramer-van der Welle and B.J.M. Peters were responsible for the preparation and writing of the manuscript. All authors contributed to the manuscript and approved the final manuscript.
Conflict of interest: B.J.M. Peters has nothing to disclose. C.M. Cramer-van der Welle has nothing to disclose.
Conflict of interest: F.M.N.H. Schramel has nothing to disclose.
Conflict of interest: O.H. Klungel reports grants from GSK, outside the submitted work.
Conflict of interest: H.J.M. Groen reports fees (paid to institution) for advisory work from BMS, Roche, Novartis, Merck and Pfizer, and institutional grants from Boehringer-Ingelheim, outside the submitted work.
Conflict of interest: E.M.W. van de Garde reports grants from Dutch Cancer Society and Roche Netherlands BV, during the conduct of the study.
Support statement: This work was supported by the Dutch Cancer Society (grant number SAN2016-7942). Funding information for this article has been deposited with the Crossref Funder Registry.
- Received June 12, 2018.
- Accepted October 13, 2018.
- Copyright ©ERS 2018.
This article is open access and distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0.