Abstract
The purpose of our study was to assess robustness of volumetric measurement of malignant pleural mesothelioma (MPM) before and after chemotherapy to modified RECIST (response evaluation criteria in solid tumours) criteria.
30 patients with digitally available chest computed tomography (CT) scans before and after three cycles of chemotherapy were included. Three readers independently assessed tumour response using two different methods: 1) the modified RECIST criteria; and 2) the tumour volumetric approach using dedicated software (Myrian®; Intrasense, Paris, France). Inter-rater reliability of unidimensional and volumetric measurements was assessed using intraclass correlation. Tumour response classification for modified RECIST was compared to the volumetric approach applying unidimensional RECIST volumetric equivalent criteria.
The determination of unidimensional tumour measurement (RECIST) revealed a low inter-rater reliability (0.55) and a low interobserver agreement for tumour response classification (general κ 0.33). Only 14 patients were classified equally. A high inter-rater reliability (0.99) and interobserver agreement (general κ 0.9) were found for absolute tumour volumes (volumetric measurements). 27 cases were classified equally. The number of cases classified as “stable disease” was higher for the volumetric approach using tumour-equivalent criteria compared to modified RECIST.
Volumetric measurement of MPM on CT using Myrian® software is a reliable, reproducible and sensitive method to measure tumour volume and, thus, therapy response after induction chemotherapy.
- Computed tomography
- induction chemotherapy
- malignant pleural mesothelioma
- response evaluation criteria in solid tumours
- therapy response
- volumetry
Although survival rates for patients with malignant pleural mesothelioma (MPM) are still very low, improvement in outcome is observed after multimodality therapy including platinol-based chemotherapy [1]. To assess tumour volume per se as a prognostic marker for overall survival [2] and its response to chemotherapy, adequate methods are necessary. Nowadays, there is no satisfactory “gold standard” technique for tumour measurement in MPM. The reason for this is the irregular “rind-like” growth of the pleural mesothelioma, providing major challenges for adequate tumour assessment. The World Health Organization (WHO) has introduced the bidimensional response criteria for tumours in general [3]. This method has been used for many years, but was insufficient for some patients and did not suit the growth pattern of MPM. During recent years, the unidimensional response criteria based on RECIST (response evaluation criteria in solid tumour) have been suggested and investigated to evaluate the response to treatment in solid tumours [4]. Modified RECIST criteria developed by Byrne and Nowak [5] also take into account the rind-like growth of the tumour and the criteria have become the standard for MPM, despite the method not only showing a high interobserver variability but also an overclassification of tumours, according to theoretical studies [6, 7].
To date, volumetry is gaining more and more importance in the assessment of tumour volume [8–12]. In liver surgery, volumetry is a well-established method for the measurement of tumour size prior to surgery in order to decide whether the remaining liver volume will be sufficient [11, 12]. To measure the tumour volume on cross-section images, special software is necessary that allows segmentation of the tumour. Although many programmes focus on hepatic measurements, they can easily be used for other tumours, for example MPM. Preliminary studies using volumetry for MPM showed promising results [2, 13–15].
The aim of this study was to compare the volumetric approach to the measurements assessed by the modified RECIST criteria concerning the interobserver reliability and tumour response after induction chemotherapy.
PATIENTS AND METHODS
Patient selection
Digitally available chest computed tomography (CT) scans obtained before and after chemotherapy in 30 patients with biopsy-proven MPM in stage cT1–3 cN0–2 cM0, including all histological subtypes considered for a multimodality approach, at the Dept of Medical Oncology and the Division of Thoracic Surgery of the University Hospital Zurich (Zurich, Switzerland) during the period from May 1999 until January 2008 were retrospectively analysed. The study was approved by the local ethics committee (Cantonal Ethical Committee of Zurich, Zurich, Switzerland) and informed consent was obtained.
All patients were treated with induction chemotherapy followed by extrapleural pneumonectomy (EPP). Neoadjuvant chemotherapy consisted of three cycles of cisplatin 80 mg·m−2 on day 1 and gemcitabine 1,000 mg·m−2 on days 1, 8 and 15 administered every 28 days or, since March 2003, cisplatin 80 mg·m−2 on day 1 and pemetrexed 500 mg·m−2 on day 1 administered every 21 days with vitamin supplementation.
CT imaging
For the underlying analysis, only patients with digitally available CT imaging data were included. Pre- and post-chemotherapy chest CT images were available for 30 patients (median (range) age 60 yrs (48–71 yrs); females n = 2, males n = 28). One of three different CT scanners was used: GE LightSpeed VCT (GE Health Systems, Milwaukee, WI, USA), Siemens Somatom Sensation and Siemens Somatom Definition (both Siemens, Erlangen, Germany). 22 patients received intravenous iodinated contrast agent on both examinations (20 mL iodixanol (270 mg iodine per mL): Visipaque; Amersham Biosciences, Amersham, UK). The scan delay after starting contrast material injection was 30 s (n = 13) and 60 s (n = 9). Eight patients did not receive a contrast agent on the pre-chemotherapy chest CT as the scan was used for positron emission tomography fusion.
The mean±sd time delay between the pre- and post-chemotherapeutic CT was 100±13.5 days. The slice thickness ranged between 2 mm and 3.75 mm, whereas in 19 cases, pre- and post-chemotherapeutic CT had identical slice thickness.
All data were stored on an internal picture archiving and communication system and sent to a dedicated workstation for tumour volumetry.
Imaging analysis
Modified RECIST criteria
Response to chemotherapy was evaluated by modified RECIST criteria and tumour volumetry [5]. Three readers performed imaging analysis independently: a trainee thoracic surgeon (M. Tutic; 3 yrs experience) and two radiologists (R.P. Götti and T. Frauenfelder; 2 and 10 yrs experience, respectively). Modified RECIST criteria were assessed using a dedicated film reading workstation (Impax 5.2; AGFA, Bonn, Germany). Both pre- and post-chemotherapeutic CT images were available simultaneously on two screens and could be linked by each reader individually at the same anatomical position. The thickness of the tumour was measured perpendicular to the chest wall and mediastinum at two positions on three different levels with a minimal craniocaudal distance of 1 cm according the modified RECIST criteria [5] (fig. 1). The images were stored locally but were not visible to the other readers.
Volumetric approach
All three readers independently measured the tumour volume using dedicated software featuring semi-automatic segmentation with linear interpolation, allowing manual adjustments if necessary (Myrian®; Intrasense, Paris, France). Although this software was previously developed for liver segmentation and volumetry, it can easily be used for other types of volumetries as the linear interpolation segmentation algorithm is not liver-specific.
The segmentation and tumour volume quantification consisted of the following steps: 1) the normal lung tissue, including the bronchi and vessels, was marked semi-automatically by thresholding and region growing; 2) pleural effusion and atelectatic lung were marked with a magnetic lasso function; and 3) after fixing normal lung tissue, pleural effusion and atelectatic lung, the outer part of the pleura was segmented semi-automatically. Manual interaction could be reduced by segmenting only every fourth to fifth slice. Interpolation between the marked slices was performed automatically using a linear algorithm (fig. 2). The volume was calculated by multiplying the sum of the voxels included in the segmented tumour by the voxel volume. The value of the resulting volume was saved as screenshots to the picture archiving and communication system.
The time needed to apply modified RECIST and volumetric measurement including eventually editing was ∼3 min and 10–15 min per case, respectively.
Analysis of data
The sum of all measurements based on modified RECIST and the tumour volume after chemotherapy was subtracted from the sum and tumour volume before chemotherapy. The results were divided by the volume before chemotherapy and multiplied by 100. Thus, the per cent change of measurements according to modified RECIST and total tumour volume was calculated.
For volumetry, volume equivalent criteria of spherical tumours were applied. Progressive disease (PD) corresponded to a >73% increase of tumour volume, partial response (PR) was defined as a >65% decrease in volume and stable disease (SD) as a change between -65% and 73%. This definition reflects spherical growth, which has previously been suggested by Armato et al. [6] and Oxnard et al. [7].
A kappa statistic was used to assess agreement of tumour response classification between readers. Inter-rater agreement was considered as poor (κ≤0.2), fair (κ = 0.21–0.4), moderate (κ = 0.41–0.6), good (κ = 0.61–0.80) or excellent (κ = 0.81–1.0) [16].
Inter-rater reliability of unidimensional and volumetric measurements for absolute values was assessed using intraclass correlation in an ANOVA with the patient and reader as random factors. Inter-rater reliability is the ratio of the patient variance component and the sum of all variance components [17]. The Wilcoxon signed rank test was used to compare absolute measurements between readers. A p-value <0.05 was considered to indicate statistical significance. Inter-rater agreement of the absolute measurements was assessed using Bland–Altman analysis. The difference in measurements according to modified RECIST was correlated to the third root of total tumour volume change using Pearson correlation.
All statistical analysis was performed using dedicated software: SPSS version 13.0 (SPSS, Chicago, IL, USA) and Microsoft Excel 2003 (Microsoft, Redmond, WA, USA), including Analyse-it version 2.12 (Analyse-it Software Ltd, Leeds, UK).
RESULTS
Between May 1999 and January 2009, 159 patients were enrolled with the intention to treat them with induction chemotherapy followed by EPP. Digital CT scans from 30 patients (cisplatin+gemcitabine: n = 2; cisplatin+pemetrexed: n = 28) pre- and post-chemotherapy were available for this analysis.
Modified RECIST criteria
Figure 3a demonstrates the classification according to the modified RECIST criteria by the three readers. 14 out of 30 cases were identically classified into PD, SD or PR using the modified RECIST criteria. In 16 cases there was a mismatch. In 15 cases of mismatch, the classification of the patients was different between one reader and the other two readers. In one case, all three readers classified tumour response differently. There was no systematic bias visible. The general kappa value was 0.33 between the three readers, meaning a moderate inter-rater agreement concerning tumour response (table 1).
No significant differences were found (p≥0.47) when comparing absolute values (cm) of tumour response for modified RECIST criteria between each reader. The intraclass correlation coefficient was 0.55, indicating a poor correlation of tumour response between all three readers (table 1 and fig. 4a).
Bland–Altman analysis for testing the degree of agreement between the three readers for tumour response revealed large mean differences (table 1, fig. 4b). The limits of agreement were vast compared to the maximal difference between pre- and post-chemotherapeutic measurements (reader 1: 5.05 cm; reader 2: 4.14 cm; reader 3: 6.41 cm), reflecting the poor agreement.
Tumour volumetry
Figure 3b shows no mismatch classifying the tumour response according to volumetric approach with volumetric software when using volume equivalent criteria by the three readers. This result led to a high general kappa value of 0.89 between the three readers, indicating an excellent inter-rater agreement concerning tumour response (table 1).
The volumetric measured tumour response did not show a significant difference between all three readers (p≥0.42). The intraclass correlation coefficient for volumetric tumour response was 0.99 between all three readers, indicating a very close agreement between the measured volumes of all three readers (table 1 and fig. 5a).
Bland–Altman analysis for testing the degree of agreement of tumour response between the three readers revealed small mean differences (≤66 mL), indicating a good correlation between the readers (table 1 and fig. 5b). The maximal changes in tumour volume were 560 mL, 557 mL and 567 mL for readers 1, 2 and 3, respectively.
Comparison of modified RECIST to volumetric tumour response
The Pearson correlation of the measured changes according to the modified RECIST criteria and volumetry was 0.57 for reader 1 (p = 0.0009), 0.67 for reader 2 (p<0.0001) and 0.45 for reader 3 (p = 0.0129), indicating a moderate correlation.
In eight cases (reader 1), nine cases (reader 2) and seven cases (reader 3), the tumour response was conflicting when comparing the changes based on modified RECIST to the percentage change based on volumetry (table 2).
When classifying the tumour response according to the modified RECIST and volume equivalent criteria, a large number of cases classified as PR and SD on modified RECIST would have been classified as SD and PD in the volumetric approach.
DISCUSSION
The results of our study show a high intraclass correlation and interobserver agreement for the absolute tumour volumes measured by specialised software. Our results indicate that volumetry is highly reliable, reproducible and reader independent compared with the modified RECIST criteria and, thus, useful for the assessment of chemotherapy response by using volumetric measurement in CT scans.
The radiological evaluation of therapy response to chemotherapy during treatment of MPM is challenging due to the special rind-like growth pattern of the tumour. But other features, including the involvement of multiple thoracic levels, separate nodular or pleural thickening, growth along fissures and accompanying atelectasis, pleural fluid and fibrosis, also make the accurate assessment of this special tumour difficult [18]. As the WHO and RECIST criteria could not be used for the distinct growth pattern of MPM, Byrne and Nowak [5] introduced the modified RECIST criteria. This method was specifically developed for better assessment of changes in pleural mesothelioma, measuring the tumour unidimensionally at two sites in three different levels on axial cross-section images [5]. This method has the drawback that the position of tumour measurement can be chosen randomly by each reader, which leads to a large intra- and interobserver variability [6, 7].
Volumetry has gained a wide interest and acceptance for pre-operative assessment of liver volume in cases of living liver donor transplantation. Additionally there are several studies using volumetric evaluation of tumour response for different kinds of tumours, e.g. liver metastases, lung nodules or lymph nodes [8–10, 12]. These studies showed a low intra- and interobserver variability for tumour volumetry.
There are also some reports about the use of a volumetry approach for MPM [2, 7, 13–15]. The volumetric approach was used for two different types of outcome study. Pass et al. [2] and Lee et al. [15] primarily focused on the potential of prognostic information for overall survival from the pre-operative or pre-therapeutic determined volume. Other studies, as well as our study, have investigated the volumetry approach to assess tumour response after chemotherapy [7, 13, 14]. All these studies had small numbers of patients (maximum n = 55) [13]. Therefore, no references or standards in terms of PD, SD or PR are defined using tumour volumetry measurement. For modified RECIST criteria, tumour response is defined as a 30% decrease for PR and a 20% increase for PD. Alternative response criteria for a typical spherical tumour model are a volume decrease of 65% for PR and a 73% volume increase for PD. This alternative response criterion can be adopted for MPM, but is not suited for the volumetric approach [7, 13] due to the wide range of SD classification. Although this would not be a limitation per se, it diminishes the value of volumetry, which has the ability to already discriminate even minimal changes of volume.
Ak et al. [13] retrospectively defined a ≥15% increase in tumour volume as PD and a ≥50% decrease in tumour volume as a PR based on the overall survival time. However, the patients analysed did not receive standardised chemotherapy and the pre- and post-CT scan intervals are not known. Therefore, a direct comparison with our results is not possible. Plathow et al. [14] did not find a difference between modified RECIST and volumetric approach using modified RECIST equivalent criteria. This finding corresponds to our results, as in both studies the changes in classification were equal.
Different methods were used for tumour calculation, such as the Cavaliere principle, which is a point-count method [13], model-based tumour volumetry [7] or voxel-based volumetry [2, 14]. The software used in our study was voxel-based volumetry. The voxel size was <3 mm3.
We acknowledge the following limitations in our study. For volumetry, all image data had to be available digitally, so our study only included a small number of patients. Therefore, the determination standard of references for volumetric response is not possible.
To perform volumetry accurately, the source data should allow a clear discrimination of the different structures, such as the atelectatic lung, pleural fluid, tumour and lymph nodes. Although different CT techniques were used (with or without contrast material and different delay for contrast material injections), this might be another limitation of the study; however, it did not influence the results, as the tumour could be delineated on all CT scans. Nevertheless, the reduced differentiation required additional manual interaction. On delayed phase-contrast CT (∼120 s after i.v. contrast material injection) different entities can be distinguished best, allowing a fast and accurate tumour volumetry. Based on this experience, we changed our scanning protocol for patients with MPM, performing only a chest CT scan with a delay of 120 s after starting i.v. contrast material injection.
Tumour volumetry is still more time consuming, since 10–15 min per reading is necessary. This is especially demanding if numerous manual adjustments have to be performed. New segmentation algorithms (e.g. object-based segmentation) may enhance semi-automatic segmentation. Using the interpolation algorithm, there might be tiny marking errors on some slices, but they do not significantly influence the result, as shown by the high inter-rater correlation.
In conclusion, tumour volumetry is, according to our results, a reproducible and reliable method to show small tumour changes, having a high interobserver agreement compared with the modified RECIST criteria. We will continue to prospectively validate the value of tumour volumetry to determine chemotherapy response criteria correlating best with survival data.
Footnotes
↵Earn CME accreditation by answering questions about this article. You will find these at the back of the printed copy of this issue or online at www.erj.ersjournals.com/misc/cmeinfo.xhtml
Statement of Interest
None declared.
- Received September 14, 2010.
- Accepted November 27, 2010.
- ©ERS 2011