Abstract
Low-dose computed tomography (CT) screening for lung cancer can reduce lung cancer mortality, but overdiagnosis, false positives and invasive procedures for benign nodules are worrying. We evaluated the utility of positron emission tomography (PET)-CT in characterising indeterminate screening-detected lung nodules.
383 nodules, examined by PET-CT over the first 6 years of the COSMOS (Continuous Observation of Smoking Subjects) study to diagnose primary lung cancer, were reviewed and compared with pathological findings (surgically-treated patients) or follow-up (negative CT for ⩾2 years, considered negative); 196 nodules were malignant.
The sensitivity, specificity and accuracy of PET-CT for differentially diagnosing malignant nodules were, respectively, 64%, 89% and 76% overall, and 82%, 92% and 88% for baseline-detected nodules. Performance was lower for nodules found at repeat annual scans, with sensitivity ranging from 22% for nonsolid to 79% for solid nodules (p=0.0001). Sensitivity (87%) and specificity (73%) were high for nodules ⩾15 mm, better (sensitivity 98%) for solid nodules ⩾15 mm.
PET-CT was highly sensitive for the differential diagnosis of indeterminate nodules detected at baseline, nodules ⩾15 mm and solid nodules. Sensitivity was low for sub-solid nodules and nodules discovered after baseline for which other methods, e.g. volume doubling time, should be used.
Abstract
PET-CT is good at differentially diagnosing large, solid and baseline-detected lung nodules in the screening setting http://ow.ly/A1amh
Introduction
Computed tomography (CT) screening for lung cancer remains controversial [1–4]. Although the National Lung Screening Trial (NLST) found that low-dose CT (LDCT) reduced lung cancer mortality by about 20% [5], the high rate of indeterminate and false-positive nodules was worrying [6].
The diagnostic work-up of CT nodules considers size (diameter or volume), characteristics (density, morphology, homogeneity) and volume doubling time (VDT) [7–13]. Several studies have investigated the ability of positron emission tomography (PET) to characterise lung nodules [14–18]. However, its role in diagnostic algorithms for screening-detected nodules has not been defined. In a previous study on PET-CT at baseline screening, we found an overall sensitivity of 88% for diagnosing malignancy, while for solid nodules >10 mm, sensitivity was 100%, suggesting PET-CT as an alternative to invasive procedures in the screening setting [19]. The study also indicated that the maximum standardised uptake value corrected for body weight (SUVbw,max) positivity threshold could be lowered from 2.0 to 1.5 for nodules <10 mm. A Danish study reported similar findings and concluded that the best malignancy predictor was combined VDT and PET-CT [20]. In the current study, we retrospectively assessed the ability of PET-CT to diagnose indeterminate nodules detected during the first 6 years of the COSMOS (Continuous Observation of Smoking Subjects) study [7].
Materials and methods
Patients
In 2004 and 2005 we enrolled asymptomatic volunteers, aged ⩾50 years, current or former smokers (⩾20 pack-years), in the nonrandomised, single-centre screening study, COSMOS [17]. Ex-smokers had stopped ⩽10 years previously. Those with previous (within 5 years) malignant disease (except treated nonmelanoma skin cancer) were excluded. The study was approved by the ethics committee of the European Institute of Oncology, Milan, Italy. Those recruited gave written consent to receive annual LDCT for 10 consecutive years.
Diagnostic protocol
Noncalcified nodules ⩽5 mm underwent repeat LDCT 1 year later. Noncalcified nodules between 5.1 and 8 mm underwent repeat LDCT 3 months later. Solid or partially solid nodules >8 mm underwent PET-CT, unless they appeared clearly benign (axial longest diameter more than twice minimum diameter, thickening of fissures, liquid density, inside apical scar), and were reinvestigated by LDCT 1 year later.
Baseline nonsolid nodules between 5.1 and 8 mm underwent repeat LDCT 6 months later; those >8 mm also underwent repeat LDCT; if any were progressive in diameter or density, PET-CT was scheduled. After 2006, lesions >8 mm thought to be due to infection were treated with oral antibiotics for 7 days and LDCT repeated 1 month later before scheduling PET-CT.
Growing or PET-positive nodules suspicious for malignancy underwent minimally invasive surgical biopsy and additional interventions. Highly suspicious nodules not amenable to biopsy were usually investigated by CT-guided fine-needle aspiration.
Resectable cases underwent standard anatomical resection plus radical lymph node dissection via lateral muscle-sparing thoracotomy or, after 2007, by robot-assisted videothoracoscopy.
LDCT
The multidetector (8- or 16-slice) CT scanner was a High/Light Speed Advantage (General Electric, Milwaukee, WI, USA). Scans were taken without contrast in a single breath with the machine set at 120 kVp, 30 mA, 1.75:1 pitch ratio and 2.5 mm slice thickness. Images were retro-reconstructed using standard and lung algorithms at 1.25 mm. The dose equivalent per patient was estimated at 0.81 mSv.
An experienced team of radiologists read the images (Advantage Windows 4.2 workstation; General Electric) using lung parenchyma windows (WW 1300, WL 480) with maximum intensity projection reconstruction, and mediastinum windows (WW 300, WL 35). Maximum axial diameter was measured by electronic calipers. Lesions >5 mm were evaluated at multidisciplinary meetings involving at least two radiologists. Nodules were classified as solid (lung parenchyma within completely obscured), partially solid (containing areas that completely obscured the lung parenchyma) or nonsolid (none of the parenchyma within nodule obscured).
PET-CT
PET acquisition using 18F-fluorodeoxyglucose is described elsewhere [7]. Briefly an in-line Discovery LS (GE Medical Systems, part of General Electric) was used, consisting of an Advance NXi PET scanner and an 8-slice Light Speed Plus CT scanner (2D modality, matrix size of 128 × 128). Patients were fasted for 6 h prior to radiotracer injection. Glucose levels were checked and were always <200 mg·dL−1 (the cut-off was recently changed from <150 mg· dL−1 to 200 mg· dL−1). The time between injection and acquisition was ∼50 min. While waiting, patients were required to drink two or three glasses of water and empty the bladder. The CT of PET-CT was performed without breath-holding, so some nodules (particularly small sub-solid nodules in inferior lobes) were not well visualised. In such cases, the nuclear medicine physician identified the nodule on axial LDCT images, defining a circular region of interest, copied this and pasted it onto the corresponding axial PET image, and onto four superior and four inferior axial PET images (in relation to axial reference CT image), to reduce the possibility of uptake underestimation due to respiratory motion. Uptake was assessed visually as negative or positive and semi-quantitatively (SUVbw,max) on black-and-white scale PET images, not on fusion PET-CT images.
Solid nodules with SUVbw,max >2.0 and nonsolid nodules with SUVbw,max >1.5 were considered malignant. The PET findings were assessed by a single nuclear medicine physician but were discussed at multidisciplinary meetings.
Pathology and treatment
The nature of indeterminate nodules was determined by cytological/histological examination if biopsied/removed, or by follow-up of ⩾2 years. For diagnostic wedge resection, a videothorascopic approach was preferred whenever lesion site and size allowed; otherwise, lateral muscle-sparing thoracotomy was employed. Wedges underwent intraoperative frozen section examination. If malignant, anatomic resection with curative intent was performed immediately.
Statistical analysis
Reference findings were pathological status of resected tissue or ⩾24 months of follow-up. We calculated the sensitivity and specificity of PET-CT, with 95% confidence intervals for the whole group, and by sex, age (<60 or ⩾60 years), nodule type and nodule size; by visual evaluation, and also according to three SUVbw,max cut-offs (1.5, 2.0 and 2.5). We also calculated the sensitivity, specificity, positive and negative predictive values (PPV and NPV, respectively) and accuracy of PET-CT, with 95% confidence intervals, for detecting specific lung cancer subtypes.
We used receiver operating characteristic (ROC) curve analysis to assess the diagnostic value of nodule size and SUVbw,max, calculating area under the curve (AUC) as a measure of diagnostic efficiency.
SUV distributions in subgroups are presented in box and whisker plots. Differences in median SUV between subgroups were assessed by nonparametric two-sample median test. Analyses were performed with SAS version 8.2 (SAS Institute Inc., Cary, NC, USA). The p-values are two-sided.
Results
From October 2004 to October 2005, 5203 volunteers, mean±sd age 57.7±5.6 years, entered the study [7]. Over the first 6 years, 443 PET-CT examinations were performed. 60 were excluded from this analysis, 49 because they were performed for collateral findings, three because patients had previous lung cancer and eight because patients were not followed-up long enough to ascertain whether they were cancer-free. The analysis is based on the remaining 383 examinations to diagnose suspected primary lung cancer in 351 volunteers.
241 suspicious nodules underwent CT-guided biopsy or surgery: cytology or histology revealed 196 malignant nodules and 45 benign nodules. 142 nodules were followed-up by LDCT and defined benign after ⩾24 months. 10 patients had multiple PET scans for various reasons.
Table 1 shows the characteristics of the patients and their PET-investigated nodules. 70% of investigated nodules were in males; 60% were in volunteers aged ⩾60 years. 149 indeterminate nodules were investigated by PET as a work-up of baseline LDCT. Fewer were investigated at subsequent rounds, but the number remained fairly constant over the next 5 years (58, 63, 45, 47 and 21 each year). The size distribution was: 147 measuring <10 mm, 106 between 10 and 15 mm, and 118 measuring ⩾15 mm. Most (69.2%) nodules were solid, 17.8% were sub-solid and 12.5% were nonsolid.
187 nodules were benign (45 by cytology/histology and 142 by follow-up), 196 were malignant, 185 diagnosed within a year, 127 (69%) at stage I. Nodule size (p=0.005) and type (p=0.02) were significantly associated with cancer (Table 1). A higher proportion of nodules investigated at follow-up rounds were cancer than those investigated at baseline (p<0.0001).
SUVbw,max
The mean±sd SUVbw,max measurements were 1.19±0.56 for the 142 benign lung nodules not investigated by cytology/histology, 2.06±2.36 for the 45 benign lung nodules biopsied/removed surgically and 4.76±5.13 for the 196 malignant nodules. For the 234 nodules considered negative by visually assessed PET-CT, the SUVbw,max was 1.09±0.25, compared with 6.46±5.24 for the 144 nodules considered positive (fig. 1).
Diagnostic performance of PET-CT
Overall sensitivity, specificity and accuracy (95% CI) of visually evaluated PET-CT in distinguishing malignant from benign nodules were 64% (56–70%), 89% (83–93%) and 76% (71–80%), respectively. Corresponding figures were 82% (69–90%), 92% (84–96%) and 88% (81–92%) for baseline nodules, and 87% (77–93%), 73% (57–85%) and 82% (74–88%) for nodules ⩾15 mm. Performance was significantly lower for nodules on annual repeat scans than baseline (sensitivity range 30–71% versus sensitivity 82%; p<0.0001), and for nonsolid compared with solid nodules (sensitivity 22% (10–40%) versus 79% (71–86%); p<0.0001) (table 2). Accuracy was greater for males than females (79% (73–83%) versus 69% (60–77%); p=0.05) and greater for those aged ⩾60 years than those aged <60 years (80% (74–85%) versus 69% (61–77%); p=0.02). Performance was similar for nodules in upper versus lower lobes and right versus left lung.
Considering size and density together, PET-CT sensitivity (95% CI) was poor for sub-solid nodules <15 mm (21% (11–36%)), intermediate for sub-solid nodules ⩾15 mm (64% (43–81%)) and solid nodules <15 mm (65% (53–76%)), and high for solid nodules ⩾15 mm (98% (88–100%)). For sub-solid nodules, sensitivity increased with size to reach 64% (43–81%) for nodules ⩾15 mm. For sub-solid nodules <10 mm, sensitivity was particularly poor (17% (6–40%)) but there were no false positives. In general, a negative PET in a sub-solid nodule did not exclude cancer, but cancer was present in 86% (67–95%) of these nodules (the PPV). For solid nodules <10 mm, sensitivity was 51% (36–66%), but false positives were rare and PPV was 85% (64–95%). For solid nodules ⩾15 mm, sensitivity was excellent (98% (88–100%)), implying that PET-negative nodules were rarely cancer and invasive diagnostic procedures could be avoided.
Figure 2 shows a 10-mm solid nodule with irregular margin appearing at second screening. Unexpectedly, it was PET-negative; needle biopsy was also negative and surgical biopsy was not performed. The nodule was stable 5 years later.
For solid nodules ⩾15 mm, the only false-negative PET was a pT1N0M0 adenocarcinoma diagnosed at screening 1 year later. However, PPV was suboptimal at 85% (73–92%), with nine out of 59 PET-positives shown to be false positives and PET positivity responsible for surgical biopsy of the nodule in six of these. Four out of the six surgically biopsied cases occurred during the first 2 years of screening, before a second LDCT 1 month later was introduced for nodules >8 mm thought to be due to infection. From rounds 3 to 6, 64 cases underwent second LDCT after antibiotics: in 54 of these the nodule shrank, in the remaining 10 it remained unchanged or increased in size.
Figure 3 illustrates a PET-positive nodule that was benign after resection. Resection could have probably been avoided if, instead of PET-CT, antibiotics had been administered followed by repeat LDCT. Figure 4 shows a new solid suspicious nodule that was smaller at second LDCT 1 month later, after antibiotics.
The analyses using three different SUVbw,max thresholds (online supplementary tables S1a–c) produced results consonant with visual assessment. For the 1.5, 2.0 and 2.5 thresholds, sensitivity was 67% (60–73%), 60% (53–67%) and 51% (44–59%), respectively; specificity was 80% (73–85%), 88% (82–92%) and 91% (86–95%), respectively; and accuracy was 73% (63–77%), 74% (69–78%) and 71% (66–76%), respectively.
Table 3 shows visually assessed PET-CT sensitivity according to cancer subtype. Sensitivity (95% CI) was 64% (56–70%) overall, but lowest for adenocarcinoma (53% (46–63%)). Sensitivity was highest for poorly differentiated (87% (72–94%)) and stage II–IV (78% (66–87%)) disease, and lowest for well-differentiated (22% (11–40%)) and stage I (38% (29–47%)) disease. Similar results were found using the three SUV thresholds (online supplementary table S2).
The AUC (ROC) to assess the discriminative ability of SUV in identifying lung cancer versus non lung cancer was 0.78 for all nodules, 0.86 for solid nodules, but only 0.65 for sub-solid nodules (online supplementary fig. S1).
Discussion
It is essential to limit harm to screened individuals, so invasive procedures such as fine-needle aspiration biopsy and surgery must be avoided whenever possible. We included PET-CT in the COSMOS work-up for indeterminate screening-detected nodules, with the aim of reducing the use of more invasive procedures [19]. Our indications for PET-CT were nodule size >8 mm, or a growing nodule on repeat LDCT, sometimes <8 mm.
For the 383 PET-CT examinations, the sensitivity, specificity and accuracy of visually evaluated PET-CT, in differentially diagnosing malignant nodules, were 64%, 89% and 76%, respectively. PET performance varied with nodule diameter: accuracy increased from 70% for nodules <10 mm to 82% for nodules ⩾15 mm (p=0.06), in line with previous experience [7]. Nodule type had a strong influence (p<0.0001) on accuracy, which ranged from 84% for solid nodules to 46% for nonsolid nodules. Sensitivity for solid nodules varied with size: high (98%) for nodules ⩾15 mm, but only 51% for nodules <10 mm.
The findings most relevant to PET-CT use in screening are nodule size and type. For solid nodules ⩾15 mm (21% of our series), sensitivity was high (98%), so a positive finding is almost always cancer. The NPV was also high (95%), indicating that invasive procedures can be avoided when PET is negative, with low risk of delayed diagnosis. In fact, we had only one false-negative PET is this group, which was still stage I when it was diagnosed at screening 1 year later.
For solid nodules <10 mm (30% of total) and sub-solid nodules of any size (28% of total), NPV was low, as expected [21], so a negative PET is unreliable and other methods like VDT or detailed nodule characteristics are required to provide an indication of malignancy before scheduling surgical biopsy. Conversely, for sub-solid nodules <10 mm (11% of total), PPV was high, so PET positivity was a strong predictor of malignancy, indicating that these cases should proceed to surgery. For sub-solid nodules ⩾10 mm, a few false positives greatly reduced PPV.
As noted, after round 2, lesions >8 mm suspicious for infection received antibiotics and repeat LDCT 1 month later. PET-CT was performed only if the lesion was unresponsive to antibiotics. This innovation avoided numerous useless and potentially harmful (because of inflammation) PET-CT examinations.
A Danish retrospective study evaluated visually assessed PET and VDT in screening: the combination predicted malignancy with high (90%) sensitivity and good (82%) specificity [20]. The overall sensitivity of PET-CT alone in our study (64%) was somewhat lower than the 71% of the Danish study, where 8 (40%) out of the overall 20 malignancies were identified at baseline, compared with 60 (30%) out of 196 in our experience. Baseline cancers are usually more often PET-positive since they are more often large and dense [22].
The significantly better PET-CT accuracy we found in older (⩾60 years of age) than younger individuals could be due to the higher proportion of aggressive cancers found in older high-risk individuals [23]. As regards the significantly better diagnostic performance of PET-CT at baseline than subsequently (table 2), this is probably due to the lower frequency of nonsolid cancers at baseline compared with later [22].
In a previous study, we found that 18F-fluorodeoxyglucose uptake by nodules furnished important prognostic information [24]. Stage I PET-positive cancers had significantly worse survival than stage I PET-negative cancers. PET-CT may also guide extent of surgical resection and lymph node dissection. Our previous study on lymph node dissection [25] and the recent study by Okada et al. [26] both suggested sub-lobar resection as an alternative to lobectomy in PET-negative cancers. This option is being investigated in randomised trials [27].
We do not use fine-needle aspiration biopsy routinely in our screening protocol. Typically, we proceed to surgery when nodule characteristics, including VDT [7, 18], often backed up by PET-CT, indicate malignancy. The disadvantage is that a pre-operative pathological diagnosis is not available and a wedge resection with frozen section examination is required before radical surgery. This procedure employs more materials and takes about 45 min longer than standard lobectomy, and also requires a pathologist in attendance. However, it is well accepted by patients, who rarely insist on a pre-operative pathological diagnosis.
Nevertheless, this policy goes against the International Association for the Study of Lung Cancer (IASLC) Screening Workshop Guidelines [4], which recommend CT-guided biopsy for suspicious nodules, not least because it often facilitates surgical decision-making. The New York Early Lung Cancer Action Project (ELCAP) group [11] reported good diagnostic performance for CT-guided biopsy (overall 82% sensitivity and 88% accuracy), although the results were less good for nodules <8 mm. However, CT-guided biopsy is highly operator dependent and sensitivity can be very low [28].
Nodules on PET-CT are commonly evaluated by semi-quantitative SUVbw,max, which is operator independent, as well as by visual assessment of uptake by the nuclear medicine physician, whose experience can markedly influence the result. In the present study, we evaluated both methods and found visual assessment afforded higher accuracy (76%) than any SUV threshold (online supplementary tables S1a–c and S2), with a good compromise between sensitivity (64%) and specificity (89%); increasing the SUV threshold from 1.5 to 2.5 decreased sensitivity (67% to 51%) and increased specificity (80% to 91%).
An important consequence of our use of visually assessed PET-CT (in combination with repeat LDCT 1 month later for suspected inflammatory nodules) was that only 14% of benign cases underwent surgical biopsy [24]. This is less than in the NLST, where 25% (90 out of 360) of suspicious cases underwent invasive procedures for benign nodules [29], and less than in the NELSON study (Dutch–Belgian randomised lung cancer screening trial), where 27% underwent surgery for benign disease at baseline and 21% in subsequent rounds [12]. By contrast, Ashraf et al. [20] reported that only two (10%) benign cases underwent surgery using a protocol that combined PET-CT with VDT.
As regards the cost of PET-CT and its use as a second-line modality, this can be approached by evaluating malignancy rate: 51% of our PET-CT cases had lung cancer, whereas 37% (20 out of 54) had lung cancer in the study by Ashraf et al. [20]. Ashraf et al. [20] scheduled solid nodules 5–15 mm and nonsolid nodules ⩽20 mm not considered definitely benign, for PET-CT. We selected noncalcified nodules >8 mm and nodules >5 mm growing in diameter or density at repeat LDCT.
A strength of the current study is that, to our knowledge, it presents the largest series of screened patients undergoing PET-CT as part of a diagnostic algorithm; almost all cases received PET-CT before surgical resection, most at the European Institute of Oncology, Milan. A limitation is that we do not have VDTs of all benign nodules and cannot compare VDT with PET, or assess the two tests combined. In addition, our results may not be applicable to populations with higher rates of endemic inflammatory lung conditions than the Italian population.
Some data indicate that, although widely accepted, 2-year stability is not a good indicator of nodule benignity [30, 31]. For example, Yankelevitz et al. [30] found that 2-year stability on CT had low sensitivity (40%) and moderate accuracy (60%) when used define a nodule as benign. Thus, some of the nodules considered benign by the 2-year criterion may prove to be malignant. Follow-up continues for these nodules with long doubling time and those that eventually turn out to be malignant tend have good prognoses.
To conclude, our PET-CT findings are encouraging overall and suggest the technique has a role in the diagnostic work-up of indeterminate nodules identified on LDCT screening. It is more useful for nodules detected at baseline, while sensitivity is low for sub-solid nodules and nodules discovered after baseline. For these, other diagnostic modalities, particularly VDT, are more useful.
Acknowledgements
The authors thank Don Ward for help with the English.
Footnotes
For editorial comments see Eur Respir J 2015; 45: 314–316 [DOI: 10.1183/0903193600192714].
This article has supplementary material available from erj.ersjournals.com
Conflict of interest: None declared.
- Received April 7, 2014.
- Accepted July 24, 2014.
- Copyright ©ERS 2015