Validation of Lung EpiCheck, a novel methylation-based blood assay, for the detection of lung cancer in European and Chinese high-risk individuals

Aim Lung cancer screening reduces mortality. We aim to validate the performance of Lung EpiCheck, a six-marker panel methylation-based plasma test, in the detection of lung cancer in European and Chinese samples. Methods A case–control European training set (n=102 lung cancer cases, n=265 controls) was used to define the panel and algorithm. Two cut-offs were selected, low cut-off (LCO) for high sensitivity and high cut-off (HCO) for high specificity. The performance was validated in case–control European and Chinese validation sets (cases/controls 179/137 and 30/15, respectively). Results The European and Chinese validation sets achieved AUCs of 0.882 and 0.899, respectively. The sensitivities/specificities with LCO were 87.2%/64.2% and 76.7%/93.3%, and with HCO they were 74.3%/90.5% and 56.7%/100.0%, respectively. Stage I nonsmall cell lung cancer (NSCLC) sensitivity in European and Chinese samples with LCO was 78.4% and 70.0% and with HCO was 62.2% and 30.0%, respectively. Small cell lung cancer (SCLC) was represented only in the European set and sensitivities with LCO and HCO were 100.0% and 93.3%, respectively. In multivariable analyses of the European validation set, the assay's ability to predict lung cancer was independent of established risk factors (age, smoking, COPD), and overall AUC was 0.942. Conclusions Lung EpiCheck demonstrated strong performance in lung cancer prediction in case–control European and Chinese samples, detecting high proportions of early-stage NSCLC and SCLC and significantly improving predictive accuracy when added to established risk factors. Prospective studies are required to confirm these findings. Utilising such a simple and inexpensive blood test has the potential to improve compliance and broaden access to screening for at-risk populations.


Introduction
Lung cancer is the leading cause of death from cancer, with 1.76 million deaths worldwide in 2018 [1]. Risk factors include age, smoking, family history and occupational/asbestos exposure. 5-year survival rate for lung cancer is only 18.6%, mainly due to diagnosis at late stages [2]. Screening with low-dose computed tomography (LDCT) has been proven to reduce lung cancer mortality in high-risk populations [3,4]. However, LDCT has a significant rate of false positives and overdiagnosis, involves radiation hazard, is reader dependent and requires substantial infrastructure. In the USA, up to 14% of the eligible population undergo lung cancer screening [5]. Current barriers are infrastructure and knowledge and awareness gaps among referring physicians and the public. Importantly, lung cancer screening is targeting a very high-risk population, representing merely a quarter of lung cancer patients [6].
Several types of tumour-derived biomarkers have been assessed for lung cancer detection, including circulating tumour cells, exosomes, mutations and methylation changes in cell-free (cf )DNA, microRNA and proteins [7,8]. Genome-wide hypomethylation and hypermethylation changes are found in lung cancer and could potentially serve as markers [9].
EpiCheck is a simple ultrasensitive PCR-based assay that detects cancer-associated hypermethylation changes in a selected panel of markers from any body fluid or tissue. The urine-based Bladder EpiCheck demonstrated sensitivity of 92% for high-grade urothelial carcinoma with specificity of 88% in bladder cancer patients undergoing surveillance [10].
The purpose of this study is to validate the performance of Lung EpiCheck®, a six-methylation-marker blood test, in lung cancer detection.

Study samples
Training set samples were used to select the markers for the panel using Nucleix's proprietary bioinformatics techniques (Nucleix, Rehovot, Israel). Six markers were selected based on their synergistic information and an algorithm calculating the EpiScore was developed and locked down (supplementary figure S1). Two cut-offs were defined to allow for different clinical scenarios, a low cut-off (LCO) of EpiScore ⩾60, favouring high sensitivity, and a high cut-off (HCO) of EpiScore ⩾70, favouring high specificity. The European validation set was a new set of samples used to validate the performance of the assay using the pre-defined algorithm and cut-offs.
The training and the European validation sets were obtained by applying a single protocol: a case-control study performed on samples from sequential recruitment in 18 departments and clinics in 16 healthcare organisations, and three biobanks in Europe and Israel (supplementary table S2). Samples were collected from July 2016 to March 2018. The initial series of cases and controls were used for training and the subsequent series was used for validation. Cases were recruited from pulmonology, thoracic surgery and oncology departments and clinics in Europe and Israel. Present and past smokers, serving as controls, were recruited from blood collection stations in primary care clinics and from general surgery departments in Israel. Potential participants were randomly approached as they came to perform a blood test for any reason (table 1). Sample processing was performed at the Nucleix laboratory (Rehovot). Disease staging of the cases was according to the American Joint Committee on Cancer staging manual (AJCC)7 and AJCC8. Adenocarcinomas were included only if classified as invasive adenocarcinomas according to International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society classification [11].
The Chinese validation set was a small feasibility study assessing the applicability of Lung EpiCheck for lung cancer detection in a Chinese population. This was a blinded, case-control, single-centre study performed in the Lung Cancer Center/Lung Cancer Institute at the West China Hospital (Sichuan University, Chengdu, China). Samples were collected from January 2018 to November 2018. Patients suspected or confirmed to have lung cancer arriving for lung surgery were enrolled. Healthy volunteers were enrolled sequentially as controls (table 1). Sample processing was performed on site. Disease staging for the cases was according to AJCC8.
Relevant medical, smoking and family history data were collected prior to study-related procedures. The study was approved by the ethics committees of the various institutions involved, and all subjects provided signed informed consent. The study registration number is NCT02373917.

Lung EpiCheck testing
Lung EpiCheck (Nucleix) is a blood test that detects lung cancer-associated hypermethylation in six markers in cfDNA. Plasma is separated from a 10 mL EDTA tube within 4 h of blood draw by two consecutive centrifugations at 1500×g for 10 min and stored at −20°C to −80°C until DNA extraction. Lung EpiCheck's reagents and methylation-sensitive enzymes are used for DNA extraction, digestion and amplification in real-time PCR (ABI 7500 Fast Dx; Thermo Fisher Scientific, Carlsbad, CA, USA). Three PCR wells are amplified for the markers and one for an internal control to verify the quality of plasma separation by detecting leukocyte-derived DNA. Lung EpiCheck software analyses the PCR output calculating an EpiScore, a numerical score (0-100) reflecting the overall methylation level in the assay's markers.

Statistical analysis
The groups' baseline characteristics were compared using Chi-squared test for categorical parameters and Wilcoxon's rank-sum test for continuous parameters. Sensitivity and specificity were calculated for the entire sample and for different subgroups of interest along with 95% exact binomial confidence intervals. The predictive ability of the continuous EpiScore was evaluated via logistic regression and the corresponding area under the receiver operator characteristic curve (AUC) was calculated. Positive likelihood ratio (LR+) and negative likelihood ratio (LR−) were calculated for the entire sample and for different subgroups of interest along with 95% exact binomial confidence intervals (LR+ = sensitivity/(1−specificity), LR− = (1−sensitivity)/ specificity). A multivariable logistic regression was used to examine the relationship between the true patient status (lung cancer case or control) and their EpiScore result. The contribution of the EpiScore result was examined adjusting for the patient's personal characteristics and known risk factors for lung cancer. An additional multivariable logistic regression analysis was performed to examine whether the EpiScore outcome is affected by a patient's personal characteristics or known risk factors for lung cancer. Both analyses used the subset of patients for whom all relevant information was available. The contribution of each predictor in the model was evaluated via odds ratio and the overall prediction ability of the model was evaluated via AUC.

Results
The training set included 367 subjects (102 cases and 265 controls). Cases were significantly older with a higher number of pack-years compared to controls, while sex and number of years since quitting smoking were similar (      Data are presented as n/N or % (95% CI), unless otherwise stated. LCO: low cut-off; HCO: high cut-off; AUC: area under the curve; NSCLC: nonsmall cell lung cancer; SCLC: small cell lung cancer. In the cases of complete or quasi-complete separation, exact p-values were calculated. # : patients with adenosquamous carcinoma histology were grouped with the squamous cell carcinoma histology subtype; ¶ : tumour sizes were missing or unmeasurable for n=3 stage II, n=14 stage III and n=10 stage IV; + : stage III tumours without stage IIIA/IIIB classification were included in the advanced stages group; § : p-value calculation for the comparison of histological subtypes did not include the "unknown" group; ƒ : p-value calculation for the comparison by tumour size did not include the "unknown" group. https://doi.org/10.1183/13993003.02682-2020 In a multivariable analysis of patients with smoking information (n=242), established risk factors for lung cancer (age, smoking status, pack-years and quit years) and sex did not influence having a positive Lung EpiCheck result in either cut-off (figure 2). Presence of COPD significantly decreased the chance of having a positive Lung EpiCheck result at LCO. A trend of higher EpiScores in patients without COPD versus patients with the condition was maintained when looking at various statistical measures of EpiScore (mean, median, 1st and 3rd quartile) of cases and in controls separately, but when combining the two groups, this trend was reversed (data not shown). The only factor driving a positive result was the group (case/control) with odds ratio (95% CI) of 18.2 (7.2-45.7), p<0.0001 with LCO and 23.7 (10.1-55.5), p<0.0001 with HCO. Likelihood ratios are reported in supplementary table S5.
A multivariable analysis was performed to assess the accuracy of lung cancer prediction based on risk factors alone, or in combination with EpiScore (figure 3). In our data, age, sex, smoking status, quit years, pack-years and COPD together yielded an AUC (95% CI) of 0.852 (0.805-0.900). Adding EpiScore significantly increased the AUC to 0.942 (0.913-0.971), p<0.0001. This analysis was performed on a subset of 242 patients with full smoking history, and the Lung EpiCheck AUC was similar to that of the entire  No tests failed in the Chinese set.

Discussion
Our data demonstrate that Lung EpiCheck achieved performance characteristics suggesting that, if prospectively validated, may be suitable for clinical use in early detection of lung cancer. The predictive performance of Lung EpiCheck in the European validation data was very high, with AUC of 0.882. While maximising sensitivity (87.2%) with the LCO, the specificity remained good (64.2%), and while maximising specificity (90.5%) using the HCO, the sensitivity remained high (74.3%). In the Chinese set the AUC of 0.899 yielded high specificity in both cut-offs (LCO 93.3% and HCO 100.0%), with good sensitivity with LCO (76.7%). Differences between the validation sets are probably due to including young nonsmoking controls and surgical patients with small resectable tumours in the Chinese set. Detection at early stages is the key performance factor for Lung EpiCheck to ensure patients are detected in time for curative treatment. In the European set, the AUCs were consistently high in stage I NSCLC (0.797), stages I and II NSCLC (0.830) and early-stage (stages I, II and IIIA) NSCLC (0.862). In the validation sets, Lung EpiCheck detected 70-80% of stage I NSCLC with LCO, detecting tumours as small as 8 mm (adenocarcinoma in the Chinese set). With HCO, European results remained strong with stage I NSCLC sensitivity of 62.2%, but Chinese performance deteriorated to 30.0%. Early-stage performance should be interpreted with caution, as the controls were not scanned or followed-up to ensure no asymptomatic cancer existed. These results compare favourably with published results of other blood tests for lung cancer detection, reporting stage I sensitivity ∼40% (22-71%), many of which are from training sets [13][14][15][16][17]. Sensitivity of NSCLC tumours was significantly impacted by size, but even in stage I NSCLC ⩽20 mm, there is a good signal of effectiveness detecting ⩾50% of tumours with LCO (four out of seven and two out of four in the European and Chinese validations sets, respectively). With HCO this stage I NSCLC ⩽20 mm sensitivity was similar in the European set (three out of seven), but all were missed in the Chinese set. Likelihood ratios can be used to simply and quickly estimate the post-test probability; however, there is no established gold-standard threshold for determining an acceptable likelihood ratio in the developing field of biomarkers for lung cancer screening. Regardless, we believe that the likelihood ratios achieved by Lung EpiCheck appear to be in a good range.
Both the Centers for Medicare & Medicaid Services [18] and the U.S. Preventive Services Task Force [19] recommend lung cancer screening with LDCT for high-risk populations, but national screening rates are very low, up to 14% [5]. Obstacles to lung cancer screening uptake are probably due to patient and primary care provider concerns about costs, inconvenience and possible risks associated with radiation and false-positive results. Additional limiting factors are absence of efficient proven programmes or lack of programme infrastructure. Offering a simple blood test to noncompliant eligible patients, as a tool to motivate them to get LDCT, could help overcome some of these barriers. Ease and safety of a blood test could encourage patients to be tested, and a positive blood test result could potentially convince patients to participate in LDCT screening programmes. If performance is confirmed in a prospective clinical study, prioritising eligible patients for LDCT based on such a test could alleviate systems restraints by reducing the number of unnecessary procedures and providing effective patient selection. Reducing the number of LDCTs could also indirectly impact on the number of false-positive findings, and their adverse outcomes and costs.
Alternatively, such a test can be used to better identify high-risk people who should undergo LDCT. Currently, high-risk populations are defined by demographic and exposure factors (age and smoking history) with very limited discrimination of AUC 0.6-0.7 [20]. Subsequently, in the USA, a mere 2.51 cancers are detected per 1000 LDCT scans [21] and >50% of lung cancer patients will not be considered eligible [7]. More elaborate risk models, such as PLCOm2012 (Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial), report better performance (AUC 0.7-0.8) as they include other surrogate markers such as COPD and history of cancer, but they are more cumbersome and harder to implement in the clinical routine. In our analysis, relationship of cases and positive Lung EpiCheck did not vary substantially by value of other risk factors (age, pack-years, quit years, sex, smoking status and COPD). Presence of these risk factors did not change or impact the strong relationship between lung cancer and the test result. Moreover, combining Lung EpiCheck with risk factors achieved a very high discrimination of 94.2%, allowing for optimal selection of high-risk populations for lung cancer screening. Further validation is required to confirm Lung EpiCheck predictive performance and to define the best way to combine Lung EpiCheck and risk factors.
Similar to published evidence, showing correlation of cfDNA levels in the blood with tumour burden from NSCLC [22] and other solid tumours [23][24][25][26], Lung EpiCheck sensitivity correlated with tumour size. Two studies found cfDNA signal to inversely correlate with survival in patients with newly diagnosed lung cancer [27,28], suggesting that it is also a prognostic marker for aggressiveness of the tumour. Further investigations are needed to inform whether there is a lower size limit of detectability by cfDNA, and if lack of cfDNA signal is an independent prognostic factor, or potentially a sign of overdiagnosis. Either way, a liquid biopsy for early detection must be very sensitive, in order to pick up the signal in the blood of early curable cancers. Lung EpiCheck's good preliminary results in early cancer classification can be explained by an analytical sensitivity of 1:200 000 [29] which is 20-200-fold higher than other liquid biopsy products available [30][31][32].
Cost-effectiveness is an important consideration in the applicability of screening tests. In a recently published model, in order to maintain the cost-effectiveness threshold of USD 50 000 per life-year gained, a marker added to risk factors to improve selection of patients for lung cancer screening could cost up to USD 300 [33]. Therefore, the next-generation sequencing based liquid biopsy tests, common in advanced settings, are irrelevant for this field, as their running costs alone are currently much higher at USD 1000-2000 per test. Similarly, other available lung cancer biomarker tests have sensitivity and specificity below screening requirements, lowering further the price level they can charge to be cost-effective. In contrast, Lung EpiCheck, with its high preliminary performance and its simple PCR-based technology, appear to be potentially well situated to be cost effective and commercially viable.
Mutations are established key factors of cancer development (e.g. driver mutations, acquired therapy resistance) as well as important targets for treatment, and could be potential markers for lung cancer detection. However, mutations in genes such as p53 [34] drive clonal haematopoiesis [35] in up to 21% of healthy elderly people. This can pose as a serious confounder and can generate false-positive results when using these mutations in blood tests for early detection of cancer. Alternatively, and unhampered by such problems, methylation changes have recently emerged as promising markers for cancer detection [16,36].

Limitations
Case-control studies are prone to selection bias, as cases and controls do not necessarily come from the same population and are not truly comparable. This is reflected in our study by the significant difference between the groups in parameters such as age, sex, smoking status and pack-years. Cases were patients diagnosed with lung cancer due to any reason (symptoms, incidental finding, screening), and do not reflect a screening population. Controls did not receive LDCT screening, nor were they followed-up after blood draw; therefore, it is not known whether lung cancer cases were among them and were missed. Therefore, a prospective study in high-risk individuals undergoing LDCT with follow-up for lung cancer incidence is essential to confirm the current study findings. Staging was performed locally according to local standard of care in each site, so in the European validation there was a mix between AJCC7 and AJCC8. This probably translates to a potential overlap between large stage I to small stage II cancers. Collection of some data were limited in biobank samples, leading to missing smoking histories in 19% of the European validation set, limiting the multivariable analyses to a subpopulation of that set. Personal history of cancer is a known risk factor for lung cancer [37]; however, to ensure that the signal emerges from lung cancer, such patients were excluded. The Chinese set was a small single-centre study that included surgical patients only, therefore not representative of the Chinese lung cancer population; additionally, the controls were young, healthy and mostly nonsmokers, not representative of patients at risk. A larger prospective study in China is warranted to confirm the performance of Lung EpiCheck in this population.
Our current findings need to be validated in prospective trials.

Conclusions
Lung EpiCheck demonstrated strong suggestive performance in lung cancer prediction in case-control European and Chinese samples, detecting up to 78% of stage I tumours, up to 100% of SCLC and significantly improving predictive accuracy when added to established risk factors. Prospective studies are required to confirm these findings. Utilising such a simple and inexpensive blood test to select people for lung cancer screening has the potential to improve compliance and broaden access to screening for high-risk populations.