Abstract
Lung cancer is the major cause of death from neoplastic disease in the world, and even with politically-motivated smoking cessation campaigns throughout Europe, the disease remains the major cause of death. The development of molecular epidemiological population-based research into early lung cancer detection, such as the Liverpool Lung Project (LLP), may provide a way forward. This is the first major molecular epidemiological study of detection of early lung cancer.
The use of molecular epidemiological risk assessments prior to clinical diagnosis and markers of preclinical carcinogenesis in patients with a high risk of developing lung cancer will reduce the incidence of clinically-detectable lung cancer, given the appropriate intervention strategies.
The aims are as follows: 1) to prepare a molecular genetic and epidemiological risk assessment model based on environmental exposures and genetic predisposition; 2) to develop an archive of specimens relating to at-risk individuals and those with lung cancer; 3) to redefine lung cancer based on molecular pathology using the fields of expression profiling, genetic instability and molecular cytogenetics; 4) to identify and assess novel markers of precarcinogenesis in high-risk populations; and 5) to facilitate the development of new treatment strategies (e.g. chemoprevention programmes and targeted drug therapies).
The LLP has two components: 1) a case-controlled study of newly-diagnosed cases of lung cancer that will provide a baseline, risk assessment; and 2) a prospective cohort study to be carried out over a 10-yr period that will identify markers of preclinical carcinogenesis. In-depth interviews are carried out using structured and semi-structured questionnaires. Sputum, blood and tumour specimens are collected and will be assessed for specific molecular markers (e.g. genetic instability, mutation and expression profiling, and methylation status).
Conclusions from The Liverpool Lung Project will be based around molecular-epidemiological and genotyping risk assessment models, as well as redefining the disease, and ultimately contributing to the development of new early lung cancer detection and treatment strategies.
This research project was funded by the Roy Castle Lung Cancer Foundation, Liverpool, UK.
Lung cancer is the major cause of death from neoplastic disease in the world. Europe is presently beset by a lung cancer epidemic, and even with politically motivated smoking cessation campaigns, the disease remains the major cause of death. Unfortunately, it is now clear that new smoking cohorts are emerging in children in their early teens. When this information is considered, alongside the fact that the majority of individuals who develop lung cancer in the USA are former smokers, this creates a major social problem.
There are >38,000 new cases of lung cancer per year in the UK, and this incidence is amongst the highest in Europe. Although the incidence is declining in some male populations, it is rising steadily in women 1, 2. Merseyside has some of the highest incidence rates in the UK, with little evident decline in males and a 30% increase in females aged <75 yrs between 1992–1995 3. In the Liverpool area, the cumulative rate (0–74 yrs) in 1994–1995 was 11.6% for males and 7.2% for females, compared with a national average of 7.2% for males and 3.1% for females in 1995 3, 4. Detection of lung cancer usually occurs late in the disease, when it is beyond effective treatment; consequently, there is a high mortality rate and a 5‐yr overall survival rate of 6% in Merseyside 5.
The development of a molecular epidemiological population-based study of early lung cancer detection could provide answers to some of the major questions posed in lung cancer. It is for this reason that the current authors have set up the Liverpool Lung Project (LLP). Conclusions from the LLP will be based around molecular-epidemiological and genotyping risk assessment models, redefining the disease, and ultimately contributing to the development of new treatment strategies (e.g. chemoprevention and targeted drugs).
This is the first major molecular epidemiological study of the detection of early lung cancer; however, substantial pieces of work have been published on the epidemiology of lung cancer 6 in the fields of smoking associations 7, occupational exposures 8, 9, diet 10 and social status 11. Recently, there has been a dramatic increase in the number of publications in the field of genotype studies and deoxyribonucleic acid (DNA) repair 12, 13 associations with lung cancer. The future lies in combining the data from such analyses 14, in order to develop robust risk assessment models that can contribute to future chemopreventive programmes 15. The LLP's objective is to achieve these aims. Currently, the present authors have enrolled >3,000 individuals at a cost of ∼£1 million·yr−1. This study has been planned over a 10-yr period in order to recruit 500 cases of lung cancer for the cohort study, which will provide adequate statistical power.
Risk factors for lung cancer
Smoking and lung cancer
Use of tobacco products and, in particular, cigarette smoking, are responsible for the majority of lung cancer cases, with an estimated attributable risk of ∼90% in males and 80% in females 16, 17. Such attributable proportions are place specific and depend on the prevalence of tobacco smoking and of other exposures. It is important to realise that the sum of the proportions of a disease attributable to different risk factors may exceed 100% because of the multiple pathways in the carcinogenic process. The fact that 90% of lung cancers are due to tobacco does not mean that only 10% are attributable to all other causes. The concept of interactions in the contribution to risk is exemplified by the importance of gene-environment interactions and the carcinogenic potency of complex mixtures. The incidence rates in nonsmokers in the USA have been reported as 14.7 per 100,000 in males and 12.0 per 100,000 in females 18, also indicating that other risk factors may be important. The smoking-associated risks are dependent on the age of starting to smoke, the duration of smoking, and the level and pattern of smoking.
It is clear that the risk of lung cancer may be substantially reduced, dependent on the duration of smoking and age at cessation. It is also becoming apparent that many exsmokers remain at high risk because of genetic damage present in the bronchoalveolar epithelial cells. Smoking during adolescence may lead to accumulated DNA damage demonstrable many years later 19.
Smoking levels in adults declined to 28% in 1996, but this decline is concentrated in the higher socioeconomic groups (SEG) and age-specific groups. Rates have declined very little in people aged <24 yrs since the 1970s, with no decrease over this time in females aged <19 yrs. Smoking prevalence in 1998 was highest at 42% amongst those aged 20–24 yrs 20, and 26% of 15 yr olds were regular smokers, while 63% have tried smoking. Smoking is a major confounder in the assessment of other risk factors, and accurate estimates of smoking levels are required for adjustment in multivariate analysis.
The role of passive smoking has also been examined with relation to an increase in lung cancer risk in nonsmoking adults living with smokers. A recent meta-analysis 21 found an excess risk of 24% among lifelong nonsmokers with partners who smoked (relative risk 1.24, 95% confidence interval 1.13–1.36). Furthermore, a significant dose-response relationship was identified.
Occupational exposure and lung cancer
Lung cancer is ranked second only to bladder cancer in the proportion of cases thought to be due to occupational exposures 22, with an estimated attributable risk in the region of 15% 23. Associations between certain exposures and a significantly increased risk of developing lung cancer have been reported in many studies. Twenty-two chemicals, groups of chemicals or mixtures that are used in industrial or agricultural settings have been classified as established human carcinogens in the International Agency for Research on Cancer (IARC) monograph series, of which 14 are believed to act upon lung tissue 24. A further 22 chemicals have been classified as probably carcinogenic and 91 as possibly carcinogenic 25. Many of the current estimates of attributable risk for occupational exposures in lung cancer are wide (4–40%) 23, 26 and would be expected to vary in time and place. The synergistic effect of cigarette smoking and exposure to specific chemicals may, in part, account for high rates of lung cancer among workers in particular industries.
Air pollution and lung cancer
The dramatic increase in morbidity and mortality that occurred as a result of high air pollution levels in London in the 1950s directed attention on the relationship between air pollution and health. Apart from the obvious human cost of morbidity and mortality, a report by the British Lung Foundation 27 has placed the approximate health costs arising from road transport air pollution at >£11 billion per annum. Over 40 compounds considered to be carcinogenic or probably carcinogenic are found in air pollution. The presence of these compounds among air pollutants supports the hypothesis that air pollution may increase the risk of lung cancer 28.
A major drawback of previous studies has been the inadequate characterisation of air pollution exposure 29. Very few studies have been conducted that provide detailed information on personal exposures, interperson variability in exposure and a correlation of these exposures with levels measured at fixed-site monitors. Air pollution comprises a large number of compounds that are usually correlated over relatively short time periods, but changes in emissions over long time periods may result in substantial modification.
Diet and lung cancer
Dietary factors may be important in the aetiology of lung cancer; however, research on the association between diet and lung cancer remains inconclusive. Smoking levels and diet correlate with social deprivation, and smokers tend to consume diets lower in antioxidants and with fewer vegetables and fruits 30–32. The evidence on fat and cholesterol from both case-controlled and cohort studies is mixed. Some studies suggested increased risk with higher fat or cholesterol intake 33–38, while others found no association 39–41. The role of a high-fat diet in relation to increased risk of lung cancer is biologically plausible 42. Alcohol consumption is strongly associated with smoking, and a significant increase in lung cancer risk with high alcohol intake, adjusted for cigarette smoking, has been shown 43–47. Alcohol may act as a solvent for carcinogens, especially those in cigarette smoke 48, but residual confounding with tobacco may exist and explain much of the observed association.
There have been many studies of fruit and vegetable intake and most are consistent with a protective effect 38, 44, 49–52. Some found a statistically significant decrease in risk with higher carotenoid intake.
Many studies have also found a protective effect for vitamin C, although in many cases the association is weak 33, 44, 53–58. It is possible, however, that the decrease in risk is actually due to some other compound in foods containing these substances, or an aspect of lifestyle related to consumption of such foods. Selenium plays an important part in the metabolism of glutathione peroxidase, an enzyme that protects against oxidative change. Three large studies have examined the role of selenium and found a protective effect 59–62, but a recent review of epidemiological studies on diet and lung cancer over the last 20 yrs 63 concludes that they have not provided overwhelming evidence that diets low in fats and high in fruit, vegetables and antioxidants are associated with reduced lung cancer risk.
Social class and lung cancer
Inequalities in health are one critical facet of social inequality 64. Health inequalities are manifested in numerous ways (e.g. lack of access to healthcare, lack of health education, a concomitant lack of understanding of the significance of symptoms, differential referral patterns to cancer specialists). These could explain why lower SEG is associated with poorer survival from lung cancer. As lung cancer is a disease affecting many older people, an additional group who may be socially disadvantaged is the elderly, having lower levels of diagnostic investigation and treatment than their younger counterparts. A previous local study 65 demonstrated variations in treatment by age and district of residence.
It is an undeniable fact that there is an inverse relationship between social class and lung cancer 66–68. Essentially, the risk of developing lung cancer is significantly higher in the more disadvantaged sections of society. A recent study, examining the relationship between social deprivation and cancer in Scotland 69, observed a three-fold difference for lung cancer between the most and least deprived areas. Factors that could, in part, explain the association between lung cancer and lower SEG include smoking, diet, occupational exposures, and exposure to environmental pollution with area of residence. Results from a recent cohort study suggest an additional risk due to poor lung health, deprivation and poor socioeconomic conditions throughout life 70.
Family history and lung cancer
Family studies have shown a two- to three-fold increase in risk in nonsmokers who have relatives with lung cancer compared to nonsmokers with no family history 71, 72. Many studies suggest that there may be inherited tumour suppressor genes or oncogenes relating to the development of lung cancer, or a genetically-determined ability to metabolise carcinogens. One study suggested that virtually all lung cancer occurs among gene carriers 73, but the current evidence is conflicting and many studies lack the power to detect small risks due to factors other than smoking. While a shared environment may also explain the familial aggregation, some genetic markers for susceptibility have been suggested. These include polymorphisms of CYP1A1, which is responsible for the metabolic activation of benzopyrene, CYP2D6 and glutathione S-transferase (GST) that catalyse the conjugation of polycyclic aromatic hydrocarbons (PAH).
Genetic susceptibility and lung cancer
Humans are constantly exposed to chemical carcinogens in their everyday lives, but only a small proportion of those with the highest exposure (i.e. smokers) develop lung cancer. Since many carcinogenic compounds require metabolic activation to enable them to react with cellular macromolecules, individual features of carcinogen metabolism may play an essential role in the development of environmental cancer.
Epidemiological evidence suggests that the genes controlling the metabolism of carcinogens and antioxidant or nutritional status are associated with lung cancer risk, possibly through their ability to modulate DNA damage by carcinogens. Since many carcinogenic compounds require metabolic activation to enable them to react with cellular macromolecules, individual features of carcinogen metabolism may play an essential role in the development of environmental cancer 74.
Cytochrome P450s are a superfamily of oxidising enzymes, the majority of which are involved in the metabolism of xenobiotics 75. Most xenobiotics found in tobacco smoke require metabolic activation before they exert carcinogenic activity. For instance, transformation to the ultimate carcinogen of Benzo‐α‐pyrene (BP) can occur by co-oxidation in the presence of various fatty acids. It is conceivable that BP from tobacco smoke is readily oxidised to the ultimate carcinogen as a consequence of a high fat diet 42. CYP1A1, one of the most extensively studied P450s, is not expressed above a basal level in any human tissue except the lungs of smokers 1, but has been shown to be highly inducible by PAH 77. The combined effect of vitamin status and genetic susceptibility on DNA damage may explain why individuals exposed to PAH have greater lung cancer risk than others with comparable exposures 78.
Aromatic amines (aryl- and heterocyclic) are a class of carcinogens present in both diet and cigarette smoke 79. They can be N‐ or O‐acetylated by the polymorphic arylamine N‐acetyltransferase (NAT) 1 or NAT2 enzymes, resulting in activation or, in some cases, detoxification. NAT2 is considered to be a susceptibility factor for a number of malignancies. Carriers of the NAT2*4/*4 genotype, with its especially high acetylation capacity, are at significantly increased risk of lung cancer 80.
The relative risk of individuals with a combination of both a homozygous rare allele of CYP1A1 and a null GST1 was “remarkably high” at 5.8 for lung cancer and 9.1 for squamous cell carcinoma, compared with other combinations of genotypes 81. The role of genetic susceptibility remains somewhat controversial. Braun et al. 82 concluded that genetic factors could not be used to predict lung cancer risk in male smokers >50 yrs. This came from their analysis of 15,924 pairs of mono- and dizygotic twins. There was no greater concordance in lung cancer death between monozygotic twins than with dizygotic twins, even when smoking histories were similar. However, there is a considerable body of evidence to the contrary.
The limitation of most of the studies undertaken to date has been a low number of cases included. In a recent review of 17 studies, an average of only 137 cases (range 35–447) were analysed for GSTM1 polymorphisms 83. Heterogeneity of these studies is the main reason for uncertain results and even meta-analysis has failed to show a precise estimate of the true odds ratio. The failings in study design and lack of statistical power in many studies have been recently well described by D'Errico et al. 84.
Early detection biomarkers
Deoxyribonucleic acid methylation analysis
Aberrant DNA methylation within CpG islands is common in human malignancies leading to abrogation or overexpression of a broad spectrum of genes 85. Abnormal methylation has also been shown to occur in CpG-rich regulatory elements in intronic and coding parts of genes for certain tumours 86. Using restriction landmark genomic scanning, Costello et al. 87 were able to show that methylation patterns are tumour-type specific. Highly characteristic DNA methylation patterns could also be shown for breast cancer cell lines 88. Genome-wide assessment of methylation status represents a molecular fingerprint of cancer tissues, as does large scale messenger ribonucleic acid (RNA) expression monitoring, and therefore should allow tumour class prediction and discovery.
Recently, an application of methylation specific polymerase chain reaction (MSP) in serum of patients with nonsmall-cell lung cancer has been published 89. The assumption is that tumour cells may release DNA into the circulation, which is enriched in serum and plasma. After purification, 1 mL of serum yields ∼50 ng of DNA. In a series of 22 patients, four genes were examined for the presence of methylated CpG islands. In 68% (15 out of 22) of the tumours, aberrant methylation in at least one of the genes was present, but not in the normal tissue. In 11 out of 15 (50% of total), abnormal methylated DNA was demonstrated in the matched serum samples. In another study of the same group, bronchoalveolar lavage (BAL) samples were examined. In 12 cases methylation of one gene (p16) was found in BAL fluid (19 tumours with methylation out of a total of 50). The other 38 were negative with MSP 90. Recently, the feasibility of using methylation studies to detect early cancer has been demonstrated, with MSP in sputum with a sensitivity of 1/50,000 alleles 91 and detection of P16 and O6− methylguanine–DNA methyltransferase promoter methylation 1–3 yrs before cancer diagnosis. It is of note that p16INK4 promoter hypermethylation and p53 mutations have been found at a high frequency in exfoliative material (i.e. sputum, BAL, brushings) from symptomatic chronic smokers and mark the development of lung cancer 92, 93. These findings emphasise the possible relevance of methylation detection of early lung cancer.
Genomic instability
Genomic instability is the most common molecular abnormality in human tumour cells 94, 95. One form of genomic instability is allelic imbalance or loss of heterozygosity (LOH), which reflects epigenetic changes such as aneuploidy, polyploidy, losses and amplifications of chromosomal regions. The other form of genomic instability is microsatellite instability (MIN) or (MSI); also referred to as microsatellite alterations (MA) or replication errors (RER), representing replication and DNA repair infidelity. The high incidence of genomic instability in lung tumours has been well established 96–100 and in some cases it has been associated with prognosis 101–103. The present authors have recently demonstrated genetic alterations in 97.6% of lung tumours examined by a panel of 12 microsatellite markers selected at specific locations 104, and have calculated the threshold of LOH detection to 23% by assessing the interassay variation.
Lung cancer is the most common cause of neoplasia-related death worldwide. Moreover, it usually has a very poor prognosis with a ≤6% 5‐yr survival. One of the reasons for this low survival is that cancer is most often diagnosed when it is beyond effective treatment. Thus, there is an increasing demand for new early lung cancer detection tools 105, 106. Lung cancer develops through a multistage process of steps with increasing genomic instability. Genetic alterations have been detected in preneoplastic lung 107–109 lesions as well as in bronchial tissue from smokers with no evidence of lung malignancy 110, 111. DNA aberrations precede morphological transformation 112 and, thus, are favourable markers and potential tools for the identification of individuals at high risk for developing lung cancer. It has been previously shown that genomic instability can be detected in bronchial lavage (BL) and sputum and this may be one of the ways forward to assist in early diagnosis of lung cancer 113–116. The present authors have demonstrated genomic instability in the BL from a number of individuals with no clinical evidence of lung cancer, posing a question about exclusive occurrence of genomic instability in cancer. This observation was also supported by reports of genomic instability in nonmalignant diseases 117–123.
The technological advantages of fluorescence polymerase chain reaction (PCR)-based assays provide the ability to detect DNA changes from minute amounts of starting material in multiplex reactions 124. Furthermore, automated analysis on sequencers/genetic analysers not only increases throughput but also reduces operator errors during analysis. The present authors have already examined the incidence of genetic alterations in 65 microsatellite markers in lung cancer specimens and have shown that allelic imbalance in some chromosomal regions appears to be cancer specific while others are not 125. The aim of this aspect of the study is thus to identify cancer-specific microsatellite marker assay(s) to be applied to BL, sputum and plasma/serum specimens in order to assist in the early diagnosis of the disease. Ninety-six lung tumour specimens with 65 microsatellite markers have already been screened, and the 24 most informative to assay for cancer specificity have been chosen by examining 100 BL specimens from lung cancer patients and 100 specimens from individuals with nonmalignant lung diseases.
Genome-wide lung cancer expression analysis: identification and investigation of genes linked to the lung cancer phenotype
Chronic exposure of the bronchial epithelium to carcinogenic agents (such as those present in tobacco smoke) appears to lead to focal epithelial changes (hyperplasia, dysplasia, carcinoma in situ) that are scattered throughout the tracheobronchial tree 126, 127. These overt cellular changes may be preceded by the widespread proliferation of apparently histologically normal but genetically damaged clones of cells, presumably as the result of an acquired growth advantage. This growth advantage will be conferred by gene mutation and gene misregulation. In frequent smokers, lung tumours are therefore likely to arise from, and against, a background of multiclonal, genetically altered, premalignant cells.
The present authors and others have shown that both genetic damage and inappropriate gene expression may be useful predictors of both early or preneoplasia and of the clinical behaviour of lung tumours 113, 128–131.
One of the most promising approaches to the early detection of lung cancer is based on the observation that premalignant cells are shed from developing lesions, most likely as a consequence of the level of structural disorder of the tissue, and that these cells may be detected in the sputum of individuals perhaps several years before overt disease can be recognised by conventional procedures 128, 132. In order to use such a technology most effectively, a complete picture of the nature of gene expression in tumour cells is required, such that a panel of highly specific antibody diagnostics (or reverse transcriptase PCR (RT-PCR) assays) against key targets may be developed and tested. The LLP-generated sputum and blood samples will constitute an extremely valuable experimental resource for the validation of novel diagnostic reagents.
The aim is therefore to carry out a genome-wide expression analysis of human lung cancer: to redefine the disease at the molecular level. The power of such an approach has recently been highlighted in a study of B‐cell lymphoma 133.
Cytological analysis of lung tumours
Cytological changes in clinical samples have long formed the basis of cancer diagnosis. The simplest method of screening for cancerous changes in the airways is to analyse exfoliated cells in sputum samples 112, 134. Alternatively, when patients present for evaluation of a possible lung tumour, a sample of cells is collected by BL 113.
However, several studies have indicated that multiple genetic changes are already present in the apparently normal bronchial epithelium of smokers 110, 111. A number of biomarkers have been evaluated for early diagnosis of lung cancer but none have yet been found robust and reliable enough for routine screening 135. It is clear that much more needs to be known about the development of lung cancer before suitable markers can be used for screening. Cell-based studies have the benefit of allowing the analysis of very small samples with high sensitivity and specificity. These studies can be based on DNA changes, or on the downstream effects of genetic change resulting in alterations to cellular proteins, since both DNA and protein remain relatively stable in clinical samples.
Molecular cytogenetic analysis of lung tumours
Molecular cytogenetic analysis of cytological specimens has the potential to detect changes in single cells and thus is likely to be more sensitive than PCR-based methods of screening. Chromosome copy changes have been reported in the normal bronchial epithelium of lung cancer patients 136–138, but most fluorescence in situ hybridisation (FISH) studies have used centromeric probes, which give good signals but may not be sufficiently informative for tumour-specific changes 139–141. A more specific set of probes is required to provide better discrimination between general epithelial disturbance and actual tumour development. This means that there should be a thorough analysis of chromosomal changes in lung tumours.
Identification of tumour-specific chromosome changes in lung cancer
Lung tumours frequently show extensive chromosome changes, with gains and losses of many chromosome segments, often generating marker chromosomes whose origin is not immediately apparent 142–144. Molecular cytogenetics, using FISH techniques, allows us to examine the content of marker chromosomes in great detail. However, the complexity of chromosome changes seen in lung tumours makes it difficult to determine what changes, if any, are important in the development of the disease. An answer will only emerge after the thorough analysis of many tumours.
Analysis of chromosome changes in lung cancer by comparative genomic hybridisation
Conventional cytogenetic analysis is done on metaphase chromosomes, which means that the cells must be cultured in the laboratory to obtain sufficient dividing cells for analysis. Unfortunately, the success rate for culturing lung tumours is known to be low (<30%) and squamous cell carcinomas are particularly difficult in this respect 142. Consequently, comparative genomic hybridisation (CGH) is a particularly valuable method for analysing the gain or loss of chromosome segments in these types of tumours 145. A number of CGH studies have been performed on lung tumours and a body of data is accumulating 146, 147, however the technique is complex and inadequate quality control can affect the reliability of results. It is also possible that variations in carcinogen exposure and genetic background in different populations may result in differences in chromosomal changes. Therefore, it is important to establish the pattern of chromosomal changes in lung tumours in individuals from the Merseyside region in relation to other populations.
Evaluation of protein markers in lung tumours
The downstream effects of genetic changes in cells can also be analysed by comparing the distribution of RNA or proteins in normal and tumour cells. These patterns can be examined across a range of cell types within tissue sections or tissue arrays, using in situ hybridisation or immunohistochemistry. Changes in the normal pattern of staining can then be correlated with disease stage, and potential candidates evaluated for clinical significance.
Risk assessment research
The way forward for improved management and prognosis for the common cancers lies with early detection of disease. Economics will dictate that screening for lung cancer will not be available for the entire population. Therefore, there is a major need to be able to identify those people at highest risk of disease who would benefit from prevention measures. The causes of common cancers may have their basis in environmental exposures occurring in the genetically predisposed host. It is essential, therefore, that the interaction between lifestyle factors and susceptibility genes can be studied to produce a risk assessment model. The major strength of the LLP lies in the potential to carry out this work.
Objectives of the Liverpool Lung Project
The use of molecular-epidemiological risk assessments prior to clinical diagnosis and markers of preclinical carcinogenesis in patients with a high risk of developing lung cancer will reduce the incidence of clinically detectable lung cancer, given the appropriate intervention strategies.
Aims
To prepare a molecular genetic and epidemiological risk assessment model based on the analysis of environmental exposures and genetic predisposition, which will provide an algorithm to measure an individual's risk for developing lung cancer.
There are five major aims of the LLP: 1) to prepare a molecular genetic and epidemiological risk assessment model based on the analysis of environmental exposures and genetic predisposition, which will provide an algorithm to measure an individual's risk for developing lung cancer; 2) to develop an archive of specimens relating to at-risk individuals and those with lung cancer; 3) to redefine lung cancer based on molecular pathology using the fields of expression profiling, genetic instability and molecular cytogenetics; 4) to identify and assess novel markers of precarcinogenesis in the high-risk populations; and 5) to facilitate the development of new treatment strategies, i.e. chemoprevention programmes and targeted drug therapies.
Study design
If the results are to be widely applicable it is important that the study is population based. Therefore, the LLP is being conducted in a defined geographical area of Merseyside, based on contiguous electoral wards with a high incidence of lung cancer. The LLP has two components. Firstly, a case-controlled study of newly diagnosed cases of lung cancer, which will provide a baseline risk assessment. Secondly, a prospective cohort study which provides serial samples to identify markers of preclinical carcinogenesis and contemporaneous lifestyle data (fig. 1⇓).
Case-controlled study
Twelve hundred and fifty newly diagnosed cases of lung cancer will be entered into the study. Two controls per case, matched for age and sex, will be randomly selected from the study area population. All cases of epithelial tumours of the trachea, bronchus and lung will be included. Approaching newly diagnosed lung cancer cases to invite them to take part in a research project is difficult. Cases are initially identified from many sources including pathology reports, specialist lung cancer nurses, the palliative care team, and clinicians in the lung cancer rapid access and oncology clinics. This work is carried out by a research nurse, working with the clinical teams.
Cohort study
All residents, aged 45–79, within the study area (total of 326,000) will be eligible for entry into the study. A random selection of 7,500 people will be chosen from this population via the general practitioner's (GP's) lists. All the GPs who have practices within the study area have been asked to collaborate with the project and an ∼2% sample of each GP's patients will be included. An anonymous list of patients registered with collaborating GPs is held in the study centre. Randomised samples are drawn from each GP list, with individuals identified by a study number. Invitation letters for cohort and control subjects are sent out from Central Operations Group office.
Exclusions
All exclusions are monitored, analysed for bias and recorded. Primary reasons for exclusion include: subject refusal, inability to contact the subject, subject resident outside the study area, refusal to approach by a clinician and the subject too ill or unsuitable for interview. The subject may be considered unsuitable if, for example, they have advanced senile dementia or are otherwise unable to understand the project sufficiently to give informed consent.
Power calculations
Cohort study
In a cohort of 7,500 persons, followed-up over 10 yrs, the present authors expect to see 500 cases, assuming an incidence rate of approximately eight in 1000. This will give power of >95% to detect risk ratios of 2.0 at the 5% significance level. The pilot study has demonstrated that 1,000 people can be recruited into the cohort, provide samples and complete the questionnaires, in a single research clinic in 1 yr.
Case-controlled study
Approximately 1,000 new cases of lung cancer are diagnosed in the LLP study area every year, around 350 in females and 650 in males <80 yrs. For exposures with a control prevalence of 5%, the proposed study would be able to detect relative risks of 1.75 at the 5% significance level with a power of 95%, or relative risks of 1.5 with a power of 80%, given a recruitment of 1,250 cases.
Analysis
Multivariate conditional logistic regression will be used for both the case-controlled and cohort studies, the latter being analysed primarily as a nested study. The risk assessment model will be based on data from the case-controlled study. This will later be validated and possibly refined by data from the cohort study. Variable selection will be critical and several different approaches will be taken and compared, black-box stepwise selection will not be used alone. Furthering work embodied in Miller 148, this is an ongoing area of statistical research. Given the nature of the data collection, the present authors also propose to incorporate and extend the methods of Carroll et al. 149 to allow for measurement error.
Some simpler nonglobal, and even univariate analyses, will also be presented for ease of interpretation and comparison. There will be some survival analysis regarding time to onset of disease following occurrence of precursor markers, based on the cohort study. Subanalyses will be made by tumour cell type, and the present authors will build risk assessment models for environmental, occupational, lifestyle, dietary, and genetic risks in isolation and in subsets. This will enable the present authors: 1) to study their contributions to the total attributable risk, and 2) to validate the findings against those of other studies where global models have not been possible. Due to the dangers of multiple hypothesis testing all subgroup analyses will be presented with suitably strong caveats. There will be no interim analysis of either the case-controlled study or the cohort study.
Ethical issues and informed consent
Ethical approval for the LLP was initially obtained from the three Local Research Ethics Committees between October 1997 and February 1998.
Further ethical approval was obtained in March 2000 for the more detailed information sheets and consent form for DNA and tissues samples which had been drawn up to comply with the draft new Medical Research Council guidelines 150.
Epidemiological data
The methodology, with regard to the structured interview and obtaining specimens, is essentially the same for the two elements of the LLP. An in-depth interview will be carried out using structured and semi-structured questionnaires. Those subjects who return for follow-up will complete a short form of the questionnaires aimed at recording any changes in lifestyle over the interval since their last attendance.
The lifestyle questionnaire has been developed in-house and covers areas such as active and passive smoking history, residential history, medical history, family history and hormonal history (for females). The residential history will be correlated with environmental pollution monitoring data. This work includes the production of detailed maps for each pollutant, from current and historical data.
The methodology for assessing occupational exposure has been developed by Siemiatyki et al. 151 in Canada. These methods are based on expert judgement applied to job descriptions obtained through detailed and structured interviews.
Occupational exposure assessment
One of the most complicated aspects of population-based studies of occupational cancer risk is retrospective exposure assessment. New methods have been developed in order to avoid the errors involved in using job titles as surrogates for exposures, or self-reporting of exposures. These methods are based on expert judgement applied to job descriptions obtained through detailed and structured interviews 151–154.
This methodology has been adopted for the multicentre case-controlled study of Occupation, Environment and Lung Cancer in Countries of Central and Eastern Europe co-ordinated by IARC.
Dietary assessment
The role of dietary factors will be assessed using a food frequency questionnaire, closely based on that used for the multicentre European Prospective Investigation of Cancer study of diet and cancer, modified for local diets. The questionnaire will be validated with a 5‐day food diary.
Estimation of exposure to environmental pollution
The assessment of an individual's long-term exposure to air pollutants is not a simple task. The available data from monitoring programmes must be extrapolated, or at least interpolated, both in time and space. However, the density of the monitoring networks is not sufficient to map air pollutant concentrations by simple numerical interpolation since the temporal and spatial variability of air pollutant concentrations is too large. Scientists at the National Environmental Technology Centre, AEA Technology, have developed alternative mapping methods for air pollutant concentrations based on the relationships found between pollution measurements and variables such as population density, land use or road traffic density for which detailed spatial data are available. These methods allow the production of air pollutant maps at 1 km2 resolution. Currently, maps are available for 1994 and 1996 for nitrogen dioxide, sulphur dioxide, carbon monoxide, benzene and particles with an 50% cut-off aerodynamic diameter of 10 µm.
Conclusion
Consideration must be given to the redefinition of lung cancer using molecular biology approaches because the classic pathological and clinical methods are inaccurate. The present authors are planning to undertake genome wide expression profiles, so are in a particularly strong position to undertake this task. The importance placed internationally on spiral computed tomography (CT) early lung cancer detection trials has been taken into account, spiral CT will play an important role in the public image and clinical approach to early lung cancer detection. Thus, the present authors are in a particularly strong position to propose future research programmes that may be run in collaboration with such trials, i.e. spiral CT early lung cancer trials.
The results of the LLP research programme will make a significant contribution to the risk assessment of individuals who may develop lung cancer, as well as providing methods of identifying genetic aberrations in bronchial cells prior to a clinical diagnosis of the disease. The present authors are now in a position to make a major contribution to the clinical and scientific community in the ability to undertake a redefinition of lung cancer based on expression profiling, which will lead to the provision of specific early detection markers as well a providing targets for future therapy.
The flow diagram of the LLP (fig. 2⇓) indicates the contribution this research programme can make to the identification of individuals who are at risk of developing lung cancer, as well as the molecular/pathological definition of the disease with the contribution of these research modalities into future intervention and treatment modalities.
This is a unique project and its major strength lies in the combination of epidemiology and molecular genetics and its successful completion should have a major impact on the prevention and management of lung cancer.
Appendix
Liverpool Lung Project management
The cohort study design is based on randomised selection of a population sample invited to take part by letter. Subjects who agree to take part are then given an appointment for a nurse-led research clinic. This enables full informed consent to be obtained, a full medical history to be obtained and a basic health check to be carried out. Blood and sputum samples can be collected in a safe environment, according to the study protocols, and the subject can then spend time with one of the medical interviewers to complete the questionnaires.
The project manager is responsible for the day to day running of the project. The project management requires clerical support and fully staffed clinic facilities, which includes a receptionist, research nurse and two medical interviewers. Furthermore, laboratory resources are required to process the specimens and undertake the cytology screening, which includes a technician to log and bar-code all samples, cytology screeners and a technician to prepare DNA from the blood and sputum specimens. In addition a full time research nurse is also required to liaise with clinical staff throughout the six hospital trusts involved with the project and to recruit new lung cancer cases. The LLP currently has two research clinics, The “Tockman Clinic” on the Cardiothoracic Centre National Health Service (NHS) Trust site, which was opened in June 1998 and the Mobile Unit which was launched in July 1999.
The research clinics
The organisation of the project is based around dedicated, nurse-led community clinics. One of these clinics is a mobile unit. Many of the study population live in socially and materially deprived areas and mobile clinics enable access, raise the profile of the project and give ownership of the project to local communities.
One research clinic can handle ∼50 appointments a week giving a maximum annual capacity of ∼2,400. The initial interview may take ≤2 hrs, if a subject has a complicated occupational history, whereas a follow up appointment is usually half this time.
Basic health advice is given, where appropriate, and subjects are provided with a referral letter for their GP if the nurse is concerned for a person's health. Most referrals relate to undiagnosed hypertension (systolic >160 mmHg, diastolic >105 mmHg).
Sputum induction
This appears to be a safe procedure for patients with asthma and chronic obstructive pulmonary disease 155, 156, although there is no evidence for its use in a population sample. Although the technique produces small changes in spirometry and arterial oxygen saturation these changes are generally asymptomatic and supplemental oxygen is not required. Oxygen levels are monitored throughout the procedure with a transcutaneous handheld pulse oximeter (NPB‐40; Nellcor Puritan Bennett UK Ltd, Bicester, UK). Baseline spirometry is performed including forced expiratory volume in one second (FEV1). Post induction FEV1 is also measured to monitor drop in pulmonary function caused by sputum induction. The procedure is generally considered safe at levels of FEV1 of 0.7l or >50% of predicted value. A drop in FEV1 is generally detected after 7–10 mins and sputum induction may be discontinued if there is a drop in FEV1 of >20%. If the patient continues to complain of shortness of breath, bronchodilation with 2.5 mg salbutamol via nebuliser is considered. Any patient with a history of reversible airways disease should receive salbutamol 2.5 mg via nebuliser for pre-induction bronchodilation.
Subjects are advised not to undertake active exercise immediately after sputum induction because of the risk of asymptomatic persistent desaturation.
A portion of the sputum sample obtained is retained for routine cytology screening. Copy cytology reports are sent to the subject's GP. There is little evidence in the current literature for the mechanism of the progression of clones of dysplastic cells. A protocol has been agreed for the management of subjects with mild, moderate or severe atypia. All such subjects are recalled for sputum induction on a 6‐monthly basis. Subjects with persistent mild atypia, or moderate or severe atypia are referred to one of two specialist respiratory physicians for further investigation. The follow-up protocol includes annual CT scan and bronchoscopy, together with sputum induction carried out in the research clinic.
Information technology security
The computer system is protected by a level three firewall and secure sign on technology. Parts of the database are further protected so that only specific members of staff have access to patient identifiers. This level of security complies with the NHS code of connection.
Management of research specimens
The members of the laboratory staff in the Roy Castle International Centre for Lung Cancer Research support the research clinics by preparing the solutions that are used for sputum induction. They also produce all sample and accompanying data-sheet bar codes to ensure efficient specimen recognition and tracking. Once collected by the research clinics the blood and sputum samples are transported back to the centre, bar coded and logged into the centre's secure database.
Sputum
Before the sputum can be examined cytologically, the lung epithelial cells within the sample need to be fixed onto a glass slide. Firstly, the samples are treated with di-thiothreitol to remove the sticky mucus and then the epithelial cells are centrifuged onto a slide using a cytospin. Four slides are produced, two of which are stained with H & E and two with Papanicolaou. These slides are then screened for the presence of any abnormal cells by Medical Laboratory Scientific Officers under the supervision of a Consultant Cytologist who is based at the Royal Liverpool University Hospital. If the results of this screening procedure are unclear repeat samples may be requested, and, if there are cellular abnormalities present the individuals will be referred to one of the collaborating chest physicians, as per a specific protocol, for further investigation. Such subjects will remain on joint follow-up.
All cytology slides produced are stored in a specially designed room where they will be archived until the end of the project. Remaining specimen material is also stored for future use, with subject consent. The development of any new technique to identify malignant or premalignant cells will, of necessity, be compared to the current cytological method of disease identification. Cytology reports are entered into the database and future sputum sample movement is then tracked via the database.
When a batch of 96 different samples has been collected, DNA is extracted from a small amount of each sputum using the Qiagen DNeasy 96 well kit (Qiagen, Crawley, UK). This DNA is checked for quality by PCR, aliquoted, bar-coded and stored at −85°C. Storage position of both the sputum and the resultant DNA samples are entered into the database; this capability greatly facilitates use of these samples in the research programme. The DNA prepared from these samples, is at present primarily used by the Genetic Instability Research Group.
Blood
The blood samples are separated by centrifugation into plasma, white cells and red cells; the red cells are discarded. The plasma and white cells are each divided into three portions and are stored at −85°C until required, these samples will be stored for the duration of the LLP.
As with the samples described above the storage position of the whole blood and the resultant DNA are recorded on the LLP database.
Bronchial lavage and tumour specimens
Following consent, one member of staff is based at one of the collaborating hospitals, to obtain samples of tumour tissue, blood and bronchial lavage. In all cases great care is taken not to compromise the diagnostic/therapeutic process. Solid tissue is either snap frozen in liquid nitrogen or fixed in formalin and embedded in wax, then sectioned for routine H & E staining. Bronchial lavage is mixed with Saccomannos fixative, and routine cytology examination is carried out. All slides prepared from these samples are bar-coded and will be stored until the end of the project. BL samples are coded and stored in a specially designed storage room. The blood sample and all solid tissues are also coded and stored, the blood sample and frozen tissue at −85°C, embedded material at room temperature.
Wax-embedded material can be used in the research projects, either as cut sections, in microarray, or as a source of DNA. Microarray technology allows efficient use of valuable material as very small specific samples can be taken and embedded in a fresh wax block, thus, producing a “mini archive” of solid material in each wax block. Results obtained from use of the above samples indicating mutation, LOH or chromosomal alterations and will be entered into the database alongside epidemiological information.
Freezer security
Plasma, white blood cells, whole blood, DNA extracted from the white blood cell and plasma are bar coded and kept at −85°C. The tumour blocks and their associated DNA and RNA are bar coded if within the LLP and stored at −85°C. Bronchial lavage and sputum DNA specimens are also bar coded and stored at −85°C. The freezer room was purpose built with air conditioning and open vents to maintain the room temp ≤21°C. The freezers are connected to uninterrupted power supply and generator back-up in the event of a power failure. The generator is tested monthly. The freezers are also connected to liquid carbon dioxide back up in the event of freezer failure. Each freezer is also connected via its remote alarm connection to the building managed system computer which will send an alarm signal to the laboratory managers office and a staff call out list
The present authors believe this to be the first European early lung cancer detection study, based on nurse-led community clinics, allowing active follow-up and producing updated lifestyle information and a specimen archive. Over 3,000 individuals have currently been recruited into this project.
Acknowledgments
The authors wish to thank all of their scientific and clinical colleagues who have contributed much to the development and success of the Liverpool Lung Project.
Footnotes
-
↵Previous articles in this series: No. 1: Steels E, Paesmans M, Berghmans T, et al. Role of p53 as a prognostic factor for survival in lung cancer: a systematic review of the literature with a meta-analysis. Eur Respir J 2001; 18: 705–719. No. 2: van Klaveren RJ, Habbema JDF, Pedersen JH, de Koning HJ, Oudkerk M, Hoogsteden HC. Lung cancer screening by low-dose spiral computed tomography. Eur Respir J 2001; 18: 857–866. No. 3: Brambilla E, Travis WD, Colby TV, Corrin B, Shimosato Y. The new World Health Organization classification of lung tumours. Eur Respir J 2001; 18: 1059–1068. No. 4 : Brock CS, Lee SM. Anti-angiogenic strategies and vascular targeting in the treatment of lung cancer. Eur Respir J 2002; 19: 557–570. No. 5: Hirsch FR, Merrick DT, Franklin WA. Role of biomarkers for early detection of lung cancer and chemoprevention. Eur Respir J 2002; 19: 1151–1158.
- Received October 29, 2001.
- Accepted April 18, 2002.
- © ERS Journals Ltd