Radiological diagnosis of interstitial lung disease: is it all about pattern recognition?
- 1Dept of Radiology, King's College Hospital Foundation Trust, London, UK
- 2Departments of Medicine and Pathology/Molecular Medicine, McMaster University, Firestone Institute for Respiratory Health, St Joseph's Healthcare, Hamilton, ON, Canada
- Simon L.F. Walsh, Dept of Radiology, King's College Hospital Foundation Trust, Denmark Hill, Brixton, London SE5 9RS, UK. E-mail: slfwalsh{at}gmail.com
Abstract
Radiological diagnosis of ILD is pattern-based and linked to underlying histology. The future of radiological diagnosis in ILD may be the identification of disease behaviour-based radiological phenotypes that predict disease outcome. http://ow.ly/Dg0Q30l4Rh8
Hypersensitivity pneumonitis (HP) is a complex fibroinflammatory lung condition that arises from repeated exposure, usually to aerosolised organic antigens, in sensitised individuals. Although HP is a well-recognised clinical entity, the underlying mechanisms that drive disease progression are poorly understood.
Making a diagnosis of HP depends on the presence of variable combinations of clinical features, including the presence of serum antibodies to inciting antigens, lymphocytosis on bronchoalveolar lavage, compatible features on high-resolution computed tomography (HRCT) and if available, the presence of loosely formed granuloma in a bronchiolocentric location on lung biopsy [1]. However, in many cases, making a confident diagnosis of HP is hampered by marginal test results, nonspecific HRCT features and perhaps most importantly, the lack of internationally agreed diagnostic guidelines for the disease. These difficulties were brought into sharp relief by a recent study of multidisciplinary practice that reported miserable diagnostic agreement (weighted κ-coefficient (κw)=0.29) between expert multidisciplinary groups assigning a diagnosis of HP to a set of standardised cases drawn from a tertiary referral centre for diffuse lung diseases [2]. In contrast, interobserver agreement for idiopathic pulmonary fibrosis (IPF) was good (κw=0.71), reflecting the positive impact evidence-based guidelines have on diagnostic performance. Adding to these challenges, HP is highly variable in its presentation, response to antigen avoidance and treated course in individual patients. The response to the “HP challenge” has been a groundswell of research focused on disentangling the complexities of HP diagnosis, all of which aim squarely at developing a case definition for HP that can be readily applied in routine clinical practice. This effort has resulted in diagnostic algorithms for HP that combine salient clinical variables with HRCT features and clinical perspectives, recommending different but similar diagnostic approaches to HP [3–5].
HRCT plays a central role in the evaluation of patients with diffuse lung diseases and often has a significant impact on subsequent management decisions including the need for lung biopsy. Historically, radiological diagnosis has been inextricably linked to histopathology. The first systematic studies of computed tomography in diffuse lung disease began in the mid-1980s and the development of HRCT in the early 1990s was followed by more than 10 years of HRCT-pathology correlative studies that bridged the gap between the microscopic world of interstitial lung disease (ILD) and their macroscopic appearances on HRCT. This era, typified by the meticulous work of the radiologist–pathologist duo of Nestor Müller and Roberta Miller, formed the foundations upon which HRCT diagnosis stands today. During that period, the primacy of histopathology as a diagnostic test in ILD was reflected in two ways: first, the official American Thoracic Society (ATS)/European Respiratory Society (ERS) classification of idiopathic interstitial pneumonias (IIPs) at the time was based on histopathological morphology; and second, histopathology was the diagnostic reference standard against which the veracity of all other diagnostic tests, including HRCT, was gauged.
As with other ILDs, the HRCT appearances of “biopsy-proven” HP were studied during this period. Early work highlighted the importance of poorly defined ground-glass nodules, heterogeneous or peribronchovascular axial distribution of disease, mosaic attenuation, and diffuse ground-glass opacification as important features of HP on HRCT [6–8]. Next, Silva et al. [9] evaluated the accuracy of HRCT for distinguishing chronic fibrotic hypersensitivity pneumonitis (cHP) from IPF and nonspecific interstitial pneumonia (NSIP); this is the classic imaging conundrum faced by practising ILD radiologists in multidisciplinary meetings the world over. In this highly cited study, the authors concluded that lobular areas of air trapping, centrilobular nodules and an absence of lower zone predominance predicted a clinical diagnosis of cHP, but also highlighted that separating cHP from IPF and NSIP is possible in only ∼50% of cases.
In a study presented in this issue of the European Respiratory Journal, Salisbury et al. [10] add to the growing HP literature by developing, for the first time, a “rule-in” radiological diagnosis model for HP. First, they assembled a diverse cohort of ILD patients (n=356) who underwent diagnostic case review at their local multidisciplinary ILD conference. Additional HP patients were identified by their electronic medical records. Importantly, the outcome for analysis was clinical diagnosis following multidisciplinary discussion (or documented diagnosis in the clinical records), and HRCT in isolation was not used to verify these diagnoses. Imaging was semiquantitatively scored and consensus reached using a standard approach by three radiologists blinded to the clinical data. Interestingly, the authors generated several comparator HRCT scores such as “mosaic attenuation or air trapping greater than reticulation” (see later). Multivariable logistic regression identified HRCT features associated with HP and an HRCT-HP diagnosis model was generated using the most highly correlated features. In the analysis, the best performing HRCT variables after controlling for age, sex and smoking status were diffuse axial disease distribution and the novel “mosaic attenuation or air trapping greater than reticulation” score. An HP predictive model using scores based on the binary categorisations of these patterns (i.e. present or absent) was created (range 0–3); a score of 3, meaning there is a diffuse axial distribution of disease and mosaic attenuation or air trapping exceeding the extent of reticulation, gave a specificity for HP of >90% in the derivation cohort. The model was then validated in an external cohort generated from the Lung Tissue Research Consortium database (n=438, a score of 3 gave a specificity for HP of 95.8%).
A significant strength of this study is its derivation–validation cohort design, which helps to prevent spurious conclusions being drawn from models that are over-fitted to a single population. However, equally important is clinical applicability. This is especially relevant when diagnostic algorithms are based on HRCT patterns for which interobserver variability is a well-known problem. The authors address this issue head-on by reporting κw for each radiologist pair. Interestingly, there was strikingly poor agreement for one of the fundamental patterns in their model, the extent of reticulation (κ=0.06 for radiologist 1 versus radiologist 3, which is essentially agreement that is no better than by chance). What the authors did not do was to evaluate a rapid scoring of cases based on their final model by a group of radiologists and clinicians of varying levels of experience. Since the model only requires a gestalt impression of the relative extents of mosaic attenuation/air trapping and reticulation, they may have found agreement on the overall model scores a great deal better. This would be a useful test of the algorithms reproducibility.
Clinical applicability aside, there are two difficulties associated with radiological diagnosis that are not addressed. The first is the ubiquitous and often debated issue of diagnostic reference standards. There is no reference standard against which radiological diagnosis can be validated. Histological diagnosis cannot be used because surgical lung biopsy is generally performed in those cases in which HRCT appearances are not definitive, i.e. they are often “atypical”; an HRCT study that is based on biopsy diagnosis is a study in which HRCT is intrinsically less helpful and selectively disadvantaged. In addition, histological diagnosis is also subject to high levels of interobserver variability [11], sampling error [12] and may be modified in up to 20% of patients following multidisciplinary discussion [13]. Radiological diagnosis also cannot be standardised against multidisciplinary diagnosis because multidisciplinary diagnosis incorporates and is influenced by radiological diagnosis, and is not, therefore, an independent reference standard [14, 15]. The authors address this issue in the penultimate paragraph of the discussion and emphasise that all attempts to verify clinical diagnosis were blinded to the HRCT findings. For these reasons, radiological diagnosis should be validated against outcome. This approach has been used in an international study of multidisciplinary IPF diagnosis and to verify the diagnostic accuracy of a novel Deep Learning algorithm for classifying fibrotic lung disease on HRCT [2, 16]. It is also an approach endorsed by the Standards for Reporting of Diagnostic Accuracy initiative [17, 18].
This raises important questions about the utility of our current “histospecific” HRCT diagnostic categories. Although clinicians and patients need to know the “label” of a disease, they also need to know the likely disease behaviour since this will determine clinical management. Aside from patients with usual interstitial pneumonia (UIP) on biopsy, histology does not reliably inform prognosis, and likewise, apart from UIP on HRCT, radiological diagnosis does not reliably predict outcome. Based on the findings of Salisbury et al. [10], a pattern of diffuse axial fibrotic disease with mosaic attenuation and air trapping may allow a confident radiological diagnosis of HP, but it does not tell us about the likely behaviour of the disease. Perhaps, therefore, it is time to rethink radiological diagnosis in ILD; since the primary utility of diagnosis is to inform the clinician of the natural history and treated course of the disease, standardising diagnosis against subsequent disease behaviour may make more sense. Interestingly, these concepts are fuelling a crucial ongoing debate regarding the “progressive fibrotic ILD phenotype”, a label that amalgamates all patients with fibrotic lung disease who exhibit IPF-like disease behaviour regardless of clinical diagnosis. It is an approach formally endorsed in the 2013 ATS/ERS IIP classification statement and argued for in a recently published IPF Working Group statement [14, 19]. For the radiologist, UIP is the classic progressive fibrotic phenotype, but self-sustaining progressive fibrosis is not confined to patients with radiologic UIP. Identifying new radiological phenotypes that reliably inform future disease behaviour (possibly in concert with a blood-based biomarker) may be the future of ILD interpretation on HRCT.
Footnotes
Conflict of interest: None declared.
- Received July 15, 2018.
- Accepted July 16, 2018.
- Copyright ©ERS 2018