Abstract
Sputum induction is a simple and noninvasive procedure for Pneumocystis carinii pneumonia (PCP) diagnosis in human immunodeficiency virus-1-positive patients, although less sensitive than bronchoalveolar lavage (BAL). In order to obtain an overview of the diagnostic accuracy of sputum induction, a systematic review and meta-analysis of studies reporting the comparative sensitivity and specificity of BAL (the “gold standard”) and sputum induction was performed.
The odds ratio and related 95% confidence interval were calculated using summary receiving operating characteristic curves as well as fixed-effect and random-effect models. Based on pooled data, the negative and positive predictive values were calculated for a range of PCP prevalence using a Bayesian approach.
Seven prospective studies assessed the comparative accuracy of BAL and sputum induction. On the whole, sputum induction demonstrated 55.5% sensitivity and 98.6% specificity. The sensitivity of sputum induction was significantly higher with immunofluorescence than with cytochemical staining (67.1 versus 43.1%). In settings of 25–60% prevalence of PCP, the positive and negative predictive values ranged 86–96.7 and 66.2–89.8, respectively, with immunofluorescence, and 79–94.4 and 53–83.5% with cytochemical staining.
In conclusion, in a setting of low prevalence of Pneumocystis carinii pneumonia, sputum induction, particularly with immunostaining, appears to be adequate for clinical decision-making.
The use of effective prophylaxis for opportunistic infections and, more recently, the availability of highly active antiretroviral therapy (HAART) have considerably altered the natural history of human immunodeficiency virus (HIV)-1 infection 1–4. The new effective antiretroviral treatment provides substantial protection against all major opportunistic infections. For instance, HAART has not only changed the morbidity and mortality of Pneumocystis carinii pneumonia (PCP) but also clinicians' attitudes towards prophylaxis directed against this opportunistic infection 1, 4–10. Despite these advances, PCP has not lost its clinical relevance, particularly in persons who are not aware of their HIV-1 infection status or in those who fail to respond to antiretroviral treatment.
In this dynamic context, reappraisal of strategies for the management of infectious complications of HIV is the subject of intense investigation [1–l0]. Rapid and specific diagnosis of PCP is still needed in HIV-infected patients.
Laboratory techniques for the diagnosis of PCP usually rely on microscopic demonstration of P. carinii by means of conventional cytochemical procedures or immunocytochemical staining 11–14. Polymerase chain reaction assays have been used to diagnose PCP, but this technique is not yet suitable for implementation in routine diagnostic laboratories 11. Immunostaining seems to increase sensitivity and specificity, but the quality of the clinical material from the respiratory tract is of crucial importance for diagnostic accuracy 11–16. Indeed, fibreoptic bronchoscopy with bronchoalveolar lavage (BAL), with a sensitivity of >90%, is recognised as the procedure of choice for the diagnosis of PCP 11, 12.
The sputum induction technique for PCP diagnosis was originally described by Bigby et al. 16. Since then, this technique has emerged as a simple and noninvasive procedure, although rather wide variation in diagnostic accuracy (25–90%) has been reported 11–17. The wide range of accuracy reported may be related to factors in the population studied or features related to study design, execution of the test and staining techniques 18. For instance, data from early reports suggest that the sensitivity of sputum induction techniques can be increased by use of monoclonal antibodies 15. Moreover, although sensitivity and specificity remain constant when the technique is applied to populations with different prevalences of disease, the related predictive values are subject to wide variation 19.
Data from a systematic review and meta-analysis of studies evaluating the sputum induction procedure in comparison with fibreoptic bronchoscopy with BAL for the diagnosis of PCP in HIV-1-infected patients are presented. The aim of the study was to provide an overview of the diagnostic accuracy of the sputum induction procedure and evaluate the predictive values of the test according to different prevalences of PCP.
Material and methods
Literature search
The English literature produced during January 1982–April 2001 was searched for using Medline. For the search, the terms “Pneumonia, Pneumocystis carinii” and either “Diagnosis” or “Sputum” had to be present in the title or abstract. The search for pertinent studies was completed by consulting bibliographies from original articles and reviews retrieved via Medline.
Protocol of the analysis
All papers considered possibly eligible were reviewed independently by the present authors, who assessed whether the paper was concerned with the comparison of the sputum induction procedure with the reference standard for the diagnosis of PCP in a relevant clinical population. A relevant clinical population was defined as a group of HIV-positive individuals requiring fibreoptic bronchoscopic investigation for suspected PCP or other pulmonary diseases. This choice was made in an attempt to cover the spectrum of diseases that is likely to be encountered in the current or future use of this diagnostic test.
Studies were included for analysis if: they compared the sputum induction procedure to BAL with fibreoptic bronchoscopy (the “gold standard”); they reported data on false-positive, true-positive, false-negative, and true-negative results of the diagnostic test; and the comparison between tests was performed prospectively in a consecutive series of patients.
Sensitivity was defined as the percentage of patients with a diagnosis of PCP by BAL who were correctly found to be positive using the sputum induction procedure. Specificity was the percentage of patients with negative BAL results correctly identified by a negative result from the sputum induction procedure.
Articles were examined in detail by two of the authors. Any disagreement was resolved by consensus with a third party.
As seen in the study of Lijmer et al. 20, studies were scored based on the sampling method as consecutive, nonconsecutive and case-control. When the comparison between tests was performed prospectively, in a blinded or nonblinded fashion, in a consecutive series of patients, the study was scored as consecutive. When not all of the patients presenting with the relevant conditions were included in order of entry or in a random design, the study was referred to as nonconsecutive. When the study evaluated a group of patients already known to have the disease and a separate group of subjects not considered a relevant clinical population, it was scored as a case-control study. Moreover, the methods of data collection and reporting were also considered. Data collection was categorised as prospective, retrospective or, when the description of the procedure was insufficient, unknown. The description of the characteristics of the study population and criteria for performing the test were also analysed.
Statistical analysis
In order to remove the influence of threshold variation from the accuracy of sputum induction procedures, receiver operating characteristic (ROC) curves with sensitivity plotted on the y-axis and 1-specificity on the x-axis were used 21–24. False-positive and true-positive rates were transformed to their logits, U and V, respectively, after increasing each observed frequency by 0.5 25–27.
Since there is a constant odds ratio irrespective of the threshold, other meta-analytical methods were used for calculating a common weighted odds ratio. For this purpose, the fixed effect method of Mantel-Haenszel 28, 29 and random effect model of Der Simonian and Laird 30 were used. The diagnostic odds ratio describes the odds of positive test results in individuals with the disease compared with the odds of positive test results in those without disease, and corresponds to particular pairings of sensitivity and specificity 18. Heterogeneity was assessed by means of the Cochran Q method 31. Since microscopic examination of induced sputum was performed using two methods, immunofluorescence or conventional staining, two separate analyses were performed in subgroups of specimens identified on the basis of the method used.
The sensitivity and specificity of the two staining techniques were evaluated by comparison of m proportions from several independent samples 32.
Finally, the post-test probability was calculated for a range of PCP prevalences using a Bayesian approach 33. The overall values were based on pooled data corrected by taking the common odds ratio of Mantel-Haenszel into account.
All calculations were carried out by implementation of the appropriate algorithms on Excel worksheets. EasyMA (available from 34) 35, meta.exe, version 4.38 (obtainable upon request from authors of 36), Stata release 7.0 (Stata Corporation, College Station, TX, USA) and Minitab programs (available from 37) were also utilised.
Results
The computer search yielded 1,191 possible articles. Of these, 1,155 were eliminated by review of titles and/or abstracts. The remaining 36 articles were reviewed in detail, and 29 of these were excluded (see Appendix). Overall, data retrieved from seven publications were included in the meta-analysis 38–44. Table 1⇓ summarises the characteristics of the seven studies included in the meta-analysis. Altogether, 322 individuals (160 patients with PCP and 162 with other lung infections) were studied using the sputum induction procedure in comparison to BAL. Sputum and BAL specimens from all of these patients were processed using conventional staining methodologies, which included P. carinii cytochemical stains for cyst wall and/or intracystic bodies and trophozoites. In three of these studies, which included a total of 156 patients, the sputum specimens were also processed by immunofluorescence staining with monoclonal antibodies 41, 42, 44.
Study quality
The method of patient enrolment was consecutive in all of the included studies. Four studies provided an adequate description of the independence of interpretation of test results 39, 42–44. Details of staining techniques and specimen collection procedures were sufficiently described in all but one 38 of the seven publications. Information on the study population was provided in all of the studies.
Results of the meta-analysis
Odds ratios and related 95% confidence intervals for each study are shown in figure 1⇓. The summary ROC curves showing true-positive versus false-positive rates from individual studies are shown in figure 2⇓. The common diagnostic odds ratios and the corresponding sensitivities and specificities are shown in table 2⇓. The fixed effect model was used for analysis of immunofluorescence data, as the homogeneity hypothesis was tenable (p=0.5).
The random-effect model was more appropriate for data from cytochemical staining of sputum, as the homogeneity hypothesis was rejected (p=0.001).
The summary ROC curves did not provide significant parameters for the regression line. Moreover, the regression coefficient β assumed a negative value, thus supporting the hypothesis that the odds ratio for an association between cytochemical or immunofluorescent staining of induced sputum specimens and the standard reference is not dependent on the threshold used.
Cytochemical versus immunofluorescent staining of induced sputum specimens
The diagnostic odds ratio was considerably higher when specimens were examined by means of immunofluorescence compared to cytochemical staining. This was also confirmed by comparison of sensitivity and specificity. The overall sensitivity was significantly greater with immunofluorescent staining than with cytochemical staining (67.1 versus 43.1%, p=0.001), whereas the specificity was comparable.
The a posteriori positive and negative predictive values are shown in table 3⇓. The relationship between prevalence of PCP and positive and negative predictive values is shown in figure 3⇓. These overall values were calculated according to different prevalences of PCP, and were based on pooled sensitivity and specificity data (67.1 and 96.5% for immunofluorescence and 43.1 and 96.2% for cytochemical staining, respectively).
Discussion
Owing to its high sensitivity and specificity, bronchoscopy with BAL is regarded as the gold standard for PCP diagnosis in HIV-1-infected patients 11–13. However, although safer than and at least as sensitive as fibreoptic bronchoscopy with transbronchial biopsy, BAL with fibreoptic bronchoscopy is still considered an invasive diagnostic method 12, 13, 45. As a consequence, the increased prevalence of PCP observed concomitant with the explosion of the acquired immune deficiency syndrome (AIDS) epidemic has brought about an increased need for less invasive diagnostic methods 12, 46.
Sputum induction has emerged as a sensitive and cost-effective diagnostic strategy for PCP in HIV-1-infected patients 11, 12, 16, 46. This technique has gained popularity because it is less expensive, saves time and is less invasive than bronchoscopy with BAL. In patients without HIV-1 infection, however, this procedure has not been studied extensively. Moreover, the reduced organism burdens observed in PCP associated with conditions other than AIDS seems to be a limiting factor for the widespread use of sputum induction in patients without HIV-1 infection 47.
The reported sensitivities of the induced sputum technique often differ greatly from one report to another 12, 13, 38–44. In this context, a meta-analysis can provide an overall summary of diagnostic accuracy, as well as estimates of accuracy according to the characteristics of the study design, population examined and test used 21–24. Moreover, the decreased prevalence of PCP related to the widespread use of prophylaxis, and more recently, to the availability of HAART, provides further grounds for a re-evaluation of P. carinii diagnosis.
Seven studies investigating the diagnostic accuracy of sputum induction in comparison with fibreoptic bronchoscopy with BAL were included in the present meta-analysis. In an observational evaluation of the methodological features of studies evaluating diagnostic tests, selection of nonrepresentative patients or use of different reference standards were found to cause significant overestimation of the accuracy of a diagnostic test 20. Based on these observations, several studies with methodological shortcomings that may have affected the estimation of diagnostic accuracy of sputum induction were excluded.
Quality assessment of the studies included in the analysis detected methodological flaws in three reports, in which interpretation of test results was not blinded. All of the included studies were consecutive, representative of a relevant clinical population of HIV-infected patients and applied the same reference standard.
Conversely, the experimental design of the studies included in the present analysis may have introduced some limits to the general applicability of the summary estimates in routine clinical practice. For instance, the use of adequate techniques for induction, collection and processing of induced sputum specimens is of crucial importance to the diagnostic yield of the procedure. Although these techniques do not require specialised skills or equipment, the availability of dedicated experienced personnel certainly increases the quality of induced specimens 40. Moreover, interpretation of stained specimens for P. carinii can be more problematic for induced sputum than BAL specimens. The use of monoclonal antibodies, however, seems to overcome this problem even in laboratories with little previous experience of identifying the microorganism 15, 48.
Using meta-analysis, the present report provides an overall summary of diagnostic accuracy. Across the whole population examined, sputum induction demonstrated 55.5% sensitivity and 98.6% specificity. A significantly higher sensitivity from immunofluorescence staining compared to cytochemical staining of sputum specimens (67.1 versus 43.1%) was clearly identifiable. Conversely, the specificity of the two staining techniques was comparable.
The overall measurement of diagnostic accuracy applied was the summary odds ratio computed using the fixed-effect and/or random-effect model. Random-effect summaries yield higher estimated variances and, consequently, broader confidence intervals than fixed-effect summaries when there is evidence of substantial heterogeneity among individual studies. If heterogeneity is found, the fixed-effect model yields an artificially narrow confidence interval. In this case, the random-effect model, which incorporates a moment estimator of the between-trial components of variance, is more appropriate 49. Since with cytochemical staining of sputum the homogeneity hypothesis was rejected, the odds ratio was calculated using the random-effect model.
The estimate of the summary odds ratio for cytochemical staining of sputum induction was 19.1, corresponding to 43.1% sensitivity and 96.2% specificity. With the fixed-effect model, the summary diagnostic odds ratio for immunofluorescence was 56.9, corresponding to 67.1% sensitivity and 96.5% specificity, and was reasonably consistent across the studies (test for heterogeneity, p=0.5).
With the summary ROC curve, the regression coefficient did not differ significantly from zero, thus supporting the assumption that the odds ratio does not depend on threshold.
Whether sputum induction is appropriate for a particular task depends ultimately on the predictive values in the intended setting, which, in turn, critically depend on the prevalence of the condition to be detected. Based on the sensitivity and specificity obtained from meta-analysis, the interpretations of the sputum induction results for the different pretest probabilities were examined using Bayes' theorem to generate post-test probabilities. For instance, in a setting of 25% prevalence of PCP, a negative result excludes infection in 89.8% of sputum specimens processed with immunological staining, and in 83.5% of specimens processed with cytochemical staining. By contrast, in a setting of higher prevalence, a negative result gives an excessive chance of P. carinii being present (table 3⇑ and fig. 3⇑). A high prevalence of PCP is still found in patients who failed to respond to HAART, are noncompliant with antiretroviral and/or prophylactic regimens, or are unaware of their HIV status, or in countries with a large burden of HIV-1 infection and no access to effective antiretroviral treatment 6, 50. In these circumstances, as in the pre-HAART era, fibreoptic bronchoscopy with BAL should be performed in patients with negative induced sputum results 51.
Conversely, a positive result in a setting of 25% PCP prevalence gave a 13 and 21% possibility of a false-positive result with sputum specimens stained with monoclonal antibodies and conventional staining techniques, respectively. With a higher prevalence of PCP, a positive result was almost invariably associated with the disease, independent of the staining technique.
Another potential problem is that the induced technique sputum may be less effective than bronchoscopy with BAL in the diagnosis of other lung diseases in HIV-1-infected patients. In two studies, bronchoscopy with BAL performed concomitantly with or after the induced sputum technique provided additional diagnoses of lung disease other than that caused by P. carinii 40, 52. In two other comparative studies, however, sputum induction was also found to be reliable for the diagnosis of other infectious causes of pneumonitis, including tuberculosis 38, 53.
The present meta-analysis supports the argument that the sputum induction technique is a valuable tool for the diagnosis of Pneumocystis carinii pneumonia in human immunodeficiency virus-infected patients. Since the specificity of induced sputum specimen results is high, a positive result retains a high predictive value even in a context of low Pneumocystis carinii pneumonia prevalence. The interpretation of a negative result may be more problematic, especially if obtained with cytochemical staining. Owing to its greater sensitivity, the immunofluorescent staining of a sputum induction specimen is preferable to cytochemical staining. This test could be of particular value in settings in which the prevalence of Pneumocystis carinii pneumonia is low. In agreement with the data of Chouaid et al. 46, in such a situation, immunofluorescence staining of induced sputum may be the most cost-effective diagnostic strategy. Furthermore, a marked reduction in the need for bronchoscopy after the widespread use of protease inhibitors has already been reported 54. In conclusion, the present data provide compelling evidence of the value of the sputum induction procedure in the diagnosis of Pneumocystis carinii pneumonia, and underscore the importance of a reappraisal of strategies for the diagnosis of infectious complications in the era of highly active antiretroviral therapy.
Appendix
The following studies comparing induced sputum procedures and bronchoscopy for the diagnosis of PCP in HIV-1-infected patients were not included in the analysis because they failed to meet the predefined inclusion criteria. Eleven studies were excluded because bronchoscopy with BAL was performed only in subsets of patients (usually in patients with negative induced sputum results) 6, 16, 46, 51, 55–61. Fifteen studies (seven case-control 48, 62–67 and eight retrospective and/or nonconsecutive 51, 68–74) were excluded on the basis of design and/or because they did not permit direct comparison of accuracy for every individual in the sample. Three studies used a different diagnostic standard reference 52, 75, 76.
- Received February 18, 2002.
- Accepted May 2, 2002.
- © ERS Journals Ltd