Abstract
Lipoarabinomannan (LAM) is a potential marker of active tuberculosis (TB). We performed a systematic review and meta-analysis regarding use of urinary LAM assays for diagnosing active TB.
We systematically searched for published and unpublished studies that evaluated urinary LAM for active TB diagnosis. Extracted data were pooled using bivariate random effects models and hierarchical summary receiver operating characteristic curves. Heterogeneity was explored through subgroup analysis and meta-regression. Quality was assessed according to standardised QUADAS (Quality Assessment of Diagnostic Accuracy Studies) criteria.
In seven studies that assessed test accuracy in microbiologically confirmed cases only, estimates of sensitivity ranged from 13% to 93%, while specificity ranged from 87% to 99%. In five studies that assessed accuracy in clinical and confirmed TB cases, sensitivity ranged from 8% to 80%, while specificity ranged from 88% to 99%. In five studies with results stratified by HIV status, sensitivity was 3–53% higher in HIV-positive than HIV-negative subgroups; sensitivity was highest with advanced immunosuppression.
The LAM urinary assay has several characteristics that make it attractive for diagnosing active TB, but has suboptimal sensitivity for routine clinical use. Further studies are needed to evaluate the potential value of the LAM assay in individuals with advanced HIV or for diagnosis of paediatric TB.
Despite the enormous global burden of tuberculosis (TB), case detection rates are low, posing serious hurdles for TB control [1]. Diagnosis of active TB continues to rely on tests such as smear microscopy, culture and chest radiography. These tests have several limitations, are not point-of-care tests and perform poorly in populations affected by the HIV epidemic [2].
As a result of recent efforts to develop new tools for TB diagnosis, several new diagnostics have been introduced and evaluated [2–4]. However, an accurate point-of-care test that could be used within peripheral clinical settings with limited laboratory facilities has not yet been successfully developed [5].
Lipoarabinomannan (LAM) is a structurally important 17.5-kD heat-stable glycolipid found in the cell wall of Mycobacterium tuberculosis. LAM can account for up to 15% of the total bacterial weight and serves as an immunogenic virulence factor that is released from metabolically active or degrading bacterial cells during TB infection [6, 7]. Detection of LAM antigens in urine has several potential advantages compared with currently used diagnostics. Urine samples are simple to collect, process and store. There are far fewer infection control concerns compared with sputum. Urine is a particularly attractive specimen in young children, who are often unable to produce sputum.
Based on early data, a prototype urinary LAM detection test was produced by Chemogen Inc. (Portland, ME, USA) and a commercial version of this test is now marketed as the ClearviewTM TB ELISA (Alere Inc. (formerly Inverness Medical Innovations Inc.), Waltham, MA, USA) [8]. Several studies have evaluated LAM-based diagnostics, but to date there has been no systematic review on this topic. Hence, we planned a systematic review of the existing evidence base on the accuracy of LAM antigen detection for diagnosis of active TB.
METHODS
We followed a standard protocol for systematic reviews and meta-analyses [9], and used methods recommended by the Cochrane Diagnostic Test Accuracy Working Group [10].
Search strategy
We systematically searched three databases for relevant citations: PubMed, Embase and Web of Science (from earliest records to September 4, 2010). No language restrictions were imposed on the search criteria. Reference lists from included studies were hand searched and experts and industry representatives were contacted to identify additional studies.
Eligibility criteria
Pre-determined eligibility criteria included the use of any form of LAM detection in urine specimens in patients suspected or known to have active pulmonary TB, and the use of an accepted reference standard. Accepted reference standards included: positive culture of M. tuberculosis; visualisation of acid-fast bacilli from a clinical specimen (including histopathology); or positive nucleic acid amplification for M. tuberculosis. Studies that used clinical information to contribute to the classification of patients were also included, and analysed according to how that clinical information was used (see section on outcome measures and subgroups).
Study selection
Screening of titles and abstracts was performed by one reviewer (J. Minion) and selection of full text was performed by two independent reviewers (J. Minion and E. Leung). Articles retrieved for full text review along with reasons for exclusion are available from the authors.
Data extraction
We created and piloted a data extraction form with a subset of eligible studies. Based on experience gained from the pilot extractions, the form was finalised. All studies included in the final review were extracted independently by two reviewers (J. Minion and E. Leung) and any disagreements were resolved by consensus.
Assessment of study quality
We used the QUADAS (Quality Assessment of Diagnostic Accuracy Studies) criteria to assess the quality of diagnostic accuracy studies [11]. All criteria were classified as “yes”, “no” or “unclear” based on information available in the publication. Studies were judged according to the data used for the meta-analysis, which may not have been all data available in the publication. This would apply if the study assessed the performance of the LAM assay in TB suspects as well as healthy controls; if possible, data from healthy controls were excluded for the main analysis.
Outcome measures and subgroups
Three definitions of reference standards were used according to how studies dealt with “reference indeterminate” results. These patients had negative microbiological results (i.e. smear and culture negative), but were considered “clinical cases” of TB on the basis of clinical and radiographical features and/or response to treatment. Analysis A considered cases (reference positive) to be only patients with microbiologically confirmed TB and all other patients were considered reference negative, except the clinical cases that were excluded from the analysis. Analysis B defined only patients with microbiologically confirmed TB as reference positive, and all others were reference negative, meaning that clinical cases were considered not to have active TB. Analysis C considered patients with microbiologically confirmed TB and the clinical cases as reference positive (active TB), and all other patients were considered reference negative. In summary, in the three analyses, the microbiologically negative clinical cases were excluded (A), included as reference negative (B) or included as reference positive (C).
Pre-specified subgroup analyses were performed according to patient HIV status, whether urine was fresh or frozen prior to LAM assay testing, and the version of the LAM assay used. Subgroup analysis was planned for different CD4 count measurements within the HIV-positive subgroup. Meta-regression was performed relating the proportion of HIV-positive patients in a study (independent variable) to the resultant LAM sensitivity (dependent variable). The proportion of specimens testing positive by the LAM assay was compared between categories of varying TB likelihood and presumed bacillary burden: smear-positive specimens; culture-positive specimens; smear-negative specimens; clinically diagnosed patients; suspects investigated for active TB; and healthy or not-at-risk controls.
Analysis
Data were analysed using STATA/IC 11.0 (StataCorp LP, College Station, TX, USA). Forest plots visually displaying sensitivity and specificity estimates and their 95% confidence intervals (using exact methods for proportions) from each study were constructed using MetaDiSc 1.4 software [12]. Hierarchical summary receiver operating characteristic (HSROC) curves were analysed to explore the influence of threshold effects and produce a global summary of test accuracy [13].
Accuracy measures were pooled using bivariate random effects regression models [14], using the user-written program “metandi”, and meta-regression was performed using “metareg” in STATA [15]. Heterogeneity of accuracy estimates was assessed using the I2 statistic [16]. Subgroups with fewer than four studies were combined using univariate random effects models because bivariate random effects regression models do not converge with small numbers of studies. The proportions of LAM-positive specimens by likelihood category were also combined using univariate random effects models.
RESULTS
Selection of included studies is summarised in figure 1. We identified 3,141 citations from database searches and an additional 11 through other sources. 1,559 unique articles were left after excluding duplicate articles. Screening of titles and abstracts identified 25 potentially relevant articles that were then retrieved for full text review. Of these, nine studies were considered eligible for this review.
Selection of included studies from systematic review. LAM: lipoarabinomannan.
Characteristics of included studies
Studies that met our selection criteria are described in table 1.
Many studies reported data that could be extracted and analysed in different ways according to how specimens from clinical cases were classified (see section on outcome measures and subgroups). Out of the nine total studies, seven reported results that allowed an analysis that excluded clinical cases (analysis A). Seven studies reported results that allowed an analysis using strict microbiological criteria as the reference standard (analysis B), and five studies reported results that could be analysed to include the clinical cases, along with microbiologically confirmed cases, as active TB (analysis C).
A summary of the quality of the studies, judged using the QUADAS criteria for diagnostic studies [11], is displayed graphically in figure 2. Seven of the studies did not report adequate information to assess all 14 criteria. The included studies met between four and 14 out of the 14 quality indicators.
Quality of studies using QUADAS (Quality Assessment of Diagnostic Accuracy Studies) criteria. Note that QUADAS criteria were judged on data used for meta-analysis, not necessarily all data presented in studies.
Accuracy estimates
When clinical cases were excluded (analysis A), seven studies reported a wide range of sensitivity estimates (13–93%), while specificity estimates were less variable (87–99%) (fig. 3). In the seven studies that could be analysed to consider only microbiologically confirmed cases as active TB (analysis B), sensitivity estimates varied from 13% to 81%, while specificity estimates ranged from 79% to 100% (online supplementary fig. 1). In the analysis combining clinical and microbiological cases as active TB (analysis C), sensitivity ranged from 8% to 80%, while specificity ranged from 88% to 99% in five studies (online supplementary fig. 2).
Forest plots of studies contributing to analysis A (n=7). Squares represent point estimates of a) sensitivity and b) specificity from each study reporting results eligible for analysis by excluding microbiologically negative clinical cases. The size of the square is proportionate to the size of the study. Solid lines represent 95% confidence intervals. a) Chi-squared value 202.55, six degrees of freedom (p<0.0001), inconsistency (I2) 97.0%. b) Chi-squared value 38.63, six degrees of freedom (p<0.0001), I2 84.5%.
When these studies were combined there was significant heterogeneity in nearly all of the pooled estimates; hence, these should be interpreted with caution (table 2). Using bivariate random effects models, pooled sensitivity estimates (with 95% CI) ranged between 34 (14–62)% and 60 (38–79)% depending on the definition of the reference standard (table 2). Sensitivity was highest when clinical cases were excluded from the analysis (analysis A), and lowest if clinical cases were considered active TB (analysis C). Pooled specificity estimates (with 95% CI) for analyses A, B and C were 93 (88–96)%, 93 (83–97)% and 94 (87–98)%, respectively.
HSROC curves
Online supplementary figures 3–5 plot the sensitivity (true-positive rate) against 1-specificity (false-positive rate) and partial curves are constructed from the bivariate random effects regression models used to calculate the pooled estimates. This visually demonstrates the large variation in sensitivity estimates between studies.
Linear regression of sensitivity on the proportion of HIV-positive subjects included in studies. Open circles represent studies reporting HIV prevalence; sizes of the circles depend on the precision of each study estimate (i.e. the inverse of its within-study variance). The line represents fitted values for the linear regression equation: sensitivity=0.17 (se 0.18)+0.0042 (se 0.0027)×%HIV. 95% CIs#: α=-0.30−0.64; β=-0.0027−0.011. Logistic model (not displayed): logit(sensitivity)= -1.63 (se 0.87)+0.021 (se 0.013)×%HIV. 95% CIs#: α=-3.86–0.60; β=-0.012–0.053. #: not statistically significant.
Proportion of specimens that were lipoarabinomannan (LAM) positive, by clinical status. Pooled proportions were calculated using univariate random effects models. Error bars represent 95% confidence intervals. Smear+: positive smear and culture; culture+: all positive cultures regardless of smear status; smear-: negative smear with positive culture; clinical+: negative culture with clinically diagnosed active tuberculosis (TB); suspect controls: patients with signs and symptoms compatible with active TB, but TB was ruled out; healthy controls: participants without signs or symptoms of active TB.
Subgroup analyses and meta-regression
Five studies compared the performance of the LAM assay in HIV-positive patients and HIV-negative patients. All studies found improved sensitivity in HIV-positive populations (between 2.5% and 52.8% absolute increase in sensitivity), although four out of the five also found small decreases in specificity (between 4.9% and 7.3% absolute decrease in specificity, with the fifth study reporting 0.9% absolute increase). Results that were stratified by HIV status were combined, again separating data depending on their definitions of reference standards (table 2). In all analyses, the LAM assay had higher sensitivity in HIV-positive than HIV-negative patients with little difference in specificity. The largest difference was seen when strict microbiological criteria were used as the reference standard (analysis B): in HIV-negative patients, LAM had an overall sensitivity (95% CI) of 18 (10–9)%, compared with 56 (40–71)% in HIV-positive patients.
Three studies reported diagnostic accuracy based on CD4 counts within HIV-positive TB suspects. These were not meta-analysed due to differences in CD4 cut-offs and different reference standards; however, in all studies, the LAM assays had higher sensitivity in patients with more severe immunosuppression (table 3).
Analysis of subgroups in which the urine used in the assay was either fresh or previously frozen found no statistically significant differences between these groups (table 4). Pooled sensitivity estimates from the two studies evaluating the early noncommercial version of the test were significantly higher than estimates from subsequent studies using either of the commercial assays.
Meta-regression was performed using the overall proportion of HIV-positive TB suspects in each study as an independent predictor of the estimated overall sensitivity, weighted inversely by the standard error of the sensitivity and combined using random effects (fig. 4).
Influence of clinical status on LAM accuracy
We estimated the pooled sensitivity of LAM in patient groups with different definitions of disease (smear-positive TB, all culture-positive TB, smear-negative TB and clinically positive TB) or suspicion of disease (symptomatic non-TB patients and healthy non-TB patients). As shown in figure 5, the proportion of LAM positives increased with greater bacillary burden in TB cases and was also higher in TB suspects than healthy controls.
DISCUSSION
The LAM urine assay was initially seen as a potentially revolutionary diagnostic for active TB. With its potential to be used as a simple point-of-care test, lack of biosafety concerns, and use of a noninvasive, convenient patient specimen, the LAM assay was fast-tracked for commercial development. Despite very positive initial evaluations, larger and more recent studies have failed to demonstrate adequate sensitivity for TB diagnosis under routine conditions in unselected patients.
Multiple explanations may contribute to this observation. The initial study conducted by Hamasur et al. [17] in 2001 can be considered a proof-of-principle demonstration, so higher accuracy may be expected. Tessema et al. [18] evaluated the original noncommercialised test methods described by Hamasur et al. [17] and found higher sensitivity than most subsequent studies using the Chemogen or Alere (Inverness) assays. The differences in the sensitivity of the different versions of the LAM assay may result from simplification of the test for widespread use. However, methodological differences between the earlier and later studies, including differences in study design, patient population (including proportion of HIV-infected patients), disease severity, specimen handling and reference standards are very plausible explanations for the large range of sensitivity estimates reported.
The Chemogen and Alere (Inverness) assays use the same polyclonal antibodies, but the manufacturing process is different. Issues such as pH, viscosity, matrix composition and spraying protocols could have had an impact on test performance. Additionally, preliminary evaluations have found improved sensitivity with a point-of-care lateral flow dipstick prototype (Determine TB; Alere Inc.) compared with the ClearviewTM TB ELISA (Alere Inc.) in an HIV-infected patient population [27]. Further evaluations of this new assay will be important, since the point-of-care format has the greatest potential to have an impact upon TB patient care.
This review does not exclude the possibility that the LAM assay is useful in the diagnosis of TB. The practical advantages of the assay are important and the sensitivity of the LAM assay might be superior to microscopy in HIV-positive patients who are more likely to have smear-negative TB and to have disseminated TB. Indeed, the ClearviewTM TB ELISA is licensed for use only as a screening test in HIV-positive TB suspects [8], so evaluations of its use in HIV-negative populations must be considered “off label”. Although all studies in this review that reported results by HIV subgroup found higher sensitivity in HIV-positive populations, not all found greater yield with LAM than smear microscopy. The pooled results in figure 5 show that only 57% of smear-positive specimens were LAM positive, suggesting that LAM would not be able to replace smear microscopy. However, we also found that LAM was positive in 41% of smear-negative specimens; this is a relatively large incremental yield compared with smear microscopy. This suggests that the LAM assay and smear microscopy may detect different groups of TB patients, and might be best used in combination, as was reported by Shah et al. [23].
There are several possible reasons to explain the higher sensitivity of LAM assays in immunosuppressed patients. One theory cites the correlation of higher sensitivity with greater bacillary burden (fig. 5) [19, 26], and assumes relatively greater multiplication of M. tuberculosis bacilli in patients with impaired immune function. Alternatively, due to the general lack of cavity formation in immunosuppressed patients, the bacteria are forced to replicate in tissue that would facilitate the diffusion of shed LAM into the circulation. A second explanation is that a larger degree of antigen–antibody complex formation in TB patients without immune suppression interferes with LAM excretion in the urine [28]. Finally, HIV-related podocyte dysfunction, which is more common in advanced HIV, can increase glomerular permeability [29] and might result in increased levels of LAM in patients' urine [27].
Peter et al. [27] have proposed a clinical algorithm for the diagnostic work-up of TB suspects in low-resource settings with high TB and HIV prevalence. In this algorithm, the LAM assay would serve as a “rule-in” test when screening patients at high risk of smear-negative, HIV-associated TB. This role depends upon the LAM assay having a high and reliable specificity. Although most studies have reported high predictive values for positive LAM results, and we found pooled specificity estimates ≥90%, there remains unexplained variability from individual studies (range 79–100%). While this may reflect a shortcoming of the assay, other explanations can be considered. It is possible that subclinical TB is being under-diagnosed, especially in HIV-infected patients, leading to misclassification bias. It is also possible that false-positive results are due to cross-reactivity with nontuberculous mycobacteria (NTM) or other microorganisms. Indeed, Dheda et al. [25] found significant LAM positivity in cultures containing common oral flora, such as Actinobacteria and Candida species. The contamination of urine specimens with normal flora including Candida species, the contamination of reagents and nonsterile containers with NTM, and the colonisation of patients with NTM may all lead to lower predictive values of positive LAM test results.
Finally, the benefits of using a noninvasive, easily collected specimen such as urine would be greatly appreciated in the diagnosis of paediatric TB. Considering the inadequate diagnostic yield of sputum-based diagnostics in young children, further evaluations of urinary LAM assays in paediatric populations is warranted.
This review had several strengths, including a broad and inclusive search of the published literature as well as efforts to identify currently unpublished studies. Selection of included studies and extraction of data was performed by two independent reviewers. Additionally, rigorous statistical methods were employed using bivariate random effects models and HSROC curves, as recommended by the Cochrane Diagnostic Test Accuracy Working Group for diagnostic meta-analyses [10]. This review was limited by the small number of studies reporting evaluations of the LAM assay, especially for important subgroups. Most of the pooled estimates should be interpreted with caution in view of their significant heterogeneity.
We performed three analyses depending on how specimens from clinical cases were classified. Differences between the pooled estimates using the three approaches demonstrate how the definition of reference standard can influence the results of diagnostic studies. Although a strict microbiological definition of TB is the most objective reference standard, a certain number of patients will have active TB, despite negative cultures [30]. Use of radiological findings, clinical features and treatment response are more subjective, and may lead to over-diagnosis of TB, but exclusion of these clinical cases may lead to overestimates of accuracy.
In conclusion, the LAM urine assay has many characteristics that make it a potentially useful rule-in TB diagnostic, but this review found inadequate sensitivity to use the LAM assay for the diagnosis of active TB in unselected cohorts. The assay performs better in HIV-infected patients, especially those with severe immunodeficiency, but even in HIV-infected persons the sensitivity is suboptimal. Further studies are warranted to evaluate the added value of the LAM assay in the diagnosis of active TB in individuals with advanced HIV and in children, as well as to assess newer versions of this test with technical advances compared with those reviewed here.
Footnotes
This article has supplementary material available from www.erj.ersjournals.com
Support Statement
This study was supported by funding from the Technology, Research, Education and Technical Assistance for Tuberculosis (TREAT TB) initiative by the International Union Against Tuberculosis and Lung Disease (IUATLD), with funding support from the United States Agency for International Development (USAID). J. Minion is a recipient of a Quebec Respiratory Health Training Fellowship. M. Pai is a recipient of a Canadian Institutes of Health Research (CIHR) New Investigator Award and also a consultant to the Bill & Melinda Gates Foundation. D. Menzies is a recipient of a Chercheur National salary award from the Fonds de Recherche en Santé du Québec (FRSQ). K. Dheda is supported by a South African Research Chair Initiative award, a Medical Research Council Career Development Fellowship, the European Union (Framework Programme 7, project TBsusgent) and the European and Developing Countries Clinical Trials Partnership (EDCTP; projects Trials of Excellence in Southern Africa (TESA) and New and Emerging Technologies for TB diagnosis (TB-NEAT)). None of these funding agencies had any role in the design and conduct of this study.
Statement of Interest
None declared.
- Received February 10, 2011.
- Accepted May 7, 2011.
- ©ERS 2011