Abstract
Although interferon-γ release assays (IGRAs) are intended for diagnosing latent tuberculosis (TB), we hypothesised that in a high-burden setting: 1) the magnitude of the response when using IGRAs can distinguish active TB from other diagnoses; 2) IGRAs may aid in the diagnosis of smear-negative TB; and 3) IGRAs could be useful as rule-out tests for active TB.
We evaluated the accuracy of two IGRAs (QuantiFERON®-TB Gold In-tube (QFT-GIT) and T-SPOT®.TB) in 395 patients (27% HIV-infected) with suspected TB in Cape Town, South Africa.
IGRA sensitivity and specificity (95% CI) were 76% (68–83%) and 42% (36–49%) for QFT-GIT and 84% (77–90%) and 47% (40–53%) for T-SPOT®.TB, respectively. Although interferon-γ responses were significantly higher in the TB versus non-TB groups (p<0.0001), varying the cut-offs did not improve discriminatory ability. In culture-negative patients, depending on whether those with clinically diagnosed TB were included or excluded from the analysis, the negative predictive value (NPV) of QFT-GIT, T-SPOT®.TB and chest radiograph in smear-negative patients varied between 85 and 89, 87 and 92, and 98% (for chest radiograph), respectively. Overall accuracy was independent of HIV status and CD4 count.
In a high-burden setting, IGRAs alone do not have value as rule-in or -out tests for active TB. In smear-negative patients, chest radiography had better NPV even in HIV-infected patients.
Tuberculosis (TB) remains a major global health concern [1, 2]. Diagnosis is the crucial first step to effectively reduce TB cases. However, TB control still relies on tests such as culture, smear microscopy and chest radiographs, despite their known limitations. Culture, the reference standard for active TB, is time-consuming and often not available in resource-poor settings. Smear microscopy, the most rapid and widely used TB test, is highly specific but has poor sensitivity [3]. Furthermore, chest radiograph lacks specificity. The shortcomings of imperfect TB diagnostic tests are even more severe in HIV-infected individuals, as smear positivity can be as low as 20% [4] and the clinical and radiographic signs are often atypical [5–7].
More recently, two quantitative T-cell interferon-γ release assays (IGRAs), namely QuantiFERON®-TB Gold In-tube (QFT-GIT; Cellestis, Carnegie, VIC, Australia) and T-SPOT®.TB (Oxford Immunotec, Abingdon, UK), have been developed as replacements for the tuberculin skin test (TST) for the diagnosis of latent TB infection (LTBI). Since IGRAs cannot distinguish between LTBI and active TB, their use for the diagnosis of active disease has been extensively debated [8–10]. Based on available data from low- and intermediate-burden settings [11–16], many national guidelines have argued against the use of IGRAs for diagnosing active TB [17–19]. This is supported by a recent meta-analysis which showed the limited utility of IGRAs for ruling in or ruling out active TB [20]. Nevertheless, many private providers in high-burden countries (e.g. South Africa and India) are using IGRAs for this purpose [21], and many investigators continue to recommend their use for active TB [22–25]. Thus, there is growing concern about the inappropriate use of IGRAs for the diagnosis of active TB in high-burden settings, particularly to rule in disease and initiate therapy.
There are currently few data from high-burden settings on which to base clinical recommendations. Preliminary evidence shows that given the high levels of interferon (IFN)-γ seen in these settings, altering the cut-off may have discriminatory value [24]. Furthermore, data from low-burden settings suggest that IGRAs may have rule-out value for active TB when combined with other clinical investigations [12, 16, 26]. Thus, even if IGRAs cannot be used to confirm active TB, there is a need to evaluate whether they can be used to exclude active TB in high-burden settings.
We hypothesised that: 1) the magnitude of the IFN-γ response and alternative cut-offs could be useful in discriminating between TB and other diagnoses; and 2) IGRAs may be useful in ruling out active TB when combined with smear microscopy or chest radiography. Thus, can IGRAs aid in rapidly excluding a diagnosis of active TB, particularly in smear-negative patients, with or without information about the chest radiograph? To address these unresolved questions in both HIV-infected and -uninfected patients, we evaluated both commercial IGRAs in 500 consecutive patients with suspected TB who were recruited at two primary care clinics in Cape Town, South Africa.
METHODS
At the University of Cape Town (Cape Town, South Africa), a primary study (TB-NEAT) was conducted to evaluate several TB diagnostic tests and their contributions for the diagnosis of active TB in an HIV-endemic setting. The study recruited 500 outpatients with suspected pulmonary TB who were consecutively recruited at two primary care clinics over a 3-yr period. To qualify as a TB suspect patient, an individual had to present with at least one of the following symptoms: cough for >2 weeks, coughing up phlegm, haemoptysis, fatigue, night sweats, fever for >2 weeks, weight loss, loss of appetite, or being bedridden. Only patients ≥18 yrs were enrolled into the study. After giving written informed consent, all patients underwent extensive diagnostic testing, which included two sputum cultures, two sputum smears, chest radiography, both IGRAs, HIV testing, and CD4 cell counts for those who were HIV-infected. All patients were interviewed and a questionnaire was completed to capture epidemiological data. The study was approved by the University of Cape Town's Health Sciences Faculty Research Ethics Committee (REC REF 421/2006).
Since culture is considered the reference standard for active TB in adults, a confirmed TB case was defined by at least one of the cultures growing Mycobacterium tuberculosis on the MGIT 960 liquid culture system (BD Diagnostic Systems, Sparks, MD, USA) and obtained from a patient whose clinical presentation was consistent with TB. A patient needed two negative cultures to be classified as having a final culture-negative result. All accuracy measures were calculated using culture as the reference standard (i.e. patients were classified as culture positive or negative). A smear-negative patient was classified as having two negative sputum smears. In addition, analyses were conducted where the culture-negative patients were stratified by whether or not they were clinically suspected of having TB (and hence, treated empirically for TB). Chest radiographs were evaluated and scored by two trained and independent observers using a computerised chest radiograph reading and recording system (CRRS) to determine the extent of disease and presence of cavitation and/or fibrosis [27]. All discrepancies were cross-checked through a consensus read. Results were classified as consistent or inconsistent with active TB. Radiographs that were taken >3 months after the study entry date were discarded.
Laboratory technicians were blinded to culture results and performed the IGRAs according to the manufacturer’s guidelines. IGRA results were interpreted according to the recommended cut-offs: 0.35 IU·mL−1 for the QFT-GIT and 5 spot-forming units (SFU) for either of the two antigens, early secretory antigenic target (ESAT)6 or culture filtrate protein (CFP)10, for the T-SPOT®.TB. For the QFT-GIT, results were indeterminate if the positive control minus the negative control was <0.05 IU·mL−1 or the negative control was >8.0 IU·mL−1. For the T-SPOT®.TB, results were indeterminate if the negative control had >10 SFU or the positive control had <20 SFU. To avoid overestimating the sensitivity of IGRAs, indeterminate results were included as false negatives if they occurred in culture-positive patients. Indeterminate results were excluded for specificity calculations.
Receiver operating characteristic curve analysis was used to determine alternative cut-offs for the IGRA. For this analysis, all negative and zero IFN-γ responses to the QFT-GIT were re-scaled to 0.01 IU·mL−1, while 10 IU·mL−1 was the maximum response, since the test cannot resolve results beyond this value. For T-SPOT®.TB, the highest number of SFUs for either the ESAT6 or CFP10 antigen was used. The median IFN-γ responses for TB and non-TB patients were compared using a nonparametric test for equality of medians with Pearson's Chi-squared statistic.
RESULTS
Demographic and clinical characteristics
Of the 500 patients recruited, a final culture result could only be determined for 395 (79%) patients. Of the 105 patients who were excluded, 68 (65%) had one or more unknown culture result while 37 (35%) had at least one contaminated culture. There were significantly more males (p=0.04) and fewer chest radiographs consistent with active TB among the excluded patients (p=0.01; data not shown). Among the 395 patients included in the analysis, 259 (66%) were male. A total of 276 (70%) patients were black African, while the rest were classified as white or mixed race. The mean±sd age of the cohort was 40±12 yrs. Of 349 patients with HIV status available, 108 (27%) were infected and 241 (61%) were uninfected. The status was missing for 46 (12%) patients who refused testing. Table 1 provides the demographic and clinical characteristics for the cohort of 395 patients, stratified by HIV status if known.
For the reference standard results, 138 (35%) and 257 (65%) of patients were classified as culture positive and negative, respectively. A total of 92 (23%) of the patients were smear positive, while 294 (74%) were smear negative. Compared with culture, the sensitivity (95% CI) for smear was 69% (61–77%) and its specificity was expectedly high at 100% (98–100%). Results for chest radiography were unknown or had to be discarded for 84 (21%) of the patients. While the specificity of chest radiography was only 28% (22–34%), its sensitivity was very high at 99% (95–100%).
IGRAs as rule-out tests for active TB
The QFT-GIT gave indeterminate results in 47 (12%) of the tests, 16 with TB and 31 without. The T-SPOT®.TB gave indeterminate results in seven (2%) of the tests, one with TB and six without TB. The specificities (95% CI) for the IGRAs were low at 42% (36–49%) for QFT-GIT and 46% (39–52%) for T-SPOT®.TB as were the positive predictive values at 44% (38–51%) and 47% (40–53%) for QFT-GIT and T-SPOT®.TB, respectively. The sensitivities, 76% (68–83%) for QFT-GIT and 84% (77–90%) for T-SPOT®.TB, were higher but still missed culture-confirmed TB cases (results stratified by HIV status are shown later). The negative predictive values (NPV) (95% CI) were 74% (66–82%) and 84% (76–90%) for QFT-GIT and T-SPOT®.TB, respectively. When culture-negative patients empirically treated for TB were excluded, the NPV (95% CI) was 66% (56–76%) for QFT-GIT and 76% (66–85%) for T-SPOT®.TB. As shown in figure 1, reducing the manufacturer's suggested cut-off for the QFT-GIT from 0.35 IU·mL−1 to 0.16 IU·mL−1 would increase the sensitivity to ≥90%. For the T-SPOT®.TB, reducing this cut-off slightly from 5 to 4 SFU would increase the sensitivity to ≥90%.
On a continuous scale, the median IFN-γ response for the QFT-GIT in non-TB patients was 0.59 IU·mL−1 compared with 2.14 IU·mL−1 for confirmed TB patients (Pearson's Chi-squared 16.53; p<0.001). The median number of SFU for the T-SPOT®.TB was eight in non-TB patients and 28 in TB patients (Pearson's Chi-squared 30.92; p<0.001). While the magnitudes of the IFN-γ response are significantly different between the two groups for both IGRAs, there is substantial overlap between TB and non-TB patients (figure 2).
Among smear-negative patients, both IGRAs performed similarly. The NPV (95% CI) for QFT-GIT was 89% (82–95%), while this figure was 92% (85–96%) for T-SPOT®.TB. When culture-negative patients empirically treated for TB were excluded, the NPV was 85% (75–92%) and 87% (78–94%) for QFT-GIT and T-SPOT®.TB, respectively. From table 2, the chest radiograph had higher NPV compared with the IGRAs. As mentioned previously, chest radiograph results alone gave near-perfect sensitivity in unselected patients. In smear-negative patients, its sensitivity and NPV (95% CI) were 97% (86–100%) and 98% (91–100%), respectively.
Results stratified by HIV status
The rate of indeterminate results was higher for HIV-infected patients for the QFT-GIT (25 versus 6%), while the rate for T-SPOT®.TB was 2% regardless of HIV status. When stratified by HIV status, the sensitivity (95% CI) of QFT-GIT was lower in HIV-infected compared with HIV-uninfected patients: 67% (51–80%) versus 82% (71–89%). The results for T-SPOT®.TB were more similar across the groups. In HIV-infected patients, the sensitivity was 82% (67–93%), while it was 85% (76–92%) in HIV-uninfected patients. In smear-negative patients, the NPV was 88% (71–97%) and 91% (77–97%) for QFT-GIT and T-SPOT®.TB, respectively, among HIV-infected patients. The IGRAs corresponded well for smear-negative, HIV-uninfected patients, with both assays giving a NPV of 91% (81–97%). However, chest radiography performed better than the IGRAs. While the sensitivity and NPV were near-perfect in uninfected individuals, they were actually 100% in those infected with HIV (table 3).
Overall, the magnitude of IFN-γ responses for both non-TB and TB patients was lower in HIV-infected compared with HIV-uninfected patients. For the QFT-GIT, the IFN-γ response was 0.08 versus 1.57 IU·mL−1 (Chi-squared 7.5; p=0.006) in HIV-infected patients and 0.81 versus 2.28 IU·mL−1 (Chi-squared 11.47; p<0.001) in HIV-uninfected patients. For the T-SPOT®.TB, the number of SFU was 4 versus 20 in HIV-infected patients (Chi-squared 19.41; p<0.001) and 12 versus 29 (Chi-squared 14.93; p<0.001) in HIV-uninfected patients. Despite these results, figure 3 shows visually that this approach has limited discriminatory ability due to the substantial overlap between TB and non-TB patients.
Among the 108 HIV-infected patients, CD4 cell counts were available for 101 (94%) of them. The median CD4 count was 182 cells·μL−1 (range 10–935). A total of 53 (52%) patients had CD4 counts <200 cells·μL−1 and 48 (48%) patients had CD4 counts of ≥200 cells·μL−1. When stratified by CD4 cell counts, the sensitivity (95% CI) of the IGRAs was actually higher in patients with <200 cells·μL−1 compared with those with ≥200 cells·μL−1: 76% (53–92%) versus 61% (36–83%) for QFT-GIT and 90% (67–99%) versus 78% (52–94%) for T-SPOT®.TB. In smear-negative patients, the NPV (95% CI) of QFT-GIT was 89% (65–99%) and 83% (52–98%) for CD4 counts of <200 cells·μL−1 and ≥200 cells·μL−1, respectively. T-SPOT®.TB results were similar across the groups: 91% (71–99%) and 90% (67–99%). Again, chest radiography performed better than the IGRAs. Both the sensitivity and NPV reached 100% regardless of the degree of immunosuppression in HIV-infected individuals (table 4).
DISCUSSION
IGRAs, like the TST, were designed for diagnosis of LTBI and not active TB. Limited evidence has shown that IGRAs have modest predictive value for progression to active disease, perhaps of the same magnitude as the TST, which means that we still do not have adequate biomarkers for predicting disease progression [28]. There is growing concern about the use of IGRAs in high-burden settings to rule in and out active disease. To our knowledge, this is the first prospective study in a high-burden setting that recruited consecutive adult patients with suspected TB and performed a head-to-head comparison of both IGRA assays for the diagnosis of active TB. There are four major findings of our study: 1) IGRAs have no rule-in value for active TB in a high-burden setting and using different cut-offs does not improve the rule-in ability; 2) these conclusions hold true even in HIV-infected patients; 3) IGRAs on their own have no rule-out value for active TB (i.e. a negative test cannot exclude active disease); and 4) although the NPV of the T-SPOT®.TB was higher than the QFT-GIT assay, it was not high enough to confidently rule out TB in smear-negative patients (i.e. negative smears followed by negative T-SPOT®.TB) though a similar result could be achieved with a chest radiograph.
Since IGRAs are unable to distinguish between LTBI and active TB, they will always have poor specificity in areas with high prevalence of LTBI. We calculated the specificity in TB suspect patients who turned out to have alternative diagnoses, which better represents the accuracy in routine clinical practice. Our results show that the background LTBI prevalence for TB suspect patients in our setting ranges from 54 to 58%. Furthermore, given the inadequate sensitivity of the IGRAs, they cannot be used alone to rule out active TB. This is particularly important for HIV-infected patients. In our analysis, the sensitivity in HIV-infected patients was lower for the QFT-GIT and there were substantially more indeterminate reactions. One study showed a similar result [29], while another study found that the QFT-GIT sensitivity was higher in HIV-positive patients (81 versus 73%) [30]. In the current study, the sensitivity for T-SPOT®.TB was >80% and less affected by HIV status. A study conducted among TB-suspected patients with advanced HIV disease reported a suboptimal sensitivity of 73% for the T-SPOT®.TB [31]. A recent meta-analysis has shown that IGRA sensitivity tends to be lower in HIV-infected individuals [32]. In addition, a recent study found that the QFT-GIT but not the T-SPOT®.TB was affected by the degree of immunosuppression [33]. The reason for the assay-specific performance variability in HIV-infected patients remains unclear but may be related to the inherently better sensitivity of the ELISPOT technique, serum interleukin-10 levels or the immunomodulatory effect of TB 7.7 [34]. The higher sensitivity of both IGRAs for patients with CD4 counts of <200 cells·μL−1 compared with those without immunosupression is harder to explain but could be due to the small numbers and overlapping 95% confidence intervals. Furthermore, HIV-infected patients have attenuated host immunity and are more prone to infection with less virulent strains. There is evidence that strain differences in Africa impact on IGRA T-cell responses [35]. Thus, we cannot exclude the possibility that strain differences may partly explain this finding.
Previous IGRA studies in high-burden settings have been conducted among confirmed TB patients. One study from India [36] and another from South Africa [30] reported sensitivities of 91 and 76%, respectively, for the QFT-GIT, compared with our figure of 76% in TB-suspected patients. Both studies reported that the QFT-GIT plus TST combination achieved a sensitivity of ≥96% and could be useful for excluding active TB. However, the TST is not used routinely for the diagnosis of active TB in high-burden settings, is labour intensive and would compromise an already overwhelmed healthcare system. Thus, the routine diagnostic work-up for active TB in adults consists of smear, culture and chest radiography.
We used the CRRS, a validated tool for reporting radiological abnormalities on chest films with good interobserver agreement. Chest radiography offers more clinically useful information on extent of disease and is more easily available compared with the IGRAs in most settings. The role of chest radiography for diagnosing active TB in HIV-endemic settings has been inconsistent [37], largely due to the variability in reporting methods. The development of the CRRS was intended to address this limitation, and the high sensitivity and NPV of the CRRS found in our study will require confirmation in other similar settings. One study has shown that the sensitivity and NPV of the CRRS are inadequate to be used as a screening tool for patients with advanced HIV disease who are starting antiretroviral treatment [38]. More studies are needed to evaluate the potential usefulness of the CRRS as a rule-out test in patients with and without profound immunosuppression.
Compared with conventional diagnostic tests, the IGRAs are expensive and require more laboratory infrastructure. While the QFT-GIT yielded more indeterminate results, the T-SPOT®.TB had more unknown test results due to logistical laboratory issues. This finding could help explain, in part, the difference in diagnostic accuracy between the two assays. Other studies have also shown that the T-SPOT®.TB is prone to technical errors in the processing stage due to its operational complexity [39–41]. While IGRAs have no real value for ruling in TB, we evaluated their role as rule-out tests for smear-negative TB. Although the T-SPOT®.TB assay had a higher NPV than the QFT-GIT in this context, it was still not high enough to confidently rule out TB (i.e. approximately one in 10 non-TB cases would be erroneously missed for TB). Nevertheless, the same result could be achieved with a chest radiograph, which is more readily accessible to national TB programmes and hospitals in high-burden settings. Thus, inappropriate use of the commercial IGRAs is not only a waste of resources for national TB programmes and the patients themselves but may also contribute to fostering drug resistance due to either under- or overtreatment. Indeed, a World Health Organization (WHO) expert group has reviewed the evidence on IGRAs for low- and middle-income countries and recommended against their use for diagnosing active TB [42]. Our data are supportive of this WHO recommendation against the use of IGRAs for active TB diagnosis in high-TB and HIV burden settings.
In South Africa and other low- or middle-income countries, neither the chest radiography nor IGRAs are the usual standard of care, although chest radiography is more easily available. Our study shows that active TB can be reliably excluded in a patient with a chest film that is not consistent with active disease. Thus, this diagnostic strategy can provide patients with rapid exclusion of active TB on an “inform and advise” basis without further investigations, thereby reducing the patient load in high-volume clinics.
Conclusions
Our study confirms that commercial IGRAs, like the TST, cannot be used to rule in active TB in areas with a high prevalence of LTBI. They should also not be used to rule out disease when performed alone, due to their suboptimal sensitivity. When combined with a negative smear, the T-SPOT®.TB assay may be able to rule out active TB reliably, although a similar result could be achieved with a chest radiograph. These findings have great relevance for clinical practice in high-burden, low-resource settings and are consistent with recent WHO recommendations on IGRAs in low- and middle-income countries.
Acknowledgments
The authors are grateful to R. Ditsoane and D. Siganga (Lung Infection and Immunity Unit, University of Cape Town, Cape Town, South Africa) for their contributions to the TB-NEAT study.
Footnotes
Earn CME accreditation by answering questions about this article. You will find these at the back of the printed copy of this issue or online at www.erj.ersjournals.com/misc/cmeinfo.xhtml
Support Statement
This study was supported by a TBsusgent grant from the European Commission (EU-FP7), the EDCTP (TESA and TB-NEAT), and the Canadian Institutes of Health Research (CIHR MOP-89918). D.I. Ling is supported by fellowships from the MUHC Research Institute and McGill University Faculty of Medicine. M. Pai is supported by a CIHR New Investigator Award from the Canadian Institutes of Health Research. R. Van Zyl-Smit is supported by a Fogarty International Clinical Research Scholars/Fellows Support Centre NIH grant R24TW007988.
Statement of Interest
A statement of interest for M. Pai can be found at www.erj.ersjournals.com/site/misc/statements.xhtml
- Received November 25, 2010.
- Accepted February 7, 2011.
- ©ERS 2011