Brief Report
Sample size calculation should be performed for design accuracy in diagnostic test studies

https://doi.org/10.1016/j.jclinepi.2004.12.009Get rights and content

Abstract

Background and Objectives

Guidelines for conducting studies and reading medical literature on diagnostic tests have been published: Requirements for the selection of cases and controls, and for ensuring a correct reference standard are now clarified. Our objective was to provide tables for sample size determination in this context.

Study Design and Setting

In the usual situation, where the prevalence Prev of the disease of interest is <0.50, one first determines the minimal number Ncases of cases required to ensure a given precision of the sensitivity estimate. Computations are based on the binomial distribution, for user-specified type I and type II error levels. The minimal number Ncontrols of controls is then derived so as to allow for representativeness of the study population, according to Ncontrols = Ncases [(1 − Prev)/Prev].

Results

Tables give the values of Ncases corresponding to expected sensitivities from 0.60 to 0.99, acceptable lower 95% confidence limits from 0.50 to 0.98, and 5% probability of the estimated lower confidence limit being lower than the acceptable level.

Conclusion

When designing diagnostic test studies, sample size calculations should be performed in order to guarantee the design accuracy.

Introduction

The importance of sample size calculation in medical research is emphasized in all sets of good clinical practice guidelines. Previous articles have dealt with this issue under various circumstances, and in particular for two-group comparisons within clinical trials [1]. Simel et al. [2] deal with sample sizes based on desired likelihood ratios confidence intervals. Knottnerus and Muris [3] deal with the whole strategy needed for development of diagnostic tests, but do not provide practical tables for calculating sample sizes in the very situation that clinician epidemiologists are in when dealing with sensitivity or specificity confidence intervals.

From a statistical point of view, sample size issues for diagnostic test assessment studies have formal counterparts within the field of clinical trials, so that answers could be derived from published equations and tables [4], at least in principle; how to perform such derivations may not be clear to clinicians.

Moreover, in the case of binary (yes/no) outcome tests, a normal approximation to the binomial distribution is often used [5]. Although the accuracy of the approximation is usually good, modern software allows for exact calculations to be carried out at virtually no extra cost. For instance, a SAS macro is available to compute exact binomial confidence limits [6].

Our objective was to describe the determination of sample size for binary diagnostic test assessment studies, and to provide exact tables based on the binomial distribution.

Section snippets

Definitions

Assessing a diagnostic test procedure with binary (yes/no) outcome entails determining the operating characteristics of the test with respect to some disease of interest. The intrinsic characteristics of the test are sensitivity and specificity. Sensitivity (Se) is the probability that the test outcome is positive in a patient who has the disease, and is estimated by the proportion of positive test results among a sample of patients with the disease (cases). Specificity (Sp) is the probability

Results

Sample sizes corresponding to lower 95% confidence limits, to be violated with probability < 5%, are presented in Table 1, Table 2

Whenever disease prevalence is <0.50, the following guidelines should be followed. The first step requires an assumption on the expected value of the new diagnostic test sensitivity. The second step is to specify the minimum acceptable lower confidence limit, together with the required probability (which was set here at 0.95) that this limit is not violated. The

Discussion

We have tabulated sample sizes needed for assessing diagnostic procedures (Table 1, Table 2), to help physicians and clinical epidemiologists when designing a study on diagnostic test assessment.

If one expected a sensitivity of 0.85 for a screening test and considered that the lower 95% confidence limit should not fall below 0.80, with 0.95 probability, the exact number of cases is 624 (Table 1). The corresponding normal approximation yields 621 cases, which is only 0.5% lower than the exact

References (12)

There are more references available in the full text version of this article.

Cited by (358)

  • Optimizing prevalence estimates for a novel pathogen by reducing uncertainty in test characteristics

    2022, Epidemics
    Citation Excerpt :

    Furthermore, they can easily incorporate prior information about prevalence, sensitivity, or specificity from other pilot or validation studies – either directly, as validation data, or indirectly, through the considered use of informative priors – and can jointly model the application of multiple diagnostic tests with different performance characteristics simultaneously (Joseph et al., 1995) or scenarios when multiple lab validation data sets are available (Gelman and Carpenter, 2020). These methods also avoid the need to rely on asymptotic approximations (Flahault et al., 2005) in the process of calculating confidence intervals. The direct inclusion of validation tests in prevalence estimation not only allows uncertain sensitivity and specificity to affect prevalence estimates (Figs. 1 and 2), but also allows field data to affect sensitivity and specificity estimates (Fig. 3).

View all citing articles on Scopus
View full text