Brief ReportSample size calculation should be performed for design accuracy in diagnostic test studies
Introduction
The importance of sample size calculation in medical research is emphasized in all sets of good clinical practice guidelines. Previous articles have dealt with this issue under various circumstances, and in particular for two-group comparisons within clinical trials [1]. Simel et al. [2] deal with sample sizes based on desired likelihood ratios confidence intervals. Knottnerus and Muris [3] deal with the whole strategy needed for development of diagnostic tests, but do not provide practical tables for calculating sample sizes in the very situation that clinician epidemiologists are in when dealing with sensitivity or specificity confidence intervals.
From a statistical point of view, sample size issues for diagnostic test assessment studies have formal counterparts within the field of clinical trials, so that answers could be derived from published equations and tables [4], at least in principle; how to perform such derivations may not be clear to clinicians.
Moreover, in the case of binary (yes/no) outcome tests, a normal approximation to the binomial distribution is often used [5]. Although the accuracy of the approximation is usually good, modern software allows for exact calculations to be carried out at virtually no extra cost. For instance, a SAS macro is available to compute exact binomial confidence limits [6].
Our objective was to describe the determination of sample size for binary diagnostic test assessment studies, and to provide exact tables based on the binomial distribution.
Section snippets
Definitions
Assessing a diagnostic test procedure with binary (yes/no) outcome entails determining the operating characteristics of the test with respect to some disease of interest. The intrinsic characteristics of the test are sensitivity and specificity. Sensitivity (Se) is the probability that the test outcome is positive in a patient who has the disease, and is estimated by the proportion of positive test results among a sample of patients with the disease (cases). Specificity (Sp) is the probability
Results
Sample sizes corresponding to lower 95% confidence limits, to be violated with probability < 5%, are presented in Table 1, Table 2
Whenever disease prevalence is <0.50, the following guidelines should be followed. The first step requires an assumption on the expected value of the new diagnostic test sensitivity. The second step is to specify the minimum acceptable lower confidence limit, together with the required probability (which was set here at 0.95) that this limit is not violated. The
Discussion
We have tabulated sample sizes needed for assessing diagnostic procedures (Table 1, Table 2), to help physicians and clinical epidemiologists when designing a study on diagnostic test assessment.
If one expected a sensitivity of 0.85 for a screening test and considered that the lower 95% confidence limit should not fall below 0.80, with 0.95 probability, the exact number of cases is 624 (Table 1). The corresponding normal approximation yields 621 cases, which is only 0.5% lower than the exact
References (12)
- et al.
Likelihood ratios with confidence: sample size estimation for diagnostic test studies
J Clin Epidemiol
(1991) - et al.
Assessment of the accuracy of diagnostic tests: the cross-sectional study
J Clin Epidemiol
(2003) Simple SAS macros for the calculation of exact binomial and Poisson confidence limits
Comput Biol Med
(1992)- et al.
Estimating sample sizes for binary, ordered categorical, and continuous outcomes in two group comparisons
BMJ
(1995) - et al.
Sample size tables for clinical studies
(1997) - et al.
Statistical methods in medical research
(1987)
Cited by (358)
Diagnostic Accuracy of Clinical Pathways for Suspected Acute Myocardial Infarction in the Out-of-Hospital Environment
2023, Annals of Emergency MedicinePupillary dilation reflex and behavioural pain scale: Study of diagnostic test
2023, Intensive and Critical Care NursingOptimizing prevalence estimates for a novel pathogen by reducing uncertainty in test characteristics
2022, EpidemicsCitation Excerpt :Furthermore, they can easily incorporate prior information about prevalence, sensitivity, or specificity from other pilot or validation studies – either directly, as validation data, or indirectly, through the considered use of informative priors – and can jointly model the application of multiple diagnostic tests with different performance characteristics simultaneously (Joseph et al., 1995) or scenarios when multiple lab validation data sets are available (Gelman and Carpenter, 2020). These methods also avoid the need to rely on asymptotic approximations (Flahault et al., 2005) in the process of calculating confidence intervals. The direct inclusion of validation tests in prevalence estimation not only allows uncertain sensitivity and specificity to affect prevalence estimates (Figs. 1 and 2), but also allows field data to affect sensitivity and specificity estimates (Fig. 3).