Nonparametric estimation of Shannon’s index of diversity when there are unseen species in sample

Chao, Anne; Shen, Tsung-Jen

doi:10.1023/A:1026096204727

Nonparametric estimation of Shannon’s index of diversity when there are unseen species in sample

Published: December 2003

Volume 10, pages 429–443, (2003)
Cite this article

Environmental and Ecological Statistics Aims and scope Submit manuscript

Anne Chao¹ &
Tsung-Jen Shen¹

4402 Accesses
531 Citations
15 Altmetric
1 Mention
Explore all metrics

Abstract

A biological community usually has a large number of species with relatively small abundances. When a random sample of individuals is selected and each individual is classified according to species identity, some rare species may not be discovered. This paper is concerned with the estimation of Shannon’s index of diversity when the number of species and the species abundances are unknown. The traditional estimator that ignores the missing species underestimates when there is a non-negligible number of unseen species. We provide a different approach based on unequal probability sampling theory because species have different probabilities of being discovered in the sample. No parametric forms are assumed for the species abundances. The proposed estimation procedure combines the Horvitz–Thompson (1952) adjustment for missing species and the concept of sample coverage, which is used to properly estimate the relative abundances of species discovered in the sample. Simulation results show that the proposed estimator works well under various abundance models even when a relatively large fraction of the species is missing. Three real data sets, two from biology and the other one from numismatics, are given for illustration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new diversity estimator

Article Open access 15 September 2017

Lukun Zheng & Jiancheng Jiang

One Step Entropy Variation in Sequential Sampling of Species for the Poisson-Dirichlet Process

Article 20 March 2023

Servet Martínez & Javier Santibáñez

The Rarefaction of Phylogenetic Diversity: Formulation, Extension and Application

References

Ashbridge, J. and Goudie, I.B.J. (2000) Coverage-adjusted estimators for mark-recapture in heterogeneous populations. Communications in Statistics-Simulation, 29, 1215–37.
Google Scholar
Basharin, G.P. (1959) On a statistical estimate for the entropy of a sequence of independent random variables. Theory of Probability and Its Applications, 4, 333–6.
Google Scholar
Batten, L.A. (1976) Bird communities of some Killarney woodlands. Proceedings of the Royal Irish Academy, 76, 285–313.
Google Scholar
Bunge, J. and Fitzpatrick, M. (1993) Estimating the number of species: a review. Journal of the American Statistical Association, 88, 364–73.
Google Scholar
Bunge, J., Fitzpatrick, M., and Handley, J. (1995) Comparison of three estimators of the number of species. Journal of Applied Statistics, 22, 45–59.
Google Scholar
Chao, A. and Lee, S.-M. (1992) Estimating the number of classes via sample coverage. Journal of the American Statistical Association, 87, 210–17.
Google Scholar
Chao, A., Hwang, W.-H., Chen, Y.-C., and Kuo, C.-Y. (2000) Estimating the number of shared species in two communities. Statistica Sinica, 10, 227–46.
Google Scholar
Chao, A., Ma, M.-C., and Yang, M.C.K. (1993) Stopping rules and estimation for recapture debugging with unequal failure rates. Biometrika, 80, 193–201.
Google Scholar
Colwell, R.K. and Coddington, J.A. (1994) Estimating terrestrial biodiversity through extrapolation. Philosophical Transactions of the Royal Society, London B, 345, 101–18.
Google Scholar
Efron, B. and Tibshirani, R.J. (1993) An Introduction to the Bootstrap, Chapman and Hall, New York.
Google Scholar
Engen, S. (1978) Stochastic Abundance Models, Halsted Press, New York.
Google Scholar
Esty, W. (1986) The efficiency of Good's nonparametric coverage estimator. The Annals of Statistics, 14, 1257–60.
Google Scholar
Good, I.J. (1953) The population frequencies of species and the estimation of population parameters. Biometrika, 40, 237–64.
Google Scholar
Haas, P. and Stokes, L. (1998) Estimating the number of classes in a finite population. Journal of the American Statistical Association, 93, 1475–87.
Google Scholar
Holst, L. (1981) Some asymptotic results for incomplete multinomial or Poisson samples. Scandinavian Journal of Statistics, 8, 243–6.
Google Scholar
Horvitz, D.G. and Thompson, D.J. (1952) A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47, 663–85.
Google Scholar
Hutcheson, K. and Shenton, L.R. (1974) Some moments of an estimate of Shannon's measure of information. Communications in Statistics, 3, 89–94.
Google Scholar
Janzen, D.H. (1973a) Sweep samples of tropical foliage insects: description of study sites, with data on species abundances and size distributions. Ecology, 54, 659–86.
Google Scholar
Janzen, D.H. (1973b) Sweep samples of tropical foliage insects: effects of seasons, vegetation types, elevation, time of day, and insularity. Ecology, 54, 687–708.
Google Scholar
MacArthur, R.H. (1957) On the relative abundances of bird species. Proceedings of National Academy of Science, U.S.A., 43, 193–295.
Google Scholar
Magurran, A.E. (1988) Ecological Diversity and Its Measurement, Princeton, Princeton University Press, New Jersey.
Google Scholar
Mandelbrot, B. (1977) Fractals, Form, Chance and Dimension, Freeman, San Francisco.
Google Scholar
Norris III, J.L. and Pollock, K.H. (1998) Non-parametric MLE for Poisson species abundance models allowing for heterogeneity between species. Environmental and Ecological Statistics, 5, 391–402.
Google Scholar
Peet, R.K. (1974) The measurement of species diversity. Annual Review of Ecology and Systematics, 5, 285–307.
Google Scholar
Pielou, E.C. (1975) Ecological Diversity, Wiley, New York.
Google Scholar
Smith, W. and Grassle, J.F. (1977) Sampling properties of a family of diversity measures. Biometrics, 33, 283–92.
Google Scholar
Solow, A.R. (1993) A simple test for change in community structure. Journal of Animal Ecology, 62, 191–3.
Google Scholar
Thompson, S.K. (1992) Sampling, Wiley, New York.
Google Scholar
Zahl, S. (1977) Jackknifing an index of diversity. Ecology, 58, 907–13.
Google Scholar
Zipf, G.K. (1965) Human Behavior and Principle of Least Effort, Addison-Wesley, New York.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Statistics, National Tsing Hua University, Hsin-Chu, TAIWAN, 30043
Anne Chao & Tsung-Jen Shen

Authors

Anne Chao
View author publications
You can also search for this author in PubMed Google Scholar
Tsung-Jen Shen
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chao, A., Shen, TJ. Nonparametric estimation of Shannon’s index of diversity when there are unseen species in sample. Environmental and Ecological Statistics 10, 429–443 (2003). https://doi.org/10.1023/A:1026096204727

Download citation

Issue Date: December 2003
DOI: https://doi.org/10.1023/A:1026096204727

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonparametric estimation of Shannon’s index of diversity when there are unseen species in sample

Abstract

Access this article

Similar content being viewed by others

A new diversity estimator

One Step Entropy Variation in Sequential Sampling of Species for the Poisson-Dirichlet Process

The Rarefaction of Phylogenetic Diversity: Formulation, Extension and Application

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Nonparametric estimation of Shannon’s index of diversity when there are unseen species in sample

Abstract

Access this article

Similar content being viewed by others

A new diversity estimator

One Step Entropy Variation in Sequential Sampling of Species for the Poisson-Dirichlet Process

The Rarefaction of Phylogenetic Diversity: Formulation, Extension and Application

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation