Estimation of test error rates, disease prevalence...

Estimation of test error rates, disease prevalence and relative risk from misclassified data: a review

Abstract

We review methods for the analysis of categorical clinical and epidemiological data, in which the observations are subject to misclassification. Under certain conditions, it is possible to estimate error parameters such as sensitivity, specificity, relative risk, or predictive value, even though no definitive classification (gold standard) is available. The parameter estimates are obtained by modelling the data, using maximum likelihood, with or without some constraints. The models recognize that the true classification of an individual is unknown, and so are sometimes referred to as "latent class" models. The latent class approach provides a unified framework for various methods found in a dispersed literature, characterising each by the number of populations or subgroups in the data, and the number of observations made on each individual; the statistical degrees of freedom are implied by the sampling design. Data sets with less than three replicate observations per individual necessarily require constraints for parameter estimation to be possible. Data sets with three or more replicates lead directly to estimates of the misclassification rates, subject to some simple assumptions. Some more complex problems are also discussed, including data where the response variable has more than two levels, sequential and irregular designs and the effects of assumption violations.

Authors

Walter SD; Irwig LM

Journal

Journal of Clinical Epidemiology, Vol. 41, No. 9, pp. 923–937

Publisher

Elsevier

Publication Date

January 1, 1988

DOI

10.1016/0895-4356(88)90110-2

ISSN

0895-4356

Associated Experts

Stephen Walter

Professor Emeritus, Faculty of Health Sciences

Visit profile

Labels