When no gold standard is available to evaluate a diagnostic or screening test, as is often the case, an imperfect reference standard test must be used instead. Furthermore, the errors of the test and its reference standard may not be independent. Some authors have opined that positively dependent errors will lead to overestimation of test performance. Although positive dependence does increase agreement between the test and the reference standard, it is not clear if test accuracy will necessarily be overestimated in this situation, and the case of negatively associated test errors is even less clear. To examine this issue in more detail, we derive the apparent sensitivity, specificity, and overall accuracy of a test relative to an imperfect reference standard and the bias in these parameters. We demonstrate that either positive or negative bias can occur if the reference standard is imperfect. The type and magnitude of bias depend on several components: the disease prevalence, the true test sensitivity and specificity, the covariance between the false‐negative test errors among the true disease cases, and the covariance between the false‐positive test errors among the true noncases. If, for example, sensitivity and specificity are 0.8 for both the test and reference standard and the errors have a moderate positive dependence, test sensitivity is then underestimated at low prevalence but overestimated at high prevalence, while the opposite occurs for specificity. We illustrate these ideas through general numerical calculations and an empirical example of screening for breast cancer with magnetic resonance imaging and mammography. Copyright © 2012 John Wiley & Sons, Ltd.