Diagnostic test evaluation: How do we test the tests?

To identify diseased animals, we often use diagnostic tests to inform us about their disease status. However, most diagnostic tests are imperfect, generating false negatives and false positives, including those commonly described as “Gold Standards”. In the rare situation where a perfect, Gold Standard, test exists, it is often expensive and/or time-consuming to use. In the absence of a Gold Standard test, we still need to be able to ‘tune’ a diagnostic test to specific situations and quantify its accuracy.


Often, diagnostic tests are based on measuring some continuous quantity, for example, the level of a chemical in blood. Typically, diseased animals’ values are different to non-diseased animals, but there is some overlap. Therefore, usually a cut-off value is defined, above which the animal is considered diseased, and below which it is classified as normal. Due to the overlap in distributions, however, there will be some animals that are assigned to the wrong group.

Sometimes we need to specify the accuracy of a test when there is no perfect, “Gold Standard” way to define diseased / non-diseased animals. Under certain circumstances, modern statistical methods can allow us to do this.

Our Role

By comparing antibody levels from animals known to have sheep scab against those without, BioSS was able to define a cut-off value that simultaneously minimised the number of false positive and false negative results. Using simulation techniques, we could show that this choice was sub-optimal in several realistic situations. In particular, the optimum cut-off would vary, depending on the proportion of (truly) infected animals and on the relative cost of a false negative or a false positive result. Taking account of our uncertainty about the true prevalence in the field, we devised a testing regime which will nearly always generate at least one positive test result from a group of animals, if at least one is infested with mites (ie assuring a very high group-level test sensitivity).

Diagnostic tests exist for a virus causing transmissible lung tumours in sheep (Ovine Pulmonary Adenocarcinoma). However, none are perfect. By simultaneously conducting multiple tests on several flocks with different, but unknown, disease status, and then applying modern, computationally intensive statistical methods, we were able to estimate the accuracy of all the tests and the risk of infection on each farm. This information is now being used to inform a flock health scheme.

Future Developments

BioSS is drawing on its previous experience to develop new statistical techniques that can be used to simultaneously estimate the risk of disease in a population and the effect of multiple risk factors, while simultaneously allowing for, and estimating, the properties of imperfect tests. This is in contrast to the other methods most commonly used, which are likely to give biased results.

Highland cows in a field. Copyright James Hutton Institute
Giles is shown smiling at the camera

For further information contact: