Published: 08 May 2019 | Last Updated: 08 May 2019 12:18:13

New research, based on the VetCompass™ programme at the Royal Veterinary College (RVC), uses machine learning to detect false-positive disease references in veterinary clinical notes.

Clinicians often include references to diseases in clinical notes even though the patient doesn’t have that disease. For some diseases, the majority of disease references are written in the notes of patients who don’t suffer from that disease. These references occur because clinicians often use their clinical notes to speculate about diseases (eg differential diagnosis) or to state that the disease has been ruled out.  This causes a problem when epidemiologists try to find patients who have a disease because a simple search based on keyword matching will match many patients who don’t have the disease.

This problem has been studied for decades in clinical text.  This new research, based on VetCompass™ data, shows how to create very large training sets for machine-learning algorithms without the need for a person to read each disease reference and decide if it is a true disease reference or not. This manual-labour approach can’t scale to corpora the size of VetCompass™.  Our method produced a training dataset which was around 100 times larger than the largest previous dataset.  We also achieved state-of- the-art classification performance with a bidirectional long short-term memory model trained to distinguish disease references between patients with or without the disease diagnosis in veterinary clinical notes.

RVC machine learning researcher and VetCompass technical lead Noel Kennedy said:

“The VetCompass corpus is a great resource for computer scientists to work with.  We are just getting started with the possible applications for machine learning on such an extensive dataset.”

The full paper is freely available online: Kennedy, N., Brodbelt, D.C., Church, D.B and O'Neill, D.G., (2019) “Detecting false-positive disease references in veterinary clinical notes without manual annotations” Nature Digital Medicine Available:


Top of page