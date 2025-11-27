

Since the first iteration of our Open E. coli Models (in June 2025), we have conducted a validation exercise, aimed at validating how the advanced and light models perform when faced with new data. When validating the classification model (with a threshold of 500CFU/100ml) trained with a random temporal split, we recorded drops in accuracy, going from 80.9% to 80.4% (advanced) and 86.7% to 75.2% (light). In contrast, with a random geographical split, the performances of the light and advanced versions increased from 73.8% to 79.7% and decreased from 80.4% to 78.7%, respectively. The details of the validation can be explored in the model output report on GitHub shared together with the Open E. coli Models.

Ultimately, we have developed models that can support the monitoring of microbial water quality safety using low cost, commonly available datasets. The models released today can enable a proactive water quality management approach and reduce the occurrence of human health risk exposures to excessive E. coli concentrations.

A collaborative and open-sourced approach

The overarching objective of River Deep Mountain AI is to bring key stakeholders involved in waterbody health together and to collaboratively develop open-source AI/ML models that can inform effective actions to tackle waterbody pollution.

All our models will be released open source to democratise artificial intelligence and benefit the entire water sector.

Access our Open E. coli Models via GitHub.

River Deep Mountain AI is funded by the Ofwat Innovation Fund and consists of 6 core partners: Northumbrian Water, Cognizant Ocean, Xylem Inc, Water Research Centre Limited, The Rivers Trust and ADAS. The project is further supported by 6 water companies across the United Kingdom and Ireland.