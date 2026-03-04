Skip to main content Skip to footer
As part of the Ofwat-funded innovation project River Deep Mountain AI (RDMAI), we are releasing our open-source anomaly detection model, built to enhance the value of continuous water quality monitoring.

In response to the critical condition of UK rivers, the government has strengthened its commitment to environmental protection and made early detection of water pollution a top priority. Strengthened regulation, including the Environment Act 2021 (UK Parliament, 2021), is driving the water sector towards robust monitoring approaches. This will transform how the sector identifies, interprets, and responds to environmental risks and incidents. Under Section 82 of the Environment Act of 2021, Water companies in England will be required to continuously monitor upstream and downstream of sewerage outfalls. It is estimated that 40,000 multi-parameter sondes will need to be deployed across England to meet this requirement. These sondes will be recording data at a high frequency (every 15 minutes at high-risk times and every hour at other times), generating millions of data points annually.

Time-intensive and manual review processes have been applied to traditional, low-frequency water quality monitoring. These review processes often fail to distinguish between natural variations (trends and seasonality) and pollution-driven anomalies. Moreover, the unique hydrological pattern of each river makes it difficult to identify anomalies in the complex data readings. Manual interpretation of monitoring data and timely identification of data anomalies within high-frequency Section 82 sonde data will be impossible due to the sheer volume of data produced. 

Our open-source anomaly detection model is a tool designed to automatically distinguish pollution-driven anomalies from natural variations, that can be applied at scale to high-frequency water quality monitoring data, to detect pollution-driven anomalies.

Leveraging AI for Anomaly Detection 

We’re proud to be publishing the first iteration of our AI-based Anomaly Detection model for river water quality monitoring. Built on high-frequency, multi-parameter datasets, including pH, dissolved oxygen, turbidity, ammonia, temperature and electrical conductivity; the model is designed to handle high volumes of data with minimal human intervention.

At its core, the framework uses advanced time-series decomposition techniques, blending Multiple Seasonal-Trend decomposition using Loess (MSTL), harmonic regression, and Butterworth filtering to untangle trends, seasonal cycles, and residual patterns in each parameter. This step is crucial for removing trends and seasonality which can mask potential anomalies.

The residual signals are fed into an unsupervised Isolation Forest model to automatically detect anomalies across multiple parameters simultaneously. Isolation Forest is an anomaly detection algorithm that learns the underlying structure of the data without labelled examples and isolates observations that significantly deviate from normal patterns.

Once anomalous observations are identified, K-Means clustering is applied to group these anomalies into clusters of events with similar signatures. Each cluster represents a distinct pattern of abnormal behaviour, often corresponding to specific types of events in the river system.

To enhance interpretability, PCA (Principal Component Analysis) biplots are used to visualize the clustered anomalies in reduced-dimensional space. These biplots not only display the separation of event clusters but also illustrate how individual water quality parameters contribute to each anomaly pattern.

Figure 1: High-level framework for transforming residuals of time-series of water quality data into clusters. The model first estimates the residuals by removing trends and seasonality from the time series and further project it into a low-dimensional (2D) space using unsupervised ML. This approach enables the identification of distinct anomaly patterns, which might otherwise remain hidden, that could correspond to potential events.  

 
Testing shows that the model can generate reliable clusters that correspond to a variety of events in the given water body. Visual outputs, from time series decomposition to anomaly overlays and evolving cluster maps, allow rapid insight into the dynamics of river systems.

Our open-source anomaly detection offers various advantages over available industry-standard alternatives, including: (1) cost savings; (2) easy customisation and improvement; (3) readily embedded in water company data systems/platforms. In addition, our anomaly detection model has been specifically designed to process high-frequency water quality monitoring data, whereas industry alternatives are more generic, for use on a wide range of data types.

A collaborative and open-sourced approach

The aim of River Deep Mountain AI is to bring key stakeholders involved in waterbody health together and collaboratively develop open-source AI/ML models that can inform effective actions to tackle waterbody pollution.   

All our models have been released open source to democratise the use of artificial intelligence and benefit the water sector. The first iterations of our models were released in May 2025, and the final iterations have been released between November 2025 and February 2026.

Access the Open Anomaly Detection Model via GitHub

River Deep Mountain AI is funded by the Ofwat Innovation Fund and consists of 6 costed partners: Northumbrian Water, Cognizant Ocean, Xylem Inc, Water Research Centre Limited, The Rivers Trust and ADAS. The project is further supported by 6 water companies across the United Kingdom and Ireland and the Stream initiative.

 

