Explainable AI for aquatic environmental intelligence: a SHAP-enhanced LSTM approach using high-frequency water quality data in a river system

Sait Mutlu Karahan , Wouter Vandenbruwaene , Janelcy Alferes Castano and Jan Verwaeren

November 24, 2025

Publication Image

High-frequency IoT (Internet of Things) sensor data are essential for monitoring water quality, with electrical conductivity (EC) serving as a key indicator of salinity in river systems. Accurate EC forecasting is critical where water is withdrawn for supply purposes. Machine learning models, such as Long Short-Term Memory (LSTM) networks, offer valuable predictive capabilities but are often criticized for their ‘black-box’ nature. In this study, an LSTM model was employed not primarily for forecasting but as a data-driven framework to capture nonlinear and temporal dependencies among hydrometeorological variables. Model outputs were interpreted using Shapley Additive Explanations (SHAP) to provide explanatory insight into the dynamics of EC during the summer season, when salinity fluctuations are most pronounced. Rather than pursuing high predictive accuracy, the analysis focused on identifying the dominant drivers of EC peaks and on comparing spatial variability between upstream and downstream regions. Results indicate that, although the model's performance was limited due to unseen dynamics during training, SHAP analysis revealed physically consistent feature influences, including neighbouring sensors, discharge, and temperature. The study demonstrates the usefulness of integrating high-frequency data, temporal modeling, and explainable AI to improve understanding and management of salinity-related risks.