Random Forest Model for Optimizing Coagulant Doses in Drinking Water Treatment: Application at the Miguel de la Cuba Ibarra Plant
Author(s)
Ronny Ivan Gonzales Medina
Juan Adriel Carlos Mendoza
Eduardo José Zuñiga Goyzueta
Rosa María Morán-Silva
Date Issued
30 de diciembre de 2025
Type
Article
Volume
13
Issue
1
Start Page
17
End Page
17
Abstract
Optimizing coagulant dosages in Drinking Water Treatment Plants (DWTPs) is critical for reducing operational costs, minimizing chemical waste, mitigating environmental impacts, and ensuring consistent water quality, particularly in resource-constrained settings where conventional jar tests are labor-intensive and poorly suited to real-time demands. This study develops and validates a Random Forest (RF) machine learning model to predict optimal dosages of aluminum sulfate, polyaluminum chloride, and a polymer flocculant at the Miguel de la Cuba Ibarra DWTP in Peru, addressing the need for an efficient, real-time decision support system. Using a historical dataset of 2556 jar tests, a univariate RF model was developed to predict settled water turbidity, tailored to the plant’s typical operational range. The model demonstrated robust predictive performance, achieving a coefficient of determination (R2) of 0.92 during training and 0.76 during validation with unseen data, alongside a Root Mean Square Error (RMSE) of 0.11 NTU and a Mean Absolute Percentage Error (MAPE) of 0.11 in the training phase. Integrated into a digital platform, the model generates real-time NTU ppm dosing curves, providing a practical and responsive tool to enhance operational efficiency for DWTP operators. This work offers a scalable, data-driven solution to improve water treatment processes in resource-limited contexts.
Subjects