Advancing River Water Quality Prediction A Comparative Study of Anomaly Detection Techniques for Optimizing Dissolved Oxygen Level Forecasting
(1) Universitas Katolik Indonesia Atma Jaya, Indonesia
(*) Corresponding Author
Abstract
In the realm of environmental monitoring, particularly river water quality, the study at hand addresses the paramount challenge of accurately predicting dissolved oxygen (DO) levels—a critical indicator of aquatic ecosystem health. This research targets the complexities inherent in environmental datasets, including the presence of anomalies that can skew predictive models, thereby undermining the reliability of DO level forecasts. By applying and critically evaluating advanced anomaly detection methods—One-Class SVM, Isolation Forest, and Autoencoders—the study endeavors to enhance predictive accuracy and address gaps in existing research methodologies. The methodology encompasses data collection, preprocessing, anomaly detection, and evaluation, working with a dataset comprising five indicators across eight monitoring stations. The research process entailed thorough data preparation, ensuring dataset integrity and uniformity. Anomaly detection was meticulously performed, with each method revealing varying outlier detection sensitivities. The One-Class SVM method identified 23 outliers, the Isolation Forest found 38, and the Autoencoders flagged 88. When assessing the impact on model accuracy, reflected by the RMSE, the Isolation Forest method outperformed the others, achieving the lowest RMSE of 0.9668, indicating a more effective anomaly mitigation contributing to a cleaner dataset. In contrast, the Autoencoders, while detecting the most anomalies, yielded the highest RMSE, suggesting a propensity to overfit and misclassify data variations as anomalies. This study illuminates the criticality of selecting suitable anomaly detection methods tailored to the dataset's nuances, emphasizing that the choice profoundly influences predictive model performance. The Isolation Forest's proficiency in this context underscores its potential as a robust method for environmental data analysis, capable of balancing outlier detection accuracy with predictive model precision.
Full Text:
PDFReferences
S. Giri, “Water quality prospective in Twenty First Century: Status of water quality in major river basins, contemporary strategies and impediments: A review,” Environ. Pollut., vol. 271, p. 116332, 2021.
A. Pal, Y. He, M. Jekel, M. Reinhard, and K. Y.-H. Gin, “Emerging contaminants of public health significance as water quality indicator compounds in the urban water cycle,” Environ. Int., vol. 71, pp. 46–62, 2014.
M. A. Sadat, Y. Guan, D. Zhang, G. Shao, X. Cheng, and Y. Yang, “The associations between river health and water resources management lead to the assessment of river state,” Ecol. Indic., vol. 109, p. 105814, 2020.
A. Csábrági et al., “Estimation of dissolved oxygen in riverine ecosystems: Comparison of differently optimized neural networks,” Ecol. Eng., vol. 138, pp. 298–309, 2019.
S. Gheorghe et al., “Metals toxic effects in aquatic ecosystems: modulators of water quality,” Water Qual., vol. 87, pp. 59–89, 2017.
J. Bir, M. S. Sumon, and S. M. B. Rahaman, “The effects of different water quality parameters on zooplankton distribution in major river systems of Sundarbans Mangrove,” IOSR J. Environ. Sci. Toxicol. Food Technol., vol. 11, pp. 56–63, 2015.
A. N. Matheri, F. Ntuli, J. C. Ngila, T. Seodigeng, and C. Zvinowanda, “Performance prediction of trace metals and cod in wastewater treatment using artificial neural network,” Comput. & Chem. Eng., vol. 149, p. 107308, 2021.
Y. Bai and J. Zhao, “A novel transformer-based multi-variable multi-step prediction method for chemical process fault prognosis,” Process Saf. Environ. Prot., vol. 169, pp. 937–947, 2023.
S. Dragović, “Artificial neural network modeling in environmental radioactivity studies--A review,” Sci. Total Environ., vol. 847, p. 157526, 2022.
A. A. Cook, G. Misirli, and Z. Fan, “Anomaly detection for IoT time-series data: A survey,” IEEE Internet Things J., vol. 7, no. 7, pp. 6481–6494, 2019.
R. Kromanis and P. Kripakaran, “SHM of bridges: characterising thermal response and detecting anomaly events using a temperature-based measurement interpretation approach,” J. Civ. Struct. Heal. Monit., vol. 6, pp. 237–254, 2016.
S. F. Gould et al., “A tool for simulating and communicating uncertainty when modelling species distributions under future climates,” Ecol. Evol., vol. 4, no. 24, pp. 4798–4811, 2014.
K. Chen et al., “Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data,” Water Res., vol. 171, p. 115454, 2020.
A. N. Ahmed et al., “Machine learning methods for better water quality prediction,” J. Hydrol., vol. 578, p. 124084, 2019.
T. H. H. Aldhyani, M. Al-Yaari, H. Alkahtani, M. Maashi, and others, “Water quality prediction using artificial intelligence algorithms,” Appl. Bionics Biomech., vol. 2020, 2020.
S. Thudumu, P. Branch, J. Jin, and J. Singh, “A comprehensive survey of anomaly detection techniques for high dimensional big data,” J. Big Data, vol. 7, pp. 1–30, 2020.
A. Blázquez-Garc’ia, A. Conde, U. Mori, and J. A. Lozano, “A review on outlier/anomaly detection in time series data,” ACM Comput. Surv., vol. 54, no. 3, pp. 1–33, 2021.
M. N. K. Sikder and F. A. Batarseh, “Outlier detection using AI: a survey,” AI Assur., pp. 231–291, 2023.
DOI: https://doi.org/10.30645/kesatria.v5i1.327
DOI (PDF): https://doi.org/10.30645/kesatria.v5i1.327.g324
Refbacks
- There are currently no refbacks.
Published Papers Indexed/Abstracted By: