Document Type : Original Article


1 Department of Industrial Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran.

2 Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradovica-6, 21000 Novi Sad, Serbia.


In recent years, there has been a surge in research exploring the potential of Machine Learning (ML) for predicting water pump failures. While some studies have focused on supervised approaches, others have delved into unsupervised methods. However, the challenge lies in identifying the key variables crucial for accurate failure predictions. This study bridges this gap by consulting domain experts to discern essential variables, including water catchment area level, water quality index, lubrication frequency, water reservoir temperature, operating time, and power interruptions count. Employing supervised ML methods, specifically multiple regression and decision tree cart, the research aims to enhance the precision of failure predictions, shedding light on less-explored variables that play a significant role in pump failure.


[1]     Ikramov, N., Kan, E., Mirzoev, M., & Majidov, T. (2019). Effect of parallel connection of pumping units on operating costs of pumping station. In E3S Web of Conferences (Vol. 97, p. 05014). EDP Sciences.
[2]     Ergashev, R., Bekchanov, F., Akmalov, S., Shodiev, B., & Kholbutaev, B. (2020). New methods for geoinformation systems of tests and analysis of causes of failure elements of pumping stations. IOP conference series: materials science and engineering (Vol. 883, p. 12015). IOP Publishing. DOI: 10.1088/1757-899X/883/1/012015
[3]     Jacobs, J. A., Mathews, M. J., & Kleingeld, M. (2018). Failure prediction of mine de-watering pumps. Journal of failure analysis and prevention, 18(4), 927–938.
[4]     Mohammed, A. (2023). Data driven-based model for predicting pump failures in the oil and gas industry. Engineering failure analysis, 145, 107019. DOI:10.1016/j.engfailanal.2022.107019
[5]     Trstenjak, B., Palasek, B., & Trstenjak, J. (2019). A decision support system for the prediction of wastewater pumping station failures based on CBR continuous learning model. Engineering, technology and applied science research, 9(5), 4745–4749. DOI:10.48084/etasr.3031
[6]     Afshar-Nadjafi, B., Pourbakhsh, H., Mirhabibi, M., Khodaei, H., Ghodami, B., Sadighi, F., & Azizi, S. (2019). Economic production quantity model with backorders and items with imperfect/perfect quality options. Journal of applied research and technology, 17(4), 250–257. DOI:10.22201/icat.16656423.2019.17.4.794
[7]     Ali, S., Abuhmed, T., El-Sappagh, S., Muhammad, K., Alonso-Moral, J. M., Confalonieri, R., ... & Herrera, F. (2023). Explainable artificial intelligence (XAI): what we know and what is left to attain trustworthy artificial intelligence. Information fusion, 99, 101805. DOI:10.1016/j.inffus.2023.101805
[8]     Azizi, S., & Mohammadi, M. (2023). Strategy selection for multi-objective redundancy allocation problem in a k-out-of-n system considering the mean time to failure. Opsearch, 60(2), 1021–1044. DOI:10.1007/s12597-023-00635-2
[9]     Issah, I., Appiah, O., Appiahene, P., & Inusah, F. (2023). A systematic review of the literature on machine learning application of determining the attributes influencing academic performance. Decision analytics journal, 7, 100204. DOI:10.1016/j.dajour.2023.100204
[10]   Rasi Nojehdehi, R., Bagherzadeh Valami, H., & Najafi, S. E. (2023). Classifications of linking activities based on their inefficiencies in network DEA. International journal of research in industrial engineering, 12(2), 165–176.
[11]   Rasinojehdehi, R., & Valami, H. B. (2023). A comprehensive neutrosophic model for evaluating the efficiency of airlines based on SBM model of network DEA. Decision making: applications in management and engineering, 6(2), 880–906. DOI:10.31181/dma622023729
[12]   Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., ... & Steinberg, D. (2008). Top 10 algorithms in data mining. Knowledge and information systems, 14(1), 1–37. DOI:10.1007/s10115-007-0114-2
[13]   Kurani, A., Doshi, P., Vakharia, A., & Shah, M. (2023). A comprehensive comparative study of artificial neural network (ANN) and support vector machines (SVM) on stock forecasting. Annals of data science, 10(1), 183–208. DOI:10.1007/s40745-021-00344-x
[14]   Nojehdehi, R. R., Maleki, P., Abianeh, M., & Valami, H. B. (2012). A geometrical approach for fuzzy production possibility set in data envelopment analysis (DEA) with fuzzy input-output levels. African journal of business management, 6(7), 2738–2745.
[15]   Reynara, F. J., Carolina, S., & Simbolon, I. N. (2022). The comparison of C4.5 and CART (classification and regression tree) algorithm in classification of occupation for fresh graduate. IConVET 2021: proceedings of the 4th international conference on vocational education and technology (p. 13). European Alliance for Innovation. DOI: 10.4108/eai.27-11-2021.2315527
[16]   Bansal, M., Goyal, A., & Choudhary, A. (2022). A comparative analysis of K-nearest neighbor, genetic, support vector machine, decision tree, and long short term memory algorithms in machine learning. Decision analytics journal, 3, 100071. DOI:10.1016/j.dajour.2022.100071
[17]   Hasri, C. F., & Alita, D. (2022). Penerapan metode naïve bayes classifier dan support vector machine pada analisis sentimen terhadap dampak virus corona di twitter. Jurnal informatika dan rekayasa perangkat lunak, 3(2), 145–160. DOI:10.33365/jatika.v3i2.2026
[18]   Herrera, G., & Morillo, P. (2022). Benchmarking of supervised machine learning algorithms in the early failure prediction of a water pumping system. Communication, smart technologies and innovation for society: proceedings of CITIS 2021 (pp. 535–546). Springer.
[19]   Velasco Robles, A. (2022). A machine learning approach to predict pipe failures in water distribution networks (Ph.D Thesis, Universidad de Sevilla).
[20]   Sunal, C. E., Dyo, V., & Velisavljevic, V. (2022). Review of machine learning based fault detection for centrifugal pump induction motors. IEEE access, 10, 71344–71355. DOI:10.1109/ACCESS.2022.3187718
[21]   Eiben, A. E., Berends, T., & Mosch, T. (2022). Predictive maintenance for sewage pumping stations using machine learning (Ph.D Thesis, Vrije Universiteit Amsterdam).
[22]   Kreuzberger, D., Kuhl, N., & Hirschl, S. (2023). Machine learning operations (mlops): overview, definition, and architecture. IEEE access, 11, 31866–31879. DOI:10.1109/ACCESS.2023.3262138
[23]   Eberly, L. E. (2007). Multiple linear regression. Topics in biostatistics, 165–187.
[24]   Charbuty, B., & Abdulazeez, A. (2021). Classification based on decision tree algorithm for machine learning. Journal of applied science and technology trends, 2(01), 20–28. DOI:10.38094/jastt20165