Evaluate effect of 126 pre-processing methods on various artificial intelligence models accuracy versus normal mode to predict groundwater level (case study: Hamedan-Bahar Plain, Iran)


Saroughi M., Mirzania E., Achite M., KATİPOĞLU O. M., Al-Ansari N., Vishwakarma D. K., ...Daha Fazla

Heliyon, cilt.10, sa.7, 2024 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 10 Sayı: 7
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1016/j.heliyon.2024.e29006
  • Dergi Adı: Heliyon
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, CAB Abstracts, Food Science & Technology Abstracts, Veterinary Science Database, Directory of Open Access Journals
  • Anahtar Kelimeler: Artificial intelligence, Deep learning, Groundwater level, Hybrid algorithm, Machine learning
  • Erzincan Binali Yıldırım Üniversitesi Adresli: Evet

Özet

The estimation of groundwater levels is crucial and an important step in ensuring sustainable management of water resources. In this paper, selected piezometers of the Hamedan-Bahar plain located in west of Iran. The main objective of this study is to compare effect of various pre-processing methods on input data for different artificial intelligence (AI) models to predict groundwater levels (GWLs). The observed GWL, evaporation, precipitation, and temperature were used as input variables in the AI algorithms. Firstly, 126 method of data pre-processing was done by python programming which are classified into three classes: 1- statistical methods, 2- wavelet transform methods and 3- decomposition methods; later, various pre-processed data used by four types of widely used AI models with different kernels, which includes: Support Vector Machine (SVR), Artificial Neural Network (ANN), Long-Short Term memory (LSTM), and Pelican Optimization Algorithm (POA) - Artificial Neural Network (POA-ANN) are classified into three classes: 1- machine learning (SVR and ANN), 2- deep learning (LSTM) and 3- hybrid-ML (POA-ANN) models, to predict groundwater levels (GWLs). Akaike Information Criterion (AIC) were used to evaluate and validate the predictive accuracy of algorithms. According to the results, based on summation (train and test phases) of AIC value of 1778 models, average of AIC values for ML, DL, hybrid-ML classes, was decreased to −25.3%, −29.6% and −57.8%, respectively. Therefore, the results showed that all data pre-processing methods do not lead to improvement of prediction accuracy, and they should be selected very carefully by trial and error. In conclusion, wavelet-ANN model with daubechies 13 and 25 neurons (db13_ANN_25) is the best model to predict GWL that has −204.9 value for AIC which has grown by 5.23% (−194.7) compared to the state without any pre-processing method (ANN_Relu_25).