Irrigation water quality classification using boosting and CNN-BiGRU-Attention models: feature importance assessment using Hellinger distance


Zerouali B., Derdour A., Almalki A. S., KATİPOĞLU O. M., Arafat A. A., Santos C. A. G., ...Daha Fazla

Hydrological Sciences Journal, 2025 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1080/02626667.2025.2574864
  • Dergi Adı: Hydrological Sciences Journal
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, IBZ Online, Compendex, Geobase, INSPEC
  • Anahtar Kelimeler: Chi-square, feature selection, mutual information score, sensitivity analysis, strong performance, water quality management
  • Erzincan Binali Yıldırım Üniversitesi Adresli: Evet

Özet

This study addresses a gap in irrigation water quality classification by comparing machine learning (ML) algorithms and deep learning (DL) models, with a focus on feature selection. Feature relevance was assessed using Hellinger distance, mutual information, and Chi-square analysis. Classifiers included boosting algorithms (AdaBoost, XGBoost, CatBoost, LightGBM, GradientBoosting, HistGradientBoosting), Extra Trees, and DL models (CNN-BiGRU and CNN-BiGRU-Attention). Performance was evaluated using standard metrics on 166 samples from Algeria’s Naama region. DL models achieved the highest accuracies (0.82–0.84) and AUC scores (0.98–0.981), followed by XGBoost and CatBoost. Feature analysis identified qiCl as most impactful (Hellinger distance = 0.248), and qiNa as most informative (mutual information= 1.0212). These findings underscore the strength of DL approaches and the value of integrated feature selection. The results offer practical insights for optimizing irrigation strategies, supporting risk assessment, and guiding sustainable water resource management.