Generalizability evaluations of heterogeneous ensembles for river health predictions

Taeseung Park, Jihoon Shin, Baekyung Park, Jeongsuk Moon, Yoon Kyung Cha

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Predictive models leverage the relationships between environmental factors and river health to predict the river health at unmonitored sites. Such models should be generalizable to unseen data. Among various machine learning models, heterogeneous ensembles are known to be generalizable owing to their structural diversity. The present study compares the generalizability of heterogeneous ensembles with those of homogeneous ensembles and single models. The models classified five grades (very good to very poor) of river health indices (RHIs) for three taxa (benthic macroinvertebrates, fish, and diatoms) given various environmental factors (water quality, hydrology, meteorological, land cover, and stream properties) as inputs. The data were monitored at 2915 sites in the four major river watersheds in South Korea during the 2016–2021 period. The results indicated better generalizability of the heterogeneous and homogeneous ensembles than single models. Moreover, heterogeneous ensembles tended to show higher generalizability than homogeneous ensembles, although the differences were marginal. Weighted soft voting was the most generalizable of the heterogeneous ensembles, with losses ranging from 0.49 to 0.59 across the three taxa. Weighted soft voting also delivered acceptable classification performance on the test set, with accuracies ranging from 0.42 to 0.52 across the taxa. The relative contributions of the environmental factors to RHI predictions and the directions of their effects agreed with established knowledge, confirming the reliability of the predictions. However, as heterogeneous ensembles have been rarely applied to RHI prediction, the extent to which heterogeneous ensembles improve the generalizability of prediction must be investigated in future studies.

Original languageEnglish
Article number102719
JournalEcological Informatics
Volume82
DOIs
StatePublished - Sep 2024

Keywords

  • Bias–variance decomposition
  • Generalizability
  • Heterogenous ensembles
  • Multimetric biological indices
  • River health
  • SHapley additive exPlanations

Fingerprint

Dive into the research topics of 'Generalizability evaluations of heterogeneous ensembles for river health predictions'. Together they form a unique fingerprint.

Cite this