TY - JOUR
T1 - Applications of geographically weighted machine learning models for predicting soil heavy metal concentrations across mining sites
AU - Jeong, Hyemin
AU - Lee, Younghun
AU - Lee, Byeongwon
AU - Jung, Euisoo
AU - Lee, Jai Young
AU - Lee, Sangchul
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/12/20
Y1 - 2024/12/20
N2 - The accurate prediction of soil heavy metal contamination is crucial for the effective environmental management of abandoned mining areas. However, conventional machine learning models (CMLMs) often fail to account for the spatial heterogeneity of soil contamination, which limits their predictive accuracy. This study evaluated the performance of geographically weighted machine learning models (GWMLMs) in predicting soil Cd and Pb concentrations in abandoned mines in the Republic of Korea. We compared two GWMLMs (Geographically Weighted Random Forest and Geographically Weighted Extreme Gradient Boosting) with four CMLMs (Random Forest, Gradient Boosting, Light Gradient Boosting, and extreme Gradient Boosting). The data used in this study included soil samples from six abandoned mining sites with various geographical and soil input variables. The results showed that the GWMLMs consistently outperformed the CMLMs in predicting heavy metal contamination. For Cd predictions, GWMLMs exhibited on average 0.02 lower root mean square error and mean absolute error values, with a 0.26 increase in R2 values compared to CMLMs. Similarly, for Pb predictions, the GWMLMs showed 0.18 and 0.13 lower root mean square error and mean absolute error values, respectively, and a 0.17 increase in R2 relative to the CMLMs. The findings demonstrate the usefulness of GWMLMs for predicting the spatial distribution of soil heavy metals. SHapley Additive exPlanations analysis exhibited elevation and distance from abandoned mining sites as the most influential factors in predicting both Cd and Pb concentrations. This study highlights the value of GWMLMs that incorporate spatial heterogeneity into CMLMs for enhancing prediction accuracy and providing crucial insights for environmental management in mining-impacted regions.
AB - The accurate prediction of soil heavy metal contamination is crucial for the effective environmental management of abandoned mining areas. However, conventional machine learning models (CMLMs) often fail to account for the spatial heterogeneity of soil contamination, which limits their predictive accuracy. This study evaluated the performance of geographically weighted machine learning models (GWMLMs) in predicting soil Cd and Pb concentrations in abandoned mines in the Republic of Korea. We compared two GWMLMs (Geographically Weighted Random Forest and Geographically Weighted Extreme Gradient Boosting) with four CMLMs (Random Forest, Gradient Boosting, Light Gradient Boosting, and extreme Gradient Boosting). The data used in this study included soil samples from six abandoned mining sites with various geographical and soil input variables. The results showed that the GWMLMs consistently outperformed the CMLMs in predicting heavy metal contamination. For Cd predictions, GWMLMs exhibited on average 0.02 lower root mean square error and mean absolute error values, with a 0.26 increase in R2 values compared to CMLMs. Similarly, for Pb predictions, the GWMLMs showed 0.18 and 0.13 lower root mean square error and mean absolute error values, respectively, and a 0.17 increase in R2 relative to the CMLMs. The findings demonstrate the usefulness of GWMLMs for predicting the spatial distribution of soil heavy metals. SHapley Additive exPlanations analysis exhibited elevation and distance from abandoned mining sites as the most influential factors in predicting both Cd and Pb concentrations. This study highlights the value of GWMLMs that incorporate spatial heterogeneity into CMLMs for enhancing prediction accuracy and providing crucial insights for environmental management in mining-impacted regions.
KW - Conventional machine learning model (CMLM)
KW - Geographically weighted machine learning model (GWMLM)
KW - Soil heavy metal
KW - Spatial heterogeneity
UR - http://www.scopus.com/inward/record.url?scp=85210064276&partnerID=8YFLogxK
U2 - 10.1016/j.scitotenv.2024.177667
DO - 10.1016/j.scitotenv.2024.177667
M3 - Article
C2 - 39579881
AN - SCOPUS:85210064276
SN - 0048-9697
VL - 957
JO - Science of the Total Environment
JF - Science of the Total Environment
M1 - 177667
ER -