Abstract
Improving prediction accuracy is important in machine learning. One method to increase prediction accuracy is building machine learning models for each cluster which is generally created based on only attribute similarities using cluster analyses. However, spatial data requires clustering to additionally reflect spatial similarities, as considering spatial autocorrelation in a model leads to an increase in prediction accuracy. Therefore, this study explores the impact of spatial clustering with spatial similarities on machine learning prediction accuracy. Specifically, it compares the prediction accuracies of machine learning models generated based on clusters that consider only attribute similarities and those that consider both spatial and attribute similarities. The machine learning techniques employed consist of linear regression model, random forest, and gradient boosting for predicting average daily ridership. The independent variables consist of 11 variables explaining land, population, facility characteristics. The analysis results show that considering both spatial and attribute similarities yields significantly higher prediction accuracy across all models. This study can contribute to the literature on spatial data analyses by demonstrating that considering spatial similarity can improve prediction accuracy when applying spatial data to machine learning.
Translated title of the contribution | Spatial Clustering with Contiguity Constraint for Improving Prediction Accuracy in Spatial Machine Learning |
---|---|
Original language | Korean |
Pages (from-to) | 327-338 |
Number of pages | 12 |
Journal | Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography |
Volume | 42 |
Issue number | 4 |
DOIs | |
State | Published - 2024 |
Keywords
- Clustering Analysis
- Machine Learning
- Spatial Autocorrelation
- Spatial Clustering Analysis