멀티모달 데이터를 활용한 Multi-Encoder U-Net 기반 시가화 지역의 의미론적 분할

Translated title of the contribution: Semantic Segmentation of Urbanized Areas Using Multi-Encoder U-Net Based on Multi-Modal Data
  • Sung Hyun Gong
  • , Hyung Sup Jung
  • , Geun Han Kim
  • , Geun Hyouk Han
  • , Il Hoon Choi
  • , Jin Sung Hong

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Air pollution has emerged as a critical global environmental issue, posing severe threats to human health and ecosystems worldwide. In East Asia, rapid urbanization, industrialization, and transboundary inflow of air pollutants from neighboring countries have further exacerbated air quality deterioration. The Korean Peninsula, in particular, frequently experiences severe air pollution events due to the combined influence of domestically emitted pollutants and those transported from China via prevailing westerlies. This complex situation necessitates quantitative assessment and effective mitigation strategies. However, conventional ground-based air quality monitoring networks are limited in their spatial coverage and thus insufficient for analyzing the relationship between urban expansion and air pollution on a regional scale. Furthermore, acquiring information on large-scale emission sources, both domestic and international, remains challenging due to constraints in accessibility and data security. To address these limitations, this study employed various remote sensing datasets in conjunction with artificial intelligence (AI) based semantic segmentation techniques to classify urbanized areas as major emission sources of air pollutants. The study focused on urban regions along China’s Eastern Coast and South Korea. A multi-modal AI dataset was constructed by integrating Landsat 8/9 optical satellite imagery, Geostationary Environment Monitoring Spectrometer (GEMS) data, and ground-based air quality monitoring network data. For the deep learning application of the multi-modal data, a multi-encoder U-Net model was designed to evaluate the classification performance of urban areas based on multiple input sources. Experimental results demonstrated that the proposed multi-encoder U-Net model achieved a pixel accuracy of 0.9523 and an Intersection over Union (IoU) score of 0.8789, showing a slight improvement over the single-encoder U-Net model, which recorded a pixel accuracy of 0.9506 and an IoU of 0.8746. These findings highlight the enhanced performance gained through the integration of heterogeneous environmental data sources. The outcomes of this study are expected to serve as a quantitative foundation for long-term air quality monitoring and policy development.

Translated title of the contributionSemantic Segmentation of Urbanized Areas Using Multi-Encoder U-Net Based on Multi-Modal Data
Original languageKorean
Pages (from-to)461-474
Number of pages14
JournalKorean Journal of Remote Sensing
Volume41
Issue number2
DOIs
StatePublished - 2025

Keywords

  • Deeplearning
  • Multi-modal
  • Remotesensing
  • Semanticsegmentation
  • U-Net
  • Urbanizedarea

Fingerprint

Dive into the research topics of 'Semantic Segmentation of Urbanized Areas Using Multi-Encoder U-Net Based on Multi-Modal Data'. Together they form a unique fingerprint.

Cite this