Advancing Air Pollution Exposure Models with Open-Vocabulary Object Detection and Semantic Segmentation of Street-View Images

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Mobile monitoring campaigns combined with land use regression (LUR) models effectively capture fine-scale spatial variations in urban air pollution. However, traditional predictor variables often fail to capture the nuances of the built environment and undocumented emission sources. To address this, we developed a framework integrating customizable object-level and segmentation-level visual features from street-view images into stepwise regression and random-forest-based LUR models. Using 5.7 million mobile air pollution measurements (2019-2020) and 0.37 million street-view images (2008-2024), we mapped nitrogen dioxide (NO2), black carbon (BC), and ultrafine particles (UFP) across 46,664 road segments in Amsterdam, The Netherlands. Incorporating street-view images improved model performance, increasing R2 by 0.01-0.05 and reducing mean absolute errors by 0.7-10.3%. Sensitivity analyses indicated that key street-view-derived visual features remained stable across years and seasons. Using images from nearby years expanded training instances, thereby enhancing alignment with mobile measurements at fine granularity. Our open-vocabulary object detection module identified influential but previously unrecognized object predictors, such as chimneys, traffic lights, and shops. Combined with segmentation-derived features (e.g., walls, roads, grass), street-view images contributed 8-18% feature importance to model predictions. These findings highlight the potential of visual data in enhancing hyperlocal air pollution mapping and exposure assessment.

Original languageEnglish
Pages (from-to)21237-21247
Number of pages11
JournalEnvironmental Science & Technology
Volume59
Issue number39
DOIs
Publication statusPublished - 7 Oct 2025

Keywords

  • air pollution
  • deep learning
  • exposure assessment
  • land use regression (LUR)
  • mobile sensing
  • street-view image
  • vision-language model (VLM)
  • vision-transformer models (ViT)

Fingerprint

Dive into the research topics of 'Advancing Air Pollution Exposure Models with Open-Vocabulary Object Detection and Semantic Segmentation of Street-View Images'. Together they form a unique fingerprint.

Cite this