TY - JOUR
T1 - Advancing Air Pollution Exposure Models with Open-Vocabulary Object Detection and Semantic Segmentation of Street-View Images
AU - Yuan, Zhendong
AU - Kerckhoffs, Jules
AU - Lin, Pi I.Debby
AU - Suel, Esra
AU - Li, Hao
AU - Yi, Li
AU - Jimenez, Marcia Pescador
AU - James, Peter
AU - de Hoogh, Kees
AU - Hoek, Gerard
AU - Vermeulen, Roel
PY - 2025/10/7
Y1 - 2025/10/7
N2 - Mobile monitoring campaigns combined with land use regression (LUR) models effectively capture fine-scale spatial variations in urban air pollution. However, traditional predictor variables often fail to capture the nuances of the built environment and undocumented emission sources. To address this, we developed a framework integrating customizable object-level and segmentation-level visual features from street-view images into stepwise regression and random-forest-based LUR models. Using 5.7 million mobile air pollution measurements (2019-2020) and 0.37 million street-view images (2008-2024), we mapped nitrogen dioxide (NO2), black carbon (BC), and ultrafine particles (UFP) across 46,664 road segments in Amsterdam, The Netherlands. Incorporating street-view images improved model performance, increasing R2 by 0.01-0.05 and reducing mean absolute errors by 0.7-10.3%. Sensitivity analyses indicated that key street-view-derived visual features remained stable across years and seasons. Using images from nearby years expanded training instances, thereby enhancing alignment with mobile measurements at fine granularity. Our open-vocabulary object detection module identified influential but previously unrecognized object predictors, such as chimneys, traffic lights, and shops. Combined with segmentation-derived features (e.g., walls, roads, grass), street-view images contributed 8-18% feature importance to model predictions. These findings highlight the potential of visual data in enhancing hyperlocal air pollution mapping and exposure assessment.
AB - Mobile monitoring campaigns combined with land use regression (LUR) models effectively capture fine-scale spatial variations in urban air pollution. However, traditional predictor variables often fail to capture the nuances of the built environment and undocumented emission sources. To address this, we developed a framework integrating customizable object-level and segmentation-level visual features from street-view images into stepwise regression and random-forest-based LUR models. Using 5.7 million mobile air pollution measurements (2019-2020) and 0.37 million street-view images (2008-2024), we mapped nitrogen dioxide (NO2), black carbon (BC), and ultrafine particles (UFP) across 46,664 road segments in Amsterdam, The Netherlands. Incorporating street-view images improved model performance, increasing R2 by 0.01-0.05 and reducing mean absolute errors by 0.7-10.3%. Sensitivity analyses indicated that key street-view-derived visual features remained stable across years and seasons. Using images from nearby years expanded training instances, thereby enhancing alignment with mobile measurements at fine granularity. Our open-vocabulary object detection module identified influential but previously unrecognized object predictors, such as chimneys, traffic lights, and shops. Combined with segmentation-derived features (e.g., walls, roads, grass), street-view images contributed 8-18% feature importance to model predictions. These findings highlight the potential of visual data in enhancing hyperlocal air pollution mapping and exposure assessment.
KW - air pollution
KW - deep learning
KW - exposure assessment
KW - land use regression (LUR)
KW - mobile sensing
KW - street-view image
KW - vision-language model (VLM)
KW - vision-transformer models (ViT)
UR - https://www.scopus.com/pages/publications/105017982649
U2 - 10.1021/acs.est.5c09687
DO - 10.1021/acs.est.5c09687
M3 - Article
C2 - 41014621
AN - SCOPUS:105017982649
SN - 0013-936X
VL - 59
SP - 21237
EP - 21247
JO - Environmental Science & Technology
JF - Environmental Science & Technology
IS - 39
ER -