TY - JOUR
T1 - Performance of Prediction Algorithms for Modeling Outdoor Air Pollution Spatial Surfaces
AU - Kerckhoffs, Jules
AU - Hoek, Gerard
AU - Portengen, Lützen
AU - Brunekreef, Bert
AU - Vermeulen, Roel C.H.
PY - 2019/2/5
Y1 - 2019/2/5
N2 - Land use regression (LUR) models for air pollutants are often developed using multiple linear regression techniques. However, in the past decade linear (stepwise) regression methods have been criticized for their lack of flexibility, their ignorance of potential interaction between predictors, and their limited ability to incorporate highly correlated predictors. We used two training sets of ultrafine particles (UFP) data (mobile measurements (8200 segments, 25 s monitoring per segment), and short-term stationary measurements (368 sites, 3 × 30 min per site)) to evaluate different modeling approaches to estimate long-term UFP concentrations by estimating precision and bias based on an independent external data set (42 sites, average of three 24-h measurements). Higher training data R2 did not equate to higher test R2 for the external long-term average exposure estimates, making the argument that external validation data are critical to compare model performance. Machine learning algorithms trained on mobi...
AB - Land use regression (LUR) models for air pollutants are often developed using multiple linear regression techniques. However, in the past decade linear (stepwise) regression methods have been criticized for their lack of flexibility, their ignorance of potential interaction between predictors, and their limited ability to incorporate highly correlated predictors. We used two training sets of ultrafine particles (UFP) data (mobile measurements (8200 segments, 25 s monitoring per segment), and short-term stationary measurements (368 sites, 3 × 30 min per site)) to evaluate different modeling approaches to estimate long-term UFP concentrations by estimating precision and bias based on an independent external data set (42 sites, average of three 24-h measurements). Higher training data R2 did not equate to higher test R2 for the external long-term average exposure estimates, making the argument that external validation data are critical to compare model performance. Machine learning algorithms trained on mobi...
U2 - 10.1021/acs.est.8b06038
DO - 10.1021/acs.est.8b06038
M3 - Article
SN - 0013-936X
VL - 53
SP - 1413
EP - 1421
JO - Environmental Science and Technology
JF - Environmental Science and Technology
IS - 3
ER -