A new technique for evaluating land use regression models and their impact on health effect estimates

Meng Wang*, Bert Brunekreef, Ulrike Gehring, Adam Szpiro, Gerard Hoek, Rob Beelen

*Corresponding author for this work

    Research output: Contribution to journalArticleAcademicpeer-review

    Abstract

    BACKGROUND: Leave-one-out cross-validation that fails to account for variable selection does not properly reflect prediction accuracy when the number of training sites is small. The impact on health effect estimates has rarely been studied.

    METHODS: We randomly generated ten training and test sets for nitrogen dioxide and particulate matter. For each training set we developed models and evaluated them using across-holdout validation approach. Cross-holdout validationdevelops new models for each evaluation compared to refitting the model without variable selection, as in standard leave-one-out cross-validation. We also implemented holdout validation, which evaluates model predictions using independent test sets. We evaluated the relationship between cross-holdout validationand holdout validation R and estimates of the association between air pollution and forced vital capacity in the Dutch birth cohort.

    RESULTS: Cross-holdout validationRs were generally identical to holdout validation Rs, but were notably smaller thanleave-one-out cross-validationRs. Decreases in forced vital capacityin relation to air pollution exposure were larger forland-use regression models that had larger holdout validationandcross-holdout validationRs rather than leave-one-out cross-validation R.

    CONCLUSIONS: Cross-holdout validationaccurately reflects predictive ability of land-use regression models and is a useful validation approach for small datasets. Land-use regression predictive ability in terms of hold-out validation and cross-holdout validation rather than leave-one-out cross-validationwas associated with the magnitude of health effect estimates in a case study.

    Original languageEnglish
    Pages (from-to)51-56
    Number of pages6
    JournalEpidemiology
    Volume27
    Issue number1
    DOIs
    Publication statusPublished - Jan 2016

    Fingerprint

    Dive into the research topics of 'A new technique for evaluating land use regression models and their impact on health effect estimates'. Together they form a unique fingerprint.

    Cite this